|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.jsoup.parser.Parser
public class Parser
Parses HTML into a Document
. Generally best to use one of the more convenient parse methods
in Jsoup
.
Constructor Summary | |
---|---|
Parser(org.jsoup.parser.TreeBuilder treeBuilder)
Create a new Parser, using the specified TreeBuilder |
Method Summary | |
---|---|
java.util.List<ParseError> |
getErrors()
Retrieve the parse errors, if any, from the last parse. |
org.jsoup.parser.TreeBuilder |
getTreeBuilder()
Get the TreeBuilder currently in use. |
static Parser |
htmlParser()
Create a new HTML parser. |
boolean |
isTrackErrors()
Check if parse error tracking is enabled. |
static Document |
parse(java.lang.String html,
java.lang.String baseUri)
Parse HTML into a Document. |
static Document |
parseBodyFragment(java.lang.String bodyHtml,
java.lang.String baseUri)
Parse a fragment of HTML into the body of a Document. |
static Document |
parseBodyFragmentRelaxed(java.lang.String bodyHtml,
java.lang.String baseUri)
Deprecated. Use parseBodyFragment(java.lang.String, java.lang.String) or parseFragment(java.lang.String, org.jsoup.nodes.Element, java.lang.String) instead. |
static java.util.List<Node> |
parseFragment(java.lang.String fragmentHtml,
Element context,
java.lang.String baseUri)
Parse a fragment of HTML into a list of nodes. |
Document |
parseInput(java.lang.String html,
java.lang.String baseUri)
|
static java.util.List<Node> |
parseXmlFragment(java.lang.String fragmentXml,
java.lang.String baseUri)
Parse a fragment of XML into a list of nodes. |
Parser |
setTrackErrors(int maxErrors)
Enable or disable parse error tracking for the next parse. |
Parser |
setTreeBuilder(org.jsoup.parser.TreeBuilder treeBuilder)
Update the TreeBuilder used when parsing content. |
static java.lang.String |
unescapeEntities(java.lang.String string,
boolean inAttribute)
Utility method to unescape HTML entities from a string |
static Parser |
xmlParser()
Create a new XML parser. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Parser(org.jsoup.parser.TreeBuilder treeBuilder)
treeBuilder
- TreeBuilder to use to parse input into Documents.Method Detail |
---|
public Document parseInput(java.lang.String html, java.lang.String baseUri)
public org.jsoup.parser.TreeBuilder getTreeBuilder()
public Parser setTreeBuilder(org.jsoup.parser.TreeBuilder treeBuilder)
treeBuilder
- current TreeBuilder
public boolean isTrackErrors()
public Parser setTrackErrors(int maxErrors)
maxErrors
- the maximum number of errors to track. Set to 0 to disable.
public java.util.List<ParseError> getErrors()
public static Document parse(java.lang.String html, java.lang.String baseUri)
html
- HTML to parsebaseUri
- base URI of document (i.e. original fetch location), for resolving relative URLs.
public static java.util.List<Node> parseFragment(java.lang.String fragmentHtml, Element context, java.lang.String baseUri)
fragmentHtml
- the fragment of HTML to parsecontext
- (optional) the element that this HTML fragment is being parsed for (i.e. for inner HTML). This
provides stack context (for implicit element creation).baseUri
- base URI of document (i.e. original fetch location), for resolving relative URLs.
public static java.util.List<Node> parseXmlFragment(java.lang.String fragmentXml, java.lang.String baseUri)
fragmentXml
- the fragment of XML to parsebaseUri
- base URI of document (i.e. original fetch location), for resolving relative URLs.
public static Document parseBodyFragment(java.lang.String bodyHtml, java.lang.String baseUri)
body
of a Document.
bodyHtml
- fragment of HTMLbaseUri
- base URI of document (i.e. original fetch location), for resolving relative URLs.
public static java.lang.String unescapeEntities(java.lang.String string, boolean inAttribute)
string
- HTML escaped stringinAttribute
- if the string is to be escaped in strict mode (as attributes are)
public static Document parseBodyFragmentRelaxed(java.lang.String bodyHtml, java.lang.String baseUri)
parseBodyFragment(java.lang.String, java.lang.String)
or parseFragment(java.lang.String, org.jsoup.nodes.Element, java.lang.String)
instead.
bodyHtml
- HTML to parsebaseUri
- baseUri base URI of document (i.e. original fetch location), for resolving relative URLs.
public static Parser htmlParser()
public static Parser xmlParser()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |