public class EntityResolver extends Object
Decodes (unescapes) HTML entities with the complication that these are received one character at a time hence must be stored temporarily. Also, we may receive some "junk" characters before the actual entity which we will discard.
This class is designed to be 100% compatible with the corresponding
logic in the C-version of the
com.google.security.streamhtmlparser.HtmlParser
, found
in htmlparser.c
. There are however a few intentional
differences outlines below:
processChar
returns the output String
whereas in Java, we return
a status code and then provide the String
in a separate
method getEntity
. It is cleaner as it avoids the
need to return empty String
s during incomplete processing.
Valid HTML entities have one of the following three forms:
ⅆ
where dd is a number in decimal (base 10) form.
&x|Xyy;
where yy is a hex-number (base 16).
&<html-entity>;
where
<html-entity>
is one of lt
,
gt
, amp
, quot
or
apos
.
A reset
method is provided to facilitate object re-use.
Modifier and Type | Class and Description |
---|---|
static class |
EntityResolver.Status
Returned in
processChar method. |
Constructor and Description |
---|
EntityResolver()
Constructs an entity resolver that is initially empty and
with status
NOT_STARTED , see EntityResolver.Status . |
EntityResolver(EntityResolver aEntityResolver)
Constructs an entity resolver that is an exact copy of
the one provided.
|
Modifier and Type | Method and Description |
---|---|
String |
getEntity()
Returns the decoded HTML Entity.
|
EntityResolver.Status |
processChar(char input)
Processes a character from the input stream and decodes any html entities
from that processed input stream.
|
void |
reset()
Returns the object to its original state for re-use, deleting any
stored characters that may be present.
|
String |
toString()
Returns the full state of the
StreamEntityResolver
in a human readable form. |
public EntityResolver()
NOT_STARTED
, see EntityResolver.Status
.public EntityResolver(EntityResolver aEntityResolver)
aEntityResolver
- the entity resolver to copypublic void reset()
public String toString()
StreamEntityResolver
in a human readable form. The format of the returned String
is not specified and is subject to change.public String getEntity()
processChar
returned status COMPLETED
.String
if
we were called with any status other than COMPLETED
public EntityResolver.Status processChar(char input)
input
- the char
to processString
. Typically returns an empty
String
while awaiting for more characters to complete
processing of the entity.Copyright © 2010–2017 Google. All rights reserved.