|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.io.InputStream
java.io.FilterInputStream
com.catcode.odf.OpenDocumentTextInputStream
OpenDocumentTextInputStream reads the content of an OASIS Open Document Format text (word processing) file.
Limitations/restrictions:
You can set two lists of element names (using the
OpenDocumentElement
class). The capture list is the
list of elements whose text you want; the omit list is the
list of elements within which text is never output. The default
value for the capture list is <text:p>
and
<text:h
. The default value for the omit list
is <text:tracked-changes>
.
Field Summary |
Fields inherited from class java.io.FilterInputStream |
in |
Constructor Summary | |
OpenDocumentTextInputStream(java.io.InputStream in)
Constructs an OASIS Open Document Text input stream. |
|
OpenDocumentTextInputStream(java.io.InputStream in,
java.util.ArrayList capture,
java.util.ArrayList omit)
Constructs an OASIS Open Document Text input stream. |
Method Summary | |
protected void |
analyzeTag(java.lang.String tag)
Set flags to accept or reject characters in this tag. |
protected void |
collectEntity()
Collect all characters up to and including the ending semicolon of the entity. |
protected void |
collectTag()
Collects information between angle brackets into a string buffer. |
protected int |
collectUTF8(int startByte)
Create a UTF-8 character from individual bytes. |
protected void |
createUTF8Output(int value)
Split a Unicode value into UTF-8 bytes. |
int |
read()
Reads the next byte of data from this input stream. |
int |
read(byte[] b)
Reads some number of bytes from the input stream and stores them into the buffer array b . |
int |
read(byte[] b,
int off,
int len)
Reads up to len bytes of data from the input stream into
an array of bytes. |
long |
skip(long n)
Skips specified number of bytes in the current ODT file entry. |
Methods inherited from class java.io.FilterInputStream |
available, close, mark, markSupported, reset |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public OpenDocumentTextInputStream(java.io.InputStream in)
in
- the actual input streampublic OpenDocumentTextInputStream(java.io.InputStream in, java.util.ArrayList capture, java.util.ArrayList omit)
binarySearch()
.
If you want an empty list for either one of these, pass in
an empty ArrayList
. Passing in null
will set you up with the default capture or omit list.
in
- the actual input streamcapture
- an ArrayList
of
elements whose content will be
read by this streamomit
- An ArrayList
of element
whose content will be ignored by ths stream.Method Detail |
public int read() throws java.io.IOException
int
in the range 0 to 255.
If no byte is available because the end of the stream has been reached,
the value -1 is returned. Only bytes within "relevant" elements (as
listed in the relevantElement
list) are returned.
This method blocks until input data is available, the end of the stream
is detected, or an exception is thrown.
-1
if the end of the stream is reached.
java.io.IOException
- if an I/O error occurs.public int read(byte[] b) throws java.io.IOException
b
. The number of bytes actually read is
returned as an integer.
java.io.IOException
public int read(byte[] b, int off, int len) throws java.io.IOException
len
bytes of data from the input stream into
an array of bytes. The number of bytes actually read is
returned as an integer. See InputStream
for details.
In fact, this code is copied straight from that file.
java.io.IOException
public long skip(long n) throws java.io.IOException
n
- the number of bytes to skip
java.io.IOException
- if an I/O error has occurred
java.lang.IllegalArgumentException
- if n < 0protected void collectEntity() throws java.io.IOException
This method will fill the utf8Output[]
array,
set utf8OutputLength
appropriately, and
set utf8OutputPosition
to zero.
If we hit the end of file, put -1
in the utf8 buffer;
the main loop in read()
will emit it the next time through.
java.io.IOException
- if I/O error occurs while reading bytes.protected void createUTF8Output(int value)
utf8Output[]
and sets the
utf8OutputLength
appropriately.
protected void collectTag() throws java.io.IOException
Reads from file until encountering a > symbol. If a byte
has a value greater than 127, then call collectUTF8()
to combine it and the following bytes into a Unicode character.
If we hit the end of file, put -1
in the utf8 buffer;
the main loop in read()
will emit it the next time through.
java.io.IOException
- if I/O error occurs while reading bytes.
protected int collectUTF8(int startByte) throws java.io.IOException
startByte
- the starting byte of a UTF-8 sequence.
java.io.IOException
protected void analyzeTag(java.lang.String tag)
tag
- the tag to be analyzed
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |