mirror of
https://github.com/yacy/yacy_search_server.git
synced 2024-09-21 00:00:13 +02:00
cd5f349666
Extracted text of files that are larger than 5MB is stored in a temp file instead of keeping it in memory *) plasmaParserDocument.java; getText now returnes an inputStream instead of a byte array *) plasmaParserDocument.java: new function getTextBytes returns the parsed content as byte array Attention: the caller of this function has to ensure that enough memory is available to do this to avoid OutOfMemory Exceptions *) httpd.java: better error handling if the soaphander is not installed *) pdfParser.java: - better handling of documents with exotic charsets - better handling of large documents - better error logging of encrypted documents *) rtfParser.java: Bugfix for UTF-8 support *) tarParser.java: better handling of large documents *) zipParser.java: better handling of large documents *) plasmaCrawlEURL.java: new errorcode for encrypted documents *) plasmaParserDocument.java: the extracted text can now be passed to this object as byte array or temp file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2679 6c8d7289-2bf4-0310-a012-ef5d649a1542 |
||
---|---|---|
.. | ||
build.xml | ||
rtfParser.java |