yacy_search_server/source/de/anomic/plasma/parser
karlchenofhell 0a64047081 - plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method)
- added 7zip parser
- added 'text/sgml' to realtime parseable mimetypes (sometimes returned by the mime type parser)
- added new cached output stream class, very suitable for parsers because of limited memory

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-18 23:13:44 +00:00
..
bzip *) plasmaHTCache: 2006-10-03 11:05:48 +00:00
doc - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
gzip *) plasmaHTCache: 2006-10-03 11:05:48 +00:00
mimeType *) plasmaHTCache: 2006-10-03 11:05:48 +00:00
odt - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
pdf - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
ppt *) adding additional file extension for powerpoint 2007-03-21 16:18:58 +00:00
ps *) adding first version of postscript parser 2007-04-01 15:02:07 +00:00
rpm *) adding rpm packager as author 2007-03-21 13:09:12 +00:00
rss *) RSS-parser extracts the author tags now 2007-03-21 13:35:32 +00:00
rtf - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
sevenzip - plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method) 2007-05-18 23:13:44 +00:00
swf - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
tar *) Fixed broken compile process. 2007-05-04 21:33:37 +00:00
vcf - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
xls - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
zip - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 2007-03-18 12:33:19 +00:00
AbstractParser.java - plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method) 2007-05-18 23:13:44 +00:00
Parser.java *) plasmaHTCache: 2006-10-03 11:05:48 +00:00
ParserException.java *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher) 2006-09-20 12:25:07 +00:00
ParserInfo.java