yacy_search_server/source/net/yacy/document/parser
Michael Peter Christen f3a6b6e21e fix for bad URL decoding
2014-07-10 01:59:29 +02:00
..
augment - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
html fix for bad URL decoding 2014-07-10 01:59:29 +02:00
images strong redesign of html parser: object recursion is now made using a 2014-04-10 18:58:03 +02:00
rdfa fix: remove obsolete ref to yacy.home 2014-04-04 02:45:04 +02:00
xml do YaCy p2p connections using a timeout-request which covers the http 2014-01-19 15:21:23 +01:00
apkParser.java added apkParser stub (work in progress) 2014-06-27 10:15:01 +02:00
audioTagParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
bzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 2014-04-16 21:34:28 +02:00
csvParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
docParser.java extract author and keywords in .doc and .ppt parser 2014-06-29 02:54:09 +02:00
dwgParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
genericParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
gzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 2014-04-16 21:34:28 +02:00
htmlParser.java added linkScraperParser, a parser which ignores the text like the 2014-07-07 13:37:17 +02:00
linkScraperParser.java added linkScraperParser, a parser which ignores the text like the 2014-07-07 13:37:17 +02:00
mmParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
odtParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
ooxmlParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
pdfParser.java optimize pdfParser 2014-06-10 04:25:20 +02:00
pptParser.java extract author and keywords in .doc and .ppt parser 2014-06-29 02:54:09 +02:00
psParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
rdfParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
rssParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
rtfParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
sevenzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 2014-04-16 21:34:28 +02:00
sidAudioParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
sitemapParser.java do YaCy p2p connections using a timeout-request which covers the http 2014-01-19 15:21:23 +01:00
swfParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
tarParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 2014-04-16 21:34:28 +02:00
torrentParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
vcfParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
vsdParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
xlsParser.java - replaced the properties object in AnchorURL with distinct variables 2013-09-15 23:27:04 +02:00
zipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 2014-04-16 21:34:28 +02:00