yacy_search_server/source/net/yacy/document/parser
orbiter c288fcf634 redesigned CrawlStartScanner user interface and added more features:
- multiple hosts for environment scans can be given (comma-separated)
- each service (ftp, smb, http, https) for the scan can be selected
- the scan result can be accumulated or refreshed each time a network scan is made
- a scheduler was added to repeat a scan and add all found urls to the indexer automatically

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-12-16 02:15:20 +00:00
..
html - added CamelCase parser to MultiProtocolURI: generate better to-be-indexed words from urls 2010-12-15 00:03:19 +00:00
images - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only 2010-11-30 16:13:55 +00:00
xml - added new protocol loader for 'file'-type URLs 2010-05-25 12:54:57 +00:00
bzipParser.java fixed bugs in parser and ftp client 2010-12-02 11:05:04 +00:00
csvParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
docParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
genericParser.java redesigned CrawlStartScanner user interface and added more features: 2010-12-16 02:15:20 +00:00
gzipParser.java fixed bugs in parser and ftp client 2010-12-02 11:05:04 +00:00
htmlParser.java - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed 2010-09-27 14:54:32 +00:00
odtParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
ooxmlParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
pdfParser.java - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only 2010-11-30 16:13:55 +00:00
pptParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
psParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
rssParser.java - enhancements for search speed 2010-10-04 11:54:48 +00:00
rtfParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
sevenzipParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
sitemapParser.java added a sitemap entry parser and loader for sitemaps 2010-11-03 19:48:33 +00:00
swfParser.java more performance hacks 2010-10-09 08:55:57 +00:00
tarParser.java fixed bugs in parser and ftp client 2010-12-02 11:05:04 +00:00
torrentParser.java - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only 2010-11-30 16:13:55 +00:00
vcfParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
vsdParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
xlsParser.java Support for indexing of RSS feeds! 2010-08-25 18:24:54 +00:00
zipParser.java fixed bugs in parser and ftp client 2010-12-02 11:05:04 +00:00