yacy_search_server/source/net/yacy/crawler
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it
appeared after the declaration of robots allow/deny for the crawler
because the sitemap parser terminated after the allow/deny rules had
been found. Now the parser reads the robots.txt until the end to
discover also sitemap rules at the end of the file.
2013-05-10 04:56:58 +02:00
..
data Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-04-26 10:50:08 +02:00
retrieval added collection attribute also to the rss feed reader 2013-04-24 01:14:35 +02:00
robots fix for sitemap detection: the sitemap url was not visible if it 2013-05-10 04:56:58 +02:00
Balancer.java infinity timeout bug protection patch 2013-04-30 11:06:48 +02:00
CrawlStacker.java - reduction of the concurrently running processes to make YaCy more 2013-04-25 11:33:17 +02:00
CrawlSwitchboard.java - added a new field for the regular expression in crawl start 2013-04-26 10:49:55 +02:00
HarvestProcess.java fix for wrong display of error urls in HostBrowser 2012-12-07 00:31:10 +01:00