yacy_search_server/source/de/anomic/crawler
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
- some restructuring of the document counting and logging structures was necessary
- better abstraction of CrawlProfiles
- added deletion of logs to the index deletion option (if the index is deleted using the servlets) which is necessary to reset the domain counters for the page limitation
- more refactoring to get the LibraryProvider more clean
- some refactoring of the Condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-12 00:01:40 +00:00
..
retrieval - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
Balancer.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
CrawlProfile.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
CrawlQueues.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
CrawlStacker.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
CrawlSwitchboard.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
ImporterException.java
Latency.java preparations to move the HTCache into cora: 2010-08-23 12:32:02 +00:00
NoticedURL.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
ResourceObserver.java same units for memory observer configuration (MiB) 2011-01-02 20:38:01 +00:00
ResultImages.java *) cleaning up the code a little bit 2010-12-27 17:07:21 +00:00
ResultURLs.java - fixed document number limitation for crawls that restrict the number of documents per domain 2011-02-12 00:01:40 +00:00
RobotsEntry.java - replaced pdfbox and fontbox version 1.1.0 with 1.2.1 2010-09-07 17:13:47 +00:00
robotsParser.java added a sitemap entry parser and loader for sitemaps 2010-11-03 19:48:33 +00:00
RobotsTxt.java - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed 2010-09-27 14:54:32 +00:00
RSSLoader.java * fix system update if urls are in blacklist (for example for very general blacklists like *.de) 2010-12-15 19:20:00 +00:00
SitemapImporter.java enhanced crawler: 2010-12-11 00:31:57 +00:00
ZURL.java * fix system update if urls are in blacklist (for example for very general blacklists like *.de) 2010-12-15 19:20:00 +00:00