yacy_search_server/source/de/anomic/crawler
2011-09-05 12:21:25 +00:00
..
retrieval fixes size of document in case the server doesn't give the size in the header 2011-09-05 12:21:25 +00:00
Balancer.java changed handling of RowSet element retrieval: until today all elements had been copied from the underlying byte[] arrays into a new Entry object that again had a copy of a portion of that byte[] in its own bye[]. There was an option to just refer to the underlying byte[] with a pointer but that was almost never used. This commit now changes an interface to the Row class where it is now necessary to tell if a copy is always required. Fortunately the copy is only needed in very rare cases. That means that this change should cause much less memory allocation; it is expected that this happens especially during search situations. 2011-07-15 08:38:10 +00:00
CrawlProfile.java *) Invalid crawl profiles (containing invalid mustmatch/mustnotmatch filters) will be moved from active crawls to invalid crawls (new file: DATA/INDEX/freeworld/QUEUES/crawlProfilesInvalid.heap). This file can not be edited yet, but it shoudl be easy to extend the CrawlProfileEditor accordingly. 2011-07-03 23:55:55 +00:00
CrawlQueues.java - not doing merge-jobs while short on Memory 2011-08-24 12:07:53 +00:00
CrawlStacker.java hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 2011-05-27 08:24:54 +00:00
CrawlSwitchboard.java *) Invalid crawl profiles (containing invalid mustmatch/mustnotmatch filters) will be moved from active crawls to invalid crawls (new file: DATA/INDEX/freeworld/QUEUES/crawlProfilesInvalid.heap). This file can not be edited yet, but it shoudl be easy to extend the CrawlProfileEditor accordingly. 2011-07-03 23:55:55 +00:00
ImporterException.java added final where possible 2008-08-02 12:12:04 +00:00
Latency.java - refactoring of robots 2011-05-02 14:05:51 +00:00
NoticedURL.java added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer 2011-04-03 23:39:45 +00:00
ResourceObserver.java Implementation of strategies for controlling memory resources. 2011-08-22 17:50:03 +00:00
ResultImages.java - fixed a bug in crawl start with file name (npe in new url) 2011-04-18 16:11:16 +00:00
ResultURLs.java refactoring: moved all score-related classes to new ranking package 2011-08-22 22:37:53 +00:00
RobotsTxt.java - enhanced ybr ranking computation 2011-05-26 10:57:02 +00:00
RobotsTxtEntry.java hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 2011-05-27 08:24:54 +00:00
RobotsTxtParser.java - refactoring of robots 2011-05-02 14:05:51 +00:00
RSSLoader.java stop loading via http at defined maximum of bytes - even size is unknown before loading 2011-08-01 23:28:23 +00:00
SitemapImporter.java hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 2011-05-27 08:24:54 +00:00
ZURL.java changed handling of RowSet element retrieval: until today all elements had been copied from the underlying byte[] arrays into a new Entry object that again had a copy of a portion of that byte[] in its own bye[]. There was an option to just refer to the underlying byte[] with a pointer but that was almost never used. This commit now changes an interface to the Row class where it is now necessary to tell if a copy is always required. Fortunately the copy is only needed in very rare cases. That means that this change should cause much less memory allocation; it is expected that this happens especially during search situations. 2011-07-15 08:38:10 +00:00