yacy_search_server/source/net/yacy/crawler
luccioman dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list
When using the 'From Link-List of URL' as a crawl start, with lists in
the order of one or more thousands of links, the failreason_s Solr field
maximum size (32kb) was exceeded by the string representation of the URL
must-match filter when a crawl URL was rejected because not matching.
2018-07-11 08:13:29 +02:00
..
data Fixed exceeding max size of failreason_s Solr field on large link list 2018-07-11 08:13:29 +02:00
retrieval Added support for enclosures (media links) to the RSS loader 2018-03-21 08:22:29 +01:00
robots Small perf improvement : initialize threads names early when possible 2018-05-23 14:45:35 +02:00
Balancer.java Fixed display of crawler pending URLs counts in HostBrowser.html page. 2017-01-22 12:31:14 +01:00
CrawlStacker.java Fixed exceeding max size of failreason_s Solr field on large link list 2018-07-11 08:13:29 +02:00
CrawlStarterFromScraper.java Updated a license header typo. 2017-10-30 07:38:47 +01:00
CrawlSwitchboard.java Do not block whole server startup on persisted crawl profile load error 2018-06-19 12:48:17 +02:00
FileCrawlStarterTask.java removed transformer 2018-06-19 00:42:23 +02:00
HarvestProcess.java fix for wrong display of error urls in HostBrowser 2012-12-07 00:31:10 +01:00
HostBalancer.java Removed time condition on HostBalancer initialization in JUnit test. 2018-01-26 17:15:27 +01:00
HostQueue.java to prevent crawler to concurrently access and alter same crawl queue 2016-07-05 23:22:35 +02:00
IllegalCrawlProfileException.java Crawl from local file : faster task end when manually terminating crawl. 2016-10-22 09:11:20 +02:00
LegacyBalancer.java use supplied url port to get robots.txt in crawlers hostqueue 2016-03-02 00:12:34 +01:00
RecrawlBusyThread.java Create recrawl requests with the relevant crawl profile. 2018-01-30 21:00:18 +01:00