yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-19 00:01:41 +02:00

History

luccioman dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list When using the 'From Link-List of URL' as a crawl start, with lists in the order of one or more thousands of links, the failreason_s Solr field maximum size (32kb) was exceeded by the string representation of the URL must-match filter when a crawl URL was rejected because not matching.		2018-07-11 08:13:29 +02:00
..
data	Fixed exceeding max size of failreason_s Solr field on large link list	2018-07-11 08:13:29 +02:00
retrieval	Added support for enclosures (media links) to the RSS loader	2018-03-21 08:22:29 +01:00
robots	Small perf improvement : initialize threads names early when possible	2018-05-23 14:45:35 +02:00
Balancer.java	Fixed display of crawler pending URLs counts in HostBrowser.html page.	2017-01-22 12:31:14 +01:00
CrawlStacker.java	Fixed exceeding max size of failreason_s Solr field on large link list	2018-07-11 08:13:29 +02:00
CrawlStarterFromScraper.java	Updated a license header typo.	2017-10-30 07:38:47 +01:00
CrawlSwitchboard.java	Do not block whole server startup on persisted crawl profile load error	2018-06-19 12:48:17 +02:00
FileCrawlStarterTask.java	removed transformer	2018-06-19 00:42:23 +02:00
HarvestProcess.java	fix for wrong display of error urls in HostBrowser	2012-12-07 00:31:10 +01:00
HostBalancer.java	Removed time condition on HostBalancer initialization in JUnit test.	2018-01-26 17:15:27 +01:00
HostQueue.java	to prevent crawler to concurrently access and alter same crawl queue	2016-07-05 23:22:35 +02:00
IllegalCrawlProfileException.java	Crawl from local file : faster task end when manually terminating crawl.	2016-10-22 09:11:20 +02:00
LegacyBalancer.java	use supplied url port to get robots.txt in crawlers hostqueue	2016-03-02 00:12:34 +01:00
RecrawlBusyThread.java	Create recrawl requests with the relevant crawl profile.	2018-01-30 21:00:18 +01:00