mirror of
https://github.com/yacy/yacy_search_server.git
synced 2024-09-19 00:01:41 +02:00
aa9ddf3c23
When starting a crawl from a file containing thousands of links, configuration setting "crawler.MaxActiveThreads" is effective to prevent saturating the system with too many outgoing HTTP connections threads launched by the crawler. But robots.txt was not affected by this setting and was indefinitely increasing the number of concurrently loading threads until most ot the connections timed out. To improve performance control, added a pool of threads for Robots.txt, consistently used in its ensureExist() and massCrawlCheck() methods. The Robots.txt threads pool max size can now be configured in the /PerformanceQueus_p.html page, or with the new "robots.txt.MaxActiveThreads" setting, initialized with the same default value as the crawler. |
||
---|---|---|
.. | ||
federatecfg | ||
solr | ||
freeworldKeystore | ||
heuristicopensearch.conf | ||
httpd.mime | ||
oaiListFriendsSource.xml | ||
RDFaParser.xsl | ||
sessionid.names | ||
solr.collection.schema | ||
solr.webgraph.schema | ||
web.xml | ||
yacy.badwords.example | ||
yacy.init | ||
yacy.logging | ||
yacy.network.allip.unit | ||
yacy.network.freeworld.unit | ||
yacy.network.intranet.unit | ||
yacy.network.metager.unit | ||
yacy.network.readme | ||
yacy.network.webportal.unit | ||
yacy.network.zeronet.unit | ||
yacy.networks | ||
yacy.stopwords | ||
yacy.stopwords.de |