yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-21 00:00:13 +02:00

History

Michael Peter Christen 8c3e5b7b6d added experimental pdf splitting which enables YaCy to split pdfs during parsing into individual pages and add them all using different URLs. These constructed urls are generated from the source url with an appended page=<pagenumber> attribute to the url get/post properties. This will distinguish the different page entries. The search result list will then replace the post parameter with a url anchor # mark which causes that the original url is presented in the search result. These URLs can be opened directly on the correct page using pdf.js which is now built-in into firefox. That means: if you find a search hit on page 5 and click on the search result, firefox will open the pdf viewer and shows page 5.		2014-12-21 18:10:15 +01:00
..
data	reactivated on-demand snapshot loading	2014-12-16 12:09:57 +01:00
retrieval	added experimental pdf splitting which enables YaCy to split pdfs during	2014-12-21 18:10:15 +01:00
robots	more ipv6 bugfixes	2014-10-08 15:21:49 +02:00
Balancer.java	- added a new Crawler Balancer: HostBalancer and HostQueues:	2014-04-16 21:34:28 +02:00
CrawlStacker.java	ViewFile servlet: update index if newer,	2014-12-05 01:13:37 +01:00
CrawlSwitchboard.java	enhanced the snapshot functionality:	2014-12-09 16:20:34 +01:00
HarvestProcess.java	fix for wrong display of error urls in HostBrowser	2012-12-07 00:31:10 +01:00
HostBalancer.java	reduce number of calls to queue.size() because that may be a bottleneck	2014-11-23 20:09:32 +01:00
HostQueue.java	more stacks shall be considered for on-demand loading, not only	2014-11-23 20:11:23 +01:00
LegacyBalancer.java	special strategy for balancer: do not remove targets with zero wait time	2014-04-18 06:50:07 +02:00