mirror of
https://github.com/yacy/yacy_search_server.git
synced 2024-09-19 00:01:41 +02:00
0a879c98e7
hold a date for each URL to record when a url was first seen. This is then used to overwrite the modification date for urls upon recrawl in case that the first-seen date is before the latest document date. This behaviour is necessary due to the common behaviour of content management systems which attach always the current date to all documents. Using the firstSeen database it is possible to approximate a real first document creation date in case that the crawler starts frequently for the same domain. As a result the search results ordered by date have a much better quality and the usage of YaCy as search agent for latest news has a better quality. |
||
---|---|---|
.. | ||
graphics | ||
operation | ||
Accessible.java | ||
DHTSelection.java | ||
Dispatcher.java | ||
EventChannel.java | ||
Network.java | ||
NewsDB.java | ||
NewsPool.java | ||
NewsQueue.java | ||
PeerActions.java | ||
Protocol.java | ||
RemoteSearch.java | ||
Seed.java | ||
SeedDB.java | ||
Transmission.java |