yacy_search_server/source/net/yacy/kelondro
Michael Peter Christen 7db0534d8a Added a zim parser to the surrogate import option.
You can now import zim files into YaCy by simply moving them
to the DATA/SURROGATE/IN folder. They will be fetched and after
parsing moved to DATA/SURROGATE/OUT.
There are exceptions where the parser is not able to identify the
original URL of the documents in the zim file. In that case the file
is simply ignored.
This commit also carries an important fix to the pdf parser and an
increase of the maximum parsing speed to 60000 PPM which should make it
possible to index up to 1000 files in one second.
2023-11-05 02:16:40 +01:00
..
blob fix for "negative seek offset" error during extension of heap files. 2023-10-29 09:32:21 +01:00
data Added a zim parser to the surrogate import option. 2023-11-05 02:16:40 +01:00
index Various javadoc fixes 2022-01-26 11:22:43 +01:00
io removed finalize() methods, deprecated 2022-10-04 20:12:47 +02:00
logging fixed exec start command where a path contains spaces 2022-12-05 17:30:11 +01:00
rwi reduced memory footprint during indexing/crawling 2021-08-24 12:24:52 +02:00
table removed finalize() methods, deprecated 2022-10-04 20:12:47 +02:00
util added npe protection 2023-09-01 12:18:47 +02:00
workflow added hazelcast and some modifications to align legacy YaCy with 2021-04-15 20:39:22 +02:00