Commit Graph

403 Commits

Author SHA1 Message Date
admin
fbf8ddd32d upgrade of jsoup 1.12.1 -> 1.14.2 2021-08-24 12:23:57 +02:00
Michael Peter Christen
4c889b7ff9 fixed build paths 2021-08-18 19:05:44 +02:00
Michael Peter Christen
e6a87e0426 enhanced crawler
a main problem when crawling is long waiting time cuased by crawl-delay
values from robots.txt entries. that attribute is not supported by
google and interpreted by yandex and bing in different ways. In large
crawls there is always one host which blocks the whole crawl with
extreme large values. YaCy now still obeys crawl-delay but limits them
to 10 seconds.
Additionally the blocking logic when loading new robots.txt was analyzed
and a deadlock was removed. Furthermore the construction of new queue
lists was redesigned and it was ensured that always a large list of
different hosts for host-balancing is provided for the loader.
2021-08-17 15:23:21 +02:00
Michael Peter Christen
8a2adb2b15 upgraded commons-compress lib
cause: alert in
https://github.com/yacy/yacy_search_server/security/dependabot/pom.xml/org.apache.commons:commons-compress/open
2021-08-07 18:21:54 +02:00
Michael Peter Christen
1cdb21592b added hazelcast and some modifications to align legacy YaCy with
YaCyGrid
2021-04-15 20:39:22 +02:00
sgaebel
88c6bc8cd7 adds missing solr lib: opentracing 0.33.0 2021-03-18 21:42:58 +01:00
Michael Peter Christen
8b4394a6c5 fixes for solr 8.8.1 migration
- replace new guava 30 with older 25 because that is the correct
dependency for solr 8.8.1. The newer one did actually not work!
- index will be crated in a DATA/INDEX/freeworld/SEGMENTS/solr_8_8_1
subfolder. The older solr_6_6 index is not touched but also not
migrated. The index starts with fresh (empty) content.
- Older indexes must be migrated by hand (export/import) so far until a
better solution is found.
- Large schema adoptions for lucene 8.8.1
2021-03-08 13:39:27 +01:00
Michael Peter Christen
f4f3808d43 added missing new dependencies for migration to Solr 8
after pulling https://github.com/yacy/yacy_search_server/pull/403
2021-03-06 13:35:32 +01:00
Michael Peter Christen
43a9f4f574 updated solr 6.6.6 -> 7.7.3
dropped GSA support (GSA API is still in YaCy Grid)
The 6.6.6 solr index works without migration also with 7.7.3
2020-12-12 02:06:43 +01:00
Michael Peter Christen
3213d9db37 updated jetty from 9.4.17 to 9.4.35
and fixed a bug in ServerSideIncludes that appeared only in that recent
version of jetty
2020-12-03 00:21:15 +01:00
sgaebel
dd9d4b1188 replace org.junit.Assert.assertThat by
org.hamcrest.MatcherAssert.assertThat from hamcrest 2.2 to avoid
deprecation-warning
2020-07-28 19:09:26 +02:00
sgaebel
e039a797d2 bump to commons-codec-1.14, commons-compress-1.20,
commons-fileupload-1.4, commons-io-2.7, httpclient-4.5.12,
httpcore-4.4.13, httpmime-4.5.12 + remove unused commons-jxpath-1.3,
htmllexer
2020-07-26 21:58:15 +02:00
Michael Peter Christen
59181e8009 removed old jsoup lib from eclipse classpath 2020-01-16 19:54:33 +01:00
Michael Peter Christen
a704ebadcd build path fix 2020-01-16 19:48:52 +01:00
Michael Christen
4ccd1ea3c0 new servlet path "p2p"
with a test class.
Call the class with
http://localhost:8090/p2p/seeds.json
2020-01-15 15:24:36 +01:00
luccioman
8d3e029247 Upgraded Lucene/Solr dependencies from 6.6.5 to 6.6.6 2019-04-27 21:27:11 +02:00
luccioman
385c6a079d Upgraded Jetty dependencies from 9.4.15.v20190215 to 9.4.17.v20190418 2019-04-24 09:50:56 +02:00
luccioman
f8b94f9891 Upgraded PDFBox dependency from 2.0.14 to 2.0.15 2019-04-19 10:33:58 +02:00
luccioman
fc83d35f3f Upgraded PDFBox dependency from 2.0.11 to 2.0.14 2019-03-22 09:52:57 +01:00
luccioman
4c1428bd63 Upgraded Jetty dependencies from 9.4.14.v20181114 to 9.4.15.v20190215 2019-03-18 14:13:10 +01:00
luccioman
d3a114c7a9 Upgraded icu4j dependency from 62.1 to 63.1 2019-01-19 11:29:17 +01:00
sgaebel
eeae816cc4 bump to HTTPclient-4.5.6 2019-01-08 20:45:59 +01:00
luccioman
1b71c6bf87 Upgraded Jetty dependencies from 9.4.12.v20180830 to 9.4.14.v20181114 2018-12-21 15:03:37 +01:00
luccioman
6c3e140083 Upgraded Solr and Lucene dependencies from 6.6.3 to 6.6.5 2018-09-22 14:40:18 +02:00
luccioman
982179a7eb Upgraded BouncyCastle dependencies from jdk15:1.46 to jdk15on:1.60 2018-09-21 12:07:57 +02:00
luccioman
51f4be1807 Upgraded Jetty dependencies from 9.4.11.v20180605 to 9.4.12.v20180830 2018-09-14 14:03:44 +02:00
luccioman
baa7154486 Upgraded Apache PDFBox dependency from 2.0.9 to 2.0.11
Release notes at
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310760&version=12343466
and https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310760&version=12342889
2018-08-18 12:39:58 +02:00
luccioman
685122363d Added a parser for XZ compressed archives.
As suggested by LA_FORGE on mantis 781
(http://mantis.tokeek.de/view.php?id=781)
2018-08-15 10:07:39 +02:00
luccioman
1a91e87b05 Upgraded commons-compress dependency from version 1.16.1 to 1.17 2018-08-08 08:06:24 +02:00
luccioman
2bbf070f57 Upgraded icu4j dependency from 61.1 to 62.1 2018-07-26 09:39:54 +02:00
luccioman
8811700e2e Upgraded Jetty dependency from 9.4.9 to 9.4.11 2018-06-20 09:33:26 +02:00
reger
d5af160e60 upd to slf4j-1.7.25 2018-05-20 21:51:41 +02:00
reger
7525594315 upd to jwat-warc-1.1.1 2018-05-06 00:49:30 +02:00
reger
b81debca2e upd to jsoup-1.11.3 2018-04-28 23:24:24 +02:00
reger
508050f79c upd to icu4j-61.1 2018-04-14 16:16:35 +02:00
reger
e7971fb888 upd to pdfbox-2.0.9 2018-04-08 20:13:53 +02:00
reger
e2b2c89feb upd to jetty-9.4.9.v20180320 2018-04-07 23:39:03 +02:00
luccioman
c867a52d96 Upgraded Solr dependencies from 6.6.2 to 6.6.3 2018-04-05 18:15:45 +02:00
reger
a57a04a003 upd to commons-codec-1.11 2018-03-19 02:02:35 +01:00
luccioman
5753ce0ac5 Upgraded Jaudiotagger dependency from 2.0.3 to 2.2.5 2018-02-26 09:17:26 +01:00
reger
aaa0ec6613 upd to commons-compress-1.16.1 2018-02-23 19:17:09 +01:00
reger
73c6ce7ae5 upd to httpclient-4.5.5 2018-02-10 20:01:35 +01:00
reger
5aa4fb1144 upd to metadata-extractor-2.11.0.jar 2018-01-27 18:32:45 +01:00
reger
cedb53be4e upd to commons-io-2.6 2017-12-28 03:13:42 +01:00
reger
270b77074e upd to httpclient-4.5.4 and httpmime-4.5.4 2017-12-24 01:34:23 +01:00
reger
6db7f5525b upd to icu4j-60.2 2017-12-24 01:02:18 +01:00
reger
c94bc82f6a upd to commons-compress-1.15 2017-12-16 00:49:48 +01:00
reger
e5b4799838 upd to Jetty-9.4.8.v20171121 2017-12-07 00:24:33 +01:00
reger
0704b1d644 upd to httpcore-4.4.8 2017-12-04 01:12:50 +01:00
reger
a1879115dc upd to Jsoup-1.11.2 2017-11-26 22:01:42 +01:00