Commit Graph

4480 Commits

Author SHA1 Message Date
Michael Peter Christen
53b01dbf2e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2023-11-01 18:57:04 +01:00
Michael Peter Christen
1c0df28bfb added a zim importer that can be used for surrogate imports.
Can not be used yet because it requires some security additions
to verify that the given urls actually work.
2023-11-01 18:48:40 +01:00
okybaca
4add1f6bc7 replaced all the links to legacy legacy wiki to legacy wiki 2023-10-29 13:12:24 +01:00
Michael Peter Christen
4a54b24703 fix for "negative seek offset" error during extension of heap files.
This would have always happend when a heap file exceeds 2GB.
should fix https://github.com/yacy/yacy_search_server/issues/372
2023-10-29 09:32:21 +01:00
Michael Peter Christen
5ba5fb5d23 upgraded pdfbox to 3.0.0 2023-10-27 12:05:24 +02:00
Michael Peter Christen
4308aa5415 removed concept of empty passwords as "no passwords used",
because we now start YaCy with a default password (yacy).
This has impact of all function that check the current state of
password-protection that included the empty password situation,
including the warnings to set a password in case that none is set (which
cannot be the case any more).
2023-10-25 22:56:06 +02:00
Michael Peter Christen
2c60ff14bb fixed default pw comparison 2023-10-25 13:59:02 +02:00
Michael Peter Christen
4da320bebf added a warning message in ConfigBasic in case that the default password
was not changed.
2023-10-24 23:36:26 +02:00
Michael Peter Christen
7830268be1 fix 756c817b5a
must be applied to all code where a transaction token is generated.
2023-10-21 13:00:49 +02:00
Michael Peter Christen
756c817b5a fix for https://github.com/yacy/yacy_search_server/issues/544 2023-10-21 11:45:26 +02:00
Michael Peter Christen
03bf259601 fix for https://github.com/yacy/yacy_search_server/issues/363
We still need to set the load in the process because a demand for higher
crawl speed may require to increase the maximum load limit. However,
following the criticism in the bug, we do never reduce the load limit
again.
2023-10-16 18:26:47 +02:00
mchristen
8fc51f66c6 fixed a test class which prevented compilation on latest jvm 2023-09-26 15:39:34 +02:00
Joel Strasser
53bafa1544
consistent formatting in string concatenation 2023-09-25 23:31:55 +02:00
Joel Strasser
22c4188001
additionally match release stub for YaCy version 2023-09-25 22:41:04 +02:00
Michael Peter Christen
ff8fe7b6a4 fix for ',' or '.' appearing within a word or number. This will not
tokenize the query into parts around that character to make it possible
to search for numbers or version numbers.
2023-09-03 11:37:25 +02:00
Michael Peter Christen
0689f4f0ae Check if the character is a minus sign and is followed by a letter or a
digit. Treat it as part of the word/number.
2023-09-03 10:22:03 +02:00
Michael Peter Christen
5db97a8928 parser can now separate numbers from words also when they are not
separated by space, i.e. 4.7Ohm
2023-09-02 19:15:22 +02:00
Michael Peter Christen
e3797de7de enhanced the word tokenizer to recognize numbers in a proper way 2023-09-01 20:10:08 +02:00
Michael Peter Christen
88cd17ea57 migrated solr from 8.9.0 to 8.11.2; activated also migration script. A YaCy index with solr 8.9.0 will automatically be migrated to 8.11.2. This is a preparation step to migrate to 9.0.0 soon. 2023-09-01 18:24:52 +02:00
Michael Peter Christen
0089f234f4 added npe protection 2023-09-01 12:18:47 +02:00
Michael Peter Christen
8285fe715a tab to spaces for classes supporting the condenser.
This is a preparation step to make changes in condenser and parser more
visible; no functional changes so far.
2023-09-01 11:00:42 +02:00
Michael Peter Christen
195bd2e444 extended the maximum header size to 16k to prevent http error 431 2023-08-19 15:21:24 +02:00
Michael Peter Christen
92dad3ed49 removed 7Zip parser because the old library could not be replaced by a maven repository 2023-07-27 23:11:27 +02:00
Michael Peter Christen
5afcba162b updated libraries 2023-07-27 22:55:46 +02:00
Michael Christen
a348146d8f setting connect host to 0.0.0.0 2023-06-29 10:46:05 +02:00
Michael Peter Christen
1c0f50985c fixed documentation and some details of handling of keywords 2023-04-04 12:41:12 +02:00
Michael Christen
3472bcb4d3 patched a 'java.lang.NoSuchMethodError: com.twelvemonkeys.imageio.util.IIOUtil.lookupProviderByName' problem which occurred only on ARM 2023-03-05 01:17:28 +01:00
Michael Christen
f7b6e98ed7
Merge pull request #562 from thkoch2001/fix-warnings
Fix warnings
2023-03-05 00:56:04 +01:00
Michael Peter Christen
a157d01bb5 increased network image size limit for linuxtage poster 2023-02-24 17:50:29 +01:00
Thomas Koch
6bca836f49 fix 3 javac warnings: redundant cast
see GitHub issue #561 for context

    [javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigAccounts_p.java:85: warning: [cast] redundant cast to YaCyHttpServer
    [javac]                 final YaCyHttpServer jhttpserver = (YaCyHttpServer)sb.getHttpServer();
    [javac]                                                    ^
    [javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigUser_p.java:156: warning: [cast] redundant cast to YaCyHttpServer
    [javac]                 final YaCyHttpServer jhttpserver = (YaCyHttpServer) sb.getHttpServer();
    [javac]                                                    ^
    [javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigUser_p.java:167: warning: [cast] redundant cast to YaCyHttpServer
    [javac]             final YaCyHttpServer jhttpserver = (YaCyHttpServer) sb.getHttpServer();
2023-02-11 17:17:46 +02:00
Michael Christen
9012fe4519 extended error message 2023-01-23 09:08:25 +01:00
Michael Christen
74104ff2d3 fix to timeout 2023-01-20 20:22:14 +01:00
Michael Peter Christen
9fcd8f1bda added canonical filter
attention: this is on by default!
(it should do the right thing)
2023-01-16 14:50:30 +01:00
Michael Peter Christen
5a52b01c09 front-end integration of tag valency 2023-01-15 20:13:45 +01:00
Michael Peter Christen
7f728bb4b4 crawl profile storage extension for tag valency 2023-01-15 14:11:32 +01:00
Michael Christen
4304e07e6f crawl profile adoption to new tag valency attribute 2023-01-15 01:20:12 +01:00
Michael Peter Christen
5acd98f4da introduction of tag-to-indexing relation TagValency 2023-01-13 17:20:18 +01:00
Michael Peter Christen
ab3ef87abf fixed exec start command where a path contains spaces 2022-12-05 17:30:11 +01:00
Michael Peter Christen
17eec667fb better release number representation 2022-12-05 14:46:58 +01:00
Michael Peter Christen
b1199e97f8 enabling new update location release.yacy.net
with new version numbers
2022-12-05 14:26:17 +01:00
Michael Peter Christen
66169d1aad default build properties to remove barrier developing in IDE
environments
2022-12-05 12:28:36 +01:00
Michael Peter Christen
309adb814e fixed import of jsonlist imort from searchlab.eu using a direct URL 2022-10-25 00:51:53 +02:00
Michael Peter Christen
5ddc794bb9 code cleanup in http clieant 2022-10-24 23:34:39 +02:00
Michael Peter Christen
62d177bf59 stub for jsonlist index importer web page 2022-10-23 12:22:31 +02:00
Michael Peter Christen
efa0425f00 refactoring: moved jsonlist importer to importer class 2022-10-23 11:35:32 +02:00
Michael Peter Christen
49daa32a88 yacy can now read searchlab export dump files
using the surrogate input process:
- copy the searchlab export file to DATA/SURROGATE/in
- the file is processed automatically and then moved to
DATA/SURROGATE/OUT
2022-10-23 11:01:58 +02:00
Michael Peter Christen
6042dd99c6 reduced danger that Tray does not initialize 2022-10-06 00:01:42 +02:00
Michael Christen
61b27217b9 throttle number of DNS requests:
as soon as the number of requests is > 50, there is a forced delay
of (10 * (requests - 50)) milliseconds. That means that once the number
of DNS requests reach 150, there is a one second delay to each request.

This shall prevent that a remote DNS is flooded with request and
possibly gets damaged.
This is also a fix/enhancement for
https://github.com/yacy/yacy_search_server/issues/513
2022-10-05 22:59:09 +02:00
Michael Christen
99174282d8 try to shut down in a bit more ordered way
inspired by https://github.com/yacy/yacy_search_server/issues/518
2022-10-05 22:13:06 +02:00
Michael Peter Christen
482f507e65 upgraded solr from 8.8.1 to 8.9.0
should hopefully fix
https://github.com/yacy/yacy_search_server/issues/496
because it includes https://issues.apache.org/jira/browse/SOLR-13034
2022-10-05 17:24:07 +02:00