Michael Christen
7f5d3e3a12
fixed name
2021-10-29 11:07:34 +02:00
Michael Peter Christen
552ab7051b
fix for warc importer
2021-10-25 19:35:15 +02:00
Michael Peter Christen
3c86b7b780
attempt to make a Mac Release using gradle
...
This is almost working with many workarounds:
- run rm lib/yacycore.jar
- run ./gradlew clean build bundleNative
- run ant clean all
- run again rm lib/yacycore.jar
- run ./fixMacBuild.sh
The build is then inside build/mac/YaCy.app
Right now this works so far but it does not have the correct release
number inside.
Target is to make this working for Windows releases and to embedd jre
entirely.
2021-10-25 18:37:39 +02:00
Michael Peter Christen
49cae8ca62
network bootstraping addresses update
2021-10-25 18:32:57 +02:00
Michael Peter Christen
8e4383c49e
downgrading gradle to 6.9
...
to be able to support org.mini2Dx.parcl
2021-10-25 18:32:34 +02:00
Michael Peter Christen
999c819e3e
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
2021-10-24 20:50:14 +02:00
Michael Peter Christen
fd770e90e2
spike to identify paths for YaCy within mac application bundles
2021-10-24 20:49:59 +02:00
Michael Peter Christen
d19872fd26
making sure that crawl queues are closed correctly to prevent data loss
2021-10-14 00:30:04 +02:00
sgaebel
90507c0fdc
comments out printing query params to std.out
2021-10-04 18:03:06 +02:00
Michael Peter Christen
be0aebad84
fixes https://github.com/yacy/yacy_search_server/issues/424
2021-10-04 14:38:49 +02:00
Michael Peter Christen
63ad8ce6b2
removed ymarks
...
had not been used since a long time
2021-09-16 22:23:51 +02:00
Michael Peter Christen
ef5a71a592
enhanced crawl start response time
...
for very very large crawl start lists
2021-09-16 21:01:01 +02:00
Michael Peter Christen
1bab4ffe20
calculating the correct size of an export.
...
This can be seen as a fix for
https://github.com/yacy/yacy_search_server/issues/343
however, the export was not flawed, it is just the impression that
something is wrong, but the export size must be smaller than the index
size because the index also containers error documents.
Now an information line is presented that shows i.e.:
"The local index currently contains 181,319 documents, only 106,887
exportable with status code 200 - the remaining are error documents."
2021-09-16 01:05:09 +02:00
Michael Peter Christen
4cadd557dc
removed synchronization in table creation
...
to avoid possible deadlocks when handling OnDemandOpenFileIndex
which happens quite often during wide crawling
2021-09-15 19:34:49 +02:00
Michael Peter Christen
8084960392
disabled citation index
...
that was created but never used
2021-09-15 18:46:37 +02:00
admin
9b7668fa58
reduced memory footprint during indexing/crawling
2021-08-24 12:24:52 +02:00
admin
fbf8ddd32d
upgrade of jsoup 1.12.1 -> 1.14.2
2021-08-24 12:23:57 +02:00
Ian Smirlis
53518a91ab
In case of reload404, load only failed documents
2021-08-19 20:49:59 +03:00
Michael Peter Christen
4c889b7ff9
fixed build paths
2021-08-18 19:05:44 +02:00
Michael Peter Christen
683cac125f
updated bouncy castle 1.60 -> 1.69
2021-08-17 15:48:54 +02:00
Michael Peter Christen
e6a87e0426
enhanced crawler
...
a main problem when crawling is long waiting time cuased by crawl-delay
values from robots.txt entries. that attribute is not supported by
google and interpreted by yandex and bing in different ways. In large
crawls there is always one host which blocks the whole crawl with
extreme large values. YaCy now still obeys crawl-delay but limits them
to 10 seconds.
Additionally the blocking logic when loading new robots.txt was analyzed
and a deadlock was removed. Furthermore the construction of new queue
lists was redesigned and it was ensured that always a large list of
different hosts for host-balancing is provided for the loader.
2021-08-17 15:23:21 +02:00
Michael Peter Christen
e9c5e78868
replaced new Number(Number) with Number.instanceOf
...
to remove deprecation warnings for Java 9
2021-08-08 00:39:03 +02:00
Michael Peter Christen
9e13d77de4
removed call to class.finalize() because of deprecation in java 9
...
next: removal of finalize() implementation
after testing with assert false
2021-08-07 18:57:49 +02:00
Michael Peter Christen
9ef4503672
fixed some newInstance() warnings
...
.. by adding .getDeclaredConstructor()
2021-08-07 18:46:53 +02:00
Michael Peter Christen
82df012442
removed old lib
2021-08-07 18:23:22 +02:00
Michael Peter Christen
8a2adb2b15
upgraded commons-compress lib
...
cause: alert in
https://github.com/yacy/yacy_search_server/security/dependabot/pom.xml/org.apache.commons:commons-compress/open
2021-08-07 18:21:54 +02:00
Michael Peter Christen
9182b3dfca
enhanced default value
2021-08-05 09:18:05 +02:00
Michael Peter Christen
294d56d4a2
addressing better GC behavior after removing Xms with earlier heap increase strategy
2021-08-05 09:16:52 +02:00
Michael Peter Christen
3959d43a5c
fixed doku link
2021-08-03 16:57:24 +02:00
Michael Peter Christen
c4659f0fb0
removed Debian and Red Hat build process
...
as announced in
https://twitter.com/yacy_search/status/1414608643241152516
because of lack of community support for these kind of
distributions. We will still support
tarball, Windows, Mac and Docker releases.
2021-07-19 20:33:52 +02:00
Michael Peter Christen
73360ed52b
add gradle to gitignore
2021-07-19 20:12:03 +02:00
Michael Peter Christen
15b7461bc7
removed Xms java memory startup parameter
...
We will use the default value for now on.
This is much better for resource economy and fits better into a
container/docker/kubernetes strategy.
Furthermore, a small memory footprint is essential for the usage on
small devices like RaspberryPi.
2021-07-19 20:04:11 +02:00
admin
c3b3087077
gradle cleanup
2021-07-14 14:07:49 +02:00
admin
a13986d659
replaced maven with gradle
2021-07-14 13:58:30 +02:00
Michael Peter Christen
1d41380f0a
better support for mac-specific tray functions in java 9
2021-07-12 17:27:59 +02:00
Michael Peter Christen
4377bd2b70
fix for wrong crawlName construction
2021-06-30 18:03:54 +02:00
Michael Peter Christen
e81b770f79
enabled crawl starts with very large sets of start urls
...
i.e. 10MB large url list with approx 0.5 million start points
2021-06-30 10:45:58 +02:00
frankenstein91
4b73b3f9f2
docker has no latest-alpine
...
There is no yacy/yacy_search_server:latest-alpine on docker hub
2021-06-20 22:27:50 +02:00
Michael Peter Christen
c623a3252e
fix for jdk 14 bug
2021-04-23 09:11:03 +02:00
Michael Peter Christen
dbd211a1ad
removed/replaced reflection in memory tool
2021-04-22 20:24:13 +02:00
Michael Peter Christen
160f00e59e
removed reconfigure script which is seven years old any may not up to
...
standards of current password implementation.
See https://github.com/yacy/yacy_search_server/issues/409 as hint
2021-04-15 20:41:01 +02:00
Michael Peter Christen
1cdb21592b
added hazelcast and some modifications to align legacy YaCy with
...
YaCyGrid
2021-04-15 20:39:22 +02:00
Michael Christen
42ea2a1c6f
Merge pull request #405 from jfhs/jfhs/support-all-html-entities
...
Improve HTML entities support
2021-03-31 01:44:54 +02:00
Michael Christen
b2af745dd6
Merge pull request #404 from lnceballosz/master
...
NGI0 - Updating licensing aspects according REUSE
2021-03-30 23:48:21 +02:00
jfhs
10bddc2c2d
Decode HTML entities in all property values by default
2021-03-30 22:24:55 +02:00
jfhs
2135d259e3
Replace hardcoded html/xml entities with a file, support decoding all defined HTML entities
2021-03-30 22:24:54 +02:00
Michael Peter Christen
8f876a8c72
added concurrency to enhance indexing speed during json surrogate import
2021-03-30 12:07:36 +02:00
Michael Peter Christen
f8cbaeef93
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2021-03-29 18:46:53 +02:00
Michael Peter Christen
a857e3d3d5
fix for json importer
2021-03-29 18:46:42 +02:00
sgaebel
7fecd859e5
fixes showing metadata from Searchresult, by removing defType=edismax
...
also removes defType=edismax from IndexBrowser, but still does not show
dates
2021-03-21 00:06:26 +01:00