reger
6a8d351df3
Enable Gradle test and add manifest to yacycore.jar, set outputdir to lib
...
as part of migration from Ant to Gradle builds.
2022-01-18 11:55:21 +01:00
reger24
d411566c45
Obsolete since replacement of Maven by Gradle build script
2022-01-16 05:36:24 +01:00
Michael Christen
c94b9b8197
Merge pull request #439 from otteresk/master
...
Adding sort functionality to Crawler Queue tables
2022-01-10 01:08:56 +01:00
Andreas
590f39b403
Add Sorting functionality to Crawler Queue Table
...
Allow to sort for count and host
2022-01-09 16:06:14 +01:00
Andreas
41e87a44bc
Merge pull request #6 from yacy/master
...
Update fork #6
2022-01-09 15:59:03 +01:00
Michael Christen
7abfeb221b
Merge pull request #436 from ZeroCool940711/master
...
Improved the Image search page to have bigger thumbnails, use a more area for results and have a smaller sidebar.
2021-12-28 02:55:01 +01:00
ZeroCool940711
7e765b8483
Improved the Image search page to have bigger thumbnails, use a bigger area for results and a smaller left sidebar.
2021-12-26 23:41:04 -07:00
Michael Peter Christen
6fe905bb82
feature https://github.com/yacy/yacy_search_server/issues/434
2021-12-26 23:33:31 +01:00
Michael Peter Christen
d7b17d8935
fixed missing thread name revert after balancer waiting
2021-12-22 01:46:18 +01:00
Michael Peter Christen
9c38b1254e
proper deletion of loadtime index
2021-12-22 01:22:46 +01:00
Michael Peter Christen
bd3f2483a1
replaced url and date retrieval by only url retrieval
...
This should prevent that the search index is used for freshnes of the
index entry.
2021-12-20 16:23:05 +01:00
Michael Peter Christen
163ba26d90
replaced check for load time method
...
instead of loading the solr document, an index only for the last loading
time was created. This prevents that solr has to fetch from its index
while the index is created. Excessive re-loading of documents while
indexing has shown to produce deadlocks, so this should now be
prevented.
2021-12-20 03:47:56 +01:00
Michael Peter Christen
1ead7b85b5
remove compiler warning
...
"warning: [try] explicit call to close() on an auto-closeable resource"
2021-12-13 12:28:34 +01:00
Michael Peter Christen
3dc6613096
updating slf4j 1.7.25 -> 1.7.32
2021-12-13 12:26:49 +01:00
Michael Christen
cd0ff48e99
there is no (more) log4j in YaCy
2021-12-12 13:53:19 +01:00
Michael Peter Christen
59777010dc
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2021-11-18 00:49:56 +01:00
Michael Peter Christen
7898815c41
disabling concurrent logging
...
(maybe temporary)
2021-11-18 00:49:46 +01:00
sgaebel
4bf6954474
uses clientBuilder not HttpClients.custom() to have these inside the
...
Pool too
2021-10-31 23:06:33 +01:00
sgaebel
cdf901270c
always use HTTPClient by 'try with resources' pattern to free up
...
resources
2021-10-31 23:06:23 +01:00
sgaebel
69adaa9f55
makes our HTTPClient closable
2021-10-31 23:06:02 +01:00
sgaebel
fc4275f901
handle all references for client, response, request to be able to close
...
them
2021-10-31 23:05:50 +01:00
sgaebel
1cdc55a425
lets SOLR merge bigger segments (up to 50GB)
...
+ some setting to reduce caches
2021-10-31 11:33:42 +01:00
sgaebel
e7d3a363f2
refactor to use finish()
2021-10-31 11:22:35 +01:00
sgaebel
4fc876f4a3
revert back to use EntityUtils.consumeQuietly - as it simply closes the
...
underlying stream
2021-10-31 11:22:28 +01:00
sgaebel
4f0392e93e
refactor use of AuthSchemeProvider
2021-10-31 11:21:59 +01:00
sgaebel
b74f337859
removes double setting of UserAgent
2021-10-31 11:21:06 +01:00
sgaebel
965748fefb
some refactoring using try with resources
2021-10-31 11:20:28 +01:00
Michael Christen
f4834e8e31
link fix
2021-10-29 11:10:23 +02:00
Michael Christen
7f5d3e3a12
fixed name
2021-10-29 11:07:34 +02:00
Michael Peter Christen
552ab7051b
fix for warc importer
2021-10-25 19:35:15 +02:00
Michael Peter Christen
3c86b7b780
attempt to make a Mac Release using gradle
...
This is almost working with many workarounds:
- run rm lib/yacycore.jar
- run ./gradlew clean build bundleNative
- run ant clean all
- run again rm lib/yacycore.jar
- run ./fixMacBuild.sh
The build is then inside build/mac/YaCy.app
Right now this works so far but it does not have the correct release
number inside.
Target is to make this working for Windows releases and to embedd jre
entirely.
2021-10-25 18:37:39 +02:00
Michael Peter Christen
49cae8ca62
network bootstraping addresses update
2021-10-25 18:32:57 +02:00
Michael Peter Christen
8e4383c49e
downgrading gradle to 6.9
...
to be able to support org.mini2Dx.parcl
2021-10-25 18:32:34 +02:00
Michael Peter Christen
999c819e3e
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
2021-10-24 20:50:14 +02:00
Michael Peter Christen
fd770e90e2
spike to identify paths for YaCy within mac application bundles
2021-10-24 20:49:59 +02:00
Michael Peter Christen
d19872fd26
making sure that crawl queues are closed correctly to prevent data loss
2021-10-14 00:30:04 +02:00
sgaebel
90507c0fdc
comments out printing query params to std.out
2021-10-04 18:03:06 +02:00
Michael Peter Christen
be0aebad84
fixes https://github.com/yacy/yacy_search_server/issues/424
2021-10-04 14:38:49 +02:00
Michael Peter Christen
63ad8ce6b2
removed ymarks
...
had not been used since a long time
2021-09-16 22:23:51 +02:00
Michael Peter Christen
ef5a71a592
enhanced crawl start response time
...
for very very large crawl start lists
2021-09-16 21:01:01 +02:00
Michael Peter Christen
1bab4ffe20
calculating the correct size of an export.
...
This can be seen as a fix for
https://github.com/yacy/yacy_search_server/issues/343
however, the export was not flawed, it is just the impression that
something is wrong, but the export size must be smaller than the index
size because the index also containers error documents.
Now an information line is presented that shows i.e.:
"The local index currently contains 181,319 documents, only 106,887
exportable with status code 200 - the remaining are error documents."
2021-09-16 01:05:09 +02:00
Michael Peter Christen
4cadd557dc
removed synchronization in table creation
...
to avoid possible deadlocks when handling OnDemandOpenFileIndex
which happens quite often during wide crawling
2021-09-15 19:34:49 +02:00
Michael Peter Christen
8084960392
disabled citation index
...
that was created but never used
2021-09-15 18:46:37 +02:00
admin
9b7668fa58
reduced memory footprint during indexing/crawling
2021-08-24 12:24:52 +02:00
admin
fbf8ddd32d
upgrade of jsoup 1.12.1 -> 1.14.2
2021-08-24 12:23:57 +02:00
Michael Peter Christen
4c889b7ff9
fixed build paths
2021-08-18 19:05:44 +02:00
Michael Peter Christen
683cac125f
updated bouncy castle 1.60 -> 1.69
2021-08-17 15:48:54 +02:00
Michael Peter Christen
e6a87e0426
enhanced crawler
...
a main problem when crawling is long waiting time cuased by crawl-delay
values from robots.txt entries. that attribute is not supported by
google and interpreted by yandex and bing in different ways. In large
crawls there is always one host which blocks the whole crawl with
extreme large values. YaCy now still obeys crawl-delay but limits them
to 10 seconds.
Additionally the blocking logic when loading new robots.txt was analyzed
and a deadlock was removed. Furthermore the construction of new queue
lists was redesigned and it was ensured that always a large list of
different hosts for host-balancing is provided for the loader.
2021-08-17 15:23:21 +02:00
Michael Peter Christen
e9c5e78868
replaced new Number(Number) with Number.instanceOf
...
to remove deprecation warnings for Java 9
2021-08-08 00:39:03 +02:00
Michael Peter Christen
9e13d77de4
removed call to class.finalize() because of deprecation in java 9
...
next: removal of finalize() implementation
after testing with assert false
2021-08-07 18:57:49 +02:00