Commit Graph

481 Commits

Author SHA1 Message Date
Michael Peter Christen
33889b4501 moved more servlets to new location 2022-10-02 22:57:58 +02:00
Michael Peter Christen
6d388bb7bf refactoring - moved htroot/yacy classes 2022-10-02 22:26:53 +02:00
Michael Peter Christen
9c1bc533fa removed hazelcast because it is phoning home, see also:
https://github.com/yacy/yacy_search_server/issues/504
2022-09-28 17:30:37 +02:00
Thomas Koch
3116713672 rm buildDate from build.xml and its usages
The https://reproducible-builds.org project invests a lot of work
to make builds reproducible. This is a security property. It allows
to compare the build of binaries from different builder machines.
If they are identical, it means that either the builds have not
been manipulated or an attacker managed to attack all builder
machines in exactly the same way.

One problem that the reproducible-builds project often sees is
that projects include the build time in their binaries. This
makes builds unreproducible for apparently no reason. The build
date should not be of interest since binaries built on different
dates but from the same source code should not be different.

Thus I decided to remove the build date instead of re-implementing
the functionality without the GitRev task. Anyways the reported
date was not the build date but the date of the last git commit
which is even less informative. The git commit ID would have
information value but should only be relevant for "nightly builds".
2022-07-10 11:32:38 +00:00
Michael Peter Christen
3d138d3fdd catch error when initializing hazelcast
should fix https://github.com/yacy/yacy_search_server/issues/468
2022-06-20 17:27:56 +02:00
reger24
05d6d0405f Move sub-menu UI Translations from public Status to secure Sys Administration
- as UI Translation (TransNews_p.html) is a secured page
- it uses for publishing internal News system but belongs not really to "Community Data"
2022-02-08 22:42:11 +01:00
reger24
6a5f0b3684 Servlet IndexBroser_p add button "Add to blacklist"
allows to add the displayed host to add to the default blacklist
2022-01-30 21:01:23 +01:00
Michael Peter Christen
bd3f2483a1 replaced url and date retrieval by only url retrieval
This should prevent that the search index is used for freshnes of the
index entry.
2021-12-20 16:23:05 +01:00
Michael Peter Christen
63ad8ce6b2 removed ymarks
had not been used since a long time
2021-09-16 22:23:51 +02:00
Michael Peter Christen
dbd211a1ad removed/replaced reflection in memory tool 2021-04-22 20:24:13 +02:00
Michael Peter Christen
1cdb21592b added hazelcast and some modifications to align legacy YaCy with
YaCyGrid
2021-04-15 20:39:22 +02:00
Michael Peter Christen
198826c362 added network scanner process to discover all YaCy peers in the intranet
this will be used to wire YaCy peers in a kubernetes cluster
2021-01-23 15:14:49 +01:00
Michael Peter Christen
d9602e8325 Implemented a new syntax in the template engine to simplify json APIs
Added also an example for one of the existing APIs. The problem is the
comma separator between objects which must not be there for the last
entry in a sequence. The new syntax adds the separator symbol
automatically.
2021-01-18 00:01:08 +01:00
Michael Peter Christen
022fb15670 fix for https://github.com/yacy/yacy_search_server/issues/385 2021-01-06 22:12:17 +01:00
Michael Peter Christen
e0ad8ca9da replaced json library from JSON.org with libandroid-json-java
This fixes https://github.com/yacy/yacy_search_server/issues/347
2020-04-24 11:45:25 +02:00
Michael Peter Christen
0a7bda2a21 removed JSON-evil license line
These classes had been my own creative work.
Just the copyright line had been appeared possibly due to a bad
copy-paste activity, unaware that the line is a non-free addition.
2020-04-22 13:14:26 +02:00
Michael Peter Christen
37827b6788 removed doubes from getpageinfo 2020-01-16 21:09:42 +01:00
luccioman
2f75e2d9c8 Fixed a case of NullPointerException on disconnected RWI data structure 2018-12-17 14:12:21 +01:00
luccioman
88d0ed676c Render http status instead of null responses on snapshot api errors 2018-10-19 10:12:10 +02:00
luccioman
746e0e788d Render a relevant HTTP status code on snapshot image rendering error
Instead of a null response body which is not very helpful.
2018-10-14 10:30:30 +02:00
luccioman
79bd9f623a Updated YaCy home page embedded links from http to https scheme 2018-05-22 17:46:12 +02:00
luccioman
addd18c993 Removed some remaining uses of deprecated Seed.getIP() 2018-04-26 09:39:30 +02:00
luccioman
0a058ba6af Keep https in result message URL when push_p API is requested over https 2018-04-24 08:05:17 +02:00
luccioman
dbf4c1cd76 Improved blacklist entries editing operations :
- Fixes issue #160 : handle properly syntax exceptions with a user
friendly message
- Fixes loss of information on multiple blacklist entries editions
- Fixes loss of entries when moving entries from one list to another
2018-02-13 18:24:26 +01:00
luccioman
5db1c9155a Do locale independant case conversion on hosts, schemes, and file exts.
Required for proper operation when the default system locale is Turkish,
as dottless and dotted i characters have specific case conversion rules
in this language.
2017-12-19 13:52:05 +01:00
luccioman
1de86cf1bf Fixed JPEG snapshot resizing when running on OpenJDK.
Resizing JPEG snapshot images through /api/snapshot.jpg failed when
running on OpenJDK, but rendered successfully with a Oracle JDK.
Details in mantis 772 ( http://mantis.tokeek.de/view.php?id=772 ).

Removing any alpha component (useless in snapshot images) from the
rendered resized image solves the issue.
2017-10-19 09:27:52 +02:00
luccioman
a17a418e78 Fixed NullPointerException cases on snapshot images parsing. 2017-10-18 08:31:18 +02:00
luccioman
285f0d6a39 Consistently encode snapshot image with format requested on the API.
Previously, calling /api/snapshot.png rendered JPEG encoded images.
2017-10-18 07:53:07 +02:00
luccioman
4eba88f2ff Removed some unnecessary uses of java.lang.reflect api.
This improves code browsing and readability, making search by references
or call hierarchy IDE features more accurate.
2017-08-24 18:47:18 +02:00
luccioman
3f0446f14b Ensure proper synchronous robots entry retrieval on first check.
Previously, when checking for the first time the robots.txt policy on a
unknown host (not cached in the robots table), result was always empty
in the /getpageinfo_p.xml api and in the /CrawlCheck_p.html page. Next
calls returned however the correct information.
2017-08-16 09:30:33 +02:00
reger
a21789d4e7 Fix unresolved pattern in api/share.html by init some display var's 2017-07-08 22:46:15 +02:00
luccioman
bf55f1d6e5 Started support of partial parsing on large streamed resources.
Thus enable getpageinfo_p API to return something in a reasonable amount
of time on resources over MegaBytes size range.
Support added first with the generic XML parser, for other formats
regular crawler limits apply as usual.
2017-07-08 09:04:03 +02:00
luccioman
8da3174867 Ensure lower case conversion consistency with any default locale.
Especially for Turkish speaking users using "tr" as their system default
locale : strings for technical stuff (URLs, tag names, constants...)
must not be lower cased with the default locale, as 'I' doesn't becomes
'i' like in other locales such as "en", but becomes 'ı'.
2017-06-27 06:42:33 +02:00
luccioman
0f80c978d6 Limit the number of initially previewed links in crawl start pages.
This prevent rendering a big and inconvenient scrollbar on resources
containing many links.
If really needed, preview of all links is still available with a "Show
all links" button.

Doesn't affect the number of links used once the crawl is effectively
started, as the list is then loaded again server-side.
2017-06-17 09:33:14 +02:00
luccioman
cbccf97361 Added JavaDoc to the getpageinfo_p API servlet. 2017-05-30 17:38:16 +02:00
luccioman
bd88fd303e Deprecated duplicated and internally unused getpageinfo servlet.
Redirections set for the transition of any eventual external uses:
 - /api/getpageinfo.xml to /api/getpageinfo_p.xml
 - /api/getpageinfo.json to /api/getpageinfo_p.json
2017-05-30 09:29:28 +02:00
reger
a2afb4bae0 add switchboardconstants for server ports config keys 2017-03-18 20:02:26 +01:00
reger
334c70c37a correct fromDate init value on missing param in api/timeline_p servlet
revert test modification from last commit in AccessTracker.main
2017-02-20 00:14:14 +01:00
luccioman
e048e74072 Added an optional parameter to webstructure.xml api.
This new "documentStructure" parameter can be set to false to only get
hosts accumulated references on a resource and thus prevent scraping the
specified URL and getting citations references.

Also set WebStructureGraph constants as final and updated the Javadoc
with example api call URLs.
2017-01-19 12:30:44 +01:00
luccioman
17b7c92009 Made sure webstructure.xml API produces valid XML.
Host names should not contain XML special characters such as quotation
mark, but at this stage the WebGraph may have mistakenly recorded a host
name with such characters. What's more the DigestURL constructor does
not prevent this.
By the way using serverObjects.putXML to encode host names we ensure
here the rendered XML is well formed and can be parsed by external tools
even if an structure entry is incorrect.
2017-01-17 15:59:55 +01:00
luccioman
ed3dd5e31a Fixed webstructure.xml API used with a domain name 'about' parameter.
As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720),
when requesting this API with a domain name instead of a complete URL
only HTTP references on default port were listed.
2017-01-16 16:41:06 +01:00
luccioman
f793d97e56 Factored common code with DigestURL.hosthash() 2017-01-13 16:05:46 +01:00
luccioman
9cea7cbb10 Detailed some Javadoc related to /api/webstructure.xml usage. 2017-01-12 17:52:47 +01:00
reger
c50e23c495 reduce creation of empty legacy RequestHeader() in situation where null
is acceptable (less for garbage collection).
2016-12-18 02:38:43 +01:00
reger
f45945cada increase use of header const for custom "EXT" header 2016-11-13 01:39:14 +01:00
luccioman
812abfc868 Converted one more set of URLs to pure relative ones.
Easier YaCy peer configuration behind a reverse proxy subfolder : no
need for the reverse proxy to rewrite HTML links or URLs in css files.

Tested on Debian Jessie with an apache2 reverse proxy.

See related mantis issues http://mantis.tokeek.de/view.php?id=106 and
http://mantis.tokeek.de/view.php?id=701
2016-11-12 15:54:35 +01:00
luccioman
74fec066f4 Converted more URLs to pure relative ones.
Easier YaCy peer configuration behind a reverse proxy subfolder : no
need for the reverse proxy to rewrite HTML links or URLs in css files.

Tested on Debian Jessie with an apache2 reverse proxy.

See related mantis issues http://mantis.tokeek.de/view.php?id=106 and
http://mantis.tokeek.de/view.php?id=701
2016-11-12 10:51:54 +01:00
luccioman
734340c128 Fixed errors for Search portal mode or when peer is not reachable.
Same case as reported on issue #87.
2016-11-04 14:31:22 +01:00
luccioman
6e1959f469 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
Conflicts:
	htroot/yacysearchitem.java
	source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java
	source/net/yacy/search/schema/CollectionConfiguration.java
	source/net/yacy/server/serverObjects.java
2016-10-14 11:29:55 +02:00
reger
7c81160f45 correct blacklist export as text url to blacklists_p.txt
was using servlet for network access and missing network.unit.name
fix for http://mantis.tokeek.de/view.php?id=694
+ prevent unresoved_pattern in yacy/list servlet
2016-10-07 03:03:41 +02:00