yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-19 00:01:41 +02:00

Author	SHA1	Message	Date
luccioman	2f75e2d9c8	Fixed a case of NullPointerException on disconnected RWI data structure	2018-12-17 14:12:21 +01:00
luccioman	88d0ed676c	Render http status instead of null responses on snapshot api errors	2018-10-19 10:12:10 +02:00
luccioman	746e0e788d	Render a relevant HTTP status code on snapshot image rendering error Instead of a null response body which is not very helpful.	2018-10-14 10:30:30 +02:00
luccioman	79bd9f623a	Updated YaCy home page embedded links from http to https scheme	2018-05-22 17:46:12 +02:00
luccioman	addd18c993	Removed some remaining uses of deprecated Seed.getIP()	2018-04-26 09:39:30 +02:00
luccioman	0a058ba6af	Keep https in result message URL when push_p API is requested over https	2018-04-24 08:05:17 +02:00
luccioman	dbf4c1cd76	Improved blacklist entries editing operations : - Fixes issue #160 : handle properly syntax exceptions with a user friendly message - Fixes loss of information on multiple blacklist entries editions - Fixes loss of entries when moving entries from one list to another	2018-02-13 18:24:26 +01:00
luccioman	5db1c9155a	Do locale independant case conversion on hosts, schemes, and file exts. Required for proper operation when the default system locale is Turkish, as dottless and dotted i characters have specific case conversion rules in this language.	2017-12-19 13:52:05 +01:00
luccioman	1de86cf1bf	Fixed JPEG snapshot resizing when running on OpenJDK. Resizing JPEG snapshot images through /api/snapshot.jpg failed when running on OpenJDK, but rendered successfully with a Oracle JDK. Details in mantis 772 ( http://mantis.tokeek.de/view.php?id=772 ). Removing any alpha component (useless in snapshot images) from the rendered resized image solves the issue.	2017-10-19 09:27:52 +02:00
luccioman	a17a418e78	Fixed NullPointerException cases on snapshot images parsing.	2017-10-18 08:31:18 +02:00
luccioman	285f0d6a39	Consistently encode snapshot image with format requested on the API. Previously, calling /api/snapshot.png rendered JPEG encoded images.	2017-10-18 07:53:07 +02:00
luccioman	4eba88f2ff	Removed some unnecessary uses of java.lang.reflect api. This improves code browsing and readability, making search by references or call hierarchy IDE features more accurate.	2017-08-24 18:47:18 +02:00
luccioman	3f0446f14b	Ensure proper synchronous robots entry retrieval on first check. Previously, when checking for the first time the robots.txt policy on a unknown host (not cached in the robots table), result was always empty in the /getpageinfo_p.xml api and in the /CrawlCheck_p.html page. Next calls returned however the correct information.	2017-08-16 09:30:33 +02:00
reger	a21789d4e7	Fix unresolved pattern in api/share.html by init some display var's	2017-07-08 22:46:15 +02:00
luccioman	bf55f1d6e5	Started support of partial parsing on large streamed resources. Thus enable getpageinfo_p API to return something in a reasonable amount of time on resources over MegaBytes size range. Support added first with the generic XML parser, for other formats regular crawler limits apply as usual.	2017-07-08 09:04:03 +02:00
luccioman	8da3174867	Ensure lower case conversion consistency with any default locale. Especially for Turkish speaking users using "tr" as their system default locale : strings for technical stuff (URLs, tag names, constants...) must not be lower cased with the default locale, as 'I' doesn't becomes 'i' like in other locales such as "en", but becomes 'ı'.	2017-06-27 06:42:33 +02:00
luccioman	0f80c978d6	Limit the number of initially previewed links in crawl start pages. This prevent rendering a big and inconvenient scrollbar on resources containing many links. If really needed, preview of all links is still available with a "Show all links" button. Doesn't affect the number of links used once the crawl is effectively started, as the list is then loaded again server-side.	2017-06-17 09:33:14 +02:00
luccioman	cbccf97361	Added JavaDoc to the getpageinfo_p API servlet.	2017-05-30 17:38:16 +02:00
luccioman	bd88fd303e	Deprecated duplicated and internally unused getpageinfo servlet. Redirections set for the transition of any eventual external uses: - /api/getpageinfo.xml to /api/getpageinfo_p.xml - /api/getpageinfo.json to /api/getpageinfo_p.json	2017-05-30 09:29:28 +02:00
reger	a2afb4bae0	add switchboardconstants for server ports config keys	2017-03-18 20:02:26 +01:00
reger	334c70c37a	correct fromDate init value on missing param in api/timeline_p servlet revert test modification from last commit in AccessTracker.main	2017-02-20 00:14:14 +01:00
luccioman	e048e74072	Added an optional parameter to webstructure.xml api. This new "documentStructure" parameter can be set to false to only get hosts accumulated references on a resource and thus prevent scraping the specified URL and getting citations references. Also set WebStructureGraph constants as final and updated the Javadoc with example api call URLs.	2017-01-19 12:30:44 +01:00
luccioman	17b7c92009	Made sure webstructure.xml API produces valid XML. Host names should not contain XML special characters such as quotation mark, but at this stage the WebGraph may have mistakenly recorded a host name with such characters. What's more the DigestURL constructor does not prevent this. By the way using serverObjects.putXML to encode host names we ensure here the rendered XML is well formed and can be parsed by external tools even if an structure entry is incorrect.	2017-01-17 15:59:55 +01:00
luccioman	ed3dd5e31a	Fixed webstructure.xml API used with a domain name 'about' parameter. As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720), when requesting this API with a domain name instead of a complete URL only HTTP references on default port were listed.	2017-01-16 16:41:06 +01:00
luccioman	f793d97e56	Factored common code with DigestURL.hosthash()	2017-01-13 16:05:46 +01:00
luccioman	9cea7cbb10	Detailed some Javadoc related to /api/webstructure.xml usage.	2017-01-12 17:52:47 +01:00
reger	c50e23c495	reduce creation of empty legacy RequestHeader() in situation where null is acceptable (less for garbage collection).	2016-12-18 02:38:43 +01:00
reger	f45945cada	increase use of header const for custom "EXT" header	2016-11-13 01:39:14 +01:00
luccioman	812abfc868	Converted one more set of URLs to pure relative ones. Easier YaCy peer configuration behind a reverse proxy subfolder : no need for the reverse proxy to rewrite HTML links or URLs in css files. Tested on Debian Jessie with an apache2 reverse proxy. See related mantis issues http://mantis.tokeek.de/view.php?id=106 and http://mantis.tokeek.de/view.php?id=701	2016-11-12 15:54:35 +01:00
luccioman	74fec066f4	Converted more URLs to pure relative ones. Easier YaCy peer configuration behind a reverse proxy subfolder : no need for the reverse proxy to rewrite HTML links or URLs in css files. Tested on Debian Jessie with an apache2 reverse proxy. See related mantis issues http://mantis.tokeek.de/view.php?id=106 and http://mantis.tokeek.de/view.php?id=701	2016-11-12 10:51:54 +01:00
luccioman	734340c128	Fixed errors for Search portal mode or when peer is not reachable. Same case as reported on issue #87.	2016-11-04 14:31:22 +01:00
luccioman	6e1959f469	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Conflicts: htroot/yacysearchitem.java source/net/yacy/cora/federate/solr/responsewriter/YJsonResponseWriter.java source/net/yacy/search/schema/CollectionConfiguration.java source/net/yacy/server/serverObjects.java	2016-10-14 11:29:55 +02:00
reger	7c81160f45	correct blacklist export as text url to blacklists_p.txt was using servlet for network access and missing network.unit.name fix for http://mantis.tokeek.de/view.php?id=694 + prevent unresoved_pattern in yacy/list servlet	2016-10-07 03:03:41 +02:00
reger	91ab8a526a	add error msg to api/share.html and skip display of url on nothing uploaded	2016-08-17 03:07:26 +02:00
luccioman	6e96c7341a	Merge remote-tracking branch 'origin/master' Conflicts: htroot/Load_MediawikiWiki.java htroot/Load_PHPBB3.java htroot/ViewImage.java	2016-07-03 18:59:00 +02:00
reger	4e0892962a	fix NPE in citation servlet on empty text field	2016-05-14 03:51:13 +02:00
reger	d9adc2c255	load handler for Transparent Proxy on startup only if feature is activated to save the resources and keep handler chain small if the feature is not used. +add a warning message on settingsack_p page to restart on first activation	2016-03-25 05:26:48 +01:00
Michael Peter Christen	b89465d952	0N - basic dump upload servlet infrastructure, to share index dumps within an experimental new sharing model	2016-03-11 18:12:13 +01:00
Michael Peter Christen	f12a900f3e	harmonization of http post of files for one and several files - this had been differently - and wrong for several files. also: base64-encoding for gzipped push files because our data structures currently only supports ASCII POST pushes..	2016-03-11 08:59:33 +01:00
luc	8682dfbd5e	Updated getpageinfo outputs to return page icons list.	2016-02-10 09:02:21 +01:00
luc	3cc5619d93	Improved HTML icons indexing and rendering in search results. See http://mantis.tokeek.de/view.php?id=629	2016-02-02 09:57:54 +01:00
luc	571bc55937	Refactoring : use StandardCharsets constants instead of hard-coded charset names.	2016-01-05 23:37:05 +01:00
luc	55a4d15775	Added a note on deprecated default search field and operator.	2015-12-14 23:55:12 +01:00
reger	52a9040ae6	Sort out double keywords (dc_subject) early in parsed documents - by direct using Set vs. List - remove not neede String[] getter	2015-11-13 01:48:28 +01:00
reger	a60b1fb6c2	differentiate api call getLocalPort() from getConfigInt()	2015-10-31 23:09:03 +01:00
sixcooler	87e4abe393	fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has moved and was not cleared anymore. This results in an huge fieldcache. (http://lucene.apache.org/#highlights-of-the-lucene-release-include https://issues.apache.org/jira/browse/LUCENE-5666) Here I try to use DovValues where it is possible. For this I used the Api-Scheme as new basis für the Solr-Schema. This needs at least a complete optimization of the Solr-Index to get a smaller FieldCache. Everything that is indexed with these setting will not use the Fieldcache at all.	2015-08-31 20:24:41 +02:00
Michael Peter Christen	b43811d38c	added surrogate import process for exported solr dumps. Just throw your solr dump file into DATA/SURROGATES/in/ and it will be imported!	2015-05-30 13:19:59 +02:00
reger	3e742d1e34	Init remote crawler on demand If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread. Deploy of the remoteCrawlJob deferred on activation of the option.	2015-05-23 02:06:39 +02:00
reger	609c52e987	refactor getBookmark to consistenly check existance by != null (w/o throwing exception on not found)	2015-05-11 00:37:04 +02:00
reger	8a5b8f8789	on bookmaring of search result, remember orig. query in separate bookmark property (instead of using the description field) - adjust display and autosearch - don't overwrite existing bookmark but combine info	2015-05-03 02:31:50 +02:00

1 2 3 4 5 ...

464 Commits