yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-21 00:00:13 +02:00

Author	SHA1	Message	Date
orbiter	de95e5e524	reduced search activity corona strength in network image	2014-04-04 10:08:44 +02:00
reger	227c42bc96	eleminate obsolete URIMetaDataRow class by joining it with/into URIMetaDataNode.	2014-04-03 00:35:15 +02:00
Michael Peter Christen	5b83887da8	npe fix	2014-04-02 02:34:55 +02:00
reger	2953ebe701	fix: port in local target adress & button style	2014-03-29 00:34:01 +01:00
Michael Peter Christen	8b44fcf0f4	added missing @Override annotation	2014-03-28 13:48:37 +01:00
reger	a373fb717d	remove more unused from legacy server.http - triggerOnlineAction not used - useTemplateCache not used	2014-03-14 03:12:04 +01:00
reger	dd5bf0b71b	cleanup old reference to HTTPDemon.setAlternativeResolver optimize .yacyh check in AbstractRemoteHandler	2014-03-06 03:08:04 +01:00
orbiter	d68e5ad0c4	NPE fix for Thread name (just commited yesterday, sorry)	2014-03-02 11:20:48 +01:00
Michael Peter Christen	6ed9c0164e	attaching names to all Threads to get a better view in profiling tools like VisualVM	2014-02-28 15:02:01 +01:00
Michael Peter Christen	7640834b37	removed double concurrency to put Solr documents into the index. The writings to the solr index are also buffered in ConcurrentUpdateSolrConnector	2014-02-26 22:21:00 +01:00
Michael Peter Christen	1b5e3d523a	better control over close-state of remote solr connections	2014-02-20 00:39:19 +01:00
Michael Peter Christen	69391e5d9e	changed strategy to test existence of documents in Solr: using the update time. The reason for that is a better caching for the crawler double-check, which needs the update time for crawler steering.	2014-02-19 04:03:45 +01:00
Michael Peter Christen	0dda979801	adopted network image drawing to increased number of peers	2014-02-11 00:53:10 +01:00
Michael Peter Christen	d9858e1b8a	removed warnings and superfluous logging	2014-02-09 12:26:58 +01:00
Michael Peter Christen	d2b8f2b477	enhancements for staticIP and ipv6 handling	2014-01-27 13:48:20 +01:00
orbiter	0002abd583	fix for OOM during remote search and too high load protection	2014-01-22 20:54:03 +01:00
sixcooler	5a917e13c6	use less ram on dht-URL transfer by not using a URIMetadataNode[]	2014-01-22 17:52:07 +01:00
sixcooler	4d77ca52c9	workaround to let dht-out run on smal Systems like a Pi	2014-01-22 01:26:44 +01:00
Michael Peter Christen	be5e808236	- removed hardcoded load-test which is now handled in BusyQueues steering, see /PerformanceQueues_p.html - changed default values for crawler queue load limit (high, because these jobs are started upon user request)	2014-01-21 17:48:45 +01:00
Michael Peter Christen	1ea17bd9f3	- removed old metadata database and all migration code - refactored all code which uses URIMetadataRow as standard for word hash length and word hash ordering and moved that to the class 'Word', becuase the class URIMetadataRow defined the old metadata data structure and should be superfluous in the future - removed unused methods from URIMetadataRow as preparation for further removal of that class	2014-01-20 18:31:46 +01:00
reger	97e84439fb	adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString - since specific heuristic Twitter & Blekko is not longer available or redundant with OpenSearchHeuristic, adjusted ConfigHeuristic to use OpensearchHeuristic settings only. For this the default OSD search target list is made available (copied) by default and the other configs are removed. - the return of QueryGoal.getOriginalQueryString includes the queryModifier, which are held separately in a modifier object, but in most (all) cases just the query term is expected, clarified and renamed it to QueryGoal.getQueryString which returns just the search term (if needed a .getOrigianlQueryString could be implemented in Queryparameters, adding the modifiers) - started to adjust internal html href references from absolute to relative (currently it is mixed). For future development we should prefer relative href targets (less trouble with context aware servlets)	2014-01-20 00:58:17 +01:00
Michael Peter Christen	022c6d3ce1	do YaCy p2p connections using a timeout-request which covers the http request into a separate thread and ignores the furthure result of a request if that does not answer within the requested time-out. This is a try to solve a problem with the peer-ping, which hangs whenever a peer appears to be dead or blocked.	2014-01-19 15:21:23 +01:00
orbiter	fd4abc0565	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	2014-01-19 01:50:55 +01:00
orbiter	d5b8e473c8	added load limit for DHT transfer: RWI acceptance only if local load is not too high	2014-01-19 01:50:42 +01:00
reger	2614fa7aeb	Skip remote Solr search if last try showed error As the solr servlet may not be available (e.g. no public search page, old version, individual access setting) a /solr/select error is remembered in the seed.dna of the remote peer. This is not permanent, as flag is not stored and the seed is reloaded on several occasions, it is just a memory of the recent past status. Might also be set to "not available" on time-out of last try.	2014-01-18 18:48:52 +01:00
orbiter	a07e9b3582	concurrency-solid version of transmission limitation	2014-01-18 12:55:05 +01:00
orbiter	60ead31273	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	2014-01-18 10:50:36 +01:00
orbiter	52bf7d1ac8	reduce load during dht transfer	2014-01-18 10:50:24 +01:00
Michael Peter Christen	0bf3cab8c7	- better 'extra'-peer selection - logging of health status for 'extra'-peer selection - concurrency for remote peer IO and interrupting the threads if time-out occurrs	2014-01-17 14:54:19 +01:00
Michael Peter Christen	ba44eb1160	when scaling the number of remote peers, also consider the machine load and the number of cores	2014-01-16 17:34:26 +01:00
Michael Peter Christen	f8ce7040ab	remote search peer selection schema change: - all non-dht targets (previously separated into 'robinson' for dht-like queries and 'node' for solr queries) are non 'extra' peers, which are queries using solr - these extra-peers are now selected using a ranking on last-seen, peer-tag-matches, node-peer flags, peer age, and link count. The ranking is done using a weight and a random factor. - the number of extra peers is 50% of the dht peers - the dht peers now exclude too young peers to prevent bad results during strong growth of the network - the number of dht peers (and therefore extra-peers) is reduced when the memory of the peer is low and/or some documents still appear in the indexing-queue. This shall prevent a peer from deadlocks when p2p queries are made in a fast sequence on weak hardware.	2014-01-16 17:27:14 +01:00
Michael Peter Christen	47a82e471c	less blocking in SeedDB which caused deadlocks in peer ping	2014-01-16 13:10:20 +01:00
reger	6932aa4d7a	use configured admin-username for api calls - the admin user name can be configured, in apiExec calls the default "admin" username is used. TODO: the bin/apicall.sh script should likely take that into account.	2014-01-07 21:26:50 +01:00
orbiter	3cb6c7861f	fixed shutdown authenticaton problem	2014-01-06 01:48:54 +01:00
Michael Peter Christen	1c56befb93	fixed mess with test on localhost (which means local hosts for some cases)	2014-01-05 04:55:30 +01:00
reger	dd8ea0cdd6	fix "add to blacklist" button style in IndexControlRWIs_p - added default filename filter to select field (as only addition to *.black list is permanent) - modified Blacklist_p header/legend to show all active blacklists (to support understanding that all configured lists are active) - removed obsolete code in Blacklist_p servlet	2013-12-30 20:03:59 +01:00
Michael Peter Christen	09412ea3a4	counting search requests in solr interface	2013-12-12 03:37:19 +01:00
Michael Peter Christen	79771c60c0	IPv6 fixes	2013-12-06 14:30:08 +01:00
Michael Peter Christen	9a27bf6e82	removed filter computation in Protocol class for remote searches because that is already done in the QueryParams class	2013-12-04 13:09:15 +01:00
Michael Peter Christen	f1b5db2c45	- performance graph does not shop peer ping in memory monitor any more - after a forced GC, the PerformanceMemory view switches to automatic update by default	2013-12-04 12:59:30 +01:00
Michael Peter Christen	2c39b65409	fixes for searches containing stopwords. The fix was done using a reconstruction of the search word set access method to protect that words are deleted from the sets from the outside of the QueryGoal class.	2013-11-26 02:24:47 +01:00
orbiter	037cd0a57c	using the BinaryResponseWriter which is supported within the YaCy solr servlet since YaCy 1.63. This is much more performant for the client than using the XMLResponseWriter because parsing of XML data is very CPU intensive. Older YaCy peers are still requested using the XMLResponseWriter but the majority of YaCy peers already respond with the binary writer. This makes remote searches much faster and less CPU intensive.	2013-11-25 21:31:40 +01:00
Michael Peter Christen	ccf2f4e43b	refactoring of seed attributes (introduced more constants)	2013-11-22 14:15:31 +01:00
orbiter	b7f1e5af51	added new servlet which generates the same file as the principal peers upload to a bootstrap position you can call it either with http://localhost:8090/yacy/seedlist.html or to generate json (or jsonp) with http://localhost:8090/yacy/seedlist.json http://localhost:8090/yacy/seedlist.json?callback=seedlist	2013-11-19 15:56:10 +01:00
Michael Peter Christen	e1c1e57877	less overhead calling exist() with only one hash	2013-11-04 09:37:31 +01:00
Michael Peter Christen	9bb7eab389	hacks to prevent storage of data longer than necessary during search and some speed enhancements. This should reduce the memory usage during heavy-load search a bit.	2013-10-25 15:05:30 +02:00
orbiter	d2effd21db	fix for npe during location search	2013-09-21 21:03:58 +02:00
Michael Peter Christen	61c5e40687	- replaced the properties object in AnchorURL with distinct variables for anchor attributes. - this caused that large portions of the parser code had to be adopted as well - added a counter target_order_i for anchor links in webgraph computation	2013-09-15 23:27:04 +02:00
Michael Peter Christen	5e31bad711	- the webgraph shall store all links which appear on a web page and not all unique links! This made it necessary, that a large portion of the parser and link processing classes must be adopted to carry a different type of link collection which carry a property attribute which are attached to web anchors. - introduction of a new URL class, AnchorURL - the other url classes, DigestURI and MultiProtocolURI had been renamed and refactored to fit into a new document package schema, document.id - cleanup of net.yacy.cora.document package and refactoring	2013-09-15 00:30:23 +02:00
Michael Peter Christen	049c3b3f2e	added an option to exclude image search results from text search. This is on by default.	2013-09-03 11:14:23 +02:00

1 2 3 4 5 ...

272 Commits