Commit Graph

2440 Commits

Author SHA1 Message Date
Michael Peter Christen
931541d198 re-inserted default value re-set button to performance queues and
patched missing values for recent new queues
2014-02-06 22:39:19 +01:00
Michael Peter Christen
456e52e0d5 enhanced strategy to clear solr caches
- redesigned the instance mirror class (which was a mess)
- added final method to close a searcher (which otherwise keeps a cache)
- changed cache clear method which iterates over resources and calls
clear to all caches in the searcher resources
2014-02-06 19:13:29 +01:00
reger
bd1685c94a fix not needed getFileExtension().toLower (double)
add missing .getFileExtension
2014-02-05 03:45:02 +01:00
orbiter
a11f072504 enhanced didyoumean 2014-02-04 00:18:11 +01:00
Michael Peter Christen
c0e6a65ec3 enhanced didyoumean 2014-02-03 18:49:03 +01:00
Michael Peter Christen
6d2dab7b21 fixed 'resource leak' warning 2014-02-03 13:38:26 +01:00
orbiter
22e3524797 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-02-03 12:45:35 +01:00
orbiter
c40ba51ca6 added new suggest method which replaces more-than-one suggestions:
instead of computing suggest permutations of the given words, the
completion of a phrase using the given words is searched in the fulltext
index.
2014-02-03 12:44:52 +01:00
reger
ad4b213145 remove unused static var from HTTPDProxyHandler 2014-02-02 03:47:12 +01:00
reger
b693ce9759 allow combining selection of different search nav's (facets)
- selecting more than one nav combines the 2 selections (with AND)
- unselecting one nav clears all selected

(e.g. select filetype:pdf and /language/fr shows ~ french pdf's only)
2014-01-30 22:57:27 +01:00
reger
cb71413d19 fix page nav, to keeping modifier
(was new issue)
2014-01-30 22:00:32 +01:00
orbiter
416481c33e added a boost on appearance of combined words (in the same order the
user submitted that) when searching for more than one word
2014-01-30 10:51:08 +01:00
reger
c589ee8c6e URLproxy access check too tight
respect config ip pattern (was own ip)
2014-01-28 22:39:45 +01:00
Michael Peter Christen
ebfaf753b7 - faster initialization of index files
- removal of not used space if index files shrink (rare, but possible)
2014-01-28 12:39:58 +01:00
Michael Peter Christen
d2b8f2b477 enhancements for staticIP and ipv6 handling 2014-01-27 13:48:20 +01:00
reger
a71718a459 add config value for ssl/https port (default=8443)
adjust server routines to use config
2014-01-27 01:09:56 +01:00
reger
a3e2cca8e9 improve isOlder check to not overwrite node index with metadata on equal load date 2014-01-26 01:00:52 +01:00
reger
9b24dae2b7 add language navigation filter clause to rwi results 2014-01-25 22:59:23 +01:00
reger
f307d65dcf prepare for a language navigator
works fine to restrict language for local solrSearches.
More work needs to be done to make rwi/remote searches respect the modifier.language restriction.
2014-01-24 03:11:25 +01:00
reger
cf553e5045 added hint to web.xml and for completeness the full set of hardcoded mappings 2014-01-23 23:56:45 +01:00
Michael Peter Christen
c84bcc878a first try to add a generic solr servlet as luke request servlet 2014-01-23 19:01:31 +01:00
Michael Peter Christen
4cb7e2a2ca refactoring: renamed the SolrServlet to SolrSelectServlet for better
naming of more Solr Servlets
2014-01-23 17:20:49 +01:00
Michael Peter Christen
dc06e407ce added two virtual instances of solr for the both cores: collection1 and
webgraph. These cores are now accessible at
/solr/collection1/select instead /solr/select?core=collection1
and
/solr/webgraph/select instead /solr/select?core=webgraph
in addition to the old behavior to support compatibility to the old
peers. These new paths are fully solr standard-conform and will allow
the cross-linking between YaCy peers using their public solr API.
2014-01-23 17:14:13 +01:00
Michael Peter Christen
8b14e92ba4 added button in host browser to re-load 404/failed documents 2014-01-23 15:56:36 +01:00
orbiter
771d8261c1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-01-22 21:53:27 +01:00
orbiter
c351e47a84 fix for bad-formatted lonlat 2014-01-22 21:33:11 +01:00
reger
4c603b216e optimize parse ServerSideInclude 2014-01-22 21:23:32 +01:00
orbiter
5ec0c969c9 fix for http://bugs.yacy.net/view.php?id=354 2014-01-22 20:59:53 +01:00
orbiter
0002abd583 fix for OOM during remote search and too high load protection 2014-01-22 20:54:03 +01:00
sixcooler
5a917e13c6 use less ram on dht-URL transfer by not using a URIMetadataNode[] 2014-01-22 17:52:07 +01:00
Michael Peter Christen
c87cdfca2e do not set a load prerequisite that prevents the start of one-time-jobs 2014-01-22 17:18:53 +01:00
sixcooler
4d77ca52c9 workaround to let dht-out run on smal Systems like a Pi 2014-01-22 01:26:44 +01:00
Michael Peter Christen
6ada0daae9 making latency_factor and maximum number of same hosts in loader queue
settings available in Crawler_p.html servlet for steering.
2014-01-21 19:28:00 +01:00
Michael Peter Christen
489c3fbc90 code simplifications / removed warnings 2014-01-21 17:53:39 +01:00
Michael Peter Christen
0168f80c28 new crawling factors can now be changed during runtime 2014-01-21 17:52:16 +01:00
Michael Peter Christen
be5e808236 - removed hardcoded load-test which is now handled in BusyQueues
steering, see /PerformanceQueues_p.html
- changed default values for crawler queue load limit (high, because
these jobs are started upon user request)
2014-01-21 17:48:45 +01:00
sixcooler
40a4030b55 configurable max-load values for YaCy-Threads:
try lower values on smal systems like a Pi
2014-01-21 17:04:22 +01:00
sixcooler
6d8c023a5e lower client-connection for single-cpu-systems 2014-01-21 16:56:44 +01:00
Michael Peter Christen
77531850b5 reverted crawling strategy from latest commit. 2014-01-21 16:05:55 +01:00
Michael Peter Christen
c0da966dfa enhanced crawler speed 2014-01-20 21:46:40 +01:00
Michael Peter Christen
79809342fa added synchronization to exists() call bacause the concurrent call to
that method showed in thread dump close to deadlock situations. Its also
better to synchronize IO operations because they become faster then.
2014-01-20 21:09:03 +01:00
Michael Peter Christen
9a6912f2e6 if a http client thread is still running but we do not wait for it any
more, call an interrupt
2014-01-20 18:39:36 +01:00
Michael Peter Christen
0d235a565b cleanup crawl loader jobs 2014-01-20 18:36:00 +01:00
Michael Peter Christen
1ea17bd9f3 - removed old metadata database and all migration code
- refactored all code which uses URIMetadataRow as standard for word
hash length and word hash ordering and moved that to the class 'Word',
becuase the class URIMetadataRow defined the old metadata data structure
and should be superfluous in the future
- removed unused methods from URIMetadataRow as preparation for further
removal of that class
2014-01-20 18:31:46 +01:00
reger
d3de309953 fix IOexception logging issue in DefaultServlet
reason not sure but .logException triggers another exception
2014-01-20 08:12:35 +01:00
reger
97e84439fb adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString
- since specific heuristic Twitter & Blekko is not longer available or redundant with OpenSearchHeuristic,
adjusted ConfigHeuristic to use OpensearchHeuristic settings only.
For this the default OSD search target list is made available (copied) by default and the other configs are removed.

- the return of QueryGoal.getOriginalQueryString includes the queryModifier, which are held separately in a modifier object,
but in most (all) cases just the query term is expected, clarified and renamed it to QueryGoal.getQueryString which returns
just the search term (if needed a .getOrigianlQueryString could be implemented in Queryparameters, adding the modifiers)

- started to adjust internal html href references from absolute to relative (currently it is mixed).
For future development we should prefer relative href targets (less trouble with context aware  servlets)
2014-01-20 00:58:17 +01:00
Michael Peter Christen
022c6d3ce1 do YaCy p2p connections using a timeout-request which covers the http
request into a separate thread and ignores the furthure result of a
request if that does not answer within the requested time-out. This is a
try to solve a problem with the peer-ping, which hangs whenever a peer
appears to be dead or blocked.
2014-01-19 15:21:23 +01:00
Michael Peter Christen
42f3733a05 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-01-19 14:47:24 +01:00
Michael Peter Christen
25a6c05008 experimental removal of synchronization. This should work for all cases
where the size() and isEmpty() method is used only for statistics, which
happens at many locations in YaCy. If these methods are used for
structual reasons (like accessing the last element in an array) then it
may fail or cause other problems. As far as visible, this is not the
case.
2014-01-19 14:47:11 +01:00
Michael Peter Christen
5695280edd removed superfluous synchronization 2014-01-19 14:44:58 +01:00