Michael Peter Christen
97b7bcf2a6
added a solr search index
...
- by default, a (empty) solr storage instance is created at
SEGMENTS/solr_36
- the index is written if in /IndexFederated_p.html the flag "embedded
solr search index" is switched on
- a standard solr query interface is available now with a new servlet at
http://127.0.0.1:8090/solr/select
To test this, do the following:
- switch to webportal mode
- switch on the feature as described
- do a crawl. this fills the solr index. The normal YaCy search will NOT
work now!
- do a solr query, like:
http://127.0.0.1:8090/solr/select?q= *:*
http://127.0.0.1:8090/solr/select?q=text_t:Help
play with different search fields as you can see in
/IndexFederated_p.html
You can use the standard solr query attributes as described in
http://wiki.apache.org/solr/SearchHandler
2012-07-19 11:34:05 +02:00
Michael Peter Christen
f0a079ac9f
allow larger log entries
2012-07-14 16:28:14 +02:00
Michael Peter Christen
9b48c9fe2e
removed a crawler overhead (terminated loop which searches greatest
...
stack that has zero-waiting urls). This should cause a slightly faster
crawl for crawl stacks with many different domains in the crawl queue.
2012-07-14 13:11:04 +02:00
Michael Peter Christen
784a4abb18
enhancement in internal data organization which should generate less
...
synchronizations in database access
2012-07-14 13:09:44 +02:00
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
2012-07-13 21:15:38 +02:00
orbiter
c00a3cf74d
less usage of generic logger to avoid logger generation overhead
2012-07-12 19:54:54 +02:00
orbiter
a196f24f60
prevent enqueueing of non-loggeable logging entries
2012-07-12 19:42:42 +02:00
orbiter
482afed07c
reduced logging overhead (a bit)
2012-07-12 19:23:40 +02:00
orbiter
e76159040b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-12 11:14:04 +02:00
orbiter
bbfa497a3c
replaced more size() > 0 by !isEmpty()
2012-07-12 11:12:21 +02:00
Michael Peter Christen
58e7d1952f
reduction of logging to prevent too much IO caused be logging
2012-07-12 02:08:11 +02:00
Michael Peter Christen
83da68c4c1
fixed a memory leak inside the logger which appeared if the log was
...
writter faster that the logger is able to print this out to its out
stream. A very large collection of unwritten log outputs had been seen
during strong crawling. The new ArrayBlockingQueue is limited to prevent
this case.
2012-07-12 01:23:04 +02:00
Michael Peter Christen
e3aa05b9dd
added creation of subpath pattern when crawl start is 'from file'
2012-07-11 23:18:57 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
orbiter
28b30231c3
fix for url matcher of multiple amp& in an url, see:
...
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=4439&p=26650#p26650
2012-07-10 17:39:56 +02:00
Roland 'Quix0r' Haeder
aef9dd0350
- removed cleaning of blacklist cache on startup
...
- added cleaning of blacklist cache if cache is modified in interface
- extended cache saving to all cache types
- moved cache location to DATA/LISTS
- fixed static file path which was relative to the application path but
should be relative to data path - which is different in debian and mac
implementations
2012-07-10 13:08:16 +02:00
orbiter
c7afa8bc48
using SwitchboardConstants for solr attributes
2012-07-10 12:01:20 +02:00
sixcooler
a99ef68422
bump to httpclient-4.2.1
2012-07-09 18:58:33 +02:00
orbiter
c6d8950651
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-09 14:33:11 +02:00
orbiter
5f3b8dc040
fix for RSS reader
2012-07-09 14:32:35 +02:00
orbiter
62202e2d71
refactoring of query attribute variable names for better consistency
...
with (next) stored query words
2012-07-09 11:14:50 +02:00
Michael Peter Christen
2160f9a819
Release 1.04
2012-07-09 00:13:59 +02:00
Michael Peter Christen
1addbc792c
use less memory for md5 cache
2012-07-08 22:05:04 +02:00
Michael Peter Christen
f32de94723
more logging
2012-07-08 22:04:36 +02:00
Michael Peter Christen
d09d9f2364
filter old peers from bootstrap (now stronger: 60 minutes instead of
...
240).
2012-07-08 21:25:22 +02:00
Michael Peter Christen
434ee90c59
added classification for control file types which shall not be loaded
...
but placed onto the noload-queue
2012-07-08 21:17:33 +02:00
Michael Peter Christen
1517a3b7b9
added webm mime-type
2012-07-08 17:59:20 +02:00
Michael Peter Christen
a90bcb48f6
added webm
2012-07-08 17:58:05 +02:00
Michael Peter Christen
801972fe6f
fix for url camel case parser and sentence reader
2012-07-08 16:48:09 +02:00
Michael Peter Christen
fbc1a2030d
fix for sitemap importer: can now also import very large sitemaps within
...
small memory configurations
2012-07-08 16:11:50 +02:00
Michael Peter Christen
92731e5287
fix for sevenzip parser
2012-07-08 16:11:19 +02:00
Michael Peter Christen
45641b0c23
catch and log a warning in RasterPlotter
2012-07-06 09:21:12 +02:00
Michael Peter Christen
8efc1c1078
- fixed a memory leak (or bad usage) during parsing/snippet fetch
...
- more logging for errors
2012-07-06 09:05:41 +02:00
Michael Peter Christen
c3db015410
prevent loading of content from the cache when retrieval with IFFRESH is
...
used and cache is stale. Should speed up snippet generation when cache
strategy is IFFRESH.
2012-07-06 08:29:41 +02:00
Michael Peter Christen
91f14ea38e
fix to solr configuration (case where the external solr was not online)
2012-07-06 01:29:13 +02:00
sixcooler
2c5b68d932
more abstraction of error message
2012-07-05 14:50:37 +02:00
Michael Peter Christen
9758c521ab
abstraction of error message
2012-07-05 14:27:28 +02:00
Michael Peter Christen
ef0d09f103
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-05 14:24:19 +02:00
Michael Peter Christen
b1e7c11fba
fix for pattern matcher in html parser
2012-07-05 14:24:03 +02:00
Michael Peter Christen
8a6edc0031
fix for solr shutdown
2012-07-05 14:23:43 +02:00
Michael Peter Christen
b8bcc06283
fix for urls beginning with "//"
2012-07-05 14:23:29 +02:00
sixcooler
9b6e4e46ca
fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4430
2012-07-05 14:06:00 +02:00
Michael Peter Christen
b0c408788b
made class methods static where possible
2012-07-05 12:38:41 +02:00
Michael Peter Christen
5bd3c90907
- removed unnecessary semicolons
...
- added default case for switch
2012-07-05 11:18:31 +02:00
Michael Peter Christen
132afaf687
removed unaccessible code
2012-07-05 11:09:44 +02:00
Michael Peter Christen
7c1ba99755
removed more unused method parameters
2012-07-05 10:44:30 +02:00
Michael Peter Christen
83701a1b4c
removed unused ImageReference package
2012-07-05 10:24:52 +02:00
Michael Peter Christen
0301aba1e9
removed unused method parameters
2012-07-05 10:23:07 +02:00
Michael Peter Christen
241dd8410a
removed snippet pattern filter - it was not used
2012-07-05 09:21:27 +02:00
Michael Peter Christen
d3964253ae
- added @SuppressWarnings to unused servlet method parameters
...
- removed unnecessary casts
- removed unnecessary throw statements
2012-07-05 09:14:04 +02:00