Michael Peter Christen
6f1ddb2519
Moved solr index-add method to the same method where the YaCy index is
...
written. Also done some code-cleanup.
2012-07-25 01:53:47 +02:00
Michael Peter Christen
315d83cfa0
cleanup
2012-07-24 22:16:56 +02:00
Michael Peter Christen
1f41d9c6f5
bugfix for a NPE
2012-07-24 17:29:32 +02:00
Michael Peter Christen
76202f068e
extended abstraction of local and remote solr index using one front-end
...
for index administration and querying.
2012-07-24 17:23:29 +02:00
Michael Peter Christen
d3f243e2e1
fixed node type calculation for principal peers
2012-07-23 23:40:50 +02:00
Michael Peter Christen
7ec7341f60
added user-authentication protection to solr search (same as implemented
...
for yacysearch)
2012-07-23 21:43:14 +02:00
Michael Peter Christen
e2a97ef8f6
better explain how to access the embedded solr
2012-07-23 21:31:12 +02:00
Michael Peter Christen
826967513b
changed options in IndexFederated_p to switch on/off parts of the index
...
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
Michael Peter Christen
cba4ab862e
fix for http://bugs.yacy.net/view.php?id=202
2012-07-23 00:36:18 +02:00
Michael Peter Christen
b76836db7b
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
2012-07-23 00:35:14 +02:00
reger
36c9875b6e
removed localized number formatting from num-results_totalcount response (this is only used in xml and json where localized format is not valid)
2012-07-23 00:00:40 +02:00
Michael Peter Christen
0640a6f7e6
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-22 21:50:44 +02:00
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
2012-07-22 13:18:45 +02:00
orbiter
6cc5d1094e
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-21 13:34:57 +02:00
orbiter
05a3ffd03a
patches to ensure that solr connectors are active ony if they have a
...
solr object assigned and vice versa
2012-07-20 11:47:50 +02:00
orbiter
5a3c829872
embedded solr is only initiated if it is activated with
...
IndexFederated_p.html
2012-07-20 11:40:33 +02:00
Michael Peter Christen
161005ceaa
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-20 09:04:14 +02:00
Michael Peter Christen
bf4968d748
source change in classpath
2012-07-20 09:04:02 +02:00
Lotus
3a350a2f83
partial html fix for
...
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4454
2012-07-20 08:53:12 +02:00
orbiter
49ee31f837
added classpath for htroot/solr
2012-07-20 00:59:58 +02:00
Michael Peter Christen
97b7bcf2a6
added a solr search index
...
- by default, a (empty) solr storage instance is created at
SEGMENTS/solr_36
- the index is written if in /IndexFederated_p.html the flag "embedded
solr search index" is switched on
- a standard solr query interface is available now with a new servlet at
http://127.0.0.1:8090/solr/select
To test this, do the following:
- switch to webportal mode
- switch on the feature as described
- do a crawl. this fills the solr index. The normal YaCy search will NOT
work now!
- do a solr query, like:
http://127.0.0.1:8090/solr/select?q= *:*
http://127.0.0.1:8090/solr/select?q=text_t:Help
play with different search fields as you can see in
/IndexFederated_p.html
You can use the standard solr query attributes as described in
http://wiki.apache.org/solr/SearchHandler
2012-07-19 11:34:05 +02:00
Michael Peter Christen
f0a079ac9f
allow larger log entries
2012-07-14 16:28:14 +02:00
Michael Peter Christen
9b48c9fe2e
removed a crawler overhead (terminated loop which searches greatest
...
stack that has zero-waiting urls). This should cause a slightly faster
crawl for crawl stacks with many different domains in the crawl queue.
2012-07-14 13:11:04 +02:00
Michael Peter Christen
784a4abb18
enhancement in internal data organization which should generate less
...
synchronizations in database access
2012-07-14 13:09:44 +02:00
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
2012-07-13 21:15:38 +02:00
orbiter
c00a3cf74d
less usage of generic logger to avoid logger generation overhead
2012-07-12 19:54:54 +02:00
orbiter
a196f24f60
prevent enqueueing of non-loggeable logging entries
2012-07-12 19:42:42 +02:00
orbiter
482afed07c
reduced logging overhead (a bit)
2012-07-12 19:23:40 +02:00
orbiter
e76159040b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-12 11:14:04 +02:00
orbiter
bbfa497a3c
replaced more size() > 0 by !isEmpty()
2012-07-12 11:12:21 +02:00
Michael Peter Christen
58e7d1952f
reduction of logging to prevent too much IO caused be logging
2012-07-12 02:08:11 +02:00
Michael Peter Christen
83da68c4c1
fixed a memory leak inside the logger which appeared if the log was
...
writter faster that the logger is able to print this out to its out
stream. A very large collection of unwritten log outputs had been seen
during strong crawling. The new ArrayBlockingQueue is limited to prevent
this case.
2012-07-12 01:23:04 +02:00
Michael Peter Christen
e3aa05b9dd
added creation of subpath pattern when crawl start is 'from file'
2012-07-11 23:18:57 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
orbiter
28b30231c3
fix for url matcher of multiple amp& in an url, see:
...
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=4439&p=26650#p26650
2012-07-10 17:39:56 +02:00
Roland 'Quix0r' Haeder
aef9dd0350
- removed cleaning of blacklist cache on startup
...
- added cleaning of blacklist cache if cache is modified in interface
- extended cache saving to all cache types
- moved cache location to DATA/LISTS
- fixed static file path which was relative to the application path but
should be relative to data path - which is different in debian and mac
implementations
2012-07-10 13:08:16 +02:00
orbiter
c7afa8bc48
using SwitchboardConstants for solr attributes
2012-07-10 12:01:20 +02:00
sixcooler
a99ef68422
bump to httpclient-4.2.1
2012-07-09 18:58:33 +02:00
orbiter
c6d8950651
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-09 14:33:11 +02:00
orbiter
5f3b8dc040
fix for RSS reader
2012-07-09 14:32:35 +02:00
orbiter
62202e2d71
refactoring of query attribute variable names for better consistency
...
with (next) stored query words
2012-07-09 11:14:50 +02:00
Michael Peter Christen
2160f9a819
Release 1.04
2012-07-09 00:13:59 +02:00
Michael Peter Christen
1addbc792c
use less memory for md5 cache
2012-07-08 22:05:04 +02:00
Michael Peter Christen
f32de94723
more logging
2012-07-08 22:04:36 +02:00
Michael Peter Christen
d09d9f2364
filter old peers from bootstrap (now stronger: 60 minutes instead of
...
240).
2012-07-08 21:25:22 +02:00
Michael Peter Christen
434ee90c59
added classification for control file types which shall not be loaded
...
but placed onto the noload-queue
2012-07-08 21:17:33 +02:00
Michael Peter Christen
1517a3b7b9
added webm mime-type
2012-07-08 17:59:20 +02:00
Michael Peter Christen
a90bcb48f6
added webm
2012-07-08 17:58:05 +02:00
Michael Peter Christen
801972fe6f
fix for url camel case parser and sentence reader
2012-07-08 16:48:09 +02:00
Michael Peter Christen
fbc1a2030d
fix for sitemap importer: can now also import very large sitemaps within
...
small memory configurations
2012-07-08 16:11:50 +02:00