Michael Peter Christen
f9fc5cfaba
better check for bad urls in url transmission
2012-08-17 17:17:00 +02:00
Michael Peter Christen
40c0856489
refactoring
2012-08-17 15:33:02 +02:00
Michael Peter Christen
9bece5ac5f
enhanced snippet fetch - removed a bug that caused documents to be
...
parsed even if a solr text was available
2012-08-17 14:22:07 +02:00
Michael Peter Christen
395b78a0d8
using the solr search index to concurrently search within solr and the
...
rwis during local search requests.
2012-08-17 01:21:56 +02:00
Michael Peter Christen
e5ef840f40
- renamed DoubleSolrConnector to MirrorSolrConnector and added a
...
hit/miss/document cache to the MirrorSolrConnector.
- more abstraction to SolrDocument in Connector interface
- bugfixes in Solr field reader
2012-08-13 13:32:32 +02:00
Michael Peter Christen
94a334f128
another fix to the Solr metadata reading process and to the shutdown
...
process
2012-08-13 11:13:53 +02:00
Michael Peter Christen
b51df6c7e8
- added coordinate storage in solr schema
...
- fixed shutdown process
- fixed some solr-to-metadata reading
- added a large number of metadata attributes in ViewFile.html
2012-08-13 10:40:04 +02:00
Michael Peter Christen
f9c0e6e950
- Implemented and integrated the URIMetadataNode object which is a
...
metadata representation from the solr index. This shall replace metadata
from the built-in database in the future.
- added the Solr-driven metadata into the search index of YaCy which
makes it now possible to run YaCy without the old metadata index. This
is a major stept forward to a full migration to Solr.
2012-08-10 13:26:51 +02:00
Michael Peter Christen
dcc72799c4
better abstraction for result writers using controlled vocabularies and
...
URIRefs
2012-08-10 07:45:43 +02:00
Michael Peter Christen
a12f693ec9
added two response writer for embedded solr interface:
...
a rss/opensearch writer and an enhanced solr xml writer.
The enhanced solr writer has less configuration overhead than the
original writer and should by slightly faster. The rss/opensearch writer
is at this time slightly incomplete compared with the already existing
rss search result form YaCy and also snippets are missing at this time.
To test the new interface, open for example:
http://localhost:8090/solr/select?wt=rss&q=olympia
The wt-code for the new result writers are=
wt=rss for opensearch
wt=exml for the enhanced solr xml writer.
Additionally, the SRU search parameters had been added to the solr
interface which can now also be used for a normal solr/xml search.
2012-08-09 18:06:48 +02:00
sixcooler
f32aa9a49c
prevent merge of blobs that can't be handled in memory
2012-07-31 23:23:16 +02:00
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
2012-07-27 12:13:53 +02:00
Michael Peter Christen
e432bb9cd9
better calculation of possible saving in HeapReader index data structure
2012-07-26 10:05:06 +02:00
Michael Peter Christen
9549984c65
documentation/comments
2012-07-25 21:34:23 +02:00
Michael Peter Christen
826967513b
changed options in IndexFederated_p to switch on/off parts of the index
...
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
2012-07-22 13:18:45 +02:00
Michael Peter Christen
f0a079ac9f
allow larger log entries
2012-07-14 16:28:14 +02:00
Michael Peter Christen
784a4abb18
enhancement in internal data organization which should generate less
...
synchronizations in database access
2012-07-14 13:09:44 +02:00
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
2012-07-13 21:15:38 +02:00
orbiter
a196f24f60
prevent enqueueing of non-loggeable logging entries
2012-07-12 19:42:42 +02:00
orbiter
482afed07c
reduced logging overhead (a bit)
2012-07-12 19:23:40 +02:00
orbiter
e76159040b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-12 11:14:04 +02:00
orbiter
bbfa497a3c
replaced more size() > 0 by !isEmpty()
2012-07-12 11:12:21 +02:00
Michael Peter Christen
83da68c4c1
fixed a memory leak inside the logger which appeared if the log was
...
writter faster that the logger is able to print this out to its out
stream. A very large collection of unwritten log outputs had been seen
during strong crawling. The new ArrayBlockingQueue is limited to prevent
this case.
2012-07-12 01:23:04 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
Michael Peter Christen
1addbc792c
use less memory for md5 cache
2012-07-08 22:05:04 +02:00
Michael Peter Christen
f32de94723
more logging
2012-07-08 22:04:36 +02:00
Michael Peter Christen
8efc1c1078
- fixed a memory leak (or bad usage) during parsing/snippet fetch
...
- more logging for errors
2012-07-06 09:05:41 +02:00
Michael Peter Christen
b0c408788b
made class methods static where possible
2012-07-05 12:38:41 +02:00
Michael Peter Christen
5bd3c90907
- removed unnecessary semicolons
...
- added default case for switch
2012-07-05 11:18:31 +02:00
Michael Peter Christen
132afaf687
removed unaccessible code
2012-07-05 11:09:44 +02:00
Michael Peter Christen
7c1ba99755
removed more unused method parameters
2012-07-05 10:44:30 +02:00
Michael Peter Christen
83701a1b4c
removed unused ImageReference package
2012-07-05 10:24:52 +02:00
Michael Peter Christen
0301aba1e9
removed unused method parameters
2012-07-05 10:23:07 +02:00
Michael Peter Christen
d3964253ae
- added @SuppressWarnings to unused servlet method parameters
...
- removed unnecessary casts
- removed unnecessary throw statements
2012-07-05 09:14:04 +02:00
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
2012-07-05 08:44:39 +02:00
Michael Peter Christen
1481037820
replaced non-generic array with collection
2012-07-05 01:02:51 +02:00
Michael Peter Christen
613b45f604
- better data structures in secondary search
...
- fixed a big memory leak in secondary search
2012-07-03 07:12:20 +02:00
Michael Peter Christen
8a82609360
- smaller caches to save memory
...
- close cloneable iterators to free memory
2012-07-02 15:40:40 +02:00
Michael Peter Christen
ce8d4b87d9
fixes for new eclipse 'Juno' warning 'Resource leak'.
2012-07-02 10:27:46 +02:00
Michael Peter Christen
0c345d1559
giving threads name so its easier to see whats happening during
...
debugging and within a thread dump
2012-07-02 09:51:43 +02:00
Michael Peter Christen
b9d42fd9c8
using com.google.common.io.Files instead of homebrew methods
2012-06-22 11:39:17 +02:00
Michael Peter Christen
de3ef8ad73
removed unimportant warnings
2012-06-19 08:45:34 +02:00
Michael Peter Christen
9264d8b4af
removed old navigation practice using subject tags in favor of
...
triplestore-tags
2012-06-17 00:33:40 +02:00
Michael Peter Christen
61bb52d55c
- using http://purl.org/dc/terms/references to refer from an
...
auto-annotated document to a 'pseudo-linked' document which has an url
created with an object-prefix as defined in the vocabulary file
2012-06-12 14:23:51 +02:00
Michael Peter Christen
8b53771db2
changed behavior of navigation processing:
...
- vocabulary annotation is not done any more into the metadata of urldb
- vocabularies are written into the jena triplestore using a rdf
vocabulary
- vocabularies for rdf tripel must be updated; refactoring done
- with the new navigation tags in the triplestore a faster
pre-urldb-lookup is possible: navigation is processed now within the RWI
during pre-ranking retrieval
- added also a Owl vocabulary stub to add the plain-text url to the
triplestore using the owl:sameas predicate
2012-06-11 23:49:30 +02:00
Michael Peter Christen
bef823c247
close the reader if finished
2012-06-11 01:20:54 +02:00
cominch
9cbfc1a1c0
augmentedProxy, which forwards every proxy request to a
...
rewrite engine to customize existing webpages. originally implemented by
Florian Richter.
Conflicts:
source/de/anomic/http/server/HTTPDProxyHandler.java
2012-06-10 10:15:34 +02:00
Michael Peter Christen
3b992e6b00
using utf8 String compression in Webstructure database
2012-06-09 11:00:33 +02:00
Michael Peter Christen
2280a7b276
- changed initialization order to prefer allocation of memory for table
...
files first
- bugfixes in memory amount calculation
2012-06-09 09:05:47 +02:00