Michael Peter Christen
d39463a85c
added deleteByQuery to solr connectors
2012-08-17 17:05:46 +02:00
Michael Peter Christen
54bea21c02
bugfix for solr connector, possibly a cause for
...
http://forum.yacy-websuche.de/viewtopic.php?p=26893#p26893
2012-08-17 14:34:31 +02:00
Michael Peter Christen
a1b2c9a67d
doctype2mime fix, influences metadata conversion between old metadata
...
and solr
2012-08-16 17:49:35 +02:00
Michael Peter Christen
597bb76e4f
get the peer location more quickly
2012-08-16 16:28:57 +02:00
Michael Peter Christen
1641835fef
replaced yacy xml encoding by solr xml encoding
2012-08-14 13:29:11 +02:00
Michael Peter Christen
89fe13e73d
enhanced GSA and RSS output format: corrected date, added some missing
...
fields, added xml encoding for utf8
2012-08-14 13:19:29 +02:00
Michael Peter Christen
d988ba50cf
added a very rudimentary, incomplete, non-verified GSA response writer
...
for solr. Try this:
http://localhost:8090/gsa/searchresult?q=pdf&site=col1&num=10
2012-08-14 12:40:26 +02:00
Michael Peter Christen
9448d9a8a2
ups
2012-08-13 14:01:45 +02:00
Michael Peter Christen
e5ef840f40
- renamed DoubleSolrConnector to MirrorSolrConnector and added a
...
hit/miss/document cache to the MirrorSolrConnector.
- more abstraction to SolrDocument in Connector interface
- bugfixes in Solr field reader
2012-08-13 13:32:32 +02:00
Michael Peter Christen
b51df6c7e8
- added coordinate storage in solr schema
...
- fixed shutdown process
- fixed some solr-to-metadata reading
- added a large number of metadata attributes in ViewFile.html
2012-08-13 10:40:04 +02:00
orbiter
39f8eb60c3
tried to prevent calls to bad-hack getSize() method and reduced overhead
...
of that method a bit.
2012-08-10 18:10:25 +02:00
Michael Peter Christen
b2b480fff2
more abstraction of the YaCySchema -> Opensearch matching process
2012-08-10 09:48:15 +02:00
Michael Peter Christen
24462e9baa
set the title every time, it is possible that it has changed
2012-08-10 07:51:57 +02:00
Michael Peter Christen
dcc72799c4
better abstraction for result writers using controlled vocabularies and
...
URIRefs
2012-08-10 07:45:43 +02:00
Michael Peter Christen
136fcb1ad9
refactoring
2012-08-10 06:47:13 +02:00
Michael Peter Christen
a12f693ec9
added two response writer for embedded solr interface:
...
a rss/opensearch writer and an enhanced solr xml writer.
The enhanced solr writer has less configuration overhead than the
original writer and should by slightly faster. The rss/opensearch writer
is at this time slightly incomplete compared with the already existing
rss search result form YaCy and also snippets are missing at this time.
To test the new interface, open for example:
http://localhost:8090/solr/select?wt=rss&q=olympia
The wt-code for the new result writers are=
wt=rss for opensearch
wt=exml for the enhanced solr xml writer.
Additionally, the SRU search parameters had been added to the solr
interface which can now also be used for a normal solr/xml search.
2012-08-09 18:06:48 +02:00
orbiter
67edfd991c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-05 15:49:48 +02:00
orbiter
d9173ba7ed
added more solr fields to integrate values from URIMetadataRow. All
...
writings to the Metadata-DB are now also done to solr. This includes
metadata transfer during search and rwi transfer.
The new/added solr fields are:
## time when resource was loaded
load_date_dt
## date until resource shall be considered as fresh
fresh_date_dt
## id of the host, a 6-byte hash that is part of the document id
host_id_s
## ids of referrer to this document
referrer_id_ss
## the md5 of the raw source
md5_s
## the name of the publisher of the document
publisher_t
## the language used in the document; starts with primary language
language_ss
## an external ranking value
ranking_i
## the size of the raw source
size_i
## number of links to audio resources
audiolinkscount_i
## number of links to video resources
videolinkscount_i
## number of links to application resources
applinkscount_i
2012-08-05 15:49:27 +02:00
Michael Peter Christen
3ce04cecf3
bad hack to prevent a bug appearing in solr
2012-07-31 23:49:07 +02:00
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
2012-07-27 12:13:53 +02:00
Michael Peter Christen
6f1ddb2519
Moved solr index-add method to the same method where the YaCy index is
...
written. Also done some code-cleanup.
2012-07-25 01:53:47 +02:00
Michael Peter Christen
315d83cfa0
cleanup
2012-07-24 22:16:56 +02:00
Michael Peter Christen
76202f068e
extended abstraction of local and remote solr index using one front-end
...
for index administration and querying.
2012-07-24 17:23:29 +02:00
Michael Peter Christen
cba4ab862e
fix for http://bugs.yacy.net/view.php?id=202
2012-07-23 00:36:18 +02:00
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
2012-07-22 13:18:45 +02:00
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
2012-07-13 21:15:38 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
orbiter
28b30231c3
fix for url matcher of multiple amp& in an url, see:
...
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=4439&p=26650#p26650
2012-07-10 17:39:56 +02:00
orbiter
c6d8950651
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-09 14:33:11 +02:00
orbiter
5f3b8dc040
fix for RSS reader
2012-07-09 14:32:35 +02:00
Michael Peter Christen
434ee90c59
added classification for control file types which shall not be loaded
...
but placed onto the noload-queue
2012-07-08 21:17:33 +02:00
Michael Peter Christen
a90bcb48f6
added webm
2012-07-08 17:58:05 +02:00
Michael Peter Christen
8a6edc0031
fix for solr shutdown
2012-07-05 14:23:43 +02:00
Michael Peter Christen
b8bcc06283
fix for urls beginning with "//"
2012-07-05 14:23:29 +02:00
Michael Peter Christen
b0c408788b
made class methods static where possible
2012-07-05 12:38:41 +02:00
Michael Peter Christen
5bd3c90907
- removed unnecessary semicolons
...
- added default case for switch
2012-07-05 11:18:31 +02:00
Michael Peter Christen
0301aba1e9
removed unused method parameters
2012-07-05 10:23:07 +02:00
Michael Peter Christen
d3964253ae
- added @SuppressWarnings to unused servlet method parameters
...
- removed unnecessary casts
- removed unnecessary throw statements
2012-07-05 09:14:04 +02:00
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
2012-07-05 08:44:39 +02:00
orbiter
7f851d62a7
replaced HashARC with SizeLimited Objects which are less costly
2012-07-04 21:56:25 +02:00
orbiter
bb8dcb4911
automatically adopt size of word cache to available memory
2012-07-03 18:22:25 +02:00
Michael Peter Christen
de903a53a0
parser refactoring & hacks
2012-07-03 06:06:38 +02:00
Michael Peter Christen
8a82609360
- smaller caches to save memory
...
- close cloneable iterators to free memory
2012-07-02 15:40:40 +02:00
Michael Peter Christen
ce8d4b87d9
fixes for new eclipse 'Juno' warning 'Resource leak'.
2012-07-02 10:27:46 +02:00
Michael Peter Christen
0c345d1559
giving threads name so its easier to see whats happening during
...
debugging and within a thread dump
2012-07-02 09:51:43 +02:00
sixcooler
97f60010d8
fix crawl start from file
2012-06-26 16:11:39 +02:00
Michael Peter Christen
d763e4d94b
fixed bad referer computation in SSIs which causes a NPE during host
...
computation. This error was there before the latest IPv6 hack but did
not cause a NPE. The IPv6 hack was not the cause for this bug, but it
discovered the misconfiguration of the 'referer' referrer.
2012-06-26 11:18:29 +02:00
Michael Peter Christen
358b04885e
more IPv6 hacks
2012-06-26 00:25:46 +02:00
Michael Peter Christen
96aeb127e3
generalized localhost naming.
...
this is also a preparation for a better IPv6 implementation.
2012-06-26 00:08:25 +02:00
Michael Peter Christen
77f795756c
fixing redirects and status codes: storing of status code in
...
ResponseHeader to make it available for late evaluations, like storage
in solr.
2012-06-25 18:17:31 +02:00