Michael Peter Christen
5ac61591f3
better abstraction for solr query params
2012-09-25 23:59:30 +02:00
Michael Peter Christen
c913b2ba77
- fix for NPEs during remote solr configuration
...
- fixed remote solr setting switch
- added more logging
2012-09-25 23:59:09 +02:00
Michael Peter Christen
1533bfd63b
refactoring
2012-09-25 21:20:03 +02:00
Michael Peter Christen
e49359cc95
removed tenant query attribute since it is not used any more and is
...
replaced by the site-operator in the GSA interface. This operator can
also be simulated in the Solr interface using the collections_sxt field.
2012-09-25 21:09:06 +02:00
Michael Peter Christen
872f83ebe0
refactoring
2012-09-25 21:04:58 +02:00
Michael Peter Christen
fb9460f0a8
using the search filter to drill down search to file types.
...
A search like "mp3 filetype:mp3" will now maybe surprise you.
2012-09-25 17:52:33 +02:00
Michael Peter Christen
15ea053c3a
- added xml output in IndexControlURLs to get the storage page of index
...
dump commands
- adjusted the apicall.sh script to get the downloaded text as output to
stdout which is necessary to parse the content out of it
- added indexdump.sh script which creates a solr dump and prints out the
storage path for the index dump
- added synchronization to the Fulltext class to prevent that data is
stored to a non-existing solr index while this index is disabled during
the storage of the dump
2012-09-25 00:19:52 +02:00
Michael Peter Christen
1b474139dd
used the new zip writer/reader to add a solr dump process: the whole
...
solr index can be written to a zip dump and also restored during runtime
2012-09-24 17:05:28 +02:00
Michael Peter Christen
4a3e684f8c
added a directory-to-zip writer and zip-to-directory reader
2012-09-24 17:04:37 +02:00
Michael Peter Christen
d9ebf4a40f
a bit more logging
2012-09-24 15:01:44 +02:00
Michael Peter Christen
5683162bd3
simplifications in DHT Distribution class and more documentation
2012-09-24 12:01:09 +02:00
Michael Peter Christen
e57bf2ca39
simplified DHT classes
2012-09-24 01:04:39 +02:00
orbiter
a053b356ee
added new classes to renovate the YaCy protocol based on simple data
...
structures in cora:
- added the Peer object, which is a fresh version of Seed
- added the Peers object, which is a fresh version of Network
- added the Network api access class to retrieve a list of peers based
on the Network.xml servlet in all YaCy peers.
2012-09-22 11:10:11 +02:00
Michael Peter Christen
8219a445f3
refactoring
2012-09-21 16:46:57 +02:00
Michael Peter Christen
f879a344e7
fix for no depth limit default value
2012-09-21 16:05:17 +02:00
Michael Peter Christen
00c1c777fa
refactoring
2012-09-21 15:48:16 +02:00
orbiter
563d584420
removed more dependencies in cora from kelondro
2012-09-21 11:02:36 +02:00
orbiter
aa65282259
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-21 10:27:30 +02:00
orbiter
63762d8f89
removed kelondro dependencies from cora
2012-09-20 19:38:22 +02:00
orbiter
6e0f4557f8
added ftp to getName
2012-09-20 18:29:04 +02:00
cominch
23204d2245
change parameter to support the smw extension for list import
2012-09-20 15:02:57 +02:00
Michael Peter Christen
c235d5c0f1
fixed size parsing in RSS message parser (for YaCy size parameter)
2012-09-19 06:36:07 +02:00
Michael Peter Christen
5bc8f34150
fix for success query counter
2012-09-18 11:06:36 +02:00
orbiter
60b1e23f05
added new crawl options:
...
- indexUrlMustMatch and indexUrlMustNotMatch which can be used to select
loaded pages for indexing. Default patterns are in such a way that all
loaded pages are also indexed (as before) but when doing an expert crawl
start, then the user may select only specific urls to be indexed.
- crawlerNoDepthLimitMatch is a new pattern that can be used to remove
the crawl depth limitation. This filter a never-match by default (which
causes that the depth is used) but the user can select paths which will
be loaded completely even if a crawl depth is reached.
2012-09-16 21:27:55 +02:00
orbiter
4987921d3d
fixed the size() method which counted also failed pages (which are also
...
inside the solr index)
2012-09-16 21:22:56 +02:00
Michael Peter Christen
6ec02deec6
added new crawl attributes in crawl profile (not active yet)
2012-09-14 16:49:29 +02:00
Michael Peter Christen
975bc95ddf
added default facet fields for json response format (stub)
2012-09-14 12:09:20 +02:00
Michael Peter Christen
0504b01bdc
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-14 00:48:17 +02:00
orbiter
9413f77b65
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-13 23:54:26 +02:00
orbiter
a55e77a115
added twitter search heuristic
2012-09-13 23:53:53 +02:00
Michael Peter Christen
e54ac38095
- some corrections in usage of getFile() and getFileName()
...
- added more attributes in json response writer according to yacy
servlet
2012-09-11 23:28:21 +02:00
Michael Peter Christen
62add1d564
added the protocol and the file name extension to the solr fields since
...
these fields are probably facets in file search
2012-09-11 22:46:39 +02:00
Michael Peter Christen
e072632a54
no complaints about memory if the database is empty
2012-09-11 22:28:10 +02:00
Michael Peter Christen
b846f585fa
fixed a bug with size_i field usage
2012-09-11 20:24:27 +02:00
Michael Peter Christen
9db032664e
activate two solr fields which will be used by administration interface
...
(later)
2012-09-11 20:15:54 +02:00
orbiter
fcd5c7eec3
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-11 09:16:38 +02:00
orbiter
6171143b4a
added facet stub in JsonResponseWriter
2012-09-11 09:15:47 +02:00
Michael Peter Christen
e84ffdb4f3
enhanced solr writers
2012-09-11 03:02:02 +02:00
Michael Peter Christen
5df553c152
- added a json writer for solr (yes there was one using xslt but this
...
one writes the same way as yacysearch.json)
- using the new json solr result to change the ajax search in
IndexControlURLs to the new solr search
2012-09-10 14:30:44 +02:00
Michael Peter Christen
4634f0e626
fix for images_withalt
2012-09-10 12:30:03 +02:00
Michael Peter Christen
e65cecc419
- updated lucene libraries to 3.6.1
...
- added lucene-grouping which enables faceted search; try this:
http://localhost:8090/solr/select?q=*:*&start=0&rows=3&facet=true&facet.field=host_s
2012-09-10 10:12:38 +02:00
Michael Peter Christen
4d29f59a27
removed warnings
2012-09-10 07:15:52 +02:00
Michael Peter Christen
8c099d2106
Merge remote-tracking branch 'origin/master'
...
Conflicts:
htroot/api/ymarks/import_ymark.java
source/de/anomic/data/ymark/YMarkEntry.java
source/de/anomic/data/ymark/YMarkTables.java
2012-09-10 07:05:20 +02:00
apfelmaennchen
d31a632951
- added dmoz RDF dump importer
...
- added indexing to Tables columns to support larger bookmark
collections
- added RDF output (HTTP) for public bookmarks at /YMarks.rdf
- YMarkRDF also provides a Jena RDF Model as "internal" API
- various other changes/fixes for YMarks (mainly backend)
2012-09-09 09:53:58 +02:00
Michael Peter Christen
10b911eed4
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-07 22:07:02 +02:00
Michael Peter Christen
be67c70a47
added Solr fields:
...
inboundlinks_text_chars_val
inboundlinks_text_words_val
inboundlinks_alttag_txt
outboundlinks_text_chars_val
outboundlinks_text_words_val
outboundlinks_alttag_txt
2012-09-07 22:06:51 +02:00
orbiter
d73fff0e0e
added solr field images_withalt_i
2012-09-07 21:33:45 +02:00
sixcooler
e78fe3f477
also do a clearcache on the solr-connector-caches
2012-09-06 22:07:07 +02:00
sixcooler
9ee2e09983
statistics for solr-cache
2012-09-06 22:02:29 +02:00
Michael Peter Christen
d8425e6809
added collections to crawl monitor
2012-09-04 14:47:53 +02:00