Michael Peter Christen
28bd3e62b1
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-10-05 00:04:09 +02:00
orbiter
4fed4a86d8
another fix to location search
2012-10-04 22:44:44 +02:00
orbiter
507c612015
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
2012-10-04 21:32:04 +02:00
reger
5650b0333e
adjusted Netbeans-IDE classpath to current jars
...
change solr jars to 3.6.1 (from 3.6.0)
change lucene jars to 3.6.1 (from 3.6.0)
added jsoup-1.6.3
2012-10-04 21:12:09 +02:00
reger
b58e1f6d67
- add translation for ConfigHeuristics_p.html # section search-result
...
- removed old/unused scroogle text
2012-10-04 20:57:29 +02:00
orbiter
0f7a54452d
fix for location search query encoding
2012-10-04 14:46:40 +02:00
Michael Peter Christen
679d562908
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-10-04 13:18:52 +02:00
sixcooler
9aa21506be
bump to httpcore-4.2.2 (maintenance release)
2012-10-03 02:15:02 +02:00
Michael Peter Christen
31485a963d
refactoring
2012-10-02 21:57:50 +02:00
Michael Peter Christen
406e1f3e7e
added an option to start indexing right from the host browser
2012-10-02 21:18:27 +02:00
Michael Peter Christen
f8a3ab2d82
added the usage of synonyms to the GSA search interface
2012-10-02 14:29:45 +02:00
Michael Peter Christen
3d33a5bdf6
turned the synonyms_t Text field into a multi-valued String field
...
synonyms_sxt
2012-10-02 11:13:06 +02:00
Michael Peter Christen
41ab2a2279
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-10-02 10:24:03 +02:00
orbiter
c8b1a693dc
ups, added missing class for last commit
2012-10-02 10:23:10 +02:00
Michael Peter Christen
3b959ee002
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-10-02 10:14:09 +02:00
orbiter
3190347814
added a synonyms_t field to solr and a process to read synonym files.
...
This can be used to add another stemming to solr using stemming files
that are expressed as synonyms for grammatical alternatives. The
synonym/stemming files must have the following form:
- each line is a comma-separated list of synonyms
- the list of synonyms may be enclosed with {} (like the GSA synonyms
file)
- the file may contain comments which are lines starting with a '#'
The synonym file(s) must be placed in DATA/DICTIONARIES/synonyms/ and
are activated by default whenever a synonym file is in place.
Then, for each word that is found in a document all synonyms are added
to a long text field which is stored into synonyms_t. Processes using
the synonyms must query with that field as optional matcher.
2012-10-02 00:02:50 +02:00
Michael Peter Christen
411d0e839b
added an underline text field to solr to record all underlined texts
2012-10-01 14:16:49 +02:00
orbiter
be4c96f3b1
The HostBrowser now offers to index files that are discovered because
...
they are linked in the web interface.
2012-09-30 13:23:06 +02:00
Michael Peter Christen
c4a3d8870f
fixed computation of links in host browser which are not indexed but
...
knwon by the crawler. Such links are now displayed in grey color.
2012-09-29 02:13:11 +02:00
Michael Peter Christen
97a47319c8
added nice links to the host browser:
...
- click on the file icon to get the metadata of the file
- click on the link icon behind the link to open the original file in
the browser
2012-09-28 23:09:21 +02:00
Michael Peter Christen
f45f7fc12e
added new Host Browser to main menu:
...
this new search interface is something completely new for search, but
completely common on desktops: browser a web space like one would browse
a file system in a file browser. The file listing is created using the
search index and a faceted restriction to specific domains.
2012-09-28 22:45:16 +02:00
Michael Peter Christen
8556a3d521
extended solr connector with a method to retrieve a single facet.
2012-09-28 13:50:13 +02:00
Michael Peter Christen
d0015df61c
added lucene memory library which is now necessary as solr has to
...
process more complex queries
2012-09-28 13:48:51 +02:00
Michael Peter Christen
80edd8ecd7
some more after-refactoring fixes
2012-09-28 10:24:57 +02:00
Michael Peter Christen
816cb6ce93
another fix for the debian installer: the installer fails because some
...
classes had unresolved dependencies. This fix removes the dependencies.
2012-09-28 09:00:40 +02:00
Michael Peter Christen
c461c28c5d
fix for debian package installation (caused by refactoring)
2012-09-27 17:23:10 +02:00
Michael Peter Christen
280e36c90b
allow Cross-Origin Resource Sharing for all stream servlets, that is the
...
solr and the gsa search interface. That means that all JavaScript in
browsers now can Cross-Origin access all YaCy search interfaces, which
opens the option of 'YaCy Client in Browser' and 'End-Point Fail-over'
concepts.
2012-09-27 12:02:24 +02:00
Michael Peter Christen
ccd65ecf8d
fixed url search in IndexControlURLs_p.html / using now the solr
...
interface
2012-09-27 00:31:59 +02:00
Michael Peter Christen
016ffa7434
increased strength of crawling waves in network image
2012-09-26 23:32:13 +02:00
Michael Peter Christen
23f68f2a69
force usage of default faceting mechanisms for search
2012-09-26 18:48:59 +02:00
Michael Peter Christen
24d2ee3c52
- better date ranking
...
- more protection against NPE and time travel effects
2012-09-26 18:36:32 +02:00
Michael Peter Christen
ca313e404f
- if a "/date" modifier is used, the solr remote query applies an
...
ordering by date (ascending)
- added also some 'anti-timetravel' protection (check if date is in the
future within any metadata date field)
2012-09-26 16:56:33 +02:00
Michael Peter Christen
a4214694df
We assert that no other metadata storage than solr is used now.
...
Therefore a property like solrConnected() must be true all the time.
Removal of this method causes removal of all write operations to the old
metadata index.
2012-09-26 16:05:11 +02:00
Michael Peter Christen
abab291162
made the index schema retrieval public and allow cross-domain retrieval
2012-09-26 15:44:50 +02:00
Michael Peter Christen
0cec7e761a
enhanced snippet extractor to find snippets also inside of tokens of an
...
url
2012-09-26 15:33:37 +02:00
sixcooler
c65b576a6f
added filename for missing crawlname when crawling from file
2012-09-26 14:05:33 +02:00
sixcooler
6c50d016ed
pdf- and zipParser should not use forced Memory-Limits
2012-09-26 14:03:51 +02:00
Michael Peter Christen
562183932b
- removed ip_s from default profile since that needs a DNS lookup to
...
create an document entry. This makes remote search much slower.
- removed synchronization of add method if ip_s is activated to prevent
that a user configuration causes bad behavior. The disadvantage of that
is, that a index dump can cause data loss if an indexing is running
during index dump
- catched more exceptions and more NPE
- better abstraction in MirrorSolrConnector
- slight performance enhancement when only the index count is requested
(rows=0 is sufficient to get a total count)
2012-09-26 13:38:04 +02:00
Michael Peter Christen
24f4ca4d85
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-09-26 12:01:34 +02:00
apfelmaennchen
7efe9eb37b
adding CORS access header for Network.xml to overcome cross domain
...
restriction (e.g. necessary to build a JavaScript YaCy
client).
2012-09-26 10:36:09 +02:00
apfelmaennchen
116f429e35
fix for java.lang.RuntimeException: TableColumnIndex not available...
2012-09-26 09:56:16 +02:00
Michael Peter Christen
5ac61591f3
better abstraction for solr query params
2012-09-25 23:59:30 +02:00
Michael Peter Christen
c913b2ba77
- fix for NPEs during remote solr configuration
...
- fixed remote solr setting switch
- added more logging
2012-09-25 23:59:09 +02:00
Michael Peter Christen
b5192e03d7
fixed bad output in stopYACY.sh
2012-09-25 23:20:09 +02:00
Michael Peter Christen
882d54067a
added dummy update servlet
2012-09-25 23:09:32 +02:00
Michael Peter Christen
1533bfd63b
refactoring
2012-09-25 21:20:03 +02:00
Michael Peter Christen
e49359cc95
removed tenant query attribute since it is not used any more and is
...
replaced by the site-operator in the GSA interface. This operator can
also be simulated in the Solr interface using the collections_sxt field.
2012-09-25 21:09:06 +02:00
Michael Peter Christen
872f83ebe0
refactoring
2012-09-25 21:04:58 +02:00
Michael Peter Christen
fb9460f0a8
using the search filter to drill down search to file types.
...
A search like "mp3 filetype:mp3" will now maybe surprise you.
2012-09-25 17:52:33 +02:00
Michael Peter Christen
bc865ab816
more cleaning (yacy-cora)
2012-09-25 12:19:24 +02:00