cominch
dc468dad01
add content control features for custom filter lists
2012-08-29 09:04:28 +02:00
reger
65d49df865
security fix: clear automtic password only if adminAccountForLocalhost=false to prevent remote access to protected pages after restart.
...
if adminAccountForLocalhost=true leave automatic password unchanged so access from local host is granted but remote access is preventet from the 1st second.
2012-08-26 22:28:14 +02:00
Michael Peter Christen
48a82bc705
log queries anonymous from gsa+solr requests
2012-08-22 23:50:40 +02:00
Michael Peter Christen
0cab06c47c
refactoring
2012-08-17 15:52:33 +02:00
Michael Peter Christen
06a78eecb7
code simplification
2012-08-17 14:43:32 +02:00
Michael Peter Christen
18f989dfb1
- refactoring (load -> getMetadata)
...
- added getDocument to retrieve Solr documents which shall replace
getMetadata
2012-08-17 01:34:38 +02:00
Michael Peter Christen
23226676c6
FOR THE BRAVE.. this is a forced migration to solr which is now ready
...
for production as a replacement of the metadata-db.
This intermediate release 1.041 will switch on the previously optional
solr index and the old metadata-db will still work as it did before.
Solr+metadata are accessed in mixed mode, no migration is done yet.
If this causes not a catastrophe until the end of the weekend, we will
do a YaCy 1.1 main release containing this as default.
2012-08-16 18:17:47 +02:00
Michael Peter Christen
b51df6c7e8
- added coordinate storage in solr schema
...
- fixed shutdown process
- fixed some solr-to-metadata reading
- added a large number of metadata attributes in ViewFile.html
2012-08-13 10:40:04 +02:00
orbiter
39f8eb60c3
tried to prevent calls to bad-hack getSize() method and reduced overhead
...
of that method a bit.
2012-08-10 18:10:25 +02:00
orbiter
67edfd991c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-05 15:49:48 +02:00
orbiter
d9173ba7ed
added more solr fields to integrate values from URIMetadataRow. All
...
writings to the Metadata-DB are now also done to solr. This includes
metadata transfer during search and rwi transfer.
The new/added solr fields are:
## time when resource was loaded
load_date_dt
## date until resource shall be considered as fresh
fresh_date_dt
## id of the host, a 6-byte hash that is part of the document id
host_id_s
## ids of referrer to this document
referrer_id_ss
## the md5 of the raw source
md5_s
## the name of the publisher of the document
publisher_t
## the language used in the document; starts with primary language
language_ss
## an external ranking value
ranking_i
## the size of the raw source
size_i
## number of links to audio resources
audiolinkscount_i
## number of links to video resources
videolinkscount_i
## number of links to application resources
applinkscount_i
2012-08-05 15:49:27 +02:00
Michael Peter Christen
24d9db1613
snippet retrieval loading processes may use a smaller minimum load time
...
value than crawling processes. This speeds up the search result
preparation dramatically.
2012-07-30 10:38:23 +02:00
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
2012-07-27 12:13:53 +02:00
Michael Peter Christen
6f1ddb2519
Moved solr index-add method to the same method where the YaCy index is
...
written. Also done some code-cleanup.
2012-07-25 01:53:47 +02:00
Michael Peter Christen
315d83cfa0
cleanup
2012-07-24 22:16:56 +02:00
Michael Peter Christen
76202f068e
extended abstraction of local and remote solr index using one front-end
...
for index administration and querying.
2012-07-24 17:23:29 +02:00
Michael Peter Christen
826967513b
changed options in IndexFederated_p to switch on/off parts of the index
...
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
2012-07-22 13:18:45 +02:00
orbiter
05a3ffd03a
patches to ensure that solr connectors are active ony if they have a
...
solr object assigned and vice versa
2012-07-20 11:47:50 +02:00
orbiter
5a3c829872
embedded solr is only initiated if it is activated with
...
IndexFederated_p.html
2012-07-20 11:40:33 +02:00
Michael Peter Christen
58e7d1952f
reduction of logging to prevent too much IO caused be logging
2012-07-12 02:08:11 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
orbiter
c7afa8bc48
using SwitchboardConstants for solr attributes
2012-07-10 12:01:20 +02:00
Michael Peter Christen
d09d9f2364
filter old peers from bootstrap (now stronger: 60 minutes instead of
...
240).
2012-07-08 21:25:22 +02:00
Michael Peter Christen
b0c408788b
made class methods static where possible
2012-07-05 12:38:41 +02:00
Michael Peter Christen
0301aba1e9
removed unused method parameters
2012-07-05 10:23:07 +02:00
Michael Peter Christen
ea10766bfd
cleaned unnecessary nested code
2012-07-05 08:44:39 +02:00
Michael Peter Christen
7249d9c9de
bugfix for concurrent seed loader
2012-07-02 14:37:57 +02:00
Michael Peter Christen
c72d3b12cd
concurrently initialize the seed list during p2p network bootstrap
2012-07-02 14:27:37 +02:00
Michael Peter Christen
1825f165b8
better integration of blacklist according to use case
2012-07-02 13:57:29 +02:00
Michael Peter Christen
c18fa9fa75
Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
2012-07-02 12:20:57 +02:00
Michael Peter Christen
ce8d4b87d9
fixes for new eclipse 'Juno' warning 'Resource leak'.
2012-07-02 10:27:46 +02:00
Michael Peter Christen
0c345d1559
giving threads name so its easier to see whats happening during
...
debugging and within a thread dump
2012-07-02 09:51:43 +02:00
reger
067728bccc
add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
2012-07-01 00:12:20 +02:00
Michael Peter Christen
03280fb161
removed segments-concept and the Segments class:
...
the segments had been there to create a tenant-infrastructure but were
never be used since that was all much too complex. There will be a
replacement using a solr navigation using a segment field in the search
index.
2012-06-28 14:27:29 +02:00
Michael Peter Christen
9116013c64
- allow lazy initialization of solr value (if using 'lazy', then no
...
0-values and no empty strings are written). This may save a lot of
memory (in ram and on disc) if excessive 0-values or empty strings
appear)
- do not allow default boolean values for checkboxes because that does
not make sense: browsers may omit the checkbox attribute name if the box
is not checked. A default value 'true' would not comply with the
semantic of the browsers response.
- add a checkbox in IndexFederated_p for the lazy initialization of solr
fields.
2012-06-27 12:17:58 +02:00
Michael Peter Christen
96aeb127e3
generalized localhost naming.
...
this is also a preparation for a better IPv6 implementation.
2012-06-26 00:08:25 +02:00
Michael Peter Christen
77f795756c
fixing redirects and status codes: storing of status code in
...
ResponseHeader to make it available for late evaluations, like storage
in solr.
2012-06-25 18:17:31 +02:00
Michael Peter Christen
8dd469b9dd
added option to configure the autocommit delay time of solr on-the-fly
2012-06-25 14:59:46 +02:00
Michael Peter Christen
b9dfca4b0a
- fixed IndexFederated Servlet / a embedded Solr can now be selected
...
- added code stub for an embedded Solr but generation of Solr store is
still commented out (it works but is not yet ready for usage)
2012-06-25 11:34:38 +02:00
Michael Peter Christen
b9d42fd9c8
using com.google.common.io.Files instead of homebrew methods
2012-06-22 11:39:17 +02:00
Michael Peter Christen
a5eb91fa60
refactoring
2012-06-22 00:49:32 +02:00
Michael Peter Christen
0752983fbd
- automatic periodic saving of triplestore
...
- transaction-safe storage of triplestore
2012-06-17 10:50:12 +02:00
cominch
a95127c9af
Triplestore: initalize per-user triplestores
2012-06-14 11:46:53 +02:00
Michael Peter Christen
4ee6fb1de9
added missing blacklist dht cache storage (maybe due to mistakes in
...
cherry picking)
2012-06-11 00:38:02 +02:00
Roland 'Quix0r' Haeder
edaa09b9b1
Rewrote all String blacklist types to enum 'BlacklistType', closes bug
...
#143
Conflicts:
htroot/Supporter.java
htroot/yacy/crawlReceipt.java
htroot/yacy/transferRWI.java
htroot/yacy/transferURL.java
source/de/anomic/crawler/CrawlStacker.java
source/de/anomic/data/ListManager.java
source/net/yacy/peers/Protocol.java
source/net/yacy/repository/Blacklist.java
source/net/yacy/repository/LoaderDispatcher.java
source/net/yacy/search/Switchboard.java
source/net/yacy/search/index/MetadataRepository.java
source/net/yacy/search/index/Segment.java
source/net/yacy/search/query/RWIProcess.java
source/net/yacy/search/snippet/MediaSnippet.java
2012-06-11 00:17:30 +02:00
Roland 'Quix0r' Haeder
af5a597e47
Scroogle is not comming back, remove dead code
...
Conflicts:
source/net/yacy/search/Switchboard.java
2012-06-10 23:38:41 +02:00
Michael Peter Christen
cde20911bb
saved a bit more ram using UTF8 String compression for OpenGeoDB and
...
Geonames data files.
2012-06-09 10:07:11 +02:00
Michael Peter Christen
2280a7b276
- changed initialization order to prefer allocation of memory for table
...
files first
- bugfixes in memory amount calculation
2012-06-09 09:05:47 +02:00
Michael Peter Christen
0746308bc2
only the metadata tables shall be able to use the tail cache
2012-06-08 18:36:11 +02:00