Michael Peter Christen
4815713ec7
added synchronization to solr server requests since lucene is not
...
thread-safe. We experienced problems as described in
http://stackoverflow.com/questions/5327978/lockobtainfailedexception-updating-lucene-search-index-using-solr
2012-08-31 15:16:33 +02:00
Michael Peter Christen
f75b3f8a47
added more patches to work without RWI data structure
2012-08-31 14:35:56 +02:00
Michael Peter Christen
a427a68bac
removed many warnings
2012-08-31 14:07:33 +02:00
Michael Peter Christen
c72c435517
- moved the gsa search interface from /gsa/searchresult? to /gsa/search?
...
- fixed the NB field data
2012-08-31 14:00:53 +02:00
Michael Peter Christen
31d4d38804
- extended the solr interface by a references-by-word-count method
...
- reduced danger that a non-existing RWI database causes NPEs
- added Solr queries to did-you-mean: this makes it possible that our
did-you-mean algorithm works together with only Solr and without RWIs
2012-08-31 13:03:00 +02:00
Michael Peter Christen
528d6763fa
- added new solr fields:
...
title_count_i, title_chars_val, title_words_val
description_count_i, description_chars_val, description_words_val
- added many asserts to ensure data type correctness from YaCy to Solr
and vice versa
- made many fixes according to new findings from these asserts (!)
2012-08-31 10:30:43 +02:00
Michael Peter Christen
3142e675e8
fixed problems with GSA api:
...
- better FS attribute
- highlightning of searched words in title
2012-08-29 16:48:53 +02:00
Michael Peter Christen
3b19fe7b52
- fixed num parameter in GSA api
...
- changed FS attribute in GSA api
2012-08-29 16:28:32 +02:00
Michael Peter Christen
2ddc33646a
added new field for solr:
...
url_paths_sxt
url_parameter_i
url_parameter_key_sxt
url_parameter_value_sxt
url_chars_i
2012-08-29 16:11:23 +02:00
Michael Peter Christen
75d5e3475d
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-29 10:13:51 +02:00
cominch
dc468dad01
add content control features for custom filter lists
2012-08-29 09:04:28 +02:00
Michael Peter Christen
316b5fe116
- added a solr type definition verifier
...
- fixed type definition found by the verifier
- added multivalue-string fields for solr with extension 'sxt'
- added multivalue-integer fields for solr with extension 'val'
- renamed some solr attributes from txt to sxt
- changed solr query line to an explicit AND/OR structure
- added a country code second level domain list to Domains class; with
parser
- added a host string parser to get domain class name, country-code
second-level domain and subdomain out of it
- removed old coordinate attributes
2012-08-28 16:58:06 +02:00
orbiter
a3d5959981
Merge commit '65d49df865f60511d22d86fb15c33a082176e7ab'
2012-08-27 16:56:22 +02:00
Michael Peter Christen
4521d63c92
added boosts to solr search queries
2012-08-27 15:25:25 +02:00
Michael Peter Christen
e8acd542b5
- added faceted drill-down for host and geolocation to solr queries
...
- added a new geolocation field to index schema, the old values are
migrated if possible
2012-08-27 14:41:33 +02:00
Michael Peter Christen
f00168ecc5
added gsa result attribute 'has'
2012-08-27 12:15:42 +02:00
reger
65d49df865
security fix: clear automtic password only if adminAccountForLocalhost=false to prevent remote access to protected pages after restart.
...
if adminAccountForLocalhost=true leave automatic password unchanged so access from local host is granted but remote access is preventet from the 1st second.
2012-08-26 22:28:14 +02:00
orbiter
2094df2e4e
- correct length computation for BStringObject (bugfix suggested by
...
apfelmaennchen)
- using ASCII for string conversion for Strings generated from Integer
2012-08-26 17:46:40 +02:00
orbiter
67f2866cd0
small fixes
2012-08-24 21:44:22 +02:00
orbiter
ce156a01ba
Merge commit 'c2341a175fdd755a34965ff63c7ea437b380352d'
2012-08-24 18:24:24 +02:00
David Rubio
c2341a175f
Fixed a bug that prevented Yacy from indexing files with non ASCII filenames in FTP servers.
...
Previously Yacy could read file listings in UTF-8, but couldn't send commands to the FTP server in UTF-8 (the second byte of every multi-byte character was ignored), which caused a lot of errors on the server side.
Now it handles UTF-8 correctly.
2012-08-24 17:45:14 +02:00
orbiter
3ebc4264c5
fixed concurrent query
2012-08-24 14:15:40 +02:00
orbiter
29171e2f6c
fixed generation of ontologies from index enumerations
2012-08-24 14:13:42 +02:00
orbiter
7cd302de3e
omit xml parsing when using the embedded solr server
2012-08-24 12:18:30 +02:00
orbiter
787e1c6836
added the
...
QueryResponse query(SolrParams params)
method to the SolrServerConnector which is necessary to use facets in
solr search.
2012-08-23 11:53:54 +02:00
orbiter
01a63ef595
redesign of YaCySchema and SolrDoc handling
2012-08-23 09:51:45 +02:00
orbiter
479bfca571
refctoring
2012-08-23 09:30:11 +02:00
Michael Peter Christen
48a82bc705
log queries anonymous from gsa+solr requests
2012-08-22 23:50:40 +02:00
Michael Peter Christen
ab6ec4ec52
added snippet computation to solr/rss and gsa result writer
2012-08-22 17:37:34 +02:00
Michael Peter Christen
4716546ef5
- reduced memory usage in index transmission using a transformation of
...
Node to Row objects
- removed peerDeparture in solr remote search in case that peer does not
answer (this may be normal because it is allowed to switch this off)
2012-08-22 16:30:33 +02:00
Michael Peter Christen
06b0081fdc
fix for NPE during host navigation computation
2012-08-22 01:55:39 +02:00
Michael Peter Christen
feb99bc291
fixed GSA format
2012-08-22 00:48:37 +02:00
Michael Peter Christen
653645c1cf
corrected solr query syntax
2012-08-22 00:48:03 +02:00
Michael Peter Christen
08ae142a3d
- enhanced caching after search queries to solr
...
- reduced caching after short memory
2012-08-22 00:31:14 +02:00
orbiter
716ea0cfe2
sorted the solr schema into mandatory and optional fields; reduced
...
number of used field to reduce solr index size
2012-08-21 23:52:56 +02:00
orbiter
9b8c8c0f47
fix from gaston in
...
http://forum.yacy-websuche.de/viewtopic.php?p=26909#p26909
2012-08-21 21:03:26 +02:00
orbiter
acb9f04e80
removed unused classes
2012-08-21 18:18:30 +02:00
Michael Peter Christen
0ad52ac4c3
gsa bugfix for date parser
2012-08-21 02:39:28 +02:00
Michael Peter Christen
3ce4c2f937
fixes for gsa result format
2012-08-21 01:57:46 +02:00
Michael Peter Christen
67d235fae9
added gzip encoding to solr2sor http interface, client side (server
...
already works)
2012-08-20 16:53:21 +02:00
Michael Peter Christen
a049761e0c
fixed double-check
2012-08-20 14:16:37 +02:00
Michael Peter Christen
f42a57cd7d
gsa format update
2012-08-20 12:50:51 +02:00
Michael Peter Christen
b3aad6cc35
bugfix for remote search when search is done to solr
2012-08-20 12:21:36 +02:00
Michael Peter Christen
ff3eaa21b0
added remote search to solr on YaCy peers!
...
- when doing a remote search, node peers are selected for solr queries
- the solr query is done concurrently to the standard YaCy rwi search
- the solr search result is feeded into the same data structure that
prepares the rwi search result
- the same remote seach that is done to several outside peers is done to
the local solr index
- the search process works now also without any 'old' RWI data using
solr
2012-08-20 12:16:11 +02:00
Michael Peter Christen
a06123aec6
more abstraction and less parameter overhead for remote search
2012-08-20 01:29:15 +02:00
Michael Peter Christen
f00733186b
code simplifications
2012-08-19 13:17:03 +02:00
Michael Peter Christen
755f5e76cf
removed strange assert statements and simplified code in metadata
...
transformation
2012-08-19 08:44:39 +02:00
Michael Peter Christen
db0d438709
fix for http://bugs.yacy.net/view.php?id=206
2012-08-19 08:43:56 +02:00
orbiter
404b0aab09
refactoring in remote search and stub for remote node peer selection
2012-08-18 23:59:25 +02:00
orbiter
d7ea45f698
- get nice text_t values from metadata conversions that are stored into
...
solr as fulltext search index.
- added slow migration from old metadata to solr index entries: each
entry from the old metadata is removed from that data structure and
written into solr.
2012-08-18 19:36:21 +02:00
orbiter
99ef57f103
reduced sleep times
2012-08-18 17:48:20 +02:00
orbiter
780f8974e7
added ramaining iteration methods for solr in fulltext class
2012-08-18 15:39:14 +02:00
orbiter
acd2dc3575
hack to removed StringBuilder overhead in query construction
2012-08-18 14:22:00 +02:00
orbiter
ee01c12e56
fixes for putDocument and putMetadata
2012-08-18 13:05:27 +02:00
orbiter
cc47a0876e
reverted bf55f69176
...
to have a fall-back option in case that memory problems as reported in
http://forum.yacy-websuche.de/viewtopic.php?p=26901#p26901
for full-solr installation are too strong and we have to work with an
'small memory footprint' peer system.
2012-08-18 10:28:40 +02:00
Michael Peter Christen
0904afe8fb
added concurrent iterator methods to the solr connectors
2012-08-17 18:22:56 +02:00
Michael Peter Christen
d54b80327a
refactoring
2012-08-17 17:28:27 +02:00
Michael Peter Christen
f9fc5cfaba
better check for bad urls in url transmission
2012-08-17 17:17:00 +02:00
Michael Peter Christen
d39463a85c
added deleteByQuery to solr connectors
2012-08-17 17:05:46 +02:00
Michael Peter Christen
0cab06c47c
refactoring
2012-08-17 15:52:33 +02:00
Michael Peter Christen
bf55f69176
removed write methods to old metadata file type; all metadata now goes
...
to solr
2012-08-17 15:46:26 +02:00
Michael Peter Christen
40c0856489
refactoring
2012-08-17 15:33:02 +02:00
Michael Peter Christen
06a78eecb7
code simplification
2012-08-17 14:43:32 +02:00
Michael Peter Christen
54bea21c02
bugfix for solr connector, possibly a cause for
...
http://forum.yacy-websuche.de/viewtopic.php?p=26893#p26893
2012-08-17 14:34:31 +02:00
Michael Peter Christen
9bece5ac5f
enhanced snippet fetch - removed a bug that caused documents to be
...
parsed even if a solr text was available
2012-08-17 14:22:07 +02:00
Michael Peter Christen
18f989dfb1
- refactoring (load -> getMetadata)
...
- added getDocument to retrieve Solr documents which shall replace
getMetadata
2012-08-17 01:34:38 +02:00
Michael Peter Christen
395b78a0d8
using the solr search index to concurrently search within solr and the
...
rwis during local search requests.
2012-08-17 01:21:56 +02:00
Michael Peter Christen
6197caf698
added clear-text search words in query params
2012-08-16 23:05:37 +02:00
Michael Peter Christen
23226676c6
FOR THE BRAVE.. this is a forced migration to solr which is now ready
...
for production as a replacement of the metadata-db.
This intermediate release 1.041 will switch on the previously optional
solr index and the old metadata-db will still work as it did before.
Solr+metadata are accessed in mixed mode, no migration is done yet.
If this causes not a catastrophe until the end of the weekend, we will
do a YaCy 1.1 main release containing this as default.
2012-08-16 18:17:47 +02:00
Michael Peter Christen
a1b2c9a67d
doctype2mime fix, influences metadata conversion between old metadata
...
and solr
2012-08-16 17:49:35 +02:00
Michael Peter Christen
a16206e38b
more attempts to clean the index (cleaning is faster then)
2012-08-16 17:24:25 +02:00
Michael Peter Christen
703f427303
fixed some peer-ping connection details
...
- larger time-out
- removed too old seedlist
- fixed a bug in connection test
2012-08-16 17:11:54 +02:00
Michael Peter Christen
597bb76e4f
get the peer location more quickly
2012-08-16 16:28:57 +02:00
Michael Peter Christen
1641835fef
replaced yacy xml encoding by solr xml encoding
2012-08-14 13:29:11 +02:00
Michael Peter Christen
89fe13e73d
enhanced GSA and RSS output format: corrected date, added some missing
...
fields, added xml encoding for utf8
2012-08-14 13:19:29 +02:00
Michael Peter Christen
ea49a8aa8c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-14 12:40:44 +02:00
Michael Peter Christen
d988ba50cf
added a very rudimentary, incomplete, non-verified GSA response writer
...
for solr. Try this:
http://localhost:8090/gsa/searchresult?q=pdf&site=col1&num=10
2012-08-14 12:40:26 +02:00
Michael Peter Christen
aab0b680c3
- added xslt support for solr result formats.
...
try i.e.
http://localhost:8090/solr/select?q=*:*&start=0&rows=10&wt=xslt&tr=json.xsl
- added servlet-side mime-type configuration for streamed servlets. this
is used for the result formatters in solr result formats
2012-08-14 11:12:50 +02:00
cominch
e2119f4e76
augmented browsing: replace htmlparser by jsoup, which is more stable
...
and reliable
2012-08-14 10:06:12 +02:00
Michael Peter Christen
9448d9a8a2
ups
2012-08-13 14:01:45 +02:00
Michael Peter Christen
e5ef840f40
- renamed DoubleSolrConnector to MirrorSolrConnector and added a
...
hit/miss/document cache to the MirrorSolrConnector.
- more abstraction to SolrDocument in Connector interface
- bugfixes in Solr field reader
2012-08-13 13:32:32 +02:00
Michael Peter Christen
94a334f128
another fix to the Solr metadata reading process and to the shutdown
...
process
2012-08-13 11:13:53 +02:00
Michael Peter Christen
b51df6c7e8
- added coordinate storage in solr schema
...
- fixed shutdown process
- fixed some solr-to-metadata reading
- added a large number of metadata attributes in ViewFile.html
2012-08-13 10:40:04 +02:00
Michael Peter Christen
da851c6071
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-11 01:21:18 +02:00
Michael Peter Christen
bd4f03bc85
removed unused class
2012-08-11 01:05:40 +02:00
orbiter
39f8eb60c3
tried to prevent calls to bad-hack getSize() method and reduced overhead
...
of that method a bit.
2012-08-10 18:10:25 +02:00
orbiter
e816b88b55
changed behaviour of metadata storage: in case that any solr is
...
attached, the metadata is not written to the metadata-db, even if it is
enabled but instead to solr. This prevents that metadata is written in
two store systems at the same time. It is also the next step to migrate
the current metadata-db to solr.
2012-08-10 15:39:10 +02:00
orbiter
2571e0d47a
removed unused classes
2012-08-10 14:47:44 +02:00
Michael Peter Christen
f9c0e6e950
- Implemented and integrated the URIMetadataNode object which is a
...
metadata representation from the solr index. This shall replace metadata
from the built-in database in the future.
- added the Solr-driven metadata into the search index of YaCy which
makes it now possible to run YaCy without the old metadata index. This
is a major stept forward to a full migration to Solr.
2012-08-10 13:26:51 +02:00
Michael Peter Christen
b2b480fff2
more abstraction of the YaCySchema -> Opensearch matching process
2012-08-10 09:48:15 +02:00
Michael Peter Christen
24462e9baa
set the title every time, it is possible that it has changed
2012-08-10 07:51:57 +02:00
Michael Peter Christen
dcc72799c4
better abstraction for result writers using controlled vocabularies and
...
URIRefs
2012-08-10 07:45:43 +02:00
Michael Peter Christen
136fcb1ad9
refactoring
2012-08-10 06:47:13 +02:00
Michael Peter Christen
a12f693ec9
added two response writer for embedded solr interface:
...
a rss/opensearch writer and an enhanced solr xml writer.
The enhanced solr writer has less configuration overhead than the
original writer and should by slightly faster. The rss/opensearch writer
is at this time slightly incomplete compared with the already existing
rss search result form YaCy and also snippets are missing at this time.
To test the new interface, open for example:
http://localhost:8090/solr/select?wt=rss&q=olympia
The wt-code for the new result writers are=
wt=rss for opensearch
wt=exml for the enhanced solr xml writer.
Additionally, the SRU search parameters had been added to the solr
interface which can now also be used for a normal solr/xml search.
2012-08-09 18:06:48 +02:00
Michael Peter Christen
bca4a16603
replaced the multivalue generic string field name suffix _ss by _txt
...
because _ss is not part of the standard solr example schema.
2012-08-06 17:58:09 +02:00
orbiter
67edfd991c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-08-05 15:49:48 +02:00
orbiter
d9173ba7ed
added more solr fields to integrate values from URIMetadataRow. All
...
writings to the Metadata-DB are now also done to solr. This includes
metadata transfer during search and rwi transfer.
The new/added solr fields are:
## time when resource was loaded
load_date_dt
## date until resource shall be considered as fresh
fresh_date_dt
## id of the host, a 6-byte hash that is part of the document id
host_id_s
## ids of referrer to this document
referrer_id_ss
## the md5 of the raw source
md5_s
## the name of the publisher of the document
publisher_t
## the language used in the document; starts with primary language
language_ss
## an external ranking value
ranking_i
## the size of the raw source
size_i
## number of links to audio resources
audiolinkscount_i
## number of links to video resources
videolinkscount_i
## number of links to application resources
applinkscount_i
2012-08-05 15:49:27 +02:00
Michael Peter Christen
3276508d1b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-31 23:49:56 +02:00
Michael Peter Christen
3ce04cecf3
bad hack to prevent a bug appearing in solr
2012-07-31 23:49:07 +02:00
sixcooler
f32aa9a49c
prevent merge of blobs that can't be handled in memory
2012-07-31 23:23:16 +02:00
Michael Peter Christen
bbd242afb4
fix for a NPE
2012-07-30 14:51:01 +02:00
Michael Peter Christen
24d9db1613
snippet retrieval loading processes may use a smaller minimum load time
...
value than crawling processes. This speeds up the search result
preparation dramatically.
2012-07-30 10:38:23 +02:00
Michael Peter Christen
ef488a15f7
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-27 12:14:24 +02:00
Michael Peter Christen
1687737771
Abstraction of HandleMap and HandleSet
2012-07-27 12:13:53 +02:00
sixcooler
76b037a20a
check content domain fix:
...
search image/media should not show pages containing image/media
search text should show all/text but image/media
2012-07-27 04:11:52 +02:00
Michael Peter Christen
e432bb9cd9
better calculation of possible saving in HeapReader index data structure
2012-07-26 10:05:06 +02:00
Michael Peter Christen
9549984c65
documentation/comments
2012-07-25 21:34:23 +02:00
Michael Peter Christen
3bcd9d622b
cleaned up classes and methods which are either superfluous at this time
...
or will be superfluous or subject of complete redesign after the
migration to solr. Removing these things now will make the transition to
solr more simple.
2012-07-25 14:31:54 +02:00
Michael Peter Christen
6f1ddb2519
Moved solr index-add method to the same method where the YaCy index is
...
written. Also done some code-cleanup.
2012-07-25 01:53:47 +02:00
Michael Peter Christen
315d83cfa0
cleanup
2012-07-24 22:16:56 +02:00
Michael Peter Christen
1f41d9c6f5
bugfix for a NPE
2012-07-24 17:29:32 +02:00
Michael Peter Christen
76202f068e
extended abstraction of local and remote solr index using one front-end
...
for index administration and querying.
2012-07-24 17:23:29 +02:00
Michael Peter Christen
d3f243e2e1
fixed node type calculation for principal peers
2012-07-23 23:40:50 +02:00
Michael Peter Christen
826967513b
changed options in IndexFederated_p to switch on/off parts of the index
...
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
Michael Peter Christen
cba4ab862e
fix for http://bugs.yacy.net/view.php?id=202
2012-07-23 00:36:18 +02:00
orbiter
69e743d9e3
- more abstraction for the RWI index as preparation for solr integration
...
- added options in search index to switch parts of the index on or off
2012-07-22 13:18:45 +02:00
orbiter
05a3ffd03a
patches to ensure that solr connectors are active ony if they have a
...
solr object assigned and vice versa
2012-07-20 11:47:50 +02:00
orbiter
5a3c829872
embedded solr is only initiated if it is activated with
...
IndexFederated_p.html
2012-07-20 11:40:33 +02:00
Michael Peter Christen
97b7bcf2a6
added a solr search index
...
- by default, a (empty) solr storage instance is created at
SEGMENTS/solr_36
- the index is written if in /IndexFederated_p.html the flag "embedded
solr search index" is switched on
- a standard solr query interface is available now with a new servlet at
http://127.0.0.1:8090/solr/select
To test this, do the following:
- switch to webportal mode
- switch on the feature as described
- do a crawl. this fills the solr index. The normal YaCy search will NOT
work now!
- do a solr query, like:
http://127.0.0.1:8090/solr/select?q= *:*
http://127.0.0.1:8090/solr/select?q=text_t:Help
play with different search fields as you can see in
/IndexFederated_p.html
You can use the standard solr query attributes as described in
http://wiki.apache.org/solr/SearchHandler
2012-07-19 11:34:05 +02:00
Michael Peter Christen
f0a079ac9f
allow larger log entries
2012-07-14 16:28:14 +02:00
Michael Peter Christen
784a4abb18
enhancement in internal data organization which should generate less
...
synchronizations in database access
2012-07-14 13:09:44 +02:00
Michael Peter Christen
f78ce93a80
collection of speed and memory saving hacks
2012-07-13 21:15:38 +02:00
orbiter
c00a3cf74d
less usage of generic logger to avoid logger generation overhead
2012-07-12 19:54:54 +02:00
orbiter
a196f24f60
prevent enqueueing of non-loggeable logging entries
2012-07-12 19:42:42 +02:00
orbiter
482afed07c
reduced logging overhead (a bit)
2012-07-12 19:23:40 +02:00
orbiter
e76159040b
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-12 11:14:04 +02:00
orbiter
bbfa497a3c
replaced more size() > 0 by !isEmpty()
2012-07-12 11:12:21 +02:00
Michael Peter Christen
58e7d1952f
reduction of logging to prevent too much IO caused be logging
2012-07-12 02:08:11 +02:00
Michael Peter Christen
83da68c4c1
fixed a memory leak inside the logger which appeared if the log was
...
writter faster that the logger is able to print this out to its out
stream. A very large collection of unwritten log outputs had been seen
during strong crawling. The new ArrayBlockingQueue is limited to prevent
this case.
2012-07-12 01:23:04 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
orbiter
28b30231c3
fix for url matcher of multiple amp& in an url, see:
...
http://forum.yacy-websuche.de/viewtopic.php?f=8&t=4439&p=26650#p26650
2012-07-10 17:39:56 +02:00
Roland 'Quix0r' Haeder
aef9dd0350
- removed cleaning of blacklist cache on startup
...
- added cleaning of blacklist cache if cache is modified in interface
- extended cache saving to all cache types
- moved cache location to DATA/LISTS
- fixed static file path which was relative to the application path but
should be relative to data path - which is different in debian and mac
implementations
2012-07-10 13:08:16 +02:00
orbiter
c7afa8bc48
using SwitchboardConstants for solr attributes
2012-07-10 12:01:20 +02:00
orbiter
c6d8950651
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2012-07-09 14:33:11 +02:00
orbiter
5f3b8dc040
fix for RSS reader
2012-07-09 14:32:35 +02:00
orbiter
62202e2d71
refactoring of query attribute variable names for better consistency
...
with (next) stored query words
2012-07-09 11:14:50 +02:00
Michael Peter Christen
1addbc792c
use less memory for md5 cache
2012-07-08 22:05:04 +02:00
Michael Peter Christen
f32de94723
more logging
2012-07-08 22:04:36 +02:00
Michael Peter Christen
d09d9f2364
filter old peers from bootstrap (now stronger: 60 minutes instead of
...
240).
2012-07-08 21:25:22 +02:00
Michael Peter Christen
434ee90c59
added classification for control file types which shall not be loaded
...
but placed onto the noload-queue
2012-07-08 21:17:33 +02:00
Michael Peter Christen
a90bcb48f6
added webm
2012-07-08 17:58:05 +02:00
Michael Peter Christen
801972fe6f
fix for url camel case parser and sentence reader
2012-07-08 16:48:09 +02:00
Michael Peter Christen
fbc1a2030d
fix for sitemap importer: can now also import very large sitemaps within
...
small memory configurations
2012-07-08 16:11:50 +02:00
Michael Peter Christen
92731e5287
fix for sevenzip parser
2012-07-08 16:11:19 +02:00
Michael Peter Christen
45641b0c23
catch and log a warning in RasterPlotter
2012-07-06 09:21:12 +02:00
Michael Peter Christen
8efc1c1078
- fixed a memory leak (or bad usage) during parsing/snippet fetch
...
- more logging for errors
2012-07-06 09:05:41 +02:00
Michael Peter Christen
c3db015410
prevent loading of content from the cache when retrieval with IFFRESH is
...
used and cache is stale. Should speed up snippet generation when cache
strategy is IFFRESH.
2012-07-06 08:29:41 +02:00
Michael Peter Christen
b1e7c11fba
fix for pattern matcher in html parser
2012-07-05 14:24:03 +02:00
Michael Peter Christen
8a6edc0031
fix for solr shutdown
2012-07-05 14:23:43 +02:00
Michael Peter Christen
b8bcc06283
fix for urls beginning with "//"
2012-07-05 14:23:29 +02:00