Roland Haeder
b58ca8622d
Some cleanups:
...
- added SKINS_PATH_DEFAULT as same as LISTS_PATH_DEFAULT was added
- Added 'final' keyword to a string
2013-07-27 10:13:57 +02:00
Roland Haeder
e2ee412160
Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS'
...
Conflicts:
htroot/api/blacklists_p.java
2013-07-27 10:12:58 +02:00
Roland Haeder
ae19401af0
Removed another duplicate occurance of Blacklist.BLACKLIST_FILENAME_FILTER
2013-07-27 09:59:09 +02:00
Roland Haeder
59225487ea
Fix for blacklist export, also applied the filename filter here
2013-07-27 09:58:56 +02:00
Roland Haeder
952fc0e7bd
Removed superfluous check for files ending '.black' as the previous commit already excluded all other files (e.g. .ser dumps), added logging in catch-all block
2013-07-27 09:58:38 +02:00
Roland Haeder
060fec1577
Reuse Blacklist.BLACKLIST_FILENAME_FILTER
2013-07-27 09:57:50 +02:00
Roland Haeder
29049c71f5
Possible fix for ticket http://bugs.yacy.net/view.php?id=270 , the filter for only including *.black must be applied
2013-07-27 09:57:07 +02:00
Roland Haeder
7263bb82fb
Fix for NPE on shutdown:
...
java.lang.NullPointerException
at net.yacy.search.Switchboard.storeDocumentIndex(Switchboard.java:2732)
at net.yacy.search.Switchboard.access00(Switchboard.java:207)
at net.yacy.search.Switchboard.run(Switchboard.java:3049)
2013-07-27 09:55:43 +02:00
Roland Haeder
13433d41a1
Log this exception better
...
Conflicts:
source/net/yacy/kelondro/blob/Tables.java
2013-07-27 09:54:51 +02:00
orbiter
080d80c9de
do not write an empty failreason in case that there is no fail. Because
...
of the lazy instantiation rule this value was not actually written, but
if lazy instantiation is switched on, then this causes that all crawl
starts delete all crawl-start-hosts completely because this looks for
filled error reasons.
2013-07-26 17:53:28 +02:00
Michael Peter Christen
61e015268b
fix in forced deletion: forced commit needed
2013-07-25 09:53:19 +02:00
Michael Peter Christen
83e2921b39
new test case for http://bugs.yacy.net/view.php?id=141
2013-07-25 09:31:48 +02:00
Michael Peter Christen
304aacb2cc
fix for http://bugs.yacy.net/view.php?id=267
2013-07-25 09:26:24 +02:00
Michael Peter Christen
c3b2301b2f
fix for http://bugs.yacy.net/view.php?id=268
2013-07-25 09:21:37 +02:00
reger
aa1a1f1d2c
- small adjustment to make sure genericParser is tried last
...
-- for some documents genericParser grabs document instead of specific available parser due to unordered pick of 1st to try parser
(like .ps .rdf files and other)
- remove redundant file extension registration
2013-07-23 20:24:13 +02:00
orbiter
3e901dcb06
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2013-07-23 19:33:07 +02:00
orbiter
f50b596e0b
do not run dht ditribution if system load is over 2.5
2013-07-23 19:32:32 +02:00
orbiter
9c681cc00d
added segment sizes, postprocessing status and cpu load to crawler
...
monitor
2013-07-23 19:10:11 +02:00
orbiter
86b514cf46
added load info to status_p.xml
2013-07-23 18:20:07 +02:00
orbiter
056b42f5aa
- added information about segment count to status_p.xml
...
- also moved this information from the old index structure, which is
still in use for the RWI/DHT index to that front-end
2013-07-23 18:03:33 +02:00
orbiter
6fb2811e68
fixes for problems with remote solr and non-activated webgraph index
2013-07-23 16:46:44 +02:00
sixcooler
af740f3058
changed optimization to a segment-size of index-size/5.000.000
...
+ one if not idle
+ one (and force) if postprocessing
2013-07-23 14:21:12 +02:00
Michael Peter Christen
336f86394c
replaced StringBuffer with StringBuilder
2013-07-23 12:21:27 +02:00
Michael Peter Christen
aeac2fb763
replaced more containsKey() -> get() usages by a simple get(), followed
...
by a test for NULL. This should increase the application speed and
reduces the lookup time for the affected methods by 50%
2013-07-23 12:16:51 +02:00
orbiter
5364c4dcc9
delayed first peer-ping to send the first ping out after the http got
...
up; if the ping comes before the http is up, it cannot be recognized as
senior peer (if at all). See also: http://bugs.yacy.net/view.php?id=266
2013-07-22 18:21:37 +02:00
orbiter
e24016e30a
added the property federated.service.solr.indexing.timeout to yacy.init
...
to provide a configurable time-out for solr; see also:
http://bugs.yacy.net/view.php?id=254
2013-07-22 17:45:12 +02:00
orbiter
c124037f19
removed forced non-soft commits to prevent index fragmentation
2013-07-22 17:28:20 +02:00
Michael Peter Christen
31483c47e1
fixed problem with remote luke requests
2013-07-22 15:55:20 +02:00
Michael Peter Christen
c15aa758dc
removed failreason_t removal patch because that causes too much
...
confusion using an external solr. to clean up the index after a schema
change, use the index cleaner function from the online servlet
2013-07-22 14:17:38 +02:00
reger
2b7a38640a
extend content type detection on file extension for .tif .tiff .htm
2013-07-21 22:57:21 +02:00
Michael Peter Christen
ac1aad5064
added a getSegmentCount method and use it to disable optimize if wanted
...
current segment count is below optimization level
2013-07-18 14:31:42 +02:00
Michael Peter Christen
36035e0a0a
- used reger's LukeRequest to generalize the index info in
...
SolrServerConnector
- used the LukeRequest in SolrServerConnector to replace the index size
method by a getNumDocs request to a LukeRequest result
2013-07-18 13:26:07 +02:00
Michael Peter Christen
39fceb5ccf
fix for NPE & bug #264
2013-07-18 12:37:32 +02:00
Michael Peter Christen
735a66eff3
enhancements to crawler
2013-07-18 12:29:04 +02:00
orbiter
232100301c
removed double-ocurring value assignments
2013-07-17 19:09:25 +02:00
Roland Haeder
be0ff6018f
Removed trailing spaces + some more final
2013-07-17 18:44:24 +02:00
Roland Haeder
aaedc0405d
Fixes and avoid of catching bad exceptions (some):
...
- Rewrote usage of HashMap/Map to concurrent versions (to avoid a
CME=ConcurrentModificationException)
- Rewrote ConnectionInfo (as an example) to use a synchronized iterator
instead of synchronizing an
already synced HashSet (see Collections call)
- This avoids catching CMEs again
- Commented out noisy ConcurrentLog.logException() call
Conflicts:
source/net/yacy/repository/LoaderDispatcher.java
2013-07-17 18:37:34 +02:00
Roland Haeder
841a28ae76
Added 'final' for all exception blocks as this helps the Java compiler
...
to optimize memory usage
Conflicts:
source/net/yacy/search/Switchboard.java
2013-07-17 18:31:30 +02:00
Roland Haeder
98e10f95e2
Added some cora package loggers
2013-07-17 18:28:10 +02:00
Roland Haeder
553f83a14e
Recommended cleanup (please, one day, execute this cleanup)
2013-07-17 18:26:50 +02:00
Felix Ableitner
03044589dd
Fixed (?i) appearing in entries, fixed multiple equal lines in file.
2013-07-17 16:42:10 +02:00
Felix Ableitner
376f9cd9d0
Merge branch 'master' of git://gitorious.org/yacy/rc1 into blacklist_structure
2013-07-17 15:58:09 +02:00
Michael Peter Christen
89c0aa0e74
added collection_sxt to error documents
2013-07-17 15:20:56 +02:00
Michael Peter Christen
0df5195cb0
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2013-07-17 12:42:06 +02:00
Michael Peter Christen
1fd006cc56
fixes using the embedded connector
2013-07-17 12:41:54 +02:00
orbiter
d0dc86cf3d
logging of deadlocks (if any) during cleanup process
2013-07-17 12:38:58 +02:00
orbiter
aba7cc5de7
added cpu load information to status page
2013-07-17 12:38:12 +02:00
Michael Peter Christen
c6a6f159e8
fix for crawl stack domain counter
2013-07-16 18:18:55 +02:00
Michael Peter Christen
93d1bac140
do a more frequent optimization, reduces IO after optimization
2013-07-16 17:16:48 +02:00
orbiter
260d0c96c7
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2013-07-16 10:49:36 +02:00