Commit Graph

133 Commits

Author SHA1 Message Date
Michael Peter Christen
8b44fcf0f4 added missing @Override annotation 2014-03-28 13:48:37 +01:00
Michael Peter Christen
6ed9c0164e attaching names to all Threads to get a better view in profiling tools
like VisualVM
2014-02-28 15:02:01 +01:00
Michael Peter Christen
9eb668e951 enhanced the resource observer
The resource observer is now able to recognize free disk space AND
available space for YaCy. The amount of space which is assigned for YaCy
are defined in new settings in the configuration file.
Furthermore, there is now a cleanup process which deletes files in case
that an autodelete is activated. The autodelete is now BY DEFAULT ON if
the disk space is low, which means that YaCy starts to delete documents
when the disk is full!
2014-02-12 01:00:44 +01:00
Michael Peter Christen
fbee98c06f fixed shortcut self-reference bug 2014-02-11 22:14:46 +01:00
Michael Peter Christen
94245ce0a8 fixed "Size in KBytes" calculation in PerformanceQueues_p.html,
see http://bugs.yacy.net/view.php?id=362
2014-02-07 17:19:08 +01:00
Michael Peter Christen
1ea17bd9f3 - removed old metadata database and all migration code
- refactored all code which uses URIMetadataRow as standard for word
hash length and word hash ordering and moved that to the class 'Word',
becuase the class URIMetadataRow defined the old metadata data structure
and should be superfluous in the future
- removed unused methods from URIMetadataRow as preparation for further
removal of that class
2014-01-20 18:31:46 +01:00
Michael Peter Christen
5e31bad711 - the webgraph shall store all links which appear on a web page and not
all unique links! This made it necessary, that a large portion of the
parser and link processing classes must be adopted to carry a different
type of link collection which carry a property attribute which are
attached to web anchors.
- introduction of a new URL class, AnchorURL
- the other url classes, DigestURI and MultiProtocolURI had been renamed
and refactored to fit into a new document package schema, document.id
- cleanup of net.yacy.cora.document package and refactoring
2013-09-15 00:30:23 +02:00
Michael Peter Christen
47b1c81d08 - refactoring
- generalized writing of url attributes to solr documents
- added more url attributes to error documents
2013-08-20 15:46:04 +02:00
orbiter
056b42f5aa - added information about segment count to status_p.xml
- also moved this information from the old index structure, which is
still in use for the RWI/DHT index to that front-end
2013-07-23 18:03:33 +02:00
Michael Peter Christen
5878c1d599 - refactoring of log to ConcurrentLog:
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
2013-07-09 14:28:25 +02:00
Michael Peter Christen
e20450e798 patch in HTCache and CitationIndex loading in case that a file is
broken: do not crash; instead ignore the file and delete it.
2013-06-07 12:52:03 +02:00
orbiter
47114910d5 fix for possible memory leaks 2013-03-13 17:55:37 +01:00
orbiter
d74472f562 corrected result counter 2013-02-27 22:40:23 +01:00
Michael Peter Christen
38d3feae65 added separate delete commands for the local+remote solr index, the old
metadata and old rwi and for the citation index. The important
advancement is the separation of the citation index deletion because
that index is responsible for the linkdepth calculation. Now a search
index can be deleted without the citation index and that should cause
that less clickdepths must be post-processed.
2013-01-04 16:39:34 +01:00
orbiter
276dd6452b removed warnings 2012-10-23 19:08:44 +02:00
Michael Peter Christen
2f536cb54d code cleanup: removed unised methods and made more methods and objects
private
2012-10-08 10:50:24 +02:00
Michael Peter Christen
a8167e6e5b clean-up: removed unused methods in kelondro 2012-10-06 03:34:52 +02:00
Michael Peter Christen
8219a445f3 refactoring 2012-09-21 16:46:57 +02:00
orbiter
563d584420 removed more dependencies in cora from kelondro 2012-09-21 11:02:36 +02:00
Michael Peter Christen
1687737771 Abstraction of HandleMap and HandleSet 2012-07-27 12:13:53 +02:00
Michael Peter Christen
826967513b changed options in IndexFederated_p to switch on/off parts of the index
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
Michael Peter Christen
f78ce93a80 collection of speed and memory saving hacks 2012-07-13 21:15:38 +02:00
orbiter
bbfa497a3c replaced more size() > 0 by !isEmpty() 2012-07-12 11:12:21 +02:00
orbiter
0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
Michael Peter Christen
7c1ba99755 removed more unused method parameters 2012-07-05 10:44:30 +02:00
Michael Peter Christen
ea10766bfd cleaned unnecessary nested code 2012-07-05 08:44:39 +02:00
Michael Peter Christen
8a82609360 - smaller caches to save memory
- close cloneable iterators to free memory
2012-07-02 15:40:40 +02:00
Michael Peter Christen
00f2df1120 a variety of possible memory leak fixes 2012-06-06 18:23:18 +02:00
Michael Peter Christen
a1fe65b115 performance hacks 2012-06-05 12:06:26 +02:00
Michael Peter Christen
15db703808 added missing serialization to remove all warnings 2012-05-15 13:13:07 +02:00
Roland 'Quix0r' Haeder
a093ccf5eb Now used synchronization in all close() methods to make sure all objects
are 'closed' in an ordered way

Conflicts:
	source/de/anomic/http/server/ChunkedInputStream.java
	source/de/anomic/http/server/ChunkedOutputStream.java
	source/de/anomic/http/server/ContentLengthInputStream.java
	source/net/yacy/cora/protocol/Domains.java
	source/net/yacy/cora/services/federated/solr/SolrShardingConnector.java
	source/net/yacy/cora/services/federated/solr/SolrSingleConnector.java
	source/net/yacy/document/content/dao/PhpBB3Dao.java
	source/net/yacy/document/parser/html/AbstractTransformer.java
	source/net/yacy/kelondro/blob/BEncodedHeap.java
	source/net/yacy/kelondro/blob/HeapReader.java
	source/net/yacy/kelondro/index/RAMIndexCluster.java
	source/net/yacy/kelondro/io/ByteCountInputStream.java
	source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java
	source/net/yacy/kelondro/table/SQLTable.java
2012-05-14 07:41:55 +02:00
Michael Peter Christen
ba6aaabc51 refactoring + parser bugfixes 2012-05-04 17:28:27 +02:00
Michael Peter Christen
a02fdf8625 better error messages 2012-01-23 00:47:25 +01:00
Michael Peter Christen
c6ba44468e timeout = 5000 instead 3000 2012-01-23 00:45:32 +01:00
Michael Christen
c04bfaa51b refactoring 2011-12-16 23:59:29 +01:00
Michael Christen
e9dc99fe15 added rules to set specific RWIs as private RWIs which are not
transmitted to remote peers. This will be used for private index copies
and phonetic indexes.
2011-12-14 22:15:51 +01:00
orbiter
bc5df0eef5 updated ranking tables (fresh computation)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8103 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 12:37:00 +00:00
orbiter
b250e6466d implemented crawl restrictions for IP pattern and country lists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7980 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-29 15:17:39 +00:00
orbiter
57d5529a01 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-28 21:16:40 +00:00
orbiter
ce2a76d603 performance hack for search process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-16 10:00:51 +00:00
orbiter
2842ce30d6 added synchronization in ReferenceContainer and logging for shrinking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 22:15:01 +00:00
orbiter
cec3836e73 added reference limitation to IndexControlRWIs_p.html servlet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7936 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 21:47:54 +00:00
sixcooler
ecb4986b38 refactored stuff from last commit to ReferenceContainer
see: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=3353&p=23163#p23163
the limiting of references is disabled per default
to enable this set yacy.conf - index.maxReferences to a value of e.g. 100000

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7935 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-07 18:55:16 +00:00
orbiter
44d6416e2d ensure termination of shrink()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7927 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-05 00:22:21 +00:00
orbiter
52230a6864 replaced catching of Exception with Throwable, which catches also Errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-05 00:09:48 +00:00
orbiter
e1a3d609aa moved merger object from Segment to IndexCell to enable a correct shutdown sequence. This solves a bug where yacy cannot be shut down during an index merge that appears during the shutdown phase.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7924 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-04 23:27:12 +00:00
orbiter
a5541751a8 - added memory computation to termlist_p.xml
- added option to delete terms in termlist_p.xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7901 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-25 19:13:45 +00:00
orbiter
45e497a9bd fix for term iteration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7900 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-25 18:29:30 +00:00
orbiter
2c595a6a47 added new methods to count the number of objects in RWIs. lots of refactoring was necessary to introduce new Rating class and to unify naming of methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7896 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-25 10:35:25 +00:00
orbiter
75df87832c refactoring/better naming of methods and classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7895 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-08-24 23:08:28 +00:00