Commit Graph

61 Commits

Author SHA1 Message Date
Michael Peter Christen
8219a445f3 refactoring 2012-09-21 16:46:57 +02:00
orbiter
563d584420 removed more dependencies in cora from kelondro 2012-09-21 11:02:36 +02:00
Michael Peter Christen
1687737771 Abstraction of HandleMap and HandleSet 2012-07-27 12:13:53 +02:00
Michael Peter Christen
e432bb9cd9 better calculation of possible saving in HeapReader index data structure 2012-07-26 10:05:06 +02:00
Michael Peter Christen
9549984c65 documentation/comments 2012-07-25 21:34:23 +02:00
Michael Peter Christen
f78ce93a80 collection of speed and memory saving hacks 2012-07-13 21:15:38 +02:00
orbiter
482afed07c reduced logging overhead (a bit) 2012-07-12 19:23:40 +02:00
Michael Peter Christen
de3ef8ad73 removed unimportant warnings 2012-06-19 08:45:34 +02:00
Michael Peter Christen
bef823c247 close the reader if finished 2012-06-11 01:20:54 +02:00
Roland 'Quix0r' Haeder
a093ccf5eb Now used synchronization in all close() methods to make sure all objects
are 'closed' in an ordered way

Conflicts:
	source/de/anomic/http/server/ChunkedInputStream.java
	source/de/anomic/http/server/ChunkedOutputStream.java
	source/de/anomic/http/server/ContentLengthInputStream.java
	source/net/yacy/cora/protocol/Domains.java
	source/net/yacy/cora/services/federated/solr/SolrShardingConnector.java
	source/net/yacy/cora/services/federated/solr/SolrSingleConnector.java
	source/net/yacy/document/content/dao/PhpBB3Dao.java
	source/net/yacy/document/parser/html/AbstractTransformer.java
	source/net/yacy/kelondro/blob/BEncodedHeap.java
	source/net/yacy/kelondro/blob/HeapReader.java
	source/net/yacy/kelondro/index/RAMIndexCluster.java
	source/net/yacy/kelondro/io/ByteCountInputStream.java
	source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java
	source/net/yacy/kelondro/table/SQLTable.java
2012-05-14 07:41:55 +02:00
Michael Peter Christen
0d6176804b emergency disabling of GenerationMemoryStrategy because of non-working
available-method
2012-01-15 21:58:18 +01:00
Michael Peter Christen
87f0210480 enriched log output to find NPE in HeapReader 2012-01-15 12:08:46 +01:00
Michael Christen
c04bfaa51b refactoring 2011-12-16 23:59:29 +01:00
Michael Christen
078fcde0dd bad initialization 2011-12-07 01:02:23 +01:00
orbiter
4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes).
The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-27 08:24:54 +00:00
orbiter
17530ca7b5 fix for bug http://bugs.yacy.net/view.php?id=10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-04-04 12:20:20 +00:00
orbiter
b1a8d0c020 enhancements to web cache and less strict caching rules
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-22 10:35:26 +00:00
orbiter
a35d513bd8 fix for not-deleted .gap and .idx files
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-17 17:09:19 +00:00
low012
3b40b98256 *) set SVN properties
*) minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-08 01:51:51 +00:00
orbiter
cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-07 20:36:40 +00:00
orbiter
42d90664f3 - fixed a memory leak in the httpc.post method (no finish)
- patched some more memory-saving relevant code
- some more minor bug fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-01 09:03:33 +00:00
orbiter
804ae2275b - do not delete idx and gap files if the heap is not modified
this change may have bugs in it which may cause damage to your existing data. please use with care.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 11:28:12 +00:00
orbiter
5e45ded8e2 - removed locks from WordReference
- refactoring of HeapReader/Writer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7514 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-23 00:32:16 +00:00
orbiter
6083f2f171 fix for (false) oom
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-15 14:26:25 +00:00
orbiter
090c73e32e catch a OOM in HeapReader iteration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-12 12:04:18 +00:00
orbiter
b2ed4cfaf8 more small bugfixes and light refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-12-28 01:57:05 +00:00
orbiter
09c208a3ab patch for corrupted database files (just work on and forget key)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7177 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 14:38:56 +00:00
orbiter
8da4eb5de6 addition to patch in SVN 7111
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7170 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 23:12:50 +00:00
orbiter
37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification
- added link to yacystats peer when peer is more than one day old

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7164 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-17 10:20:04 +00:00
orbiter
83ac07874f - corrected return value of put() methods (not used anywhere, so it did not harm before)
- added use of LookAheadIterator which should prevent mistakes when coding iterators with embedded iterators
- added a fail-safe reaction in case of database corruption using iterators over database elements (no interruption then)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-15 10:43:14 +00:00
orbiter
7dbc357593 patch to identify corrupted database files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-13 07:20:53 +00:00
orbiter
d865ef77a8 removed re-read of index in case of a bad index. This may not solve the problem but it applies a 100% CPU problem on the peer. I'm afraid bad index files must be abandoned, and cannot be fixed this way.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 09:55:04 +00:00
orbiter
65eaf30f77 redesign of crawl profiles data structure. target will be:
- permanent storage of auto-dom statistics in profile
- storage of profiles in WorkTable data structure
not finished yet. No functional change yet.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-31 15:47:47 +00:00
orbiter
0f276dd63f - MapHeap now implements Map<byte[], Map<String, String>>
- refactoring of method names to comply with Map method names

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-24 12:36:56 +00:00
orbiter
7aa860c505 - more logging
- more stability for database heap in case of buffer failure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-21 10:16:05 +00:00
orbiter
6388a58fc7 better memory management and slightly less (in total and temporary) RAM allocation:
- confirm that database objects that are not supposed to grow do not have a index memory management that is designed for growth
- changed index sorting method in such a way that it allocates less objects during quicksort
- database classes classes renaming (shorter, naming addresses that objects hold in RAM)
- added a large number of asserts to check if objects actually take the RAM that they should have


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-04 13:33:12 +00:00
orbiter
5924a0d851 - enhanced concurrency in database index access for multicore
- added statistics about database index caches in PerformanceMemory_p.html
- adoped many classes to use the new statistics
- added missing close statements

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7018 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-03 04:58:48 +00:00
orbiter
b03caaa57a better handling of OOM situations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6918 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-15 19:44:05 +00:00
orbiter
64f29f990e a collection of performance hacks and code cleanup:
- removed usage of URL-Caches which could have been a memory leak
- removed unused classes and methods
- removed not necessary synchronizations
- added synchronization hacks where possible
- fine-tuned crawling speed to prevent IO of balancer
- fixed a bug in IODispatcher that may have caused that no merges were done
- reduced number of parameters in very often called methods (compare methods)
- reduced complexity of data structures of now massively used HandleSet class
- reduction of new String() and getBytes() usage / new methods to support this transition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6820 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-19 16:42:37 +00:00
orbiter
1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790
The result should be a less usage of new String() and less memory usage (since a String-encapsulated byte[] has 40 bytes overhead)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6815 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-15 13:22:59 +00:00
orbiter
dde394a977 - shifted some computation out of synchronization to allow more concurrency
- removed synchronization where not necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6814 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-14 23:22:06 +00:00
orbiter
2bc36de336 - fix for bug in svn 6669
- cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6670 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 22:06:13 +00:00
sixcooler
787b588c33 reverted a part of svn6636:
- didn't work on blobs >2GB
- should be obsolete since svn6651
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2652&sid=7fa98fd3edfc2a03f26394d545e3e3c1&p=19172#p19172

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6655 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-07 19:32:46 +00:00
sixcooler
089877f32c my first commit - hopefully fix for merge problem
- http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2652

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-05 19:38:00 +00:00
orbiter
ef9473d92c added another sixcooler suggestion: recycle corrupted records
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 16:25:05 +00:00
orbiter
308a973503 refactoring of tables data organisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6644 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-04 11:26:23 +00:00
orbiter
3751ab4ae2 added sixcoolers patch and more checks/removed unnecessary code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6636 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 16:11:00 +00:00
orbiter
d8d8562c59 fill key with zeros during normalization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6635 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-01 15:40:16 +00:00
orbiter
24060885b6 - added Tables abstraction in data.Tables.java
fix for
http://forum.yacy-websuche.de/viewtopic.php?p=18910#p18910
http://forum.yacy-websuche.de/viewtopic.php?p=18894#p18894
http://forum.yacy-websuche.de/viewtopic.php?p=18814#p18814


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6631 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-29 18:02:09 +00:00
orbiter
0098e6e859 bugfix for heap iterator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6610 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-22 10:26:50 +00:00