Commit Graph

422 Commits

Author SHA1 Message Date
orbiter
46367afaaa update of memory-protection values
see http://www.yacy-forum.de/viewtopic.php?p=35539#35539

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-11 18:02:48 +00:00
orbiter
85035dc319 addition to svn 3699: check send/receive if p2p-mode is activated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-10 13:27:38 +00:00
orbiter
139c59ebbd - fixed dht selction problem: the seed tables used a wrong ordering
- cleaned some code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 17:59:36 +00:00
orbiter
22a0e9f117 more timeout-control
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 14:53:17 +00:00
orbiter
0831034e07 fixed non-termination bug for robinson remote crawl peer selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3681 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 14:37:50 +00:00
orbiter
191ef16499 fixed wrong ordering that caused bad dht selection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-04 14:11:50 +00:00
orbiter
35c660654d more debugging lines to fix bug for
http://www.yacy-forum.de/viewtopic.php?p=34935#34935

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-30 23:05:19 +00:00
orbiter
81844e85b2 - fixed more cluster routing problems
- fixed a problem in remote search when balancer caused shift process to wait too long

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-30 00:39:53 +00:00
orbiter
64a6d6e5e6 added new set iterator (needed for last commit)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3599 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-26 09:52:37 +00:00
orbiter
f8de19fb2f robinson cluster: added client-side protocol implementation
- the network configuration page shows a new option: robinson clusters
- when a global search is made, all robinson peers are excluded, but:
- robinson peers/clusters that provide peer tags and where search words match
  such tags, they are included in global search. Therefore, robinson peers/clusters
  support the global yacy network with their indexes, without doin DHT-exchange


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-26 09:51:51 +00:00
orbiter
2f3b518169 temporary patch for startup-problem:
http://www.yacy-forum.de/viewtopic.php?t=3854
This is a serious problem that is caused by the database bug between 0.511 - 0.513
which produced a large number of double-entries in the RWI index. The uniq()-method
tries to fix this, and it does not terminate when the index is large and the number
of double-occurrences is also large. This patch does simply implement a time-controlled
termination, which does not heal the inconsistency problem. The uniq-method itself
is correct and does not need a bugfix, the non-termination is simply caused by the large number
of data that is shifted during the process. It was possible to reproduce this behaviour
in a test environment.
A real fix would need to:
- enhance the uniq()-method by using a recursive, binary segmentation of the array to be fixed
- uniq() must report the entries that are double
- the double-entries must be deleted from the collection index (from the index and the collections) to heal the problem


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3583 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-20 07:53:58 +00:00
orbiter
ba525ebf52 - re-enabled path optimization that was disabled during testing
- re-implemented index load/extend optimization that was removed from kelondroFlexTable,
  this is now part of kelondroIntBytesIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 14:55:19 +00:00
orbiter
595ee10468 fixed datatabase inconsistency bugs
inserted many debug lines
added a huge number of asserts
extended database test methods


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 13:37:02 +00:00
orbiter
7a7a1c7c29 fight against problems with remove-methods and synchronization
- some bugs may have been fixed with wrong removal operations
- removed temporary storage of remove-positions and replaced by direct deletions
- changed synchronization
- added many assets
- modified dbtest to also test remove during threaded stresstest

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-17 15:15:47 +00:00
orbiter
b6a5f53020 removed double synchronization from kelondroRecords.USAGE to prevent thread locking.
The method synchronization should be sufficient

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3574 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-15 21:13:54 +00:00
orbiter
063063aa0c fix for 100% cpu bug during dht selection
see also: http://www.yacy-forum.de/viewtopic.php?p=34068#34068

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-13 13:40:19 +00:00
orbiter
25070822a5 fix for http://www.yacy-forum.de/viewtopic.php?p=33925#33925
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 19:08:59 +00:00
orbiter
159bd0cab5 diverses; b.o. fix for http://www.yacy-forum.de/viewtopic.php?p=33914#33914
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 14:58:29 +00:00
orbiter
cdc7b77a62 fix for http://www.yacy-forum.de/viewtopic.php?p=33916#33916
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3548 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 14:47:45 +00:00
orbiter
40c14a4f0e - better implementation of search query properties
- basic protection against start-up problems when database files are corrupted
- auto-delete of not-critical databases during startup when load error occurs
- on-the-fly reset option for all database tables
- automatic on-the-fly reset for seed tables during enumeration exceptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 10:14:48 +00:00
orbiter
fcdf000fbc bugfix for http://www.yacy-forum.de/viewtopic.php?p=33838#33838
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 22:08:40 +00:00
orbiter
ba2c307ab3 optimized memory allocation in kelondroRow.Entry
such an entry cannot be instantiated without allocation of new byte[]; instead
it can re-use memory from other kelondroRow.Entry objects.
during bugfixing also other bugs may have been solved, maybe the INCONSISTENCY problem
could have been solved. One cause can be missing synchronization during bulk storage
when a R/W-path optimization is done. To test this case, the optimization is currently
switched off.
More memory enhancements can be done after this initial change to the allocation scheme.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 12:10:12 +00:00
orbiter
210ede8230 added a class for byte-array management. This was the result of a very large experiment
to replace byte[] objects within kelondro. Frequent System.arraycopy are common when
kelondroRow.Entry objects are handled. This class may be used to prevent this.
However, experimental replacement of byte[] by kelondroByteArray in kelondroRow.Entry
resulted in complete re-write of large parts of kelondro. This experiment did not
completely lead to a result, because then the interface to kelondro had to be changed
also from byte[] to kelondroByteArray, which may have caused a rewrite of large parts
of YaCy. The experiment is therefore abanonded, but this class remains here without
any function but possibly for future use.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-30 08:44:43 +00:00
orbiter
847349358b less memory usage during collectionIndex-rebuild
should also speed up that process a little bit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-27 08:21:03 +00:00
orbiter
602ac42010 fix for OOM case when a kelondroTree Node cache grows
See also: http://www.yacy-forum.de/viewtopic.php?p=33275#33275

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 13:26:18 +00:00
orbiter
7af188ff9a fix for http://www.yacy-forum.de/viewtopic.php?p=33089#33089
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-19 11:59:29 +00:00
orbiter
5bbf010107 removed synchronization of size() method from numerous classes to avoid thread locking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-18 19:45:23 +00:00
orbiter
861f41e67e redesigned NURL-handling:
- the general NURL-index for all crawl stack types was splitted into separate indexes for these stacks
- the new NURL-index is managed by the crawl balancer
- the crawl balancer does not need an internal index any more, it is replaced by the NURL-index
- the NURL.Entry was generalized and is now a new class plasmaCrawlEntry
- the new class plasmaCrawlEntry replaces also the preNURL.Entry class, and will also replace the switchboardEntry class in the future
- the new class plasmaCrawlEntry is more accurate for date entries (holds milliseconds) and can contain larger 'name' entries (anchor tag names)
- the EURL object was replaced by a new ZURL object, which is a container for the plasmaCrawlEntry and some tracking information
- the EURL index is now filled with ZURL objects
- a new index delegatedURL holds ZURL objects about plasmaCrawlEntry obects to track which url is handed over to other peers
- redesigned handling of plasmaCrawlEntry - handover, because there is no need any more to convert one entry object into another
- found and fixed numerous bugs in the context of crawl state handling
- fixed a serious bug in kelondroCache which caused that entries could not be removed
- fixed some bugs in online interface and adopted monitor output to new entry objects
- adopted yacy protocol to handle new delegatedURL entries
all old crawl queues will disappear after this update!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 13:25:56 +00:00
orbiter
581db87237 more debug code for
http://www.yacy-forum.de/viewtopic.php?p=33009#33009

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3479 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 15:04:06 +00:00
orbiter
dd06d4cada more logging to better trace bug
http://www.yacy-forum.de/viewtopic.php?p=33001#33001

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 09:36:54 +00:00
orbiter
96b79bf86d redesigned remove method in kelondroRowSet
This should fix also numerous bugs like
http://www.yacy-forum.de/viewtopic.php?p=31077#31077
(java.lang.ArrayIndexOutOfBoundsException in kelondroRowCollection.removeShift)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3476 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 08:55:05 +00:00
orbiter
9f929b5438 better snippet handling in case of snippet load fail
see also http://www.yacy-forum.de/viewtopic.php?p=31096#31096

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-13 22:18:36 +00:00
orbiter
909d7a8ae9 fixed wrong implemented row iterator in kelomdroFlexSplitTables
this has no effect, until now this iterator was only used on
the Index Administration page.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 13:55:26 +00:00
orbiter
3ef77d2030 fix for http://www.yacy-forum.de/viewtopic.php?p=29878#29878
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3461 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 12:14:25 +00:00
orbiter
6ad39bae1e fixed shutdown problem
this fixes the 'inconsistency' messages during start-up

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 08:48:47 +00:00
orbiter
38b93f8cb8 bugfix for my last commit:
iterator did not consider secondary start point in case of rotation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 22:07:17 +00:00
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
orbiter
23338d2070 small fix for RAM computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3447 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 23:55:52 +00:00
orbiter
4e8eb1dbe3 some minor changes here and there
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 14:22:10 +00:00
orbiter
3499a364ef a little bit better memory protection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 09:38:14 +00:00
orbiter
1cba31de43 redesigned ram organization for database caches
- each cache can now allocate as much memory as is available
- no more fixed limits
- replaced old performance memory monitor by new one
- added supervision methods as static functions into the classes that provide cache functionality
- steering of ram allocation is done with two simple limits that are ram availability-relative


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 22:43:32 +00:00
orbiter
db235f2d61 added some memory protection in collection index multiple merge
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-04 22:54:04 +00:00
orbiter
b466baa574 added some memory protection
too large collection arrays are now avoided. By default, the biggest
collection index is 7. larger collections are dumped into a commons
directory, but cannot yet be used. Bevore doing a dump, the collection
is splittet into a part which has only root-references, and stored back
to the collection; the remaining part goes to commons

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-03 00:55:51 +00:00
orbiter
51e12049fa third generation of R/W head path optimization
- data from collection arrays are read in order
- merged data is written in order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 11:13:23 +00:00
orbiter
10a3c20b8d some more enhancements to R/W Head path optimization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 15:54:02 +00:00
orbiter
f4cfd19835 second Generation of collection R/W head path optimization:
- permanent cache flush is switched off. The optimized cache flush
  works better if it is a large number of collections that is flushed
  together
- the flush size can be configured instead the flush divisor. There is
  only one size for all flushes
- collection records that shall be removed during collection transition
  (jump from one collection file to another) are now not really removed
  but only marked in RAM. add-operations to the collection use these
  marked collection spaces
- index bulk write operations are now separated for each file of a kelondroFlex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-27 13:01:22 +00:00
orbiter
1fda50fd3c correct R/W head positioning in kelondroFlex
and some enhancements

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3409 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 22:25:39 +00:00
orbiter
304412a049 first generation of collection index R/W head path optimization
- collections are now hand-over as collection lists to collection index for merge opertations
- collection index lists are separated into 'new' and 'extend' lists
- lists are written separately
- write operations are done into array sets and array indexes. These are now serialized
- write operations into index files are sorted by index;
  that means that a R/W head does not need to go forward
  and backward, only forward
More enhancements are possible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 15:49:23 +00:00
orbiter
32867580ee update to kelondroRecords needed fo last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3403 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 11:55:36 +00:00
orbiter
8668ac5d91 preparations for collection index cache flush optimization
(hand-over commit, no functional change to current code)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3399 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-25 21:06:26 +00:00