Commit Graph

2264 Commits

Author SHA1 Message Date
orbiter
210ede8230 added a class for byte-array management. This was the result of a very large experiment
to replace byte[] objects within kelondro. Frequent System.arraycopy are common when
kelondroRow.Entry objects are handled. This class may be used to prevent this.
However, experimental replacement of byte[] by kelondroByteArray in kelondroRow.Entry
resulted in complete re-write of large parts of kelondro. This experiment did not
completely lead to a result, because then the interface to kelondro had to be changed
also from byte[] to kelondroByteArray, which may have caused a rewrite of large parts
of YaCy. The experiment is therefore abanonded, but this class remains here without
any function but possibly for future use.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3531 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-30 08:44:43 +00:00
theli
1b7fda12ee *) SOAP: separate function to get the active/passive/potential peer list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3526 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-28 07:34:44 +00:00
orbiter
6488ec8a80 no deletions in index in case that snippet-loading fails and there is no network connection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3525 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-27 08:21:45 +00:00
orbiter
847349358b less memory usage during collectionIndex-rebuild
should also speed up that process a little bit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-27 08:21:03 +00:00
auron_x
8ef3ad12a7 *) fix for rare bug in PPM-calc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3523 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-25 21:46:03 +00:00
auron_x
00bc0c1b47 *) new logging for PPM-Calculation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-25 20:24:12 +00:00
auron_x
5941577076 *) added some logging to PPM-Calculation to find a rare bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3521 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-25 14:56:42 +00:00
orbiter
5c3afb3202 added option to configure a path to a secondary index location.
this shall be used to store a fragment of the index on another physical device,
to split IO load and enhance access speed. The index is splitted in such a way
that the LURLs are stored to the secondary location, and the RWIs to the primary
location. This is especially useful for environments where symbolic links are
not possible and may cause IO access even if there is no write access to the
device which hosts the symbolic link.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-24 15:28:17 +00:00
theli
c2e6afbd69 *) bugfix: setting mimeType properly for dir listing with e.g. "?format=xml"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-23 05:37:19 +00:00
orbiter
242c19b480 completed TLD categorization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3515 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-22 13:52:00 +00:00
hydrox
b99f9d870d *) fixed double selection of peers for the same DHT-chunk.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3513 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-22 09:08:38 +00:00
theli
f20b596dc0 *) adding servlet to display all deployed SOAP Services
- soap related servlets are located in htroot/soap
*) new serverContext class for soap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3511 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-22 08:30:57 +00:00
theli
75d90834a2 *) adding additional file extension for powerpoint
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3507 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 16:18:58 +00:00
orbiter
2cb16824e3 removed support for old database structures.
The new collection index will be more generalized to support other indexes
i.e. YBR block-rank computation. A clean-up of the many conditions to support
the old database was necessary.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3506 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 15:35:35 +00:00
theli
81b4598487 *) peer profile can now be displayed as vcard
e.g. http://localhost:8080/ViewProfile.vcf?hash=localhash

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 15:08:18 +00:00
orbiter
3688ec33e5 release 0.51
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3501 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 14:00:17 +00:00
theli
1f61c13697 *) RSS-parser extracts the author tags now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3500 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 13:35:32 +00:00
orbiter
602ac42010 fix for OOM case when a kelondroTree Node cache grows
See also: http://www.yacy-forum.de/viewtopic.php?p=33275#33275

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 13:26:18 +00:00
theli
b374812f01 *) adding rpm packager as author
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 13:09:12 +00:00
orbiter
beb772d6cd fixed problem with broken notifier image, occurred only at initial start-up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 12:23:27 +00:00
theli
40ce33e664 *) adding RSS feed for yacy news
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3496 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 12:22:18 +00:00
theli
589cbd8cbf *) replacing all yacy-news-category strings with corresponding constants
Note: please use these constants from now on

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3495 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 11:09:15 +00:00
allo
f4af360f7c bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-20 15:37:19 +00:00
orbiter
7af188ff9a fix for http://www.yacy-forum.de/viewtopic.php?p=33089#33089
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-19 11:59:29 +00:00
orbiter
5bbf010107 removed synchronization of size() method from numerous classes to avoid thread locking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-18 19:45:23 +00:00
orbiter
6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
and it is difficult to get both types from all document types
- added some author parsing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-18 12:33:19 +00:00
orbiter
a738b57b31 added author tag to indexing content
enhanced composition of title tag
TODO: insert author information for external parsers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-17 01:18:34 +00:00
orbiter
6be57983a8 another update to the crawl balancer
can now alternate between top and bottom of the crawl stack

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 16:54:54 +00:00
orbiter
91cdc1493f removed query to NAT or responder in case that no other peer is there.
this is not needed any more, there are enough peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3486 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 15:21:24 +00:00
orbiter
4783a30910 - fixed a flush problem in balancer
- return to idle divisor in RWI RAM cache flush

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 15:16:26 +00:00
theli
91c2a042a7 *) bugfix for wrong proxy traffic accounting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 13:52:48 +00:00
orbiter
861f41e67e redesigned NURL-handling:
- the general NURL-index for all crawl stack types was splitted into separate indexes for these stacks
- the new NURL-index is managed by the crawl balancer
- the crawl balancer does not need an internal index any more, it is replaced by the NURL-index
- the NURL.Entry was generalized and is now a new class plasmaCrawlEntry
- the new class plasmaCrawlEntry replaces also the preNURL.Entry class, and will also replace the switchboardEntry class in the future
- the new class plasmaCrawlEntry is more accurate for date entries (holds milliseconds) and can contain larger 'name' entries (anchor tag names)
- the EURL object was replaced by a new ZURL object, which is a container for the plasmaCrawlEntry and some tracking information
- the EURL index is now filled with ZURL objects
- a new index delegatedURL holds ZURL objects about plasmaCrawlEntry obects to track which url is handed over to other peers
- redesigned handling of plasmaCrawlEntry - handover, because there is no need any more to convert one entry object into another
- found and fixed numerous bugs in the context of crawl state handling
- fixed a serious bug in kelondroCache which caused that entries could not be removed
- fixed some bugs in online interface and adopted monitor output to new entry objects
- adopted yacy protocol to handle new delegatedURL entries
all old crawl queues will disappear after this update!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 13:25:56 +00:00
hydrox
9b5fb3908d *) a peer-message are now created when a blog-comment is written
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-15 12:58:17 +00:00
orbiter
581db87237 more debug code for
http://www.yacy-forum.de/viewtopic.php?p=33009#33009

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3479 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 15:04:06 +00:00
orbiter
81c4cc6bf7 better debugging of balancer failure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 12:02:56 +00:00
orbiter
dd06d4cada more logging to better trace bug
http://www.yacy-forum.de/viewtopic.php?p=33001#33001

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 09:36:54 +00:00
orbiter
96b79bf86d redesigned remove method in kelondroRowSet
This should fix also numerous bugs like
http://www.yacy-forum.de/viewtopic.php?p=31077#31077
(java.lang.ArrayIndexOutOfBoundsException in kelondroRowCollection.removeShift)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3476 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-14 08:55:05 +00:00
orbiter
9f929b5438 better snippet handling in case of snippet load fail
see also http://www.yacy-forum.de/viewtopic.php?p=31096#31096

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-13 22:18:36 +00:00
auron_x
d451ad48d3 *) improved peerloadgraphic:
- unnecessary (0 %) pieces are removed
 - percent-values of each thread displayed in legend

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3474 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-12 19:08:17 +00:00
orbiter
a5d668c0c6 added speed-buttons for easy performance setting
appears in crawl start and on indexing monitor page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-12 16:24:28 +00:00
orbiter
5b0a84ce09 fix for synchronization deadlock with flushMissNameCache.
see also: http://www.yacy-forum.de/viewtopic.php?p=32939#32939

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-12 09:06:57 +00:00
karlchenofhell
e2ac5f62bd - Code hübscher machen [von NNs TODO]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-11 19:53:14 +00:00
allo
f04097c3dd integrated tor-patch for crawling, if yacyDebugMode is set.
(replaces: http://yacy.deruwe.de/overlay/net-misc/yacy-tor/files/disable_dns_checks-svn3132.patch)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3470 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-11 18:43:11 +00:00
auron_x
22fe14f292 *) first version of Peerload-graphic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3469 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-11 17:04:11 +00:00
orbiter
432d7d4e9c better catch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3468 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-10 23:38:08 +00:00
orbiter
8f7e8b6ee2 auto-delete for not-fixable db error in crawl stacker.
see also http://www.yacy-forum.de/viewtopic.php?p=32906#32906

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3467 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-10 23:31:36 +00:00
orbiter
7a52b07fcc better memory protection during freemen cycle
see also http://www.yacy-forum.de/viewtopic.php?p=32903#32903

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-10 23:22:37 +00:00
orbiter
6faa262259 fix for NURL-fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3465 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 14:30:53 +00:00
orbiter
909d7a8ae9 fixed wrong implemented row iterator in kelomdroFlexSplitTables
this has no effect, until now this iterator was only used on
the Index Administration page.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 13:55:26 +00:00
orbiter
a1fb8358b2 lets make a well-formed http link so that other crawlers don't have a problem to follow this link :-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3463 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 12:35:54 +00:00