Commit Graph

1542 Commits

Author SHA1 Message Date
orbiter
14e0bb0dcf allow more references per word for new db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2458 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-25 12:06:23 +00:00
orbiter
985dcbde7f changed some parameters that may cause better memory usage and more indexing speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 23:39:52 +00:00
orbiter
b7f4a1521b added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 22:21:22 +00:00
orbiter
c26da4893b turned back NURL usage of kelondroTree, kelondroFlexTable has still problems with deleted entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2454 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 10:03:38 +00:00
orbiter
db1eae0227 * simplified initialization of database objects
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 02:19:25 +00:00
hermens
0b73f2b132 Repair DNS prefetch during cacheScan
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 01:31:08 +00:00
orbiter
27a159b401 * documentation update
* removed doc from release
* release information in doc/News.html
* release 0.46

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-23 11:36:09 +00:00
theli
f80f776b89 *) Trying to solve NullpointerException problem in function addURLtoErrorDB
See: http://www.yacy-forum.de/viewtopic.php?t=2705

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-23 10:23:20 +00:00
orbiter
d78b824e85 fixed problem with default path after first start-up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-22 13:35:51 +00:00
hydrox
1c99b5a484 *)fixed logging for urldbcleanup
*)changed exception handling in urldbcleanup so that it shows NullPointerException correctly
*)added more Blacklisting to urlcleaner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 06:42:42 +00:00
orbiter
135e019883 removed one superfluous line from last commit
(hasnot is included in remove)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:59:44 +00:00
orbiter
1591a55963 added object cache miss-cache use for remove method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:51:27 +00:00
orbiter
8f3f4ab0eb enhanced synchronisation in plasmaWordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2433 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:29:26 +00:00
orbiter
f933f00f09 another patch to URL protocol handling for 'news', 'nntp' etc:
reject it! (the java.net.URL class rejects them too)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2432 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:04:04 +00:00
orbiter
4c6e00d80a more bugfixes for URL class, see:
http://www.yacy-forum.de/viewtopic.php?p=24844#24844

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 00:23:39 +00:00
orbiter
23dd972608 fixed memory calculation in performanceMemory web page
fixed also maximum cache size computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-20 01:20:34 +00:00
orbiter
b7dc251948 fixed bugs in url class:
- correct backpath ('..') handling
- correct absolute path handling
- included https


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 22:27:01 +00:00
orbiter
1ce3c22761 better memory control:
- added memory monitor for preNURL-db in performanceMemory
- changed default memory assignments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 13:09:04 +00:00
orbiter
39b4c26bdc more memory control:
- catchup of OutOfMemoryError in server threads
- automatic adoption of word cache size after a Short Mem Cycle

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 00:06:39 +00:00
orbiter
3e9d509c39 some small fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2425 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 22:50:05 +00:00
orbiter
276225d79e fix for URL class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 21:33:00 +00:00
orbiter
eb633c0a4f server threads must now supply a method that can be called in case
of short memory. This has been realized for the indexing thread.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 02:07:03 +00:00
orbiter
f5720cb2fa removed most synchronization in wordIndex (for testing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 01:35:33 +00:00
orbiter
0187c60010 because of a bug in the JRE 1.4.2 there was no memory protection
see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4686462
this commit fixes the bug by using a memory-computation patch.
All uses of Runtime.maxMemory had been replaced by serverMemory.max
The bug is not present any more in Java 1.5

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 01:33:54 +00:00
auron_x
4eca0f8830 *) fixed PPM calculation for multiple indexer-threads
*) fixed totalPPM calculation and added total PPM to Network.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-17 19:15:30 +00:00
orbiter
cfb51fdef1 less synchronization in plasmaWordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-17 00:10:50 +00:00
orbiter
d6a928c2da quickfix for http://www.yacy-forum.de/viewtopic.php?t=2705
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 23:20:21 +00:00
orbiter
6ad471ef96 * applied many compiler warning recommendations
* cleaned up code
* added unit test code
* migrated ranking RCI computation to kelondroFlex and kelondroCollectionIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 19:49:31 +00:00
allo
cf1186597b utf fix from theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 15:26:04 +00:00
hydrox
9da3aa74d3 silly me, fix for the fix as advised by theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2408 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 17:26:32 +00:00
hydrox
bb3d9a5582 *) e.getMessage().indexOf() can only be used if there is actually an ExceptionMessage.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 17:18:09 +00:00
hydrox
7a54010a9c *) Iterators can't be casted to IndexContainer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2406 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 17:08:39 +00:00
theli
5e0b6f8f83 *) sorting peer name list on Blacklist_p.html
*) restructuring of sharedBlacklist_p.java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-13 13:29:50 +00:00
orbiter
cd5f7e137c fixed problem with NURL-generation upon first startup
(a new kelondroFlexTable was generated, which should not)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2402 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 23:24:50 +00:00
orbiter
8418af141a added several consistency checks and small changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 15:59:14 +00:00
theli
9d13aeca13 *) removing class. does not work so far
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2399 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 14:43:07 +00:00
theli
95a84ae469 *) adding missing classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 14:41:26 +00:00
theli
eee44be602 *) adding an interface for customized blacklist classes
- now it's possible to use a customized blacklist engine
     instead of the default one
   - this can be done by configuring the property BlackLists.class
   See: http://www.yacy-forum.de/viewtopic.php?t=2108

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 14:28:14 +00:00
orbiter
6d2f15971a there is a very strange error that causes that the kelondroRecords structure
is corrupted. The cause is, that the deleted-records-chain has wrong entries,
and one of the pointers in that chain points to a place behind the file end.
This causes an IndexOutOfBoundsException within an IO operation.
I currently don't know the reason that the deleted-records-chain is
corrupted, but the error can be catched. If this now happens with the
assortment database, the database is deleted.
See also:
http://www.yacy-forum.de/viewtopic.php?p=24586#24586

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2396 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 13:45:23 +00:00
theli
d2e8e76218 *) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
See: http://www.yacy-forum.de/viewtopic.php?t=2541
        http://www.yacy-forum.de/viewtopic.php?p=24516

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 02:42:10 +00:00
orbiter
9ae9062bd3 * disabled new kelondroFlex table for NURLs
* added new RAM index Class
* fixed possible synchronization problem in kelondroRecords


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2388 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 00:58:43 +00:00
orbiter
689bbcf9cd replaced kelondroTree db for NURLs by new kelondroFlexTable
The new database is only created if the old is deleted or does not exist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 23:36:58 +00:00
orbiter
7fbba41962 synchronization fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 23:04:36 +00:00
orbiter
328f9859a5 more synchronization in plasmaWordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 22:07:59 +00:00
orbiter
f43c90fa98 fixed handling of null referer in crawlOrder
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 21:46:34 +00:00
orbiter
130e6d4719 generalized index object for eurl, nurl and lurl to prepare move
of these tables to new kelondroFlexTable Object

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 17:37:54 +00:00
orbiter
acdf24877f more synchronization against outOfMemoryError in wordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 16:27:56 +00:00
orbiter
95160d7f2c fixed size computation of index elements from the collection index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 16:01:18 +00:00
orbiter
26116cabde added missing rowdef assignment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:31:40 +00:00
orbiter
cfbacbbf08 reverted change in robotsParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:29:29 +00:00