Commit Graph

2290 Commits

Author SHA1 Message Date
theli
79f27c298e *) next bugfix for blacklist
- added items were not displayed properly
   - bugfix for disabled usage

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2395 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 12:18:39 +00:00
theli
4abc04dac0 *) make Blacklist_p.html more failsafe if no blacklist file is available of the user has entered empty strings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2394 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 12:03:44 +00:00
theli
28c84e5b57 *) Bugfix for NullPointerException in Blacklist_p if blacklist directory is empty
See: http://www.yacy-forum.de/viewtopic.php?p=24563

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 11:03:14 +00:00
theli
66f1eb07d9 *) Bugfix for IllegalArgumentException in transferURL
See: http://www.yacy-forum.de/viewtopic.php?p=24560

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 10:54:19 +00:00
(no author)
1cab72c93c small changes reflecting the de.lng version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2390 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 10:22:39 +00:00
theli
d2e8e76218 *) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
See: http://www.yacy-forum.de/viewtopic.php?t=2541
        http://www.yacy-forum.de/viewtopic.php?p=24516

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 02:42:10 +00:00
orbiter
9ae9062bd3 * disabled new kelondroFlex table for NURLs
* added new RAM index Class
* fixed possible synchronization problem in kelondroRecords


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2388 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-12 00:58:43 +00:00
orbiter
689bbcf9cd replaced kelondroTree db for NURLs by new kelondroFlexTable
The new database is only created if the old is deleted or does not exist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 23:36:58 +00:00
orbiter
7fbba41962 synchronization fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 23:04:36 +00:00
orbiter
328f9859a5 more synchronization in plasmaWordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 22:07:59 +00:00
orbiter
f43c90fa98 fixed handling of null referer in crawlOrder
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 21:46:34 +00:00
rramthun
23a99b8283 Small changes to the language
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 20:55:40 +00:00
orbiter
130e6d4719 generalized index object for eurl, nurl and lurl to prepare move
of these tables to new kelondroFlexTable Object

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 17:37:54 +00:00
orbiter
acdf24877f more synchronization against outOfMemoryError in wordIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 16:27:56 +00:00
orbiter
95160d7f2c fixed size computation of index elements from the collection index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 16:01:18 +00:00
orbiter
26116cabde added missing rowdef assignment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:31:40 +00:00
orbiter
cfbacbbf08 reverted change in robotsParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:29:29 +00:00
orbiter
abf22f6e60 removed url normalform computation from htmlFilterContentScraper.
This method was implemented in de.anomic.net.URL


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:09:22 +00:00
orbiter
740d49751d * strict type and size check in kelondroRow handling
* adopted all code to use the declaration form of kelondroRow
* fixed a bug in kelondroRow which caused wrong parsing of encoding type
* the bug caused bad database behaviour in new indexCollection data structure.
  because of this bug, all test databases are now already void. A new database is created
* the kelondroFlexTable and indexCollection data structures now store a declaration of the row definition
  into a properties file along the database files.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 03:20:44 +00:00
orbiter
314021453f * more logging
* option in yacy.init to set useCollectionIndex usage

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-10 21:21:50 +00:00
allo
a52f36787f better templatedebugging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2371 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-10 14:02:03 +00:00
allo
cbfba2026d removing the prefix option.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-09 17:51:20 +00:00
allo
3480d36417 added some debug code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-09 16:57:36 +00:00
orbiter
61b151b083 * added another auto-fix for collection index inconsitency check
* fixed words size computation for collection index


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-08 00:52:04 +00:00
orbiter
0bbbd129ef small fix for exception message
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 23:52:12 +00:00
orbiter
718fbc2dae enhancements in kelondroCollectionIndex:
* synchronized array and index objects
* auto-fix function for slightly corrupted index entries
* generalized internal access methods

also extended kelondroIndex interface to support ordering access
which is used in kelondroCollectionIndex for string comparisments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 23:29:26 +00:00
orbiter
ec5149ff3b fix for busyCacheFlush detection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 22:28:09 +00:00
orbiter
f58283def2 better control of index flush
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2364 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 22:07:17 +00:00
orbiter
4be21a3cab ups
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2363 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 21:56:02 +00:00
orbiter
80b6c90d54 enhancements to prevent blocking during dht transfer receive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 21:49:39 +00:00
auron_x
4fb8fddd99 *)made the domainlist of the blacklist sorted
if a new domain is added it is still appended to the end of the list and sorted in with next refresh, may need a fix.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2361 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 17:34:27 +00:00
theli
9f298083cd *) adding more urls to the error url
- old error strings where replaced with there corresponding constants   
   See: http://www.yacy-forum.de/viewtopic.php?t=2638

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2360 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 15:11:14 +00:00
hermens
d56f06401e - Cache known URLs during indexReceive to avoid getting blocked during loadedURL.exists() whenever possible
- Small logging updates



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:42:00 +00:00
theli
c09f734d06 *) offer router configuration on ConfigBasic.html
- checkbox to allow router configuration is shown if
   - a) the UPnP forwarder is installed
   - b) a UPnP enabled router was found
   - c) no other forwarder was configured
   See: http://www.yacy-forum.de/viewtopic.php?p=24264

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:31:18 +00:00
hermens
dcbb4d0a6b Display the size of HashBlacklistedCache on PerformanceMemory page.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2357 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:19:54 +00:00
theli
c7b6389ca1 *) renaming indexDistribution.dhtReceiptLimitEnabled property to indexDistribution.transferRWIReceiptLimitEnabled
so that the default value is taken over by all peers


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:01:01 +00:00
theli
0baadcadca *) enable indexDistribution.dhtReceiptLimitEnabled limit per default
See: http://www.yacy-forum.de/viewtopic.php?p=24425

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2355 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 10:51:24 +00:00
orbiter
d799622da1 better flush limit for index collections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2354 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 00:44:43 +00:00
orbiter
d468d665c9 some changes that may help to prevent deadlocks that cause an OutOfMemoryError
as described in
http://www.yacy-forum.de/viewtopic.php?p=24359

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 00:19:01 +00:00
theli
988341cf81 *) some comments added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-06 14:44:04 +00:00
theli
d54767f634 *) last step of removing embedded html from dir class
- migration finished
*) dir list now sorts the dirlist entries. 
   - directories are listed before files
   - files are sorted alphabetically, case insensitive 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-06 14:38:07 +00:00
rramthun
96b774e427 Adding link to newsletters as agreed in forum.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-06 10:22:53 +00:00
theli
8283df2d77 *) first step of removing embedded html from dir class
- dir list generation uses templates now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-06 08:09:39 +00:00
orbiter
279b1d969d Integrated new indexing data structure 'collections' into the main class
for indexing, the plasmaWordIndex.

The new data structure is ready-to-use, but currently disabled.
It can be activated by setting the static
plasmaWordIndex.useCollectionIndex
to true. This shall be done for testing purpose.

The new index is stored to
DATA/INDEX/PUBLIC/TEXT
The directory PLASMA shall be used only for crawler in the future.

Attention: during testing the data structure in INDEX may change,
and created indexes with the new data structure may get useless.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-05 22:22:14 +00:00
orbiter
4ff742e42d implemented indexCollectionRI
this is the new database structure that is supposed to replace the
plasmaAssortmentCluster AND the plasmaWordIndexFileCluster
The new structure is not yet active and needs to be integrated into
plasmaWordIndex. This has some migration constraints that are not yet
completely solved.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-05 19:18:33 +00:00
allo
132cd7da45 no need to copy dir.*
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-05 18:39:25 +00:00
allo
0164321160 fix for the actions (uploading/deleting, loggin in, ...)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-05 18:34:31 +00:00
orbiter
01f95eccd3 re-write of kelondroCollectionIndex. This is the data structure that
shall replace the current assortment files.
* used the kelondroFlexTable to hold the index of collections
* used kelondroRow definitions to declare all data structures
* fixed several bugs that appeared in kelondroRowSet and kelondroRowCollection during testing


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-04 23:04:03 +00:00
orbiter
ebc2233092 * implemented (finished) class indexRowSetContainer
* replaced indexTreeMapContainer by indexRowSetContainer
* deleted indexTreeMapContainer and abstract class
This is another step to the new database structure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-02 23:20:03 +00:00
orbiter
9183d21f25 renamed new index class to old name
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-02 20:01:59 +00:00