Commit Graph

151 Commits

Author SHA1 Message Date
orbiter
069562a14d fixed problem with re-crawl; replaced error file-db with ram-db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3900 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 23:47:08 +00:00
orbiter
139c59ebbd - fixed dht selction problem: the seed tables used a wrong ordering
- cleaned some code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 17:59:36 +00:00
orbiter
7f56c8d4aa fixed some seed selection details
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3685 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 22:22:35 +00:00
orbiter
81844e85b2 - fixed more cluster routing problems
- fixed a problem in remote search when balancer caused shift process to wait too long

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-30 00:39:53 +00:00
orbiter
e48189c710 enhanced cluster routing
- cluster definitions can now contain an addition for local ip addresses
- cluster-cluster communication uses the local ip address instead the global address, if one is given

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3624 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-29 22:05:34 +00:00
orbiter
b33cef421e better routing for public clusters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3620 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-29 00:08:38 +00:00
orbiter
f8de19fb2f robinson cluster: added client-side protocol implementation
- the network configuration page shows a new option: robinson clusters
- when a global search is made, all robinson peers are excluded, but:
- robinson peers/clusters that provide peer tags and where search words match
  such tags, they are included in global search. Therefore, robinson peers/clusters
  support the global yacy network with their indexes, without doin DHT-exchange


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-26 09:51:51 +00:00
rramthun
e6fb6426a3 *) Some cosmetical changes and corrections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 16:16:54 +00:00
orbiter
25070822a5 fix for http://www.yacy-forum.de/viewtopic.php?p=33925#33925
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 19:08:59 +00:00
orbiter
159bd0cab5 diverses; b.o. fix for http://www.yacy-forum.de/viewtopic.php?p=33914#33914
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3549 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 14:58:29 +00:00
orbiter
40c14a4f0e - better implementation of search query properties
- basic protection against start-up problems when database files are corrupted
- auto-delete of not-critical databases during startup when load error occurs
- on-the-fly reset option for all database tables
- automatic on-the-fly reset for seed tables during enumeration exceptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-05 10:14:48 +00:00
orbiter
6ad39bae1e fixed shutdown problem
this fixes the 'inconsistency' messages during start-up

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-09 08:48:47 +00:00
orbiter
38b93f8cb8 bugfix for my last commit:
iterator did not consider secondary start point in case of rotation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 22:07:17 +00:00
orbiter
1cba31de43 redesigned ram organization for database caches
- each cache can now allocate as much memory as is available
- no more fixed limits
- replaced old performance memory monitor by new one
- added supervision methods as static functions into the classes that provide cache functionality
- steering of ram allocation is done with two simple limits that are ram availability-relative


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 22:43:32 +00:00
theli
26450a1d9a *) avoid nullpointerException on seed.getAddress() (reported by netbude)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 16:11:36 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
2d7f7da7ce fix for null pointer exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 09:50:24 +00:00
orbiter
b2f4087400 redesign of last-seen fieln inside seed:
the field contains now a time in UDC-0 (instead relative to local UDC offset)
this fixes a bug in peer selection, where an iteration over all seeds
ordered by lastseen did not work correctly.
Problems may occur because the new meaning of this field may mix with
the different meaning of that field in older peers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 23:54:27 +00:00
orbiter
fcc11391a8 some redesign attempts because sorting of lastseen does not work correctly
not finished yet
target: better selection of peer-ping targets, which should enhance stabilization of the net

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 13:12:31 +00:00
orbiter
f696d3c1eb added double computation to kelondroMapObjects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3316 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 09:48:31 +00:00
orbiter
306c50ac40 QPM (queries per minute) statistic stub
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:39:11 +00:00
orbiter
9c05e2a820 re-design ob kelondroMap
- this class is replaced by an object that can hold any type of object
- this object must be defined as a class that implements kelondroObjectsEntry
- the kelodroMap is now implemented as kelondroMapObjects

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 23:51:10 +00:00
orbiter
d07b132a0d - fixed colors of network grafic
- added option to activate write cache for seed-db
- did not activate write cache because it did not work

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 19:39:31 +00:00
(no author)
37e53b4a6a replaced tree database structure for seed db by flex data structure
I don't know if this helps, we will find out...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3177 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-07 23:34:13 +00:00
orbiter
d0c32c6aeb better protection against fraud peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3104 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-20 01:07:49 +00:00
orbiter
f4b547dc13 limited index transfer to peer with version 0.486
this protects peers with version below 0.486 from new RWI objects
(which they cannot handle)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2988 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-20 02:46:53 +00:00
orbiter
ba967c4875 - bugfixes and debug code
- ne generalized index class indexCachedRI

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-07 01:09:02 +00:00
orbiter
215c4e65f1 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2887 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-31 22:10:25 +00:00
orbiter
985fd807cc bugfixing in collection methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2882 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-30 02:39:39 +00:00
orbiter
147d88cf23 re-design of database caching
this should reduce IO a lot, because write caches are now actived for all databases
- added new caching class that combines a read- and write-cache.
- removed old read and write cache classes
- removed superfluous RAM index (can be replaced by kelonodroRowSet)
- addoped all current classes that used the old caching methods
- more asserts, more bugfixes


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2865 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-26 13:50:50 +00:00
theli
df49724f28 *) better error handling for seed upload - test download - problems
See: http://www.yacy-forum.de/viewtopic.php?p=26814#26814

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 10:10:53 +00:00
orbiter
bcf2b800b4 applied UTF-8 encoding parameter to yacy-internal protocol communication
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 13:35:38 +00:00
orbiter
5a40ea7866 refactoring of wget string list generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 09:59:20 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
orbiter
db1eae0227 * simplified initialization of database objects
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 02:19:25 +00:00
orbiter
23dd972608 fixed memory calculation in performanceMemory web page
fixed also maximum cache size computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2429 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-20 01:20:34 +00:00
orbiter
6ad471ef96 * applied many compiler warning recommendations
* cleaned up code
* added unit test code
* migrated ranking RCI computation to kelondroFlex and kelondroCollectionIndex


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-16 19:49:31 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
orbiter
92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
the start-up delay is used to pre-load the database node cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 23:57:33 +00:00
orbiter
66964dc015 removed high/med/low from kelondroRecords cache control.
this was done because testing showed that cache-delete operations
slowed down record access most, even more that actual IO operations.
Cache-delete operations appeared when entries were shifted from low-priority
positions to high-priority positions. During a fill of x entries to a database,
x/2 delete situation happen which caused two or more delete operations.
removing the cache control means that these delete operations are not
necessary any more, but it is more difficult to decide which cache elements
shall be removed in case that the cache is full. There is not yet a stable
solution for this case, but the advantage of a faster cache is more important
that the flush problem.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-25 10:31:38 +00:00
orbiter
4d8f8ba384 added cache-performance analysis for node caches
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-24 09:40:08 +00:00
orbiter
82b2bc6932 patch for index-transfer DoS problem
see http://www.yacy-forum.de/viewtopic.php?p=21627#21627
note that this function will make the index-transfer functionality void

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-18 22:24:51 +00:00
auron_x
55ea4cbfe6 *)reverted patch for memory-display issue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-14 18:09:28 +00:00
auron_x
53d9ab6db7 *)fixed bug in PerformanceMemory_p.java which caused negative memory-values on big peers
see http://www.yacy-forum.de/viewtopic.php?t=2370

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-14 08:42:41 +00:00
orbiter
fd7c17e624 added virtual host support:
all yacy-to-yacy communication now send the <peer-hexhash>.yacyh
virtual domain inside the http 'Host' property field.
This shall enable running a yacy peer on a virtual host.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 13:11:00 +00:00
orbiter
29b1b0823c added monitoring of new object cache to performanceMemory page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2072 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 10:03:12 +00:00
orbiter
488a0ed580 replaced old keyIterator and rowIterator by buffered iterators
that are synchronized with database access
Main change is done in kelondroTree, other classes are only adoptions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1918 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 23:43:24 +00:00
borg-0300
149409ba5c move description -> javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-20 23:15:39 +00:00
theli
9b941fb773 *) bugfix for usage of yacy with extended port binding (e.g. #eth0:8080, 192.168.0.1:8080, etc.)
- port was reported incorrectly to other peers


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-17 10:53:20 +00:00
orbiter
218cd6561c fixed problem with wrong hash length in file share
see: http://www.yacy-forum.de/viewtopic.php?p=16565#16565

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1658 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-15 22:12:53 +00:00
hermens
bb1664b63e *) Remove workaround from SVN 1472: It is not needed anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1500 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-31 00:45:48 +00:00
orbiter
3419b3bcdd fix for bug that caused the peer-counter problem.
See http://www.yacy-forum.de/viewtopic.php?p=16016#16016
The kelondroDyn now uses a generic fill character.
kelondroDyn-Tables containing peer/word/url-hashes must not use '_'
as fill character.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 22:16:55 +00:00
hermens
2d1283da34 This is an extremely ugly workaround for an incompatibility between yacySeed hashes and kelondroDyn keys
See: http://www.yacy-forum.de/viewtopic.php?p=15955#15955



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 15:26:56 +00:00
hermens
ad0de69607 Yet another bug fix for svn 1441. It should work now.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:32:04 +00:00
hermens
58fd40e1c1 Aaargh
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:28:03 +00:00
hermens
b08af0c2cb *) Force download of seed file when checking upload success
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:25:48 +00:00
orbiter
b3dca06bb1 added location column to network pages.
The location is computed from the userAgent string of connecting peers.
Therefore this information is not available right after start-up.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-22 01:01:46 +00:00
orbiter
0c762daf4b better startup failure handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1205 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-12 23:59:58 +00:00
orbiter
3d8a5ae652 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 14:24:13 +00:00
orbiter
a04930f025 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-04 23:51:28 +00:00
orbiter
6e81f2580d try to fix bug with storage of settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1058 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-11 08:41:13 +00:00
orbiter
79818a320f introduced citation-rank transmission protocol and activate transport for anonymisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-10 23:48:20 +00:00
borg-0300
e3179a6394 added getOwnSeedFile()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 14:07:58 +00:00
theli
e58e85363d *) Bugfix for ConcurrentModificationException while operating on seed properties
*) Bugfix for YACY database inconsistency (no more elements available in db '...seed.new.db'), re-set of db.
   See: http://www.yacy-forum.de/viewtopic.php?p=11836#11836
        http://www.yacy-forum.de/viewtopic.php?p=11814#11814

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@995 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-28 07:15:00 +00:00
orbiter
8d827cdb30 tried to fix problems with order of network list by last-seen (which could also improve the network picture)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@980 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-24 14:07:43 +00:00
theli
446e7e8bef *) Bugfix for Seed-Upload - Permission denied problem
See: http://www.yacy-forum.de/viewtopic.php?p=11648#11648

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@978 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-24 08:54:46 +00:00
theli
02d9af1a70 *) Restructuring and extending of Remote Proxy Support
- remote proxy configuration can now be "really" changed on the fly and takes effect immediately
   - adding possibility to disable remote proxy usage for yacy->yacy communication
   - adding possibility to disable remote proxy usage for ssl
   - restructuring proxy configuration so that it is stored in a single place now

*) Adding possibility to import a foreign word DB (or even more of them in parallel) 
   at runtime into the peers DB
   - this can be done by calling IndexImport_p.html 
   - ATTENTION: please not that at the moment this thread must be aborted via gui
     before a normal server shutdown is done. 
   - TODO: integrating IndexImport Thread into normal server shutdown
   - TODO: Adding posibility to import crawl-queues, etc. from foreign peers
   - TODO: removing old import function from yacy.java and calling the new routines instead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-22 13:28:04 +00:00
borg-0300
e642a5d8b7 more constants
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-17 15:46:12 +00:00
theli
c8a35a0130 *) Adding new connection tracking page (currently only for incoming connections)
*) Displaying statistic for incoming connections on status page
*) Bugfix for Loop-Access Bug when trying to access the yacy page while yacy is configured as proxy
   See: http://www.yacy-forum.de/viewtopic.php?p=6826
*) Bugfix for Referer Bug
   See: http://www.yacy-forum.de/viewtopic.php?p=11098#11098
*) Adding reverse Name lookup for yacy-domain names (used by the connection tracking page)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@916 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-12 08:17:43 +00:00
orbiter
e85989510a update to network image; added disconneced peers by disconnection time and changed colors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@890 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-09 17:20:40 +00:00
theli
a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
various checks like the blacklist check or the robots.txt disallow check are now
   done by a separate thread to unburden the indexer thread(s)
   TODO: maybe we have to introduce a threadpool here if it turn out that this single
         thread is a bottleneck because of the time consuming robots.txt downloads

*) improved index transfer
   The index selection and transmission is done in parallel now to improve index 
   transfer performance.
   TODO: maybe we could speed up performance by unsing multiple transmission threads in 
         parallel instead of only a single one.

*) gzip encoded post requests
   it is now configureable if a gzip encoded post request should be send on
   intex transfer/distribution

*) storage Peer (very experimentell and not optimized yet)
   Now it's possible to send the result of the yacy indexer thread to a remote peer 
   istead of storing the indexed words locally. 
   This could be done by setting the property "storagePeerHash" in the yacy config file
   - Please note that if the index transfer fails, the index ist stored locally.
   - TODO: currently this index transfer is done by the indexer thread. 
     To seedup the indexer
     a) this transmission should be done in parallel and
     b) multiple chunks should be bundled and transfered together


*) general performance improvements  
   - better memory cleanup after http request processing has finished
   - replacing some string concatenations with stringBuffers
   - replacing BufferedInputStreams with serverByteBuffer
   - replacing vectors with arraylists wherever possible
   - replacing hashtables with hashmaps wherever possible
   This was done because function calls to verctor or hashtable functions
   take 3 time longer than calls to functions of arraylists or hashmaps.
   TODO: we should take a look on the class serverObject which is inherited from hashmap
         Do we realy need a synchronization for this class?
   TODO: replace arraylists with linkedLists if random access to the list elements is not needed

*) Robots Parser supports if-modified-since downloads now
   If the downloaded robots.txt file is older than 7 days the robots parser tries to
   download the robots.txt with the if-modified-since header to avoid unnecessary downloads
   if the file was not changed. Additionally the ETag header is used to detect changes.

*) Crawler: better handling of unsupported mimeTypes + FileExtension

*) Bugfix: plasmaWordIndexEntity was not closed correctly in 
   - query.java
   - plasmaswitchboard.java

*) function minimizeUrlDB added to yacy.java 
   this function tests the current urlHashDB for unused urls
   ATTENTION: please don't use this function at the moment because
              it causes the wordIndexDB to flush all words into the
              word directory!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 10:45:33 +00:00
orbiter
7fc822a59b changed handling of time-zones
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@801 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-27 16:28:55 +00:00
orbiter
495bc8bec6 removed cache-control from low and medium priority caches which reduces memory use and computation overhead
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-22 20:01:26 +00:00
orbiter
71a31f0902 integrated and extended new memory performance menu; found and fixed bug in DHT caching
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-20 10:54:20 +00:00
orbiter
fb52a82008 added new performance page for memory settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@751 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-20 10:10:34 +00:00
theli
177e8af5b7 *) Bugfix for ConcurrentModification in kelondroAbstractRA.writeMap caused by yacySeed.getMap()
See: http://www.yacy-forum.de/viewtopic.php?p=9523

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@695 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-10 05:59:12 +00:00
theli
4fd5b95b1f *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logSevere instead of logFailure and logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:32:59 +00:00
theli
6adf8a4bde *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logFailure instead of logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:10:39 +00:00
theli
3dfda1c9da *) More verbose output on ftp-seed-upload failure
See: http://www.yacy-forum.de/viewtopic.php?p=8000#8000

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 12:18:41 +00:00
allo
66ebce1109 use staticIP more often
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-28 16:55:52 +00:00
theli
115c4edfcc *) Adding additional logging statements to help debugging seed-upload problems
See: http://www.yacy-forum.de/viewtopic.php?t=975&highlight= 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-19 09:32:30 +00:00
orbiter
ba0a486328 moved printStackTrace() to logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-14 23:35:18 +00:00
orbiter
40da910f41 bugfixes and automatic news-cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-02 16:03:35 +00:00
orbiter
d34eb23e4e fixed news; added news appearance on Network and IndexCreate page; added intention string to global crawl
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-01 01:12:02 +00:00
orbiter
e24dbde217 better logging for WRONG seed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@463 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-31 11:11:29 +00:00
orbiter
f663f26cfd catch of another IOException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-28 17:03:13 +00:00
orbiter
af67c633d5 doc-changes and more strict brute-force handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-25 09:56:54 +00:00
orbiter
0f663bcebf added global ppm computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@407 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-18 15:22:14 +00:00
orbiter
858cd94299 replaced indexing ram-queue by file-based stack-queue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-06 14:48:41 +00:00
theli
d53b2393e5 *) autoconfig.java: ip address was not reported correctly when port-forwardin is on
*) hello.java: reportedip my be empty at peer startup
*) httpc.java: adding method to determine if the connection was already closed or is broken
*) httpdProxyHandler.java: trying to do a better errorhandling
*) server/serverCore.java
- setting myseed ip-address and port correctly if port-forwarding is on
- doing a more failsafe close and adding some debugging output
*) yacyClient.java: adding some logging statements to allow a better detection of 
   "degraded to senior"-bug
*) yacyCore.java: restructuring publishMySeed
   (@Orbiter: pleas take a look)
- to avoid buzy waiting
- to allow a gracefull shutdown on server shutdown
- new seed count was not calculated correctly in the previous version
*) yacySeedDB.java: host ip and port was not initialized correctly if port-forwarding
   was activated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 11:00:26 +00:00
theli
dff937a9a3 *) Adding some logging statements to detect problems with seed upload functionality
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 08:38:22 +00:00
orbiter
5a490aa065 fixed html parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 21:49:56 +00:00
theli
9a98988c3c *) Bugfix for SSL/NIO Bug
See: http://www.yacy-forum.de/viewtopic.php?t=516
   - removing NIO from server/serverCore.java because of massive problems
     with socket close issues
*) Adding support for remote port forwarding via sch
   @Orbiter: Please take a look into
   - hello.java
   - server/serverCore.java.publicIP()
   - yacy/yacyClient.java.publishMySeed(...)
*) Making startup loading of additional content parsers more failsafe


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 07:28:07 +00:00
mkossin
3dd3431c58 Fixes a Problem with mySeed.txt. If this file is accessible but has no content it will be recreated: http://www.yacy-forum.de/viewtopic.php?p=3657#3657
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-31 13:58:30 +00:00
theli
0e1d9e9722 *) shrinking httpc linebuffer when httpc is returned to pool. This is done to free memory
*) Making Seed-Upload configuration more verbose.
*) Some Changes in SOAP Search API (not finished yet).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-23 10:10:51 +00:00
theli
361f05978d Multiple updates regarding the yacy seedUpload facility,
optional content parsers, thread pool configuration ...

Please help me testing if everything works correct.

*) Migration of yacy seedUpload functionality
See: http://www.yacy-forum.de/viewtopic.php?t=256
- new uploaders can now be easily introduced because of a new modulare uploader system
- default uploaders are: none, file, ftp
- adding optional uploader for scp
- each uploader provides its own configuration file that will be 
  included into the settings page using the new template include feature
- Each uploader can define its libx dependencies. If not all needed libs are
  available, the uploader is deactivated automatically.

*) Migration of optional parsers
See: http://www.yacy-forum.de/viewtopic.php?t=198
- Parsers can now also define there libx dependencies
- adding parser for bzip compressed content
- adding parser for gzip compressed content
- adding parser for zip files
- adding parser for tar files
- adding parser to detect the mime-type of a file
  this is needed by the bzip/gzip Parser.java
- adding parser for rtf files
- removing extra configuration file yacy.parser
  the list of enabled parsers is now stored in the main config file

*) Adding configuration option in the performance dialog to configure
See: http://www.yacy-forum.de/viewtopic.php?t=267
- maxActive / maxIdle / minIdle values for httpd-session-threadpool
- maxActive / maxIdle / minIdle values for crawler-threadpool

*) Changing Crawling Filter behaviour
See: http://www.yacy-forum.de/viewtopic.php?p=2631

*) Replacing some hardcoded strings with the proper constants of the httpHeader class

*) Adding new libs to libx directory. This libs are
- needed by new content parsers
- needed by new optional seed uploader
- needed by SOAP API (which will be committed later)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-17 08:25:04 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
orbiter
f99930c04b fixed brute-force + peer-disconnect - Bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@75 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-01 23:31:21 +00:00
orbiter
b9203bdb50 bug fixes and code cleaning
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@22 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-15 14:18:14 +00:00
orbiter
e374aca2cd enhanced exception handling in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@14 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-12 15:45:50 +00:00