Commit Graph

215 Commits

Author SHA1 Message Date
(no author)
5141fa5942 combinedVersionString2PrettyString(..) renamd to combined2prettyVersion(..), new parameter "computerName" added to indentify the source of problems
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2871 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-28 11:33:05 +00:00
orbiter
2a9d868f6d - removed object cache from kelondroTree
- generalized object caching and added new object caching class
- added object caching wherever kelondroTree was used
- added object caching also to usage of kelondroFlex
- added object buffering (a write cache) to NURLs
- added many assert statements; fixed bugs here and there
- added missing close methods to latest added classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 13:48:16 +00:00
orbiter
278d8c3c7e - more asserts
- bugfix for reading of previously deleted nodex

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-23 00:59:55 +00:00
karlchenofhell
c5a5a9eb1c - patch for NullPointerException by Fuchs: see http://www.yacy-forum.de/viewtopic.php?p=27033#27033
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2840 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-22 07:09:45 +00:00
orbiter
1825540020 another fix for url-db migration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2834 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 12:23:06 +00:00
orbiter
11843bba7f fix for Malformed URL Exception in url migration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2825 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-20 13:50:00 +00:00
orbiter
8b56887676 removed unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2820 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 21:30:02 +00:00
orbiter
06854988da - full integration of new LURL database in INDEX
- added migration method for urlHash.db into INDEX

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2819 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 21:14:37 +00:00
orbiter
b79e06615d - added new LURL.Entry class for next database migration
- refactoring of affected classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 22:25:07 +00:00
orbiter
77a59a115d refactoring of indexing methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2787 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-16 15:04:16 +00:00
orbiter
a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
- added test migration method to migrate the old LURL to a new LURL
the new LURL will be splitted into different tables for each month
this solves several problems:
- the biggest table in YaCy is splitted in different parts and can
  also be managed in filesystems that are limited to 2GB
- the oldest entries can easily be identified, used for re-crawl und
  deleted
- The complete database can be limited to a specific size (as wanted many times)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-12 23:14:41 +00:00
rramthun
581dd2ec72 *)Proper arrow-function on Network.html, but ordering is still broken. Perhaps someone could fix that?
*)Removed double creation of DATA directory. New warning message in case of insufficient rights.
*) Removed roland-ramthun.de-seedlist temporarily, because of server changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2747 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-11 18:27:38 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
theli
64b2ef5aae *) Trying to bugfix shutdown problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2639 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 10:13:23 +00:00
orbiter
40965e183e bugfix for minimizeurldb and urldbcleanup
see http://www.yacy-forum.de/viewtopic.php?p=25539#25539

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 10:12:41 +00:00
orbiter
b7e7808ea6 wordmigration now works also for new index database
if the new database is switched on, no 'too big' messages appear,
all the WORDS files can be completely migrated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 08:23:47 +00:00
auron_x
005400a137 *) reverted last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2546 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 14:41:06 +00:00
auron_x
045ffebbd8 *) added debugline to versionstring-processing to find a possible bug in versiongeneration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-10 20:50:37 +00:00
orbiter
4866868c0e added write cache for LURLs
This was necessary to speed up the index receive process during global search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 01:13:03 +00:00
auron_x
b515d49f87 *) fix for new combinedVersionString2PrettyString by bost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-02 07:29:12 +00:00
auron_x
24316ba937 *) improved implementation of combinedVersionString2PrettyString by bost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2465 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-30 16:04:40 +00:00
auron_x
57dda1a92c *)again fixing for wrong version display, now totally working with double instead of float
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-28 17:54:07 +00:00
auron_x
5e558fbaae *) hopefully fixed the wrong display of yacy-version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2462 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-27 21:52:58 +00:00
orbiter
b7f4a1521b added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 22:21:22 +00:00
orbiter
db1eae0227 * simplified initialization of database objects
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 02:19:25 +00:00
hydrox
1c99b5a484 *)fixed logging for urldbcleanup
*)changed exception handling in urldbcleanup so that it shows NullPointerException correctly
*)added more Blacklisting to urlcleaner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 06:42:42 +00:00
orbiter
0187c60010 because of a bug in the JRE 1.4.2 there was no memory protection
see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4686462
this commit fixes the bug by using a memory-computation patch.
All uses of Runtime.maxMemory had been replaced by serverMemory.max
The bug is not present any more in Java 1.5

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 01:33:54 +00:00
orbiter
314021453f * more logging
* option in yacy.init to set useCollectionIndex usage

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-10 21:21:50 +00:00
orbiter
279b1d969d Integrated new indexing data structure 'collections' into the main class
for indexing, the plasmaWordIndex.

The new data structure is ready-to-use, but currently disabled.
It can be activated by setting the static
plasmaWordIndex.useCollectionIndex
to true. This shall be done for testing purpose.

The new index is stored to
DATA/INDEX/PUBLIC/TEXT
The directory PLASMA shall be used only for crawler in the future.

Attention: during testing the data structure in INDEX may change,
and created indexes with the new data structure may get useless.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-05 22:22:14 +00:00
orbiter
c4e922885a replaced indexURLEntry by new class that uses a kelondroRow.Entry object
to store the index entry. This is another step to move to the new database structure.
A side effect of this change is, that index storage uses much less RAM space,
which affects the index RAM cache.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-02 19:59:28 +00:00
theli
6e676224d0 *) adding support for upnp
A new port forwarding method for upnp was added.
   If this method is enabled, yacy automatically determines an UPnP 
   capable internet gateway and configures the gateway port forwarding
   settings properly. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 14:26:45 +00:00
orbiter
417ed5102e redesign of database iterators:
an iteration of key elements in kelondroTree databases is no longer supported.
this is now replaced by an iteration of kelondroRow.Entry objects from the database
Iteration of keys from the database was mostly followed by retrieval of the row
from the database, whcih caused unnecessary database load.
The index selection was also redesigned to use the new row iteration methods.
This affects many funktions, most important is the DHT selection routine which is now much faster.



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2327 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 11:21:51 +00:00
orbiter
ad692fc6c7 implemented option to extract nurls from the database
(plus some iteration enhancements for nurls)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2325 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-24 16:40:59 +00:00
orbiter
1ed3e2daef added option to extract domains and/or urls from the eurl database
when extracting from eurl, the html output format is recommended, since
this format adds also the fail reason to the domain/url.
The complete syntax for domain extraction is now
java -Xmx<megabytes>m -classpath classes yacy -domlist [ -source { lurl | eurl } ] [ -format { text  | zip | gzip | html } ] [ <path to DATA folder> ]


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-24 08:08:33 +00:00
orbiter
58df8b7bbf a large collection of different changes
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-23 22:39:41 +00:00
orbiter
493b1cd2bf better logging for domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:43:56 +00:00
orbiter
685430a1b5 bugfix in new URL class, better loggin for domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:33:01 +00:00
orbiter
c57b78722b added some more logging to domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2316 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:56:40 +00:00
orbiter
cc2be7fb43 fix for genurllist in case of bad urls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:00:21 +00:00
allo
ff3f174a2d case insentive commandline options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 11:20:22 +00:00
allo
ff39a7a0d1 Overlay for welcome.*
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2299 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 11:07:30 +00:00
allo
8795875800 dirlisting for all empty directories.
no problem to update dir.java anymore, because its only in htroot/htdocsdefault needed.
migration to delete old dir.* files in the fileshare

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-17 15:49:42 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
orbiter
92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
the start-up delay is used to pre-load the database node cache

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 23:57:33 +00:00
hydrox
53077f5835 *)fixed paths to yacy.logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-28 09:04:53 +00:00
orbiter
12af69dd86 cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 11:49:31 +00:00
theli
45b39ee1be *) solving unpacking problems with to long filename by
a) renaming the parent folder in the tgz file to yacy
      (can be configured via build properties file)
   b) reconfiguring build file to throw an error if a file
      name is too long 
Please note that currently there is _no_ proplem with too long
class names because of step a.

      

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-14 15:18:41 +00:00
theli
f01bd25489 *) Bugfix for OutOfMemory problem during minimizeUrlDB
See: http://www.yacy-forum.de/viewtopic.php?t=2498
*) out of date import functions removed (can be done via web gui)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2189 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-09 05:38:59 +00:00
orbiter
4a907a570f 1st step to migrate kelondroTree to usage of kelondroRow instead of byte[][]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-31 23:31:46 +00:00
orbiter
eaa6f012f0 refactoring: better naming for classic DB (files in WORDS)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-28 11:59:16 +00:00