Commit Graph

276 Commits

Author SHA1 Message Date
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
orbiter
5d5e6ebfcc fix for http://www.yacy-forum.de/viewtopic.php?p=32631#32631
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 08:54:07 +00:00
orbiter
51e12049fa third generation of R/W head path optimization
- data from collection arrays are read in order
- merged data is written in order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 11:13:23 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
orbiter
1f1f398bfa enhanced speed of RAM cache flush by factor 20 (twenty times faster)
- the speed was doubled by avoiding read access during the dump
- the speed was dramatically increased at least by factor 10
   by using a temporary ram-file where the structures are flushed to
   before it is dumped then as a whole byte-chunk to the file system.
The speed enhancements also affects some other parts of the database.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 23:21:46 +00:00
orbiter
7673f0869b minor enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:01:03 +00:00
orbiter
fcc11391a8 some redesign attempts because sorting of lastseen does not work correctly
not finished yet
target: better selection of peer-ping targets, which should enhance stabilization of the net

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 13:12:31 +00:00
orbiter
306c50ac40 QPM (queries per minute) statistic stub
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:39:11 +00:00
orbiter
7598e1243e removed unused variables/imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 09:28:47 +00:00
allo
98cb777e18 abstract wikiCode in putWiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3293 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 15:09:58 +00:00
karlchenofhell
15f0334cd3 - fixed IllegalThreadStateException in LogParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-21 14:45:52 +00:00
hydrox
814a09a0ed *) reversed r3250 and parts of r3252 (nanotime() is an java1.5 function)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 11:10:57 +00:00
hydrox
f7623f5d24 *) added missing measuring points for Parser-Runtime
*) changed precision of Parser-Runtime from ms to ns

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 09:25:04 +00:00
karlchenofhell
5d540b219e - LogalizerHandler skips interfaces again
- added LogParser stats to LogStatistics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 17:01:20 +00:00
allo
e1fb3550ab fix for profile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 14:52:51 +00:00
hydrox
6faf9b70b7 *) LogParserPLASMA now counts its total runtime.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:35:33 +00:00
hydrox
e5f854bc37 *) added LogalizerHandler-settings to yacy.logging.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:25:11 +00:00
karlchenofhell
77b73aa7a8 - log-entries 'Indexed' are parsed correctly now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 18:42:34 +00:00
karlchenofhell
71112b1fe6 - added LogStatistics_p.html servlet based on the logalizer (indexing values not functional yet due to charset/regex problems)
add the following to DATA/LOG/yacy.logging:
---
# Properties for the LogalizerHandler
de.anomic.server.logging.LogalizerHandler.enabled = true
de.anomic.server.logging.LogalizerHandler.debug = false
de.anomic.server.logging.LogalizerHandler.parserPackage = de.anomic.server.logging.logParsers
---
and "de.anomic.server.logging.LogalizerHandler" to the list of global handlers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 16:13:21 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
karlchenofhell
bdda9e802f - added some commented string constants to ease use of the result-table
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3215 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 05:51:39 +00:00
karlchenofhell
4dce5ec261 - if mem is too low but former GCs helped, the word-cache limit is only decreased now, if a subsequent GC doesn't
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-13 06:30:17 +00:00
hydrox
c27e88104c *) getResults() should now work and compile properly with Java 1.4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 11:12:09 +00:00
hydrox
a2fb54afff *) Quickfix for http://www.yacy-forum.de/viewtopic.php?p=29973#29973 getResults() used a java 1.5 function (Output is temporally disabled until a sulution with 1.4 functions is implemented)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 10:39:22 +00:00
hydrox
3acd90033c *) added functions to get results from log-parsers (not documented yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3186 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-11 09:18:10 +00:00
karlchenofhell
0336480a3e - the maxMemory-fix for the Sun JVM 1.4.2 wrongly also applied to 1.6, thx to NN
- added logging of reducing word-cache (log-level fine)
- disabled memprereq field in PerformanceQueues_p.html, because it is now set by the collections db
- minor changes to ConfigSkins / -Language

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-05 08:00:05 +00:00
allo
6ff8359b98 possibility to use anonther bindPort than the externally reachable port.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-03 21:00:07 +00:00
karlchenofhell
d6eb699e8e - fix for last commit (didn't know that the paragraph sign has an UTF-8-specific location)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 14:04:21 +00:00
karlchenofhell
41bc31d2c2 - ConfigAdvanced_p => XHTML (no invalid IDs)
- removed unmappable characters from code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 13:35:34 +00:00
borg-0300
a4f63d187d better map2string and NullPointer fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-22 13:13:41 +00:00
borg-0300
05d0464377 only do 16 checks if "address" starts with "172.";
better readably;


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-17 13:54:11 +00:00
rramthun
1a525710c1 *) cursor jumps now to searchbox on searchpages again
*) added missing private IP-ranges for APIPA/Zeroconf and 172.16.0.0–172.31.255.255
*) Changed some seed-download-errors to warnings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3086 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-17 13:21:17 +00:00
karlchenofhell
52abbd4131 - fix for wrong public IP if no hostname was set and IP was from range 192.168.*.*
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-09 22:00:34 +00:00
hydrox
2c69cc969a *) more special chars removed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3027 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 07:39:09 +00:00
hydrox
ebb42906f8 *) removed special characters
*) added Copyright comments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3026 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 07:33:54 +00:00
theli
954db729db *) Bugfix for ArrayIndexOutOfBoundsException during SSL detection
See: http://www.yacy-forum.de/viewtopic.php?p=28247#28247

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3025 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 06:38:53 +00:00
orbiter
ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
- enhanced condenser: better handling of utf-8 and pre-formatted texts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-28 15:00:15 +00:00
hydrox
f442af956c *) first version of build-in logalizer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-17 11:49:21 +00:00
(no author)
24ac4e8860 Bugfix to "-UNRESOLVED_PATTERN- bei Hostname-Änderung" (http://www.yacy-forum.de/viewtopic.php?t=3093)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-11 17:52:20 +00:00
orbiter
ba967c4875 - bugfixes and debug code
- ne generalized index class indexCachedRI

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-07 01:09:02 +00:00
orbiter
114a76a86e - added flag to urlhash that shows that domain is a local domain
- enhanced local domain detection
- bugfixing for memory assignment in kelondroFlexSplit
- automatic memory assignment to caches according to available RAM
- bugfixes for details during search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2924 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-06 02:05:39 +00:00
theli
5e57e0814d *) new soap function to display log
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-03 14:39:48 +00:00
orbiter
f21ede312e bugfixes for internals of database organization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-25 01:21:05 +00:00
orbiter
2a9d868f6d - removed object cache from kelondroTree
- generalized object caching and added new object caching class
- added object caching wherever kelondroTree was used
- added object caching also to usage of kelondroFlex
- added object buffering (a write cache) to NURLs
- added many assert statements; fixed bugs here and there
- added missing close methods to latest added classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 13:48:16 +00:00
orbiter
278d8c3c7e - more asserts
- bugfix for reading of previously deleted nodex

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-23 00:59:55 +00:00
orbiter
2d3f1a53fd handling of Missing byte-order mark exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-22 12:18:15 +00:00
allo
c35793fb46 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2838 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 16:41:22 +00:00
allo
a831c83025 create servletProperties, with the servlet specific funktions from serverObjects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2835 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 15:01:53 +00:00