Commit Graph

323 Commits

Author SHA1 Message Date
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
orbiter
1f1f398bfa enhanced speed of RAM cache flush by factor 20 (twenty times faster)
- the speed was doubled by avoiding read access during the dump
- the speed was dramatically increased at least by factor 10
   by using a temporary ram-file where the structures are flushed to
   before it is dumped then as a whole byte-chunk to the file system.
The speed enhancements also affects some other parts of the database.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 23:21:46 +00:00
orbiter
7673f0869b minor enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:01:03 +00:00
orbiter
fcc11391a8 some redesign attempts because sorting of lastseen does not work correctly
not finished yet
target: better selection of peer-ping targets, which should enhance stabilization of the net

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 13:12:31 +00:00
orbiter
306c50ac40 QPM (queries per minute) statistic stub
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:39:11 +00:00
orbiter
7598e1243e removed unused variables/imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 09:28:47 +00:00
allo
98cb777e18 abstract wikiCode in putWiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3293 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 15:09:58 +00:00
karlchenofhell
15f0334cd3 - fixed IllegalThreadStateException in LogParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-21 14:45:52 +00:00
hydrox
814a09a0ed *) reversed r3250 and parts of r3252 (nanotime() is an java1.5 function)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 11:10:57 +00:00
hydrox
f7623f5d24 *) added missing measuring points for Parser-Runtime
*) changed precision of Parser-Runtime from ms to ns

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 09:25:04 +00:00
karlchenofhell
5d540b219e - LogalizerHandler skips interfaces again
- added LogParser stats to LogStatistics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 17:01:20 +00:00
allo
e1fb3550ab fix for profile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 14:52:51 +00:00
hydrox
6faf9b70b7 *) LogParserPLASMA now counts its total runtime.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:35:33 +00:00
hydrox
e5f854bc37 *) added LogalizerHandler-settings to yacy.logging.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:25:11 +00:00
karlchenofhell
77b73aa7a8 - log-entries 'Indexed' are parsed correctly now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 18:42:34 +00:00
karlchenofhell
71112b1fe6 - added LogStatistics_p.html servlet based on the logalizer (indexing values not functional yet due to charset/regex problems)
add the following to DATA/LOG/yacy.logging:
---
# Properties for the LogalizerHandler
de.anomic.server.logging.LogalizerHandler.enabled = true
de.anomic.server.logging.LogalizerHandler.debug = false
de.anomic.server.logging.LogalizerHandler.parserPackage = de.anomic.server.logging.logParsers
---
and "de.anomic.server.logging.LogalizerHandler" to the list of global handlers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 16:13:21 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
karlchenofhell
bdda9e802f - added some commented string constants to ease use of the result-table
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3215 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 05:51:39 +00:00
karlchenofhell
4dce5ec261 - if mem is too low but former GCs helped, the word-cache limit is only decreased now, if a subsequent GC doesn't
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-13 06:30:17 +00:00
hydrox
c27e88104c *) getResults() should now work and compile properly with Java 1.4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 11:12:09 +00:00
hydrox
a2fb54afff *) Quickfix for http://www.yacy-forum.de/viewtopic.php?p=29973#29973 getResults() used a java 1.5 function (Output is temporally disabled until a sulution with 1.4 functions is implemented)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 10:39:22 +00:00
hydrox
3acd90033c *) added functions to get results from log-parsers (not documented yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3186 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-11 09:18:10 +00:00
karlchenofhell
0336480a3e - the maxMemory-fix for the Sun JVM 1.4.2 wrongly also applied to 1.6, thx to NN
- added logging of reducing word-cache (log-level fine)
- disabled memprereq field in PerformanceQueues_p.html, because it is now set by the collections db
- minor changes to ConfigSkins / -Language

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-05 08:00:05 +00:00
allo
6ff8359b98 possibility to use anonther bindPort than the externally reachable port.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-03 21:00:07 +00:00
karlchenofhell
d6eb699e8e - fix for last commit (didn't know that the paragraph sign has an UTF-8-specific location)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 14:04:21 +00:00
karlchenofhell
41bc31d2c2 - ConfigAdvanced_p => XHTML (no invalid IDs)
- removed unmappable characters from code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 13:35:34 +00:00
borg-0300
a4f63d187d better map2string and NullPointer fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-22 13:13:41 +00:00
borg-0300
05d0464377 only do 16 checks if "address" starts with "172.";
better readably;


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-17 13:54:11 +00:00
rramthun
1a525710c1 *) cursor jumps now to searchbox on searchpages again
*) added missing private IP-ranges for APIPA/Zeroconf and 172.16.0.0–172.31.255.255
*) Changed some seed-download-errors to warnings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3086 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-17 13:21:17 +00:00
karlchenofhell
52abbd4131 - fix for wrong public IP if no hostname was set and IP was from range 192.168.*.*
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-09 22:00:34 +00:00
hydrox
2c69cc969a *) more special chars removed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3027 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 07:39:09 +00:00
hydrox
ebb42906f8 *) removed special characters
*) added Copyright comments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3026 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 07:33:54 +00:00
theli
954db729db *) Bugfix for ArrayIndexOutOfBoundsException during SSL detection
See: http://www.yacy-forum.de/viewtopic.php?p=28247#28247

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3025 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-30 06:38:53 +00:00
orbiter
ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
- enhanced condenser: better handling of utf-8 and pre-formatted texts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-28 15:00:15 +00:00
hydrox
f442af956c *) first version of build-in logalizer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-17 11:49:21 +00:00
(no author)
24ac4e8860 Bugfix to "-UNRESOLVED_PATTERN- bei Hostname-Änderung" (http://www.yacy-forum.de/viewtopic.php?t=3093)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-11 17:52:20 +00:00
orbiter
ba967c4875 - bugfixes and debug code
- ne generalized index class indexCachedRI

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-07 01:09:02 +00:00
orbiter
114a76a86e - added flag to urlhash that shows that domain is a local domain
- enhanced local domain detection
- bugfixing for memory assignment in kelondroFlexSplit
- automatic memory assignment to caches according to available RAM
- bugfixes for details during search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2924 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-06 02:05:39 +00:00
theli
5e57e0814d *) new soap function to display log
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-03 14:39:48 +00:00
orbiter
f21ede312e bugfixes for internals of database organization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2860 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-25 01:21:05 +00:00
orbiter
2a9d868f6d - removed object cache from kelondroTree
- generalized object caching and added new object caching class
- added object caching wherever kelondroTree was used
- added object caching also to usage of kelondroFlex
- added object buffering (a write cache) to NURLs
- added many assert statements; fixed bugs here and there
- added missing close methods to latest added classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2858 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 13:48:16 +00:00
orbiter
278d8c3c7e - more asserts
- bugfix for reading of previously deleted nodex

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-23 00:59:55 +00:00
orbiter
2d3f1a53fd handling of Missing byte-order mark exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-22 12:18:15 +00:00
allo
c35793fb46 fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2838 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 16:41:22 +00:00
allo
a831c83025 create servletProperties, with the servlet specific funktions from serverObjects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2835 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 15:01:53 +00:00
theli
df49724f28 *) better error handling for seed upload - test download - problems
See: http://www.yacy-forum.de/viewtopic.php?p=26814#26814

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 10:10:53 +00:00
theli
5b114249ce *) Bugfix for ViewLog problem with multiline logging messages
See: http://www.yacy-forum.de/viewtopic.php?t=2972

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2774 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-14 13:21:07 +00:00
theli
de5e233766 *) Bugfix for GuiHandler sorting problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2773 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-14 13:06:08 +00:00
theli
fd94aa4bef *) Bugfix for IndexOutOfBound in GuiHandler
*) Bugfix for reversed order displaying of messages

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-14 12:41:10 +00:00
orbiter
2bb529cedb added peer tags for peers in robinson mode
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-10 20:09:26 +00:00
orbiter
f25f61d9d3 documentation of compile problem. See
http://www.yacy-forum.de/viewtopic.php?p=26407#26407

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2734 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-09 23:11:03 +00:00
orbiter
db294687ea enhanced logging
- more logging output
- fix in log line preparation
- added filter to log page
- some small bugfixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2707 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-03 22:55:59 +00:00
theli
f17ce28b6d *) plasmaHTCache:
- method loadResourceContent defined as deprecated. 
     Please do not use this function to avoid OutOfMemory Exceptions 
     when loading large files
   - new function getResourceContentStream to get an inputstream of a cache file
   - new function getResourceContentLength to get the size of a cached file
*) httpc.java:
   - Bugfix: resource content was loaded into memory even if this was not requested
*) Crawler:
   - new option to hold loaded resource content in memory
   - adding option to use the worker class without the worker pool 
     (needed by the snippet fetcher)
*) plasmaSnippetCache
   - snippet loader does not use a crawl-worker from pool but uses
     a newly created instance to avoid blocking by normal crawling
     activity.
   - now operates on streams instead of byte arrays to avoid OutOfMemory 
     Exceptions when operating on large files 
   - snippet loader now forces the crawl-worker to keep the loaded
     resource in memory to avoid IO 
*) plasmaCondenser: adding new function getWords that can directly operate on input streams
*) Parsers
   - keep resource in memory whenever possible (to avoid IO)
   - when parsing from stream the content length must be passed to the parser function now.
     this length value is needed by the parsers to decide if the parsed resource content is to large
     to hold it in memory and must be stored to file 
   - AbstractParser.java: new function to pass the contentLength of a resource to the parsers
   


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2701 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-03 11:05:48 +00:00
theli
8b2ceddb91 *) Displaying servere and warning logging messages in different colors on ViewLog_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2678 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-30 08:12:22 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
theli
813a8a8179 *) migration of mimeTypeParser to jmimemagic 0.1
- better mimetype detection for rss feeds
   - better mimetype detection for odt documents (less memory consuming)
   - two new detector classes implementing MagicDetector interface of jmimemagic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2650 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-22 11:40:46 +00:00
allo
b0a4fcce8c fix from theli
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 18:03:24 +00:00
theli
b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
*) better logging of parser failures
*) simplified usage of plasmaparser through switchboard
*) restructuring of crawler
   - crawler now returns an error message if it is used in sync mode (e.g. by snippet fetcher)
*) snippet-fetcher: more verbose error messages
*) serverByteBuffer.java: adding new function append(String,encoding)
*) serverFileUtils.java: adding functions to copy only a given number of bytes between streams


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 12:25:07 +00:00
orbiter
e03427871e enhanced surftipps:
- added switchh to show or hide surftipps
- more news contribute to surftipps
- added voting system for surftipps

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2638 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 07:17:41 +00:00
theli
cc667b0aa5 *) htmlFilterContentScraper.java: adding support for link tag
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2633 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 16:13:13 +00:00
orbiter
f453c14b5d removed unreacheable catch blocks and unused imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 11:23:58 +00:00
theli
ad7f600f25 *) Bugfix. re-enabling inheritance of serverCharBuffer from writer class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 11:04:16 +00:00
theli
97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
- serverFileUtils.java: 
   -- adding methods to copy from stream to writer and readers to writers
   -- moving httpc writeX methods into serverFileUtils class
   - serverCharBuffer.java: removing inheritance from Writer class
   - replacing htmlFilterOutputStream by htmlFilterWriter class which handles
     content as char stream
   - htmlFilterContentTransformer.java: deactivating getText mode 
    (still needs to be migrated to use char streams instead of byte streams)
   - changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
   - changes in Scraper and Transformer classes to operate on chars instead of bytes
   - httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 10:12:11 +00:00
orbiter
3aac5b26da - added automatic tag generation when a web page from the search results is added
- added new image 'B' in front of search results for bookmark generation
- added news generation when a public bookmark is added
- the '+' in front of search results has new meaning: positive rating for that result
- added news generation when a '+' is hit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 00:37:02 +00:00
theli
0e84a969d6 *) Bugfix for serverCharBuffer read from file operation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 13:11:32 +00:00
theli
90ef19d778 *) first version of a serverCharBuffer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 12:56:03 +00:00
orbiter
1b48473bc5 bugfix to utf8 recognition
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2603 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 23:55:06 +00:00
orbiter
90f7241b59 serverByteBuffer.trim() can now recognize utf-8 characters
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 23:52:26 +00:00
theli
8115ac47b5 *) charset aware metadata parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2598 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 15:01:25 +00:00
theli
74c3e7cf29 *) storing document charset into plasmaParserDocument object (is needed later by the condenser)
*) htmlFilterContentScraper.java: using proper charset for document title
*) serverByteBuffer.java: adding new toString which allows to specify the charset for byte encoding


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 13:18:12 +00:00
theli
e2f8339827 *) some bugfixes for UTF-8 related problems
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2577 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-14 05:16:36 +00:00
orbiter
82a6054275 - fixed bug with new indexAbstract generation
- added partly evaluation of indexAbstracts during remote searches

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 10:39:25 +00:00
orbiter
309accb983 memory control for ymage generation:
the ymageMatrix initializer throws an RuntimeException if there is not
enough memory available to generate a new ymage of wanted size

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 07:01:39 +00:00
orbiter
c2e6cc8c6b small part of Bosts patch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 01:40:23 +00:00
orbiter
a2525072f2 bugfix for kelondroRow - property generation
this bug affected ranking parameters :-(

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2506 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 10:55:34 +00:00
theli
f3ac4dbbb9 *) better handling of server shutdown
See: e.g. http://www.yacy-forum.de/viewtopic.php?t=2584

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2468 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-03 14:59:00 +00:00
orbiter
39b4c26bdc more memory control:
- catchup of OutOfMemoryError in server threads
- automatic adoption of word cache size after a Short Mem Cycle

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 00:06:39 +00:00
orbiter
eb633c0a4f server threads must now supply a method that can be called in case
of short memory. This has been realized for the indexing thread.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 02:07:03 +00:00
orbiter
0187c60010 because of a bug in the JRE 1.4.2 there was no memory protection
see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4686462
this commit fixes the bug by using a memory-computation patch.
All uses of Runtime.maxMemory had been replaced by serverMemory.max
The bug is not present any more in Java 1.5

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 01:33:54 +00:00
orbiter
314021453f * more logging
* option in yacy.init to set useCollectionIndex usage

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-10 21:21:50 +00:00
theli
c09f734d06 *) offer router configuration on ConfigBasic.html
- checkbox to allow router configuration is shown if
   - a) the UPnP forwarder is installed
   - b) a UPnP enabled router was found
   - c) no other forwarder was configured
   See: http://www.yacy-forum.de/viewtopic.php?p=24264

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 11:31:18 +00:00
orbiter
d468d665c9 some changes that may help to prevent deadlocks that cause an OutOfMemoryError
as described in
http://www.yacy-forum.de/viewtopic.php?p=24359

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-07 00:19:01 +00:00
orbiter
8b77afd72c some fixes to new container merger
and some code cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2336 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 22:40:11 +00:00
theli
839806a775 *) serverPortForwardingUpnp.java: code cleanup, license header added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 15:32:35 +00:00
theli
03230cd887 *) removing old port forwarding classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2330 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 14:42:12 +00:00
theli
6e676224d0 *) adding support for upnp
A new port forwarding method for upnp was added.
   If this method is enabled, yacy automatically determines an UPnP 
   capable internet gateway and configures the gateway port forwarding
   settings properly. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-26 14:26:45 +00:00
orbiter
1ed3e2daef added option to extract domains and/or urls from the eurl database
when extracting from eurl, the html output format is recommended, since
this format adds also the fail reason to the domain/url.
The complete syntax for domain extraction is now
java -Xmx<megabytes>m -classpath classes yacy -domlist [ -source { lurl | eurl } ] [ -format { text  | zip | gzip | html } ] [ <path to DATA folder> ]


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-24 08:08:33 +00:00
orbiter
58df8b7bbf a large collection of different changes
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-23 22:39:41 +00:00
orbiter
b3f7e62e03 better handling of whitespace
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 23:53:27 +00:00
orbiter
4149939c02 better handling of whitespace for gettext quotation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2310 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 23:18:06 +00:00
orbiter
97fa6788a1 added gettext support:
automatic replacement of string appearances in html files by
gettext quotes.
see also: http://www.yacy-forum.de/viewtopic.php?p=23901#23901

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 22:35:36 +00:00
orbiter
67edd80884 removed tabs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-19 11:13:14 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
allo
6acb6a4d8f tiny performance optimization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-09 15:37:45 +00:00
theli
fe617d7e54 *) adding function to return the protocol type of a ssl connection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2274 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 14:16:46 +00:00
orbiter
018b3e0832 added pause option to server threads.
The pause is started by calling intermission(Long.MAX_VALUE)
and can be stopped by calling intermission(0)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2272 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-03 13:20:14 +00:00
allo
0621106ef3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2214 6c8d7289-2bf4-0310-a012-ef5d649a1542 2006-06-18 12:15:26 +00:00
orbiter
12af69dd86 cosmetics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-18 11:49:31 +00:00