Commit Graph

16 Commits

Author SHA1 Message Date
orbiter
57a5b6fa71 some generalization of remote proxy configuration and setting handling in httpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-02 00:42:37 +00:00
orbiter
367fc28928 corrected Brausse->Brausze
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-01 22:15:51 +00:00
orbiter
40b0547611 - documentaton changes (removed old forum links)
- different handling of link quotation
- different handling of link normalization
- enhanced html/unicode en/de-coding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 15:32:10 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
e48189c710 enhanced cluster routing
- cluster definitions can now contain an addition for local ip addresses
- cluster-cluster communication uses the local ip address instead the global address, if one is given

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3624 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-29 22:05:34 +00:00
orbiter
861f41e67e redesigned NURL-handling:
- the general NURL-index for all crawl stack types was splitted into separate indexes for these stacks
- the new NURL-index is managed by the crawl balancer
- the crawl balancer does not need an internal index any more, it is replaced by the NURL-index
- the NURL.Entry was generalized and is now a new class plasmaCrawlEntry
- the new class plasmaCrawlEntry replaces also the preNURL.Entry class, and will also replace the switchboardEntry class in the future
- the new class plasmaCrawlEntry is more accurate for date entries (holds milliseconds) and can contain larger 'name' entries (anchor tag names)
- the EURL object was replaced by a new ZURL object, which is a container for the plasmaCrawlEntry and some tracking information
- the EURL index is now filled with ZURL objects
- a new index delegatedURL holds ZURL objects about plasmaCrawlEntry obects to track which url is handed over to other peers
- redesigned handling of plasmaCrawlEntry - handover, because there is no need any more to convert one entry object into another
- found and fixed numerous bugs in the context of crawl state handling
- fixed a serious bug in kelondroCache which caused that entries could not be removed
- fixed some bugs in online interface and adopted monitor output to new entry objects
- adopted yacy protocol to handle new delegatedURL entries
all old crawl queues will disappear after this update!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-16 13:25:56 +00:00
karlchenofhell
03c5906ae7 - minor bugfixes for url-fetcher & http://www.yacy-forum.de/viewtopic.php?t=3646
- PerformanceMemory_p.html is valid XHTML again

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 11:50:03 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
karlchenofhell
d114a0136e - crawl profile: don't add null-values
- added some settings and statistics for url-fetcher 'server'-mode
- added own stack for fetchable URLs
- added possibility to fill stack via shift from peer's queues, via POST (addurls=$count and url$num=$url) or via file-upload
- added "htroot" to classpath of linux start-script

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-17 19:16:53 +00:00
karlchenofhell
b2a9d32f29 why do I always forget some lines? sorry...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-14 15:11:03 +00:00
karlchenofhell
e6ddf135bb - enabled fetching new crawls via /yacy/list.html?list=queueUrls for testing purposes
- sent URLs are taken off the limit-stack (of the global crawl trigger) (may be moved somewhere else in future versions)
- added option to set the requested chunk-size

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-14 14:50:55 +00:00
karlchenofhell
67d96249b4 - fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-13 21:17:43 +00:00
karlchenofhell
c5a2ba3a23 - prepared URL fetch from other peers
- more feedback for user

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-13 20:18:12 +00:00
karlchenofhell
4e5eda6ef9 huch...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-10 20:25:45 +00:00
karlchenofhell
50b59e312f - added experimental CrawlURLFetch_p-Servlet to fetch new URLs from a specified location (\n-seperated list). Requested by Theli.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3361 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-10 20:20:00 +00:00