Commit Graph

132 Commits

Author SHA1 Message Date
danielr
c565906050 FTP:
- added maxFileSize-check
- added timeout for download
- fixed dirlist (when all filenames have spaces, change to absolute links)
- enhanced isFolder()
- make sure data connection is closed, so a new can be opened
- refactoring


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4561 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 16:28:27 +00:00
danielr
1a7870df0d FTP: source cleanup (added finals, indention for easier diffs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 12:35:53 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
orbiter
4a80902081 - added ViewProfile as rdf in foaf syntax
- added link to rdf and vCard version on html page
- can be seen on http://localhost:8080/ViewProfile.html?hash=localhash
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-28 18:21:08 +00:00
orbiter
4ffbcd54a4 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=754
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:10:03 +00:00
hermens
4748d5c1ab Some enhancements to time management:
- remove unnecessary generation of Calendar and Date objects
- synchronized SimpleDateFormat objects in blog-, message- and wikiBoard
- correct use of TimeZones and SimpleDateFormats



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 17:11:35 +00:00
orbiter
daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 09:01:35 +00:00
orbiter
9ca46a8c69 indexing of local (intranet) urls enabled
To do this, one must create a separate YaCy network that has a local URL domain
A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 00:46:17 +00:00
orbiter
ec817a2ff5 removed JAR command from ftpc - produces warnings and is not used by YaCy
see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=193&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3997 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-22 20:08:09 +00:00
orbiter
40b0547611 - documentaton changes (removed old forum links)
- different handling of link quotation
- different handling of link normalization
- enhanced html/unicode en/de-coding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 15:32:10 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
9bbd39b67c - removed unfinished auto-updater from roland and martin
- added new download-option for releases on the status page
still mising:
- thomas-style restart for linux/mac
- untar/gunzip on shell basis
(comes next)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 14:52:26 +00:00
karlchenofhell
03c6551b0c - fix for http://www.yacy-forum.de/viewtopic.php?t=3747
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3724 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-14 12:44:44 +00:00
orbiter
e602436fda fixed problem with cluster routing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3684 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 20:48:24 +00:00
orbiter
871ee1ce0f one step closer to automatic updates:
automatically aquire release information from download archives
web pages from latest.yacy-forum.net and yacy.net are retrieved, parsed,
links wihin are analysed, sorted and the most recent developer and main
releases are provided as direct download link on the status page, if it was
discovered that a more recent version than the current version is available.
This process is done only once during run-time of a peer, to protect our
download archives from DoS by YaCy peers.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-27 09:23:44 +00:00
(no author)
4f4d3d71dd *) Faster appearance of ConfigBasic by bypassing UPNP-scan in case of existing external connects
*) Marked two deprecated source-points
*) Added possibility to dump words from indexing to file. Should not affect performance in the current form.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-24 16:33:31 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
f3f99b19c6 extended search statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 01:45:29 +00:00
karlchenofhell
b3a2c79fec - fix for last fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-14 17:47:25 +00:00
karlchenofhell
4b23e79f51 - quick-fix for http://www.yacy-forum.de/viewtopic.php?t=3358
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3201 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-14 17:31:08 +00:00
karlchenofhell
be941c4475 - "javascript:"-URLs are recognized as well (as intended formerly I assume)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3097 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 20:19:02 +00:00
karlchenofhell
a619ba3f49 - fix for String index out of range during URL parsing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3096 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 19:32:01 +00:00
orbiter
d34f10c63d some tests with reverse dns lookup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-12 00:28:10 +00:00
orbiter
bd4f43cd66 - fixed a null pointer exception bug
- switched off more write caches
- re-enabled index-abstracts search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2885 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-31 02:45:41 +00:00
theli
d38ef0493d *) be more tolerant against missing ports in url
"http://yacy.net:/" is now interpreted as "http://yacy.net/"
   See: http://www.yacy-forum.de/viewtopic.php?p=27102

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 05:22:54 +00:00
theli
cfe54fedc7 *) Bugfix for resolveBackpath problem with tailing /..
*) Junit testclass for resolveBackpath testing 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2850 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 05:07:34 +00:00
karlchenofhell
d13b381f83 - added mint-green skin
- removed test-urls because of problems with text-encoding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2832 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 10:24:32 +00:00
karlchenofhell
ce237aefad - assortment-sizes table from PerformanceQueues_p.html is not shown if not used
- escape query- and fragment-part of an url as well
- new resolveBackpath for urls: http://www.yacy-forum.de/viewtopic.php?t=2679#24867

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2815 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 15:27:24 +00:00
karlchenofhell
b14a500b88 - removed debug output from PerformanceMemory_p
- added URL escaping (tested, nevertheless watch out for possibly broken URLs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 14:51:37 +00:00
orbiter
bcf2b800b4 applied UTF-8 encoding parameter to yacy-internal protocol communication
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 13:35:38 +00:00
orbiter
5a40ea7866 refactoring of wget string list generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 09:59:20 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
borg-0300
f18304ddd3 unused/not needed imports removes;
properties added;

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 22:21:18 +00:00
orbiter
1e7fd48afd added size method to ftpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2508 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 12:21:41 +00:00
orbiter
d4c5e2af01 html-dirlist can now also be generated from existing connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 10:11:07 +00:00
theli
7930839594 *) URL.java: userinfo was not taken over when generating a new url from a base url and a rel. path
*) CrawlWorker.java: using new dirhtml function of ftpc

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2492 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 05:17:57 +00:00
orbiter
17ba468165 added html dirlisting generation in ftpc.java:
ftpc.dirhtml() generates a StringBuffer with a complete web page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 00:11:59 +00:00
theli
ab5a9bee66 *) adding some copyright headers
*) next step of restructuring for new crawlers
   - adding first testversion of ftp crawler class
   -- does not create a htCache entry yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 14:38:29 +00:00
theli
5847492537 *) next step of restructuring for new crawlers
- IndexCreate_p.java: correcting problems with ftp urls
   - URL.java does not cutout the userinfo anymore 
    (needed to transport authentication info in ftp urls, e.g. ftp://username:pwd@ftp.irgendwas.de)
   - plasmaCrawlLoader.java: 
   -- hack to re enable https urls
   -- adding function getSupportedProtocols

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:17:11 +00:00
orbiter
6cce47e217 test of ftp-urls in URL class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:10:40 +00:00
orbiter
f933f00f09 another patch to URL protocol handling for 'news', 'nntp' etc:
reject it! (the java.net.URL class rejects them too)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2432 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:04:04 +00:00
orbiter
4c6e00d80a more bugfixes for URL class, see:
http://www.yacy-forum.de/viewtopic.php?p=24844#24844

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 00:23:39 +00:00
orbiter
b7dc251948 fixed bugs in url class:
- correct backpath ('..') handling
- correct absolute path handling
- included https


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 22:27:01 +00:00
orbiter
276225d79e fix for URL class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 21:33:00 +00:00
orbiter
f43c90fa98 fixed handling of null referer in crawlOrder
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 21:46:34 +00:00
orbiter
abf22f6e60 removed url normalform computation from htmlFilterContentScraper.
This method was implemented in de.anomic.net.URL


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:09:22 +00:00
theli
0db237467f *) bugfix for URL generation from file
see: http://www.yacy-forum.de/viewtopic.php?p=24116

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2326 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-25 16:18:45 +00:00
orbiter
e20ff77c10 another bugfix in new url class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:37:22 +00:00
orbiter
685430a1b5 bugfix in new URL class, better loggin for domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:33:01 +00:00
orbiter
79af283f6c better debugging in new URL class for wrong port numbers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2315 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:21:24 +00:00
orbiter
4bd626572b added hashCode and compareTo to new URL class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 12:00:54 +00:00
theli
a70cbd959b *) further improvements for the anomic.net.url class
- relpath starting with javascript: are ignored now
   - bugfix for concatenation of relpath starting with # or ?
     in this case no slash should be added to the baseURL, otherwise
     we get URLs of the form http://test.de/index.html/?param=value

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 05:12:08 +00:00
theli
8a1f1d96b3 *) Bugfix for url concatenation. Relative urls with / or http:// at the beginning
were not handled correctly on url concatenation via new URL(URL,relPath).
   See: http://www.yacy-forum.de/viewtopic.php?t=2623

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 04:48:18 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
hermens
d4645062bc Correct usage of vhost in wget/wput requests:
- yacyClient: don't use own .yacyh domain in requests, instead use .yacyh domain of target peer for everything but ranking distribution
- natLib: use full hostname instead of just SLD.TLD



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-21 14:25:27 +00:00
orbiter
fd7c17e624 added virtual host support:
all yacy-to-yacy communication now send the <peer-hexhash>.yacyh
virtual domain inside the http 'Host' property field.
This shall enable running a yacy peer on a virtual host.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 13:11:00 +00:00
theli
a10072b5c0 *) Adding timeout to ftpc.java
See: http://www.yacy-forum.de/viewtopic.php?t=2268
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-23 10:35:46 +00:00
orbiter
59fc55ea1e added checks to protect peers from wrong seeds
see also: http://www.yacy-forum.de/viewtopic.php?p=19249#19249

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1939 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-20 20:31:51 +00:00
allo
b3df5ce2b8 another local iprange
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1875 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-10 18:34:53 +00:00
allo
32d4c3e4c6 static IP is proper
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1779 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-28 08:57:20 +00:00
theli
026dcdfcc0 *) Bugfix from mbirth for ftpc bug
See: http://www.yacy-forum.de/viewtopic.php?p=15496#15496
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-20 13:37:42 +00:00
orbiter
37f88b4017 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 23:51:29 +00:00
orbiter
8f1f2daa5e implemented interactive link deletion of search results.
next steps: attach voting and restrict to administrator
to see the deletion button, move the mouse pointer to the left of a search result

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 16:15:21 +00:00
orbiter
3d8a5ae652 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 14:24:13 +00:00
orbiter
1d6a6d1f85 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 00:17:12 +00:00
orbiter
79818a320f introduced citation-rank transmission protocol and activate transport for anonymisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-10 23:48:20 +00:00
theli
d3ad712418 *) Bugfix for Seed file upload problem via ftpc
See: http://www.yacy-forum.de/viewtopic.php?p=11662#11662

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@984 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-25 10:45:14 +00:00
theli
02d9af1a70 *) Restructuring and extending of Remote Proxy Support
- remote proxy configuration can now be "really" changed on the fly and takes effect immediately
   - adding possibility to disable remote proxy usage for yacy->yacy communication
   - adding possibility to disable remote proxy usage for ssl
   - restructuring proxy configuration so that it is stored in a single place now

*) Adding possibility to import a foreign word DB (or even more of them in parallel) 
   at runtime into the peers DB
   - this can be done by calling IndexImport_p.html 
   - ATTENTION: please not that at the moment this thread must be aborted via gui
     before a normal server shutdown is done. 
   - TODO: integrating IndexImport Thread into normal server shutdown
   - TODO: Adding posibility to import crawl-queues, etc. from foreign peers
   - TODO: removing old import function from yacy.java and calling the new routines instead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-22 13:28:04 +00:00
theli
a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
various checks like the blacklist check or the robots.txt disallow check are now
   done by a separate thread to unburden the indexer thread(s)
   TODO: maybe we have to introduce a threadpool here if it turn out that this single
         thread is a bottleneck because of the time consuming robots.txt downloads

*) improved index transfer
   The index selection and transmission is done in parallel now to improve index 
   transfer performance.
   TODO: maybe we could speed up performance by unsing multiple transmission threads in 
         parallel instead of only a single one.

*) gzip encoded post requests
   it is now configureable if a gzip encoded post request should be send on
   intex transfer/distribution

*) storage Peer (very experimentell and not optimized yet)
   Now it's possible to send the result of the yacy indexer thread to a remote peer 
   istead of storing the indexed words locally. 
   This could be done by setting the property "storagePeerHash" in the yacy config file
   - Please note that if the index transfer fails, the index ist stored locally.
   - TODO: currently this index transfer is done by the indexer thread. 
     To seedup the indexer
     a) this transmission should be done in parallel and
     b) multiple chunks should be bundled and transfered together


*) general performance improvements  
   - better memory cleanup after http request processing has finished
   - replacing some string concatenations with stringBuffers
   - replacing BufferedInputStreams with serverByteBuffer
   - replacing vectors with arraylists wherever possible
   - replacing hashtables with hashmaps wherever possible
   This was done because function calls to verctor or hashtable functions
   take 3 time longer than calls to functions of arraylists or hashmaps.
   TODO: we should take a look on the class serverObject which is inherited from hashmap
         Do we realy need a synchronization for this class?
   TODO: replace arraylists with linkedLists if random access to the list elements is not needed

*) Robots Parser supports if-modified-since downloads now
   If the downloaded robots.txt file is older than 7 days the robots parser tries to
   download the robots.txt with the if-modified-since header to avoid unnecessary downloads
   if the file was not changed. Additionally the ETag header is used to detect changes.

*) Crawler: better handling of unsupported mimeTypes + FileExtension

*) Bugfix: plasmaWordIndexEntity was not closed correctly in 
   - query.java
   - plasmaswitchboard.java

*) function minimizeUrlDB added to yacy.java 
   this function tests the current urlHashDB for unused urls
   ATTENTION: please don't use this function at the moment because
              it causes the wordIndexDB to flush all words into the
              word directory!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 10:45:33 +00:00
orbiter
3c1d968d29 fix-fix for 792 and small changes in ftpc/download/dir experiments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-26 10:36:42 +00:00
orbiter
dc474aa22f various bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@792 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-26 01:10:41 +00:00
rramthun
02c242ae22 minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-09 21:01:53 +00:00
theli
73ded2f0b6 *) Trying to fix bug for Seed-Upload-Failed
Bug may be caused because of timing issues
   See: http://www.yacy-forum.de/viewtopic.php?p=9439

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@688 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-09 10:44:32 +00:00
borg-0300
718950c5da small change
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@679 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-07 15:20:12 +00:00
theli
3dfda1c9da *) More verbose output on ftp-seed-upload failure
See: http://www.yacy-forum.de/viewtopic.php?p=8000#8000

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@605 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 12:18:41 +00:00
allo
66ebce1109 use staticIP more often
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@592 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-28 16:55:52 +00:00
allo
eb6365c069 local Bootstrapping bug.
use yacyDebugMode=true to allow local bootstrapping


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-22 12:13:19 +00:00
orbiter
2d8557cb10 minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-03 02:02:39 +00:00
theli
6f4d2e5272 *) fixing replace bug.
using 
      stringvar = stringvar.replace(xxx) 
   istead of 
      stringvar.replace()

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-10 12:17:50 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
allo
d005d7484e yacyDebugMode - allow Lan-IPs for testing
where was the Code from 0.25 lost?


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@53 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 12:13:49 +00:00
orbiter
248077d3f0 initial load with yacy 0.36
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-07 19:19:42 +00:00