Commit Graph

2633 Commits

Author SHA1 Message Date
theli
043edfa4d8 *) ftp/ResourceInfo.java ResourceInfo object for ftp resources added
*) ftp/CrawlWorker.java better errorhandling for ftp crawler
*) plasmaCrawlEURL.java: some errorcodes added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2499 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 04:12:52 +00:00
orbiter
4866868c0e added write cache for LURLs
This was necessary to speed up the index receive process during global search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2498 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 01:13:03 +00:00
orbiter
8a0e35618b enhancements to search result preparation
- added detailed count on remote search results
- enhanced search sequence during remote searches (doing local search in sequence)
- strict adherence to timout limits

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-06 17:51:28 +00:00
theli
5c1bb53d2a Missing description for last commit
*) next step of restructuring for new crawlers
   > HTCaching should now work protocol independent
   -- introduction of new ResourceInfo objects containing protocolspecific metadata
      of a resource. 
   -- the ResourceInfo objects now implement old functions like shallIndexCacheForXXX, 
      shallStoreCacheForXXX in a protocol dependent manner   
   > Indexing should also work protocol independent now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2496 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-06 14:35:45 +00:00
theli
dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542 2006-09-06 14:31:17 +00:00
theli
4825bfaaf3 *) Bugfix for PrintWriter Problem
See: http://www.yacy-forum.de/viewtopic.php?t=2792

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2494 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 18:55:45 +00:00
orbiter
d4c5e2af01 html-dirlist can now also be generated from existing connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 10:11:07 +00:00
theli
7930839594 *) URL.java: userinfo was not taken over when generating a new url from a base url and a rel. path
*) CrawlWorker.java: using new dirhtml function of ftpc

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2492 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 05:17:57 +00:00
orbiter
17ba468165 added html dirlisting generation in ftpc.java:
ftpc.dirhtml() generates a StringBuffer with a complete web page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 00:11:59 +00:00
(no author)
2dacf63dd9 Spelling correction of the language list "Slovenky" -> "Slovensky"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 19:55:21 +00:00
theli
413e6b9855 *) direct access to responseheaders of sbQueue.Entry removed to make it more http independent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:56:49 +00:00
theli
2126c51906 *) bugfix for ViewFile.java. Wrong http header were used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2488 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:49:51 +00:00
theli
7a35b8e237 *) direct access to responseheaders of sbQueue.Entry removed to make it more http independent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:36:19 +00:00
theli
ffbf416e76 *) direct access to requestheader of htCache.Entry removed to make it more http independent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2486 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:29:45 +00:00
theli
3870d615e3 *) setting htCache.Entry fields to private
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:06:58 +00:00
theli
393a7d10be *) setting htCache.Entry fields to private
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2484 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 15:03:54 +00:00
theli
ab5a9bee66 *) adding some copyright headers
*) next step of restructuring for new crawlers
   - adding first testversion of ftp crawler class
   -- does not create a htCache entry yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 14:38:29 +00:00
theli
5847492537 *) next step of restructuring for new crawlers
- IndexCreate_p.java: correcting problems with ftp urls
   - URL.java does not cutout the userinfo anymore 
    (needed to transport authentication info in ftp urls, e.g. ftp://username:pwd@ftp.irgendwas.de)
   - plasmaCrawlLoader.java: 
   -- hack to re enable https urls
   -- adding function getSupportedProtocols

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:17:11 +00:00
orbiter
6cce47e217 test of ftp-urls in URL class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:10:40 +00:00
theli
fce9e7741b *) next step of restructuring for new crawlers
- renaming of http specific crawler settings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 11:56:47 +00:00
theli
e3f0136606 *) next step of restructuring for new crawlers
- adding function isSupportedProcotol to plasmaCrawlLoader.java
   - disabling robots.txt check for protocols other than http(s)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2479 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 11:46:17 +00:00
theli
9ded4e8d5a *) Bugfix for name resolution in proxy mode
See: http://www.yacy-forum.de/viewtopic.php?p=25241

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 11:26:53 +00:00
theli
1c8300fcec *) Bugfix for name resolution in proxy mode
See: http://www.yacy-forum.de/viewtopic.php?p=25241

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 11:23:57 +00:00
theli
4e2a950ac9 *) next step of restructuring for new crawlers
- avoid using the http crawler class directly. Using the interface class instead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2476 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 09:24:24 +00:00
theli
f94131c13d *) Bugfix for Blacklist_p.java
- avoid nullpointerexception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 09:02:23 +00:00
theli
09b106eb04 *) next step of restructuring for new crawlers
- adding interface class (plasma/crawler/plasmaCrawlWorker.java) for protocol specific crawl-worker threads 
   - moving reusable code into abstract crawl-worker class AbstractCrawlWorker.java
   - the load method of the worker threads should not be called directly anymore (e.g. by the snippet fetcher)
     to crawl a page and wait for the result use function plasmaCrawlLoader.loadSync([...])

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2474 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 09:00:18 +00:00
theli
eb9b138986 *) next step of restructuring for new crawlers
- conversion of the crawler pool into a keyed object pool
   - crawlers are now loaded based on the url protocol (of course works only for http now)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 06:52:55 +00:00
theli
1395aae742 *) starting restructuring which is needed to add crawlers for additional protocols
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 06:09:20 +00:00
theli
857a2d76a2 *) better handling of server shutdown
See: e.g. http://www.yacy-forum.de/viewtopic.php?p=25234

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 05:47:17 +00:00
theli
b4acbdaa97 *) better handling of server shutdown
See: e.g. http://www.yacy-forum.de/viewtopic.php?p=25234

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2470 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 05:17:37 +00:00
orbiter
7df572756a fist step+attempt so solve the snippet marking problem.
See: http://www.yacy-forum.de/viewtopic.php?p=22855#22855

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2469 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-03 23:22:23 +00:00
theli
f3ac4dbbb9 *) better handling of server shutdown
See: e.g. http://www.yacy-forum.de/viewtopic.php?t=2584

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2468 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-03 14:59:00 +00:00
theli
959b779aba *) avoid performance loss if log level is greater than 'fine'
See: http://www.yacy-forum.de/viewtopic.php?p=25180

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2467 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-03 08:42:46 +00:00
auron_x
b515d49f87 *) fix for new combinedVersionString2PrettyString by bost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-02 07:29:12 +00:00
auron_x
24316ba937 *) improved implementation of combinedVersionString2PrettyString by bost
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2465 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-30 16:04:40 +00:00
auron_x
57dda1a92c *)again fixing for wrong version display, now totally working with double instead of float
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-28 17:54:07 +00:00
auron_x
479b74e1dd *) fix for stupid mistake in new ppm-calc which caused decimal digits beeing written to seedinfo
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2463 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-28 04:43:28 +00:00
auron_x
5e558fbaae *) hopefully fixed the wrong display of yacy-version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2462 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-27 21:52:58 +00:00
auron_x
348258a557 *) changed PPM-calculation to be much more accurate
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2461 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-27 17:18:34 +00:00
orbiter
18b6876860 new cache flush configuration settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-25 22:31:21 +00:00
hermens
f0278b4092 Bugfix for / by zero when the AssortmentCluster is empty
See: http://www.yacy-forum.de/viewtopic.php?t=2746



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2459 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-25 20:23:04 +00:00
orbiter
14e0bb0dcf allow more references per word for new db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2458 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-25 12:06:23 +00:00
orbiter
985dcbde7f changed some parameters that may cause better memory usage and more indexing speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 23:39:52 +00:00
orbiter
b7f4a1521b added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 22:21:22 +00:00
orbiter
c26da4893b turned back NURL usage of kelondroTree, kelondroFlexTable has still problems with deleted entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2454 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 10:03:38 +00:00
orbiter
db1eae0227 * simplified initialization of database objects
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 02:19:25 +00:00
hermens
0b73f2b132 Repair DNS prefetch during cacheScan
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-24 01:31:08 +00:00
rramthun
e34e07e0a1 - Changed back to dev namescheme and new 0.461
- Corrected some errors in News.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2450 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-23 19:39:48 +00:00
allo
2d9478d203 installer for 0.46
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-23 13:09:48 +00:00
orbiter
27a159b401 * documentation update
* removed doc from release
* release information in doc/News.html
* release 0.46

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-23 11:36:09 +00:00