Commit Graph

969 Commits

Author SHA1 Message Date
orbiter
aea3e00864 cleanup: removed unused temporary index management in indexEntity.
This is replaced by indexContainers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1486 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 01:18:25 +00:00
orbiter
03c65742ba changes towards the new index storage scheme:
- replaced usage of temporary IndexEntity by EntryContainer
- added more attributes to word index
- added exact-string search (using quotes in query)
- disabled writing into WORDS during search; EntryContainers are used instead


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 00:42:38 +00:00
rramthun
84a00e5673 Use YaCy logging instead of something I don't understand.
Problem was: YaCy under Linux wrote every CORRECTING ITERATOR message to syslog an your logfiles get VERY big if you run YaCy 24/7. 
Approx. 20MB/day.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-29 16:04:20 +00:00
theli
ab7a911bb3 *) Trying to solve pool not open problem
See: http://www.yacy-forum.de/viewtopic.php?t=1798

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-29 08:54:19 +00:00
allo
a6245a302f even better ppm ;-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 19:22:30 +00:00
hydrox
d665f3c39c *) fixed Threadnames for stackCrawl-Threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 19:06:21 +00:00
theli
3d5347bc8e *) changing loglevel for some messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1479 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 17:40:24 +00:00
theli
0fcd113c42 *) last bugfix part. Seems to work now for the stackCrawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 17:25:19 +00:00
theli
b9c9eaeb44 *) next try todo a bugfix :-((
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 17:05:47 +00:00
theli
4b4b93c413 *) next try todo a bugfix :-(
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1476 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 16:55:05 +00:00
theli
d9fbad71b9 *) next try todo a bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 16:38:25 +00:00
theli
6da97bd2e4 *) next bugfix for threadpool problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1474 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 16:31:40 +00:00
theli
bea2b9edee *) further redesign of threadpools to solve too many thread problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 16:18:07 +00:00
hermens
2d1283da34 This is an extremely ugly workaround for an incompatibility between yacySeed hashes and kelondroDyn keys
See: http://www.yacy-forum.de/viewtopic.php?p=15955#15955



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1472 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 15:26:56 +00:00
theli
784fd50437 *) more verbose thread names
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 15:26:47 +00:00
theli
56e4dbeb71 *) displaying current active + current idle threads in PerformanceQueues_p.html now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1470 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-28 15:17:04 +00:00
theli
859c6a88f5 *) testing various thread pool eviction settings to avoid outOfMemory - Thread creation problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1467 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-27 16:51:29 +00:00
allo
7197f171d3 better ppm calculation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1464 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-27 11:51:27 +00:00
orbiter
f2b18cede9 AND-bugfix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1461 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-27 03:58:38 +00:00
orbiter
b946e28e61 some ranking enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1460 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-27 02:48:27 +00:00
rramthun
6c02f889f7 Cosmetic changes.
Corrected version numbering as described in http://www.yacy-websuche.de/wiki/index.php/De:Versionsnummern

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-26 15:12:44 +00:00
theli
b191f06d16 *) Adding additional logging message to locate problems with stackcrawl threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1452 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-26 14:24:29 +00:00
theli
5c56b9ed59 *) catch exceptions that could occur during url decoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-26 13:57:49 +00:00
theli
d9bcd73d93 *) Bugfix for exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1448 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-26 12:15:59 +00:00
theli
f5abfe8d57 *) more failsafe threadpools
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1446 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-26 09:37:43 +00:00
orbiter
47344e8df0 removed referrer fake (too many complaints, too less use)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 18:54:46 +00:00
hermens
ad0de69607 Yet another bug fix for svn 1441. It should work now.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:32:04 +00:00
hermens
58fd40e1c1 Aaargh
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1442 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:28:03 +00:00
hermens
b08af0c2cb *) Force download of seed file when checking upload success
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 13:25:48 +00:00
hermens
66c889138e *) Bugfix: Principals are reported back as 'principal', so IWasAccessed should also be true
*) make it easier to include legacy peers switching between timezones +0100 and +0200



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-25 01:50:24 +00:00
orbiter
a56fefe0d3 added missing forced-flush for index cache
see http://www.yacy-forum.de/viewtopic.php?p=15732#15732

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 16:24:15 +00:00
hermens
78bcb8014a *) Limit range for selection of indexes for distribution to a DHTDistance of 0.2
(For wider ranges enough suitable targets are not probable)
*) Migrate Indexes from ClassicDB back to AssortmentCluster if transfer fails
*) Remove class iterateFiles from plasmaWordIndex
   (The class iterateFiles from plasmaWordIndexClassicDB is used instead)



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1430 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 14:58:15 +00:00
hermens
861aae678d *) cleanup cacheAge database when cleaning up the HTCache
*) Log directory deletes with level Fine



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1427 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 14:21:01 +00:00
orbiter
b9d73f63e7 replaced String object in loop detection by byte[] to omit String-generation
which could cause locks.
See http://www.yacy-forum.de/viewtopic.php?p=15738#15738

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1425 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 13:12:38 +00:00
low012
927c2c3709 *) Fixed a minor bug in code for tables. {|border"1" did not work, {| border"1" did. Now the space is not needed anymore.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 11:15:20 +00:00
theli
75aad0fe66 *) Bugfix for URLs containing spaces
See: http://www.yacy-forum.de/viewtopic.php?t=1640

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 09:30:58 +00:00
theli
754a35877f *) Changing robots parser cxclusion policy
- crawling is now allowed if server returned a 403 statuscode 
     when trying to download the robots.txt
   See: http://www.yacy-forum.de/viewtopic.php?t=1612

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 08:53:30 +00:00
hermens
a2e2d583f9 *) small bugfix regarding peerPingMaxRunning
*) beautify log



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-24 01:16:22 +00:00
theli
b4e2efef10 *) first test of new iteration function
ATTENTION: please don't use it at the moment

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 17:20:30 +00:00
rramthun
a4e90c4b11 Fixed spelling bug.
I think this is important for other programmers, who don't make the same mistake as the original author.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 15:08:39 +00:00
low012
c45517db46 *) replaced code for table with better version (by kane)
*) split replaceHTML into replaceCharacters and replaceHTMLonly, replaceHTML can still be used to ensure compatibility



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1416 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 13:59:40 +00:00
orbiter
eabf4a0386 fix for null pointer exception during shut-down
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1415 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 13:45:14 +00:00
orbiter
47843e69e2 auto-reset for switchboard queue stack
bugfix for http://www.yacy-forum.de/viewtopic.php?p=15684#15684

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 12:41:08 +00:00
orbiter
a70970f993 fixed increment in content iterator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1413 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 12:11:32 +00:00
hermens
62ab8d18c1 *) Bugfix for peer sorting method. This seems to cause funny side effects in the SeedDB
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1412 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-23 01:42:26 +00:00
hermens
75b268f16d *) use majority voting for peer type decision
*) reduce the number of peer pings sent out
see: http://www.yacy-forum.de/viewtopic.php?t=1748



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1411 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-22 23:14:37 +00:00
orbiter
d6581c445b added content iterator for corrupted database files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1406 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-22 17:12:16 +00:00
theli
ecdc1f7547 *) Bugfix for crawling URLs with query parameters
See: http://www.yacy-forum.de/viewtopic.php?p=14065
*) Preparation for http://www.yacy-forum.de/viewtopic.php?t=1719

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1405 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-22 16:39:10 +00:00
low012
eb80156233 *) added Kane's code for tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1403 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-22 15:41:49 +00:00
low012
ef22fa8bf2 *) beautifying code and a little comment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1401 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-22 14:20:38 +00:00