Commit Graph

6063 Commits

Author SHA1 Message Date
orbiter
ea473e32b8 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6390 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 22:27:50 +00:00
orbiter
735e2737e3 * added index segments
This is a major change in the organization of indexes.
Please consider a back-up of your data before you run this update.
All existing index files will be moved and renamed to a new position.
With this change, it will be possible to maintain different indexes for different purposes and it will be possible to have a distinction between DHT-in and DHT-out specific indexes. Tenants may also have their own index, and it may be possible to have histories and back-ups of indexes. This is just the beginning, many servlets must be adopted after this change, but all functions that had been there should still work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 14:44:20 +00:00
orbiter
09de5da74a once again a performance hack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6388 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 18:26:54 +00:00
orbiter
2f6d88403e
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 18:10:56 +00:00
orbiter
d2615ea5a8 increased memory for scraper buffer to enhance parsing speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 15:27:13 +00:00
orbiter
4bbbb74ec4 removed not necessary synchronization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 15:26:28 +00:00
hermens
67e5464cc2 Fix for SVN6380: x[] Arrays are unsuitable Keys for Maps without using a proper Comparator.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 12:55:36 +00:00
lotus
5f72d2b19f update to jre6u16
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 10:48:33 +00:00
hermens
aeab8c7917 Prevent failed DHT attemps from overwriting newer peer info
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 00:17:29 +00:00
hermens
9324b5b6c5 Enhancements to DHT
- speed up deletion of containers when selscted from whole index
- correctly eliminate all references to unavailable URLs, not just the first encountered



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-08 00:03:16 +00:00
hermens
e49e2d75fe Limit the time Transmission.Chunks stay in the transmissionCloud by using a Map that iterates entires in insertion order.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-07 23:41:25 +00:00
orbiter
92db7c5d07 increased timeout for index retrieval
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-07 13:03:13 +00:00
lotus
386b9f35f6 activated resource observer for windows 7
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-07 06:20:24 +00:00
orbiter
6e0dc39a7d - some fixes to prevent blocking situations
- better logging for the crawler
- better default values for the crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-06 21:52:55 +00:00
orbiter
51f2bbf04b possible fix for problem in http://forum.yacy-websuche.de/viewtopic.php?p=17655#p17655
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-06 09:56:14 +00:00
orbiter
f8371707e5 - possibly better termination for SplitTable
- better abstraction in DidYouMean

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-05 22:09:58 +00:00
orbiter
87780f2562 produce did-you-mean also for queries with more than one word
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-05 21:51:02 +00:00
orbiter
04a548a1e3 - temporary integrated the transferURL servlet as static class instead as a class that is called using reflection to investigate the OOM problems in that class
- fixes for numerous other problems
- removed dead code
- resdesign of the strings-method, which produces now less memory overhead and may help to prevent OOMs
- another fix for the deadlock problem in SplitTable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-05 20:11:41 +00:00
orbiter
ea427df944 fixed a worst case situation of the condenser which may cause a temporary full CPU load because of a bad data structure usage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6372 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-05 08:26:55 +00:00
lotus
f1bde59c50 logger config cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6371 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-02 18:03:14 +00:00
orbiter
3e38035389 fix for interrupted thread during has() property check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-02 10:55:40 +00:00
orbiter
5bd1c1d205 just added some comments that had been produced to learn about OAI-PMH
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-01 22:56:22 +00:00
suessthomas
d52cf19835 small changes to de.lng (parser settings)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-01 20:27:05 +00:00
orbiter
6aa474f529 - better logging for web cache access and fail reasons
- better Exception handling for web cache access
- distinction between access of web cache for proxy and crawler


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-01 13:08:19 +00:00
orbiter
3671c37989 added experimental oai-pmh reader and integrated it with the existing dublin core parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 22:11:00 +00:00
orbiter
0c17b600c6 remote search by default off
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 15:06:29 +00:00
orbiter
58a00205d5 re-activated the emergency close when too many server connections exist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6364 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 14:29:43 +00:00
orbiter
c57d2070e6 more logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6363 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 13:25:08 +00:00
orbiter
a995b95367 tried a fix for the httpd access bug (too many unclosed sessions)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6362 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 13:18:02 +00:00
orbiter
e1fba41cad better logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6361 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 21:52:17 +00:00
orbiter
2275f885a8 possible fix for concurrency problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6360 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 21:40:50 +00:00
low012
a6a3090c3d *) blacklist cleaner supports usage of regular expressions now
*) refacored BlacklistCleaner_p.java for better readability
*) moved check of validity of patterns to the Balcklist implementation since patterns might be valid in one implementation, but not in another
*) added method to check validity to Blacklist interface
*) fixed some minor issues like typos or wrong whitespaces
*) set subversion properties for a whole bunch of files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 21:28:49 +00:00
orbiter
5a93807781 improved web cache speed:
- removed one computation out of a synchronization
- removed one not necessary has() call


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:41:05 +00:00
orbiter
2e8b2867ff double performance of store method because it avoids one 'has'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6357 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:23:44 +00:00
orbiter
afda5b1adc new join method for indexes (not yet used)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:16:24 +00:00
orbiter
65b66c2c18 better handling of array files of length 0
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6355 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:13:44 +00:00
orbiter
1957b5797a fix for seed generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6354 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:05:36 +00:00
orbiter
432154f725 new strategy for concurrent database index key retrieval
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-29 08:04:00 +00:00
suessthomas
09775daa60 slightly changed my skin (vega-aqua):
- search result headline font is slightly larger and has an underline (like g**gle)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-28 19:50:43 +00:00
orbiter
a11cd9f80f - removed reverse name lookup for http access logging (grr..)
- removed a synchronization in seed info string generation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-28 15:23:15 +00:00
orbiter
2e6bdce086 - added more logging to balancer
- changed balancer logic slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-27 22:35:22 +00:00
low012
3c4064932c *) added width and hight to prevent the page from "jumping" when the image is reloaded automatically in Opera 10
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-26 22:32:52 +00:00
low012
5e4f267a36 *) added subversion properties and edited a few comments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-26 22:07:40 +00:00
low012
2d01411bbc *) tested string will appear in input box now (when testing a similar URL it does need to be copied anymore, it can be edited it right away)
*) https:// and ftp:// can be used as start of the string to be tested now too
*) better error handling (no text in Java anymore)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-26 21:33:33 +00:00
orbiter
3faa011e3d added another search integration help page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-25 14:03:41 +00:00
orbiter
26b81bd1f1 added another search integration help page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-25 14:03:07 +00:00
orbiter
69a091de17 added skin for geocaching search portal
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-24 22:53:50 +00:00
orbiter
1171a72006 fix for deadlock as seen in http://forum.yacy-websuche.de/viewtopic.php?p=17521#p17521
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-24 19:14:35 +00:00
orbiter
031e6eefbd some updates to dublin core, metadata browsing, file indexing and parser stability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-24 12:54:45 +00:00
hermens
62a7341c4d Fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2204
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-24 11:38:15 +00:00