Commit Graph

4325 Commits

Author SHA1 Message Date
apfelmaennchen
6dc319fc32 UTF-8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 21:00:03 +00:00
apfelmaennchen
3afdcd0d59 fixed problem with utf-8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4388 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:53:26 +00:00
apfelmaennchen
13668830b7 fixed problems with utf-8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4387 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:53:09 +00:00
apfelmaennchen
34e5422675 adjusted code for bookmarksDB.getFolderList()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4386 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:15:45 +00:00
apfelmaennchen
aa53a46937 adjusted code for getFolderList() and cleanTagsString()
added input for folders in add/edit bookmark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:14:31 +00:00
apfelmaennchen
eebc688f37 moved Login link to submenu
added Folders to add/edit bookmark form

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:13:08 +00:00
apfelmaennchen
f3a9e9c542 added getFolderList() to bookmarksDB
added cleanTagsString() to bookmarksDB
added getFoldersString() to Bookmark
modified getTagsString() to exclude folderTags

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4383 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-24 20:11:57 +00:00
orbiter
db25425893 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 23:08:32 +00:00
apfelmaennchen
6f9f821481 added XBEL Export for YaCy Bookmarks. Tags are strored as
<metadata owner="Mozilla" ShortcutURL="tag1,tag2"/>

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 22:19:23 +00:00
orbiter
9e7cd4fdbb more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4380 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 21:23:17 +00:00
orbiter
4e70dff8cf more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 21:09:56 +00:00
orbiter
6dc679785f - fixed bad sort behavior of kelondroRowSet, in this case: no sort at all!
see http://forum.yacy-websuche.de/viewtopic.php?p=4841#p4841
- some memory calculation enhancements in kelondroFlex and a little bit more logging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4378 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 20:18:36 +00:00
orbiter
0b4205eb5a - fix double-deletion in eco tables
- changed behaviour of sort moment (not during a get)
- added some asserts in snippet cache for debugging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 11:13:39 +00:00
low012
41a3ff8ccc *) removed unused imports
*) some generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4374 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 00:10:15 +00:00
low012
c0fbab9cca *) heading, trailing and double commas are removed since they are unnecessary
*) trailing and double slashs in paths are removed, they are not only ugly, but also caused infinite loops

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 23:59:15 +00:00
low012
089faf1a00 *) added login link at bottom of page
*) empty tags will not be displayed any longer

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4372 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 23:14:57 +00:00
orbiter
4ce6fab428 added special handling for doubles in eco tables after initialization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 21:40:25 +00:00
orbiter
002a109c4d patch for http://forum.yacy-websuche.de/viewtopic.php?p=4597#p4597
(urls that have no protocol but start with www will be treated as http://www...

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:49:26 +00:00
orbiter
634430c48a - more logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:44:12 +00:00
apfelmaennchen
d288987a93 replaced isEmpty() with equals("")
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4367 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:19:34 +00:00
orbiter
d372a78aef some fixes to bring back lulabads peer..
see also: http://forum.yacy-websuche.de/viewtopic.php?p=4772#p4772

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 20:02:20 +00:00
low012
f4799c2334 *) removed since I decided to turn this into a project of it's own using Perl to gather n-gram data which YaCy will be able to use
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:59:21 +00:00
apfelmaennchen
a870ac32b8 reorganized code and added folders; this update might have overwritten latest changes by orbiter - sorry!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4364 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:39:15 +00:00
apfelmaennchen
62a5df5bfc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4363 6c8d7289-2bf4-0310-a012-ef5d649a1542 2008-01-22 19:32:52 +00:00
apfelmaennchen
7aa94f17da git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4362 6c8d7289-2bf4-0310-a012-ef5d649a1542 2008-01-22 19:32:42 +00:00
apfelmaennchen
373fface89 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4361 6c8d7289-2bf4-0310-a012-ef5d649a1542 2008-01-22 19:32:30 +00:00
apfelmaennchen
f319bbe4c8 adjustments for folder tree
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4360 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:28:55 +00:00
orbiter
6eb8321cb0 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=752&p=4815#p4815
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4359 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:19:56 +00:00
orbiter
4ffbcd54a4 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=754
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4358 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:10:03 +00:00
apfelmaennchen
e81bced2bd reorganized the code and adjusted getTagIterator() to suit folders
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4357 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:08:32 +00:00
apfelmaennchen
4c631d912e added folder view; limited tag view to top25
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:06:21 +00:00
apfelmaennchen
e68b133b35 added JavaScript for folder tree view
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4355 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:05:36 +00:00
orbiter
85dc62c16f refactoring: more dublin core - compliant naming
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4354 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 19:03:47 +00:00
orbiter
efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
- refactoring of plasmaParserDocument to use Dublin Core - compatible property names
- redesign of url handling in parser and condenser (less String-to-yacyURL conversion)
- more generics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-22 11:51:43 +00:00
low012
cfd4fecd12 *) blanks in paths for restart and update script are replaced by backslash+blank now (see http://forum.yacy-websuche.de/viewtopic.php?t=745)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-21 18:04:08 +00:00
orbiter
f945ee21d2 some security additions, keep maximum byte[] size to 2^27
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4350 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 23:46:27 +00:00
orbiter
2f3b2f3481 - extended dbtest for comparisment tests
- added initial space option for eco tables
- used initial space value in initialization of collectionIndex, this should avoid OOM failures" /Volumes/Magneto/dev/workspace/trunk/source/dbtest.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroCollectionIndex.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroDyn.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroEcoTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroRow.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroSplitTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlBalancer.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlStacker.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlZURL.java
- added index consistency check (checks for double-occurrences of primary keys in file)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4349 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 21:42:35 +00:00
orbiter
9eb746863d interface enhancements for eco records memory statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 01:51:02 +00:00
orbiter
9abc927645 to fix inconsistencies in collection index, a double reference reporting mechanism has been implemented
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 01:22:46 +00:00
orbiter
58a1f518f8 fixed some problems with eco tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 12:23:56 +00:00
orbiter
d4d07802ac better RAM protection using eco tables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 01:50:24 +00:00
orbiter
f6cfb97b7f added a test servlet (to be used to analyse the remote crawl xml bug)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 01:07:52 +00:00
orbiter
f4e9ff6ce9 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-19 00:40:19 +00:00
orbiter
cbefc651ac more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-18 18:43:56 +00:00
orbiter
45339c3db5 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4341 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-18 17:14:02 +00:00
orbiter
94f21d9403 activated new kelondroEcoTable file structure.
This data structure replaces almost all files in the PLASMA directory
also the collection.index and the LURL-db will be created as Eco-DB, if it does not exist before
existing Flex-databases will be used as they are (the is no data lost)
If you want to force the creation of a Eco-collection.index, simply delete the old index.
The Eco file system will only be used if there is enough memory.
The collection.index RAM limit is 200MB, if you have less, a flex-Table is createt.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 21:48:08 +00:00
low012
739f35d389 *) added previous/next links to blog (in case blog has more entries than get displayed on one page)
The blog still has a major problem: entries are displayed in random(?) order if there are several entries in the blog

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4339 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 21:01:30 +00:00
orbiter
a0f7f2faad some more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4338 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 18:43:01 +00:00
orbiter
dc26d6262b - removed write buffer from kelondroCache (was never used because buggy; will now be replaced by new EcoBuffer)
- added new data structure 'eco' for an index file that should use only 50% of write-IO compared to kelondroFlex
The new eco index is not used yet, but already successfully tested with the collectionIndex
The main purpose is to replace the kelondroFlex at every point when enough RAM is available.
Othervise, the kelondroFlex stays as option in case of low memory (which then can even use a file-index)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 12:12:52 +00:00
orbiter
dbdec0f4d3 another fix for the "too many processes in loader queue, dismissed" - problem:
this was probably caused by http-forward cases; which are cases when urls from the loader queue change
and it was not possible to remove the old urls from the queue because they had been based on url hashes.
The queue is now again stored using the entry.hashCode, which does not change ieven if the url changes.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-13 23:10:09 +00:00