Commit Graph

25 Commits

Author SHA1 Message Date
orbiter
3d4b826ca5 migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically.
This removes the last very IO-intensive data structures which were still used for Wiki, Blog and Bookmarks. Old database files will still remain in the DATA subdirectory but can be deleted manually if no major bugs appear during migration. There is no need for any user action, all migration is done automatically.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-27 15:04:04 +00:00
orbiter
26a46b5521 increased default maximum file size for database files to 2GB
Other file sizes can now be configured with the attributes
filesize.max.win and filesize.max.other
the default maximum file size for non-windows OS is now 32GB

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5974 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-25 06:59:21 +00:00
orbiter
21fbca0410 better scaling of HEAP dump writer for small memory configurations;
should prevent OOMs during cache dumps

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5920 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-04 08:29:44 +00:00
orbiter
6e0b57284d better care for states of the IODispatcher
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-03 22:54:47 +00:00
orbiter
c8624903c6 full redesign of index access data model:
terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash.
this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before.
Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-16 15:29:00 +00:00
orbiter
89ec3acb3e - full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes.
- during index joins all word positions are maintained: better ranking for word distance possible; exact phrase match can be implemented soundly


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5804 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-15 06:34:27 +00:00
orbiter
b887f4a116 keep more free mem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5778 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 14:27:04 +00:00
orbiter
c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index.
Moved to version 0.74

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 13:23:45 +00:00
orbiter
ab656687d7 more strict BLOB initialization .. may also help to save some ram
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5776 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 12:42:24 +00:00
orbiter
f21a8c9e9c a different naming scheme for BLOBArray files. This may be necessary if blobs are written more often than once in a second.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5771 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-02 15:08:56 +00:00
orbiter
7ba078daa1 - added fast site-operator
- refactoring merge into BLOBArray

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-02 13:26:47 +00:00
orbiter
0139988c04 - added writing of temporary file names and renaming to final file name when index dump/merge are done. Interrupted merges can be cleaned up.
- added clean-up of unfinished merges and unused idx/gap files
- enhanced merge file selection method

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5764 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-01 12:39:11 +00:00
orbiter
3621aa96ab - added a memory protection for the IndexCell migration
- fix for bad cell file selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5763 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-31 19:17:45 +00:00
orbiter
568e8f1741 fix in unmountBLOB
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5762 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-31 17:03:13 +00:00
orbiter
9da69d6b68 - better selection of files to be merged
- fix for getChannel().close(), which works on windows but not on macs and linux

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5761 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-31 16:49:02 +00:00
orbiter
d2e2420a68 - added another file selection method for index cell merge
- more hacks to check that files are closed propertly and filehandles do not exist after files are closed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5757 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-30 19:05:08 +00:00
orbiter
96eaecda3e - added migration class to go from index collections to the index cell data structure.
- added better control over file deletion, because this sometimes fails, especially on windows

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5756 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-30 15:31:25 +00:00
orbiter
fa07234d4e fix for clear method: now deletes files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-29 21:28:14 +00:00
orbiter
37f892b988 added new concurrent merger class for IndexCell RWI data
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-20 14:54:37 +00:00
orbiter
b3f75e48fa - enhanced balancer: auto-solving of waiting-deadlocks
- removed deprecated cache-init size value
- more debug lines for IndexCell cache dump merge

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5728 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-18 20:21:19 +00:00
orbiter
9a90ea05e0 added a merge operation for IndexCell data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5727 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-18 16:14:31 +00:00
orbiter
44874cb550 added a deleteOnExit for blob file deletion in case that a deletion is not successful.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 22:47:31 +00:00
orbiter
efcd95dc37 simplification of (internal) query process / refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-06 15:53:20 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00