Commit Graph

8 Commits

Author SHA1 Message Date
orbiter
6c7e83909b - refactoring of data access methods to be prepared for new cell data structure
- removed a memory overhead in collections which prevent OOM Exception in low memory configurations

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 09:38:08 +00:00
orbiter
e1acdb952c fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-08 00:17:45 +00:00
orbiter
9b0c4b1063 redesign of parts of the new BLOB buffer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5287 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-19 22:30:44 +00:00
orbiter
9663e61449 added another class to handle BLOB writings to the new HTCACHE data storage:
- entries are buffered and written as stream with many entries at once (saves many IO accesses)
- entries are compressed with gzip: increases capacity of cache
- concurrency for stream-writing and compression: all writings to the cache are non-blocking

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5284 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-18 08:57:48 +00:00
orbiter
826ca79735 refactoring and new architecture to store the files of the web cache:
- files are not stored any more as individual files
- a new database structure using BLOBHeap files stores many cache entries in common files
- all file-writing procedures had been migrated to generate byte[] objects which are written with the new database methods

this is only an intermediate step to the final architecture, where cached files are written together with their metadata in one single database structure.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-16 21:24:09 +00:00
orbiter
1e6d12f146 Major update to BLOB data structures:
- introduced a new BLOB file format: kelondroBLOBHeap. This is a flat file with an index in RAM.
  very similar to the eco-tables, but with flexible value sizes. It will replace the kelondroBLOBTree,
  which is based on a kelondroTree, a file-AVL-based index data structure.
- the HTCACHE header file was replaced by the new blob heap file structure
- the robots.txt file was replaced by the new blob heap file structure
- the robots parser was enhanced (bugfixing for double-loading of the same robots.txt)
- other BLOB-dependent data structures were prepared to use also the new BLOB heap
- fixed a bug in the snippet fetch process: the file header was not written to the header index
There should now be less IO during snippet fetch and during crawling


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4978 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-10 00:47:37 +00:00
orbiter
81f75f5056 - removed unnecessary classes (these objects are much easier to handle using generics)
- generalized BLOB referencing. This is the preparation to use another BLOB class, the kelondroHeap


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 23:52:53 +00:00
orbiter
3330181aa0 refactoring:
find a better way to store BLOBs; generalize current BLOG data structure (kelondroDyn)
and prepare it to replace it with something better. The best candidate is the kelondroHeap,
which will become the kelondroBLOBHeap;
removed also some never-used classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-07 23:12:24 +00:00