Commit Graph

28 Commits

Author SHA1 Message Date
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
a34d9b8609 * added a search history cache that maintains search results for 10 minutes
it is necessary for the new search process that will do automatic re-searches
a positive effect is, that when a re-search is done it can be monitored how many
results had been contributed from other peers. The message for this contribution
was moved from the end of the result page to the top.
* enhanced re-search time when a global search was done an the local index has
already a great number of results for this word
* re-organised presearch computation; must be further enhanced

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 23:12:59 +00:00
orbiter
72752bb503 because of a new database structure handling, the memory need for accessing
collection objects has been reduced to 50%:
- set new memory calculation functions for indexing process
- adjusted guessed memory amount
-> Testing needed:
   try new recommended value (see performanceQueues) and see if OOMs occur.
-> report maximum recommended value, so we can set new default values.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-20 17:36:43 +00:00
orbiter
9628db6cdc enhanced memory allocation during database access:
- refactoring of kelondroRecords; this class is now divided into
  kelondroAbstractRecords, kelondroRecords, kelondroCachedRecords, kelondroHandle and kelondroNode
- better abstraction of kelondroNodes, such nodes may now be crated by different classes
- a new Node defining class kelondroEcoRecords defines Nodes that do not need so much allocation and System.arraycopy
- there is less memory transfer on the bus, especially for collection index
- now half of memory needed for web index access


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4024 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-03 11:44:58 +00:00
orbiter
ba2c307ab3 optimized memory allocation in kelondroRow.Entry
such an entry cannot be instantiated without allocation of new byte[]; instead
it can re-use memory from other kelondroRow.Entry objects.
during bugfixing also other bugs may have been solved, maybe the INCONSISTENCY problem
could have been solved. One cause can be missing synchronization during bulk storage
when a R/W-path optimization is done. To test this case, the optimization is currently
switched off.
More memory enhancements can be done after this initial change to the allocation scheme.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 12:10:12 +00:00
orbiter
2cb16824e3 removed support for old database structures.
The new collection index will be more generalized to support other indexes
i.e. YBR block-rank computation. A clean-up of the many conditions to support
the old database was necessary.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3506 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-21 15:35:35 +00:00
orbiter
5bbf010107 removed synchronization of size() method from numerous classes to avoid thread locking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-18 19:45:23 +00:00
orbiter
38b93f8cb8 bugfix for my last commit:
iterator did not consider secondary start point in case of rotation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 22:07:17 +00:00
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
orbiter
1cba31de43 redesigned ram organization for database caches
- each cache can now allocate as much memory as is available
- no more fixed limits
- replaced old performance memory monitor by new one
- added supervision methods as static functions into the classes that provide cache functionality
- steering of ram allocation is done with two simple limits that are ram availability-relative


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-06 22:43:32 +00:00
orbiter
b466baa574 added some memory protection
too large collection arrays are now avoided. By default, the biggest
collection index is 7. larger collections are dumped into a commons
directory, but cannot yet be used. Bevore doing a dump, the collection
is splittet into a part which has only root-references, and stored back
to the collection; the remaining part goes to commons

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3426 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-03 00:55:51 +00:00
orbiter
51e12049fa third generation of R/W head path optimization
- data from collection arrays are read in order
- merged data is written in order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 11:13:23 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
orbiter
1f1f398bfa enhanced speed of RAM cache flush by factor 20 (twenty times faster)
- the speed was doubled by avoiding read access during the dump
- the speed was dramatically increased at least by factor 10
   by using a temporary ram-file where the structures are flushed to
   before it is dumped then as a whole byte-chunk to the file system.
The speed enhancements also affects some other parts of the database.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 23:21:46 +00:00
orbiter
daf2e15f59 some storage process enhancements (write without preceding read)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-07 23:23:24 +00:00
orbiter
9c2101a852 small enhancement to cache dump
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-07 00:02:54 +00:00
(no author)
c67d22116e added exists-check based only on RAM index lookup:
- faster double-check during crawling
- less IO

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3179 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-08 13:13:30 +00:00
orbiter
c48374d14a new memory limit computation for indexing queue
shall better prevent outofmemory errors

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3118 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-22 12:54:56 +00:00
orbiter
937ccd4e76 fix for snippet-generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-09 02:13:43 +00:00
orbiter
10d888e70c - added a media search for images, audio, video and applications
- new search options on search page
- new option in ViewInfo to display all links of a file
- enhanced collection data structure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-07 02:40:57 +00:00
orbiter
9a85f5abc3 cleanup
- removed 'deleteComplete' flag; this was used especially for WORDS indexes
- shifted methods from plasmaSwitchboard to plasmaWordIndex

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-06 12:51:46 +00:00
orbiter
773ba1e91a - generalized object order handling
- controlled object order for all database tables
- migrated DHT position computation to correct base64-decoded values
  this also closed the 'gaps' in the dht positions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3049 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-06 03:02:57 +00:00
orbiter
109ed0a0bb - cleaned up code; removed methods to write the old data structures
- added an assortment importer. the old database structures can
  be imported with
  java -classpath classes yacy -migrateassortments
- modified wordmigration. The indexes from WORDS are now imported
  to the collection database. The call is
  java -classpath classes yacy -migratewords
  (as it was)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-05 02:47:51 +00:00
orbiter
052f28312a removed assortments from indexing data structures
removed options to switch on assortments

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-02 19:34:59 +00:00
orbiter
e3d75f42bd final version of collection entry type definition
- the test phase of the new collection data structure is finished
- test data that had been generated is void. There will be no migration
- the new collection files are located in DATA/INDEX/PUBLIC/TEXT/RICOLLECTION
- the index dump is void. There will be no migration
- the new index dump is in DATA/INDEX/PUBLIC/TEXT/RICACHE

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2983 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-19 20:05:25 +00:00
orbiter
bb7d4b5d5e refactoring to prepare new RWI entry object
- moved all url and index(RWI) entries to index package
- better naming to distinguish RWI entries and URL entries


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-08 16:17:47 +00:00
orbiter
ba967c4875 - bugfixes and debug code
- ne generalized index class indexCachedRI

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-07 01:09:02 +00:00
orbiter
ee4715a21c - more asserts
- bugfix for performaceMemory
- refactoring of index ram cache: renamed indexRAMCacheRI to indexRAMRI, to make space for a cached indexRI, which should be named indexRAMCacheRI

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2925 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-06 10:15:05 +00:00