yacy_search_server/source/de/anomic/kelondro/order
orbiter 16baa7ad24 To translate a mediawiki dump into the YaCy surrogate format do the following:
- download a wikipedia dump, i.e. dewiki-20090311-pages-articles.xml.bz2
from http://download.wikimedia.org/dewiki/20090311/
- move dewiki-20090311-pages-articles.xml.bz2 to DATA/HTCACHE/
- start the conversion; open a command shell, move to the yacy home directory and execute
java -Xmx2000m -cp classes:lib/bzip2.jar de.anomic.tools.mediawikiIndex -convert DATA/HTCACHE/dewiki-20090311-pages-articles.xml.bz2 DATA/SURROGATES/in/ http://de.wikipedia.org/wiki/

this generates a series of files to DATA/SURROGATES/in

if YaCy is running (it may run concurrently), it fetches all new dumps in the surrogate-in directory. The export process is transaction-save, that means YaCy will not start reading a dump while the dump is not completely finished.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5851 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-21 22:12:19 +00:00
..
AbstractOrder.java
Base64Order.java To translate a mediawiki dump into the YaCy surrogate format do the following: 2009-04-21 22:12:19 +00:00
Bitfield.java
ByteOrder.java - more efficient comparator calls 2009-03-14 00:07:37 +00:00
CloneableIterator.java
CloneableMapIterator.java
Coding.java
Digest.java some fixes and performance hacks 2009-04-20 23:01:44 +00:00
MergeIterator.java added new class RowSetArray which arranges RowSet objects like Elements in a hashtable, but still provides the functionality of sorted enumeration. The new class is now integrated into the ObjectIndexCache, which is the core class to provide index functions to all database files. The new index access is about twice as fast as before. This has strong speed enhancement effects on all parts of YaCy. 2009-03-12 23:05:18 +00:00
MicroDate.java
NaturalOrder.java - more efficient comparator calls 2009-03-14 00:07:37 +00:00
Order.java even more efficient comparator calls (less System.arraycopy for primary keys) 2009-03-14 00:41:20 +00:00
RotateIterator.java
StackIterator.java added new row interator in kelondro tables files that enumerates rows 2009-02-24 10:40:20 +00:00