yacy_search_server/source/de/anomic/kelondro
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -incollection DATA/INDEX/freeworld/TEXT/RICOLLECTION used.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -diffurlcol DATA/INDEX/freeworld/TEXT used.dump diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -export DATA/INDEX/freeworld/TEXT xml urls.xml diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -delete DATA/INDEX/freeworld/TEXT diffurlcol.dump

The export-feature is optional, the purpose of that function is to provide a back-up function for URLs to be deleted. The export function can also be used to create html files with embedded links and simple text-files. Simply replace the 'xml' word with 'html' or 'text'. The last argument in the cann, the diffurlcol.dump value, can also be omitted. This will cause that the complete URL database is exported. This is an alternative to the Web-Interface based export function.

The delete-feature is the only destructive method of the four presented here. Please use it with care. It is better to make a back-up of the url database files before starting the deletion.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-10 20:52:10 +00:00
..
blob - refactoring of IntegerHandleIndex and LongHandleIndex (better method names) 2009-03-08 21:37:17 +00:00
index added next tool for url analysis: check for references, that occur in the URL-DB but not in the RICOLLECTIONS 2009-03-10 13:38:40 +00:00
io disabled the BufferedIOChunks, because I consider it as broken. 2009-02-11 15:21:48 +00:00
order code-beautification (to be consistent with external documentation paper) 2009-03-09 10:24:15 +00:00
table better logging and startup behaviour for referenceHash computation 2009-03-09 22:32:04 +00:00
text added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following: 2009-03-10 20:52:10 +00:00
util - refactoring of IntegerHandleIndex and LongHandleIndex (better method names) 2009-03-08 21:37:17 +00:00