yacy_search_server/source/de/anomic/kelondro/text
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -incollection DATA/INDEX/freeworld/TEXT/RICOLLECTION used.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -diffurlcol DATA/INDEX/freeworld/TEXT used.dump diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -export DATA/INDEX/freeworld/TEXT xml urls.xml diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -delete DATA/INDEX/freeworld/TEXT diffurlcol.dump

The export-feature is optional, the purpose of that function is to provide a back-up function for URLs to be deleted. The export function can also be used to create html files with embedded links and simple text-files. Simply replace the 'xml' word with 'html' or 'text'. The last argument in the cann, the diffurlcol.dump value, can also be omitted. This will cause that the complete URL database is exported. This is an alternative to the Web-Interface based export function.

The delete-feature is the only destructive method of the four presented here. Please use it with care. It is better to make a back-up of the url database files before starting the deletion.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-10 20:52:10 +00:00
..
AbstractBlacklist.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
Blacklist.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
DefaultBlacklist.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
Document.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
Index.java
IndexCache.java fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1915&hilit=&p=13249#p13249 2009-03-09 10:14:49 +00:00
IndexCell.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
IndexCollection.java better logging and startup behaviour for referenceHash computation 2009-03-09 22:32:04 +00:00
IndexReader.java
MetadataRepository.java added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following: 2009-03-10 20:52:10 +00:00
MetadataRowContainer.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
Phrase.java
Reference.java
ReferenceContainer.java fixed merge method initialization in ReferenceContainer 2009-03-07 10:45:14 +00:00
ReferenceContainerArray.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
ReferenceContainerCache.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
ReferenceContainerOrder.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
ReferenceOrder.java
ReferenceRow.java
ReferenceVars.java
URLMetadata.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
Word.java