yacy_search_server/source/de/anomic/plasma
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -incollection DATA/INDEX/freeworld/TEXT/RICOLLECTION used.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -diffurlcol DATA/INDEX/freeworld/TEXT used.dump diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -export DATA/INDEX/freeworld/TEXT xml urls.xml diffurlcol.dump
java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -delete DATA/INDEX/freeworld/TEXT diffurlcol.dump

The export-feature is optional, the purpose of that function is to provide a back-up function for URLs to be deleted. The export function can also be used to create html files with embedded links and simple text-files. Simply replace the 'xml' word with 'html' or 'text'. The last argument in the cann, the diffurlcol.dump value, can also be omitted. This will cause that the complete URL database is exported. This is an alternative to the Web-Interface based export function.

The delete-feature is the only destructive method of the four presented here. Please use it with care. It is better to make a back-up of the url database files before starting the deletion.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-10 20:52:10 +00:00
..
parser - refactoring of the http client 2009-02-19 16:24:46 +00:00
LogParser.java refactoring of logging 2009-01-30 23:33:47 +00:00
plasmaCondenser.java more refactoring of indexer and kelondro classes; 2009-03-02 10:00:32 +00:00
plasmaDbImporter.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
plasmaGrafics.java update to the server core 2009-02-10 13:26:26 +00:00
plasmaHTCache.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
plasmaParser.java - refactoring of the http client 2009-02-19 16:24:46 +00:00
plasmaParserConfig.java refactoring of logging 2009-01-30 23:33:47 +00:00
plasmaParserDocument.java more bugfixes as recommendet by findbugs 2009-02-17 09:12:47 +00:00
plasmaProfiling.java better scaling on performance graph 2009-02-15 17:36:13 +00:00
plasmaRankingCRProcess.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
plasmaRankingDistribution.java moved logging partially to kelondro 2009-01-31 01:06:56 +00:00
plasmaRankingRCIEvaluation.java moved logging partially to kelondro 2009-01-31 01:06:56 +00:00
plasmaSearchAPI.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
plasmaSearchEvent.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
plasmaSearchImages.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
plasmaSearchQuery.java more refactoring of indexer and kelondro classes; 2009-03-02 10:00:32 +00:00
plasmaSearchRankingProcess.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
plasmaSearchRankingProfile.java more performance hacks 2008-12-04 12:54:16 +00:00
plasmaSnippetCache.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
plasmaStore.java
plasmaSwitchboard.java - refactoring of IntegerHandleIndex and LongHandleIndex (better method names) 2009-03-08 21:37:17 +00:00
plasmaSwitchboardConstants.java more refactoring of kelondro.text / deleted de.anomic.index 2009-03-02 11:04:13 +00:00
plasmaWebStructure.java simplification of (internal) query process / refactoring 2009-03-06 15:53:20 +00:00
plasmaWordIndex.java added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following: 2009-03-10 20:52:10 +00:00