yacy_search_server/source/net/yacy/data/ymark
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
When a crawl is started, a new field to exclude content from scraping is
available. The field can be identified with the class name of div tags.
All text contained in such a div tag where the configured class name(s)
match are not indexed, while the remaining page is indexed.
2017-12-09 22:29:35 +01:00
..
MonitoredReader.java refactoring 2012-09-21 15:48:16 +02:00
TablesRowComparator.java - the webgraph shall store all links which appear on a web page and not 2013-09-15 00:30:23 +02:00
YMarkAutoTagger.java enhanced timezone managament for indexed data: 2015-04-15 13:17:23 +02:00
YMarkCrawlStart.java added a crawl filter based on <div> tag class names 2017-12-09 22:29:35 +01:00
YMarkDate.java - the webgraph shall store all links which appear on a web page and not 2013-09-15 00:30:23 +02:00
YMarkDMOZImporter.java added missing @Override annotation 2014-03-28 13:48:37 +01:00
YMarkEntry.java - the webgraph shall store all links which appear on a web page and not 2013-09-15 00:30:23 +02:00
YMarkHTMLImporter.java added missing @Override annotation 2014-03-28 13:48:37 +01:00
YMarkImporter.java Added 'final' for all exception blocks as this helps the Java compiler 2013-07-17 18:31:30 +02:00
YMarkJSONImporter.java added missing @Override annotation 2014-03-28 13:48:37 +01:00
YMarkMetadata.java harmonize/correct assignment to Ymarkmeta.mime 2015-09-23 00:13:10 +02:00
YMarkTables.java skip loading document on crawl start for YMark bookmarks 2015-12-26 01:15:07 +01:00
YMarkTag.java added missing @Override annotation 2014-03-28 13:48:37 +01:00
YMarkUtil.java Cleaned up some Javadoc warnings. 2017-01-09 16:44:47 +01:00
YMarkXBELImporter.java added missing @Override annotation 2014-03-28 13:48:37 +01:00