yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-19 00:01:41 +02:00

History

Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names When a crawl is started, a new field to exclude content from scraping is available. The field can be identified with the class name of div tags. All text contained in such a div tag where the configured class name(s) match are not indexed, while the remaining page is indexed.		2017-12-09 22:29:35 +01:00
..
MonitoredReader.java	refactoring	2012-09-21 15:48:16 +02:00
TablesRowComparator.java	- the webgraph shall store all links which appear on a web page and not	2013-09-15 00:30:23 +02:00
YMarkAutoTagger.java	enhanced timezone managament for indexed data:	2015-04-15 13:17:23 +02:00
YMarkCrawlStart.java	added a crawl filter based on <div> tag class names	2017-12-09 22:29:35 +01:00
YMarkDate.java	- the webgraph shall store all links which appear on a web page and not	2013-09-15 00:30:23 +02:00
YMarkDMOZImporter.java	added missing @Override annotation	2014-03-28 13:48:37 +01:00
YMarkEntry.java	- the webgraph shall store all links which appear on a web page and not	2013-09-15 00:30:23 +02:00
YMarkHTMLImporter.java	added missing @Override annotation	2014-03-28 13:48:37 +01:00
YMarkImporter.java	Added 'final' for all exception blocks as this helps the Java compiler	2013-07-17 18:31:30 +02:00
YMarkJSONImporter.java	added missing @Override annotation	2014-03-28 13:48:37 +01:00
YMarkMetadata.java	harmonize/correct assignment to Ymarkmeta.mime	2015-09-23 00:13:10 +02:00
YMarkTables.java	skip loading document on crawl start for YMark bookmarks	2015-12-26 01:15:07 +01:00
YMarkTag.java	added missing @Override annotation	2014-03-28 13:48:37 +01:00
YMarkUtil.java	Cleaned up some Javadoc warnings.	2017-01-09 16:44:47 +01:00
YMarkXBELImporter.java	added missing @Override annotation	2014-03-28 13:48:37 +01:00