yacy_search_server/source/net/yacy/document
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD
queue and not from virtual documents generated by the parser.
- The parser now generates nice description texts for NOLOAD entries
which shall make it possible to find media content using the search
index and not using the media prefetch algorithm during search (which
was costly)
- Removed the media-search prefetch process from image search
2012-04-24 16:07:03 +02:00
..
content added changes from copperdust (submitted by email): 2012-02-22 12:21:27 +01:00
geolocalization added autotaggig stub .. only reading and parsing of vocabularies at 2012-01-07 17:34:38 +01:00
importer - Redesigned crawler and parser to accept embedded links from the NOLOAD 2012-04-24 16:07:03 +02:00
language added changes from copperdust (submitted by email): 2012-02-22 12:21:27 +01:00
parser - Redesigned crawler and parser to accept embedded links from the NOLOAD 2012-04-24 16:07:03 +02:00
AbstractParser.java added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled. 2011-09-07 10:08:57 +00:00
Autotagging.java fix for single-word vocabulary lines 2012-01-26 16:44:30 +01:00
Condenser.java new indexing strategy: ALL links that appear anywhere are indexed, not 2012-04-22 02:05:17 +02:00
Document.java - Redesigned crawler and parser to accept embedded links from the NOLOAD 2012-04-24 16:07:03 +02:00
ImageParser.java - enhanced description on search front page 2011-11-26 13:40:33 +00:00
LargeNumberCache.java more performance hacks 2010-10-09 08:55:57 +00:00
LibraryProvider.java added autotagging to document condenser: 2012-01-15 22:17:57 +01:00
Parser.java *) added SID file (Commodore 64) sound file parser 2010-12-28 12:06:04 +00:00
Phrase.java more performance hacks 2010-10-09 08:55:57 +00:00
SentenceReader.java Initial performance improvements 2011-11-30 11:15:54 +00:00
SnippetExtractor.java performance hack 2012-01-25 12:48:48 +01:00
StringBuilderComparator.java replaced String with StringBuilder in suggestion process 2011-11-09 14:42:55 +00:00
TextParser.java - Redesigned crawler and parser to accept embedded links from the NOLOAD 2012-04-24 16:07:03 +02:00
WordCache.java vocabularies are now also used as source for a did-you-mean computation 2012-01-08 02:13:52 +01:00
WordTokenizer.java performance hack 2012-01-25 12:48:48 +01:00