..
content
Ensure lower case conversion consistency with any default locale.
2017-06-27 06:42:33 +02:00
importer
added a crawl filter based on <div> tag class names
2017-12-09 22:29:35 +01:00
language
Fixed language detector initialization and NullPointerException cases.
2016-12-05 18:12:21 +01:00
parser
Added basic support for autotagging microdata annotated item types.
2018-02-06 10:25:38 +01:00
AbstractParser.java
added a crawl filter based on <div> tag class names
2017-12-09 22:29:35 +01:00
Condenser.java
Added basic support for autotagging microdata annotated item types.
2018-02-06 10:25:38 +01:00
DateDetection.java
Remove old hard-coded holiday dates from DateDection class.
2017-11-07 19:02:09 +01:00
Document.java
Added basic support for autotagging microdata annotated item types.
2018-02-06 10:25:38 +01:00
ImageParser.java
BMP and ICO image formats support : integrated /haraldk/TwelveMonkeys
2015-11-20 09:38:16 +01:00
LargeNumberCache.java
Cleaned up some Javadoc warnings.
2017-01-09 16:44:47 +01:00
LibraryProvider.java
Cleaned up some Javadoc warnings.
2017-01-09 16:44:47 +01:00
Parser.java
added a crawl filter based on <div> tag class names
2017-12-09 22:29:35 +01:00
Phrase.java
more performance hacks
2010-10-09 08:55:57 +00:00
ProbabilisticClassifier.java
Fixed a NullPointerException case.
2016-12-02 13:45:45 +01:00
SentenceReader.java
hacks to prevent storage of data longer than necessary during search and
2013-10-25 15:05:30 +02:00
SnippetExtractor.java
skip unused call parameter for hashSentence()
2014-11-30 19:42:33 +01:00
TextParser.java
added a crawl filter based on <div> tag class names
2017-12-09 22:29:35 +01:00
Tokenizer.java
Refactoring : documented and extracted autotagging processing functions.
2018-02-02 10:27:36 +01:00
VocabularyScraper.java
added enrichment of synonyms and vocabularies for imported documents
2015-07-02 00:23:50 +02:00
WordTokenizer.java
reactivate sentence counter in WordTokenizer for phrasepos ranking,
2016-09-07 02:16:16 +02:00