luccioman
|
5a14d34a7d
|
Refactoring : documented and extracted autotagging processing functions.
|
2018-02-02 10:27:36 +01:00 |
|
reger
|
b017e97421
|
optimize condenser language detection a little.
langdetect probabilities take letter case into account, add words from
description and anchors etc. as is.
+ add it to javadoc
|
2016-10-06 19:03:52 +02:00 |
|
reger
|
ae3717d087
|
adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! )
+ remove unused sentenceword map (we use only the count)
+ upd test case for sentence count
|
2016-10-06 03:41:07 +02:00 |
|
reger
|
e310ec5f70
|
fix posInText ranking calculation to score 0 on no position info
+ fix Word posInText calc in Tokenizer to start with 1
+ test case
|
2016-09-06 00:05:59 +02:00 |
|