.. |
content
|
enhanced the surrogate parser: better reading of UTF-8 characters
|
2011-04-01 11:05:42 +00:00 |
geolocalization
|
replaced more appearance of double values by float values
|
2011-02-02 00:06:29 +00:00 |
importer
|
fix for mediawiki importer and wikicode parser
|
2011-04-13 13:22:27 +00:00 |
language
|
*) cleaning up the code a little bit
|
2010-12-27 17:07:21 +00:00 |
parser
|
more UTF8 getBytes() performance hacks
|
2011-04-12 05:02:36 +00:00 |
AbstractParser.java
|
redesign of parser interface:
|
2010-06-29 19:20:45 +00:00 |
Classification.java
|
*) added SID file (Commodore 64) sound file parser
|
2010-12-28 12:06:04 +00:00 |
Condenser.java
|
- added an index constraint 'has location' to the condenser
|
2011-03-31 09:41:30 +00:00 |
Document.java
|
more UTF8 getBytes() performance hacks
|
2011-04-12 05:02:36 +00:00 |
ImageParser.java
|
redesign of parser interface:
|
2010-06-29 19:20:45 +00:00 |
LargeNumberCache.java
|
more performance hacks
|
2010-10-09 08:55:57 +00:00 |
LibraryProvider.java
|
- fixed document number limitation for crawls that restrict the number of documents per domain
|
2011-02-12 00:01:40 +00:00 |
Parser.java
|
*) added SID file (Commodore 64) sound file parser
|
2010-12-28 12:06:04 +00:00 |
Phrase.java
|
more performance hacks
|
2010-10-09 08:55:57 +00:00 |
SentenceReader.java
|
*) set SVN properties
|
2011-03-08 01:51:51 +00:00 |
SnippetExtractor.java
|
always try to guess the size of a StringBuilder to prevent too many memory re-allocations
|
2011-03-09 09:29:05 +00:00 |
TextParser.java
|
*) added SID file (Commodore 64) sound file parser
|
2010-12-28 12:06:04 +00:00 |
WordCache.java
|
some enhancements to scoring speed
|
2011-04-13 15:17:00 +00:00 |
WordTokenizer.java
|
- replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
|
2011-03-10 23:25:07 +00:00 |