.. |
content
|
abstraction of surrogate main element (xmlns:geo was missing for wiki extracts)
|
2011-05-17 08:57:49 +00:00 |
geolocalization
|
added autotaggig stub .. only reading and parsing of vocabularies at
|
2012-01-07 17:34:38 +01:00 |
importer
|
!Important: move from Hashtable to HashMap
|
2012-01-09 01:29:18 +01:00 |
language
|
enhanced identificator: using AtomicInteger for counter
|
2011-06-19 13:31:10 +00:00 |
parser
|
PDFParser - return at least first 3 pages of PDF
|
2012-01-21 03:15:12 +01:00 |
AbstractParser.java
|
added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
|
2011-09-07 10:08:57 +00:00 |
Autotagging.java
|
suppress auto-tagged subject entries when sending out or receiving
|
2012-01-17 02:10:05 +01:00 |
Classification.java
|
- added a 'add every media object linked in a html document as a new document' to the html parser. This causes that all image, app, video or audio file that is linked in a html file is added as document. In fact that means that parsing a single html document may cause that a number of documents is inserted into the search index.
|
2011-09-01 16:05:00 +00:00 |
Condenser.java
|
added autotagging to document condenser:
|
2012-01-15 22:17:57 +01:00 |
Document.java
|
added autotagging to document condenser:
|
2012-01-15 22:17:57 +01:00 |
ImageParser.java
|
- enhanced description on search front page
|
2011-11-26 13:40:33 +00:00 |
LargeNumberCache.java
|
more performance hacks
|
2010-10-09 08:55:57 +00:00 |
LibraryProvider.java
|
added autotagging to document condenser:
|
2012-01-15 22:17:57 +01:00 |
Parser.java
|
*) added SID file (Commodore 64) sound file parser
|
2010-12-28 12:06:04 +00:00 |
Phrase.java
|
more performance hacks
|
2010-10-09 08:55:57 +00:00 |
SentenceReader.java
|
Initial performance improvements
|
2011-11-30 11:15:54 +00:00 |
SnippetExtractor.java
|
finishing up my commits (7855-7858) which could be helpful for
|
2011-08-01 23:35:24 +00:00 |
StringBuilderComparator.java
|
replaced String with StringBuilder in suggestion process
|
2011-11-09 14:42:55 +00:00 |
TextParser.java
|
added switches to ConfigParser to accept/deny documents by their
|
2012-01-17 16:43:34 +01:00 |
WordCache.java
|
vocabularies are now also used as source for a did-you-mean computation
|
2012-01-08 02:13:52 +01:00 |
WordTokenizer.java
|
Initial performance improvements
|
2011-11-30 11:15:54 +00:00 |