Michael Peter Christen
6578ff3ddb
enhanced suggest function
2015-02-09 18:45:07 +01:00
reger
fe6f5a395d
fix Umlaut handling in blekko heuristic search term
...
http://mantis.tokeek.de/view.php?id=169
observation: blekko seams to block xxxbot agents (=0 results)
2015-02-08 23:40:33 +01:00
reger
ab98f69592
fix: searchoption hint for heuristic
2015-02-08 00:15:30 +01:00
reger
23924348e2
url with semicolon or comma handling in proxy request
...
apply patch supplied with bugreport http://mantis.tokeek.de/view.php?id=540
2015-02-07 22:01:54 +01:00
sixcooler
b05a2fca1f
small correction for last commit
2015-02-07 13:47:15 +01:00
reger
8fa542a8e1
upd to Jetty 9.2.7
2015-02-07 00:44:09 +01:00
reger
9025fe3518
upd error message for proxy
...
fix http://mantis.tokeek.de/view.php?id=539
2015-02-07 00:37:43 +01:00
Michael Peter Christen
974d58b01f
IPv6 Fix for push interface
2015-02-04 15:03:34 +01:00
Michael Peter Christen
fe50e5aef6
fix for failed selection of terms in faceted search with vocabularies
2015-02-04 11:55:27 +01:00
Michael Peter Christen
1309619a71
remove remote indexing option in crawl start if not in p2p mode
2015-02-04 11:37:07 +01:00
Michael Peter Christen
6324db1213
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2015-02-04 11:27:31 +01:00
reger
5cb05c3013
adjust table column width to not line wrap crawler traffic line
2015-02-04 03:51:34 +01:00
Michael Peter Christen
606d00c8f2
cloning a crawl now accepts the class name of vocabulary scapers
2015-02-04 01:50:35 +01:00
Michael Peter Christen
97ba5ddbb7
configuration option for maxload limit for remote search
2015-02-04 01:12:25 +01:00
reger
c454ef69c6
add shortMemory check to heuristic search
...
and skip operation on shortMemory (no request to remote openserch systems)
2015-02-03 03:08:34 +01:00
reger
11b21308c0
fix: malformed filename in image search
...
fix for http://mantis.tokeek.de/view.php?id=533
2015-02-01 05:35:09 +01:00
reger
9e1ec5fec4
refactor: just some more useages of constant for term ":[* TO *]"
2015-02-01 04:26:33 +01:00
reger
8c491f51a5
remove hardcoded initialization of language nav if not used
2015-02-01 00:29:28 +01:00
Marc Nause
a311c97c9b
Added & in start script for *NIX which was lost a few commits ago.
2015-01-30 21:17:23 +01:00
Michael Peter Christen
b5ac29c9a5
added a html field scraper which reads text from html entities of a
...
given css class and extends a given vocabulary with a term consisting
with the text content of the html class tag. Additionally, the term is
included into the semantic facet of the document. This allows the
creation of faceted search to documents without the pre-creation of
vocabularies; instead, the vocabulary is created on-the-fly, possibly
for use in other crawls. If any of the term scraping for a specific
vocabulary is successful on a document, this vocabulary is excluded for
auto-annotation on the page.
To use this feature, do the following:
- create a vocabulary on /Vocabulary_p.html (if not existent)
- in /CrawlStartExpert.html you will now see the vocabularies as column
in a table. The second column provides text fields where you can name
the class of html entities where the literal of the corresponding
vocabulary shall be scraped out
- when doing a search, you will see the content of the scraped fields in
a navigation facet for the given vocabulary
2015-01-30 13:20:56 +01:00
Michael Peter Christen
1cb290170e
refactoring of autotagging code (combined same code pieces)
2015-01-29 11:39:47 +01:00
Michael Peter Christen
c3b55455fc
enhanced initialization speed of vocabularies by using better
...
normalization and by removal of unused data structures
2015-01-29 02:45:32 +01:00
Michael Peter Christen
68c605d637
replace with CommonPattern.SPACE for split
2015-01-29 02:28:03 +01:00
Michael Peter Christen
de3e373913
using precompiled CommonPattern.TAB for split
2015-01-29 02:22:28 +01:00
Michael Peter Christen
1f5047b15f
using precompiled pattern CommonPattern.SEMICOLON for splits
2015-01-29 02:19:41 +01:00
Michael Peter Christen
a8a2b7a803
persistency for vocabulary facet switch
2015-01-29 02:16:42 +01:00
Michael Peter Christen
efbc9a3561
introducting a new getConfig method which parses comma-separated llists
...
from setting fields; refactoring for all places where such lists are
parsed
2015-01-29 01:53:36 +01:00
Michael Peter Christen
69eacdf4eb
applying precompiled CommonPattern.COMMA.split to all places where
...
split(",") was used
2015-01-29 01:46:22 +01:00
Michael Peter Christen
ac19690d30
refactoring with CommonPattern.COMMA
2015-01-29 01:35:28 +01:00
Michael Peter Christen
cf9b22ca5c
do not reindex based on vocabulary fields (there are meanwhile many of
...
them) and some default settings
2015-01-29 01:22:28 +01:00
Michael Peter Christen
5a060c9f26
refactoring of reindexSolr (just replaced constant string)
2015-01-29 00:33:07 +01:00
Michael Peter Christen
b5a55c8b3d
fix for wkhtmltopdf (custom header does not work)
2015-01-28 17:45:25 +01:00
Michael Peter Christen
3d717b749a
fix for urlmaskfilter
2015-01-28 13:40:41 +01:00
Michael Peter Christen
2636582435
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2015-01-28 10:32:17 +01:00
reger
0260d3d800
Allow to hide linkstructure graphic in crawl monitor
...
using/setting the config param DECORATION_GRAFICS_LINKSTRUCTURE
2015-01-28 03:59:01 +01:00
Michael Peter Christen
bee5ee7cce
removed some warnings
2015-01-27 17:00:20 +01:00
Michael Peter Christen
783cf6fbc7
the LinkedBlockingQueue is much faster than the ArrayBlockingQueue
...
(strange but this is the result of a test:
ArrayBlockingQueue: 39461 lines / second;
LinkedBlockingQueue: 60774 lines / second)
2015-01-27 16:53:09 +01:00
Michael Peter Christen
6390454652
fix for vocabulary on/off setting
2015-01-27 16:24:27 +01:00
Michael Peter Christen
a3c5995bde
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2015-01-26 14:13:17 +01:00
reger
5ca0762179
fix: eom on parsing ico file by genericImageParser
...
trace: java.lang.OutOfMemoryError: Java heap space
at java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75)
at java.awt.image.Raster.createPackedRaster(Raster.java:467)
at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
at java.awt.image.BufferedImage.<init>(BufferedImage.java:331)
at net.yacy.document.parser.images.bmpParser$IMAGEMAP.<init>(bmpParser.java:149)
at net.yacy.document.parser.images.bmpParser.parse(bmpParser.java:69)
at net.yacy.document.parser.images.genericImageParser.parse(genericImageParser.java:116)
2015-01-24 23:17:07 +01:00
Michael Peter Christen
4cd2d68e03
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2015-01-24 07:10:47 +01:00
Michael Peter Christen
dc5700148f
update to latest code changes from json.org
2015-01-24 07:10:14 +01:00
reger
42b0672be3
Let auto-disabled crawls recover if low resource condition vanished.
...
Analog to autodisabled DHT switch autodisabled crawls back on upon mem ok
by remembering the autodisable by conf parameter.
2015-01-24 01:53:58 +01:00
Michael Peter Christen
b32e0b5457
fix for shell script
2015-01-23 18:34:38 +01:00
Michael Peter Christen
29f6e9db7a
write java version to status page
2015-01-23 17:57:54 +01:00
Michael Peter Christen
604ccd8072
new development cycle
2015-01-23 11:31:05 +01:00
Michael Peter Christen
287c528f46
replaced old JavaApplicationStub for Mac Application framework with new
...
script. Adopted the YaCyApp environment and fixed a problem in the
startYACY.sh application wrapper which caused wrong usage of logging
option -l which caused that files had been written to the YaCy
application folder.
As a result of this fix, it is not necessary any more to change path
settings in Info.plist if libraries are changed.
2015-01-23 11:30:13 +01:00
Michael Peter Christen
2bc2564668
Release 1.82
2015-01-21 12:45:55 +01:00
Michael Peter Christen
4c9d2a7c64
reverted 'do not show all options' strategy. This is actually confusing
...
new users. Will be activated maybe again if there is an optional
tutorial mode which can be switched on for this special purpose of
running a tutorial.
2015-01-20 18:18:12 +01:00
Michael Peter Christen
7db2888336
fixed font size and print page generation in pdf snapshots
2015-01-20 17:14:14 +01:00