Commit Graph

2592 Commits

Author SHA1 Message Date
Michael Peter Christen
ebd44a7080 replaced solr 4.6.1 with solr 4.7.1 and added index migration to
lucene_47
2014-04-06 10:45:03 +02:00
Michael Peter Christen
734778c0c8 fixed a time-out problem in the default servlet which is also a logging
problem because the error log showed the wrong reason (file not found)
instead the actual reason (time-out).
2014-04-04 15:27:29 +02:00
Michael Peter Christen
466d90ad42 fixed a problem with resource observer; probably coming from uncatched
exceptions within the apache library which appear only in concurrency
environments.
2014-04-04 15:26:39 +02:00
Michael Peter Christen
e8ddd415a8 enhanced the new link structure graph 2014-04-04 14:43:54 +02:00
Michael Peter Christen
926d28dd3f fixed a bug which prevented crawl starts after a network switch 2014-04-04 14:43:35 +02:00
Michael Peter Christen
3ce8eff21b another fix for inbound/outbound detection 2014-04-04 12:41:59 +02:00
Michael Peter Christen
d4b5c457e4 NPE fix 2014-04-04 12:34:34 +02:00
Michael Peter Christen
36a66b0704 fix for parsing of numeric value in case that boolean values are given 2014-04-04 11:59:51 +02:00
orbiter
41730c8048 better logging in template engine: shows filename of servlets where
errors in templates occur
2014-04-04 10:55:46 +02:00
orbiter
3c1274057d fixed thread dump in case of wrong seeds 2014-04-04 10:54:56 +02:00
orbiter
18f9c40302 moved Edge class out of linkstructure servlet as this does not work on
non-eclipse driven environments (all non-dev cases)
2014-04-04 10:54:11 +02:00
orbiter
de95e5e524 reduced search activity corona strength in network image 2014-04-04 10:08:44 +02:00
reger
da413af664 move baseurl after parsing orig source in urlproxyservlet
to calculate absolute href links for rewrite from unmodified source.
2014-04-04 03:11:16 +02:00
reger
af6ad20728 fix: remove obsolete ref to yacy.home
(use Switchboard instead)
2014-04-04 02:45:04 +02:00
Michael Peter Christen
74ab094587 fix for solr query size; too many documents had been retrieved in case
that less than _pagesize_ had been requested.
2014-04-03 13:42:10 +02:00
Michael Peter Christen
c64c10ef00 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-04-03 01:58:06 +02:00
Michael Peter Christen
48fbfa60c1 bugfix to inbound/outbound identification 2014-04-03 01:21:43 +02:00
reger
227c42bc96 eleminate obsolete URIMetaDataRow class
by joining it with/into URIMetaDataNode.
2014-04-03 00:35:15 +02:00
Michael Peter Christen
cca851a417 introduced new solr field crawldepth_i which records the crawl depth of
a document. This is the upper limit for the clickdepth_i value which may
be shorter in case that the crawler did not take the shortest path to
the document.
2014-04-02 23:37:01 +02:00
orbiter
b1ba764d81 fix for first start options and added german translation for popup texts 2014-04-02 17:10:59 +02:00
orbiter
429a874222 - added COLS field in GSA response (non-gsa standard by customer
request)
- updated document link in GSA response writer
2014-04-02 16:05:44 +02:00
Michael Peter Christen
1b9ec9a1c5 - added popover to p2p/stealth mode button to explain the peer mode and
privacy issues.
- added popover to first-time use case to explain that specific servlets
are only visible after customization and/or crawl starts
2014-04-02 13:33:43 +02:00
Michael Peter Christen
62a36fa584 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-04-02 03:27:08 +02:00
reger
c9f92abddc fix: application link count
(URIMetadataNode)
2014-04-02 03:21:51 +02:00
Michael Peter Christen
a267c46e1a Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-04-02 02:35:58 +02:00
Michael Peter Christen
5b83887da8 npe fix 2014-04-02 02:34:55 +02:00
Michael Peter Christen
63c9fcf3e0 free configuration of postprocessing clickdepth maximum depth and time 2014-04-02 02:34:39 +02:00
Michael Peter Christen
39b641d6cd added tutorial mode - some menu items will only appear if you 'qualify'
for them. Thus, the first-time user will only see four menu items. The
other items will unfold as the user interacts.
2014-04-02 02:33:17 +02:00
sixcooler
f06775850f fix receiving DHT / parse pultipart
+ another close to fix possible resource leak warning
2014-04-02 01:24:15 +02:00
reger
49e76a1c55 make use of detected charset in htmlParser if none is given. 2014-04-01 04:02:34 +02:00
reger
e11504309f adding a hint to javascript browser short cut on Url-Proxy page (AugmentedBrowsing_p.html) 2014-03-30 05:11:42 +02:00
reger
b12200cafe alternative UrlProxyServlet (for /proxy.html) using different url rewrite rules
- use JSoup parser for selective rewrite of html body <a href=  links only,
instead of regex which rewrites also header href/src links
- this improves display of pages which use header <base> tag
- tags with src attribute are taken from original location (like css) improving display and are not routed trough the indexer
Disadvantage: scripting links will drop out of proxy

Setting of the servlet through web.xml exclusivly (in case one would like to quickly switch back to the YaCyProxyServlet,
leaving the existing code of YaCyProxyServlet untouched available)
2014-03-30 04:04:02 +02:00
reger
2953ebe701 fix: port in local target adress
& button style
2014-03-29 00:34:01 +01:00
Michael Peter Christen
fda591695c fixed visibility of custom icon 2014-03-28 17:25:39 +01:00
Michael Peter Christen
a9b9950d7f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-03-28 14:48:32 +01:00
Michael Peter Christen
b488f33975 added close to fix possible resource leak warning 2014-03-28 14:34:49 +01:00
Michael Peter Christen
56710ecb26 prevent opening of new files as that could be a cause for the latest
too-many-open-files exception. The old file is just truncated if the
table is cleaned.
2014-03-28 14:31:43 +01:00
Michael Peter Christen
8b44fcf0f4 added missing @Override annotation 2014-03-28 13:48:37 +01:00
reger
d7055904a6 fix: proxyservlet path header setting 2014-03-28 02:05:58 +01:00
Michael Peter Christen
e515dd460d added linkscount_i and linksnofollowcount_i to the default solr schema 2014-03-27 23:36:08 +01:00
Michael Peter Christen
1a764135be one more Thread Dump fix for new bootstrap css style 2014-03-27 23:01:28 +01:00
Michael Peter Christen
bb21d825f9 fix for thread dump line spacing 2014-03-27 22:13:37 +01:00
Michael Peter Christen
cbdfef7ce1 changed protocol facet to show also all other counts if one facet is
selected
2014-03-27 13:29:14 +01:00
reger
b9056ef2db remove unused private header entries (HeaderFramework)
X_YACY_ORIGINAL_REQUEST_LINE
X_YACY_KEEP_ALIVE_REQUEST_COUNT
CONNECTION_PROP_REQUESTLINE
2014-03-26 23:28:19 +01:00
sixcooler
6d16fa993d make transparent proxy handle https-connections:
the implemented handle for connect did not work for me - so lets try the
connectHandler
2014-03-26 20:01:15 +01:00
Michael Peter Christen
61ad194065 fix for source and target clickdepth in webgraph index 2014-03-26 16:00:05 +01:00
Marc Nause
809b4e1fd9 Team added support for URLs with unicode characters in host part to
blacklist. Punycode is used to handle unicode characters.
2014-03-25 22:14:54 +01:00
reger
b126b9ba17 add some InputFileStream close at end of reads
to make sure file is released
2014-03-24 02:32:17 +01:00
reger
ca7444dbdf limit filetype nav to known extension also on image/media search
- on text search we limit filetype nav already to known extension, apply filter to image search
2014-03-23 23:10:29 +01:00
reger
651d057e93 surrogate import translate dc:language 3-char codes
OAI records often use 3-char language codes, start converting some 3-char lang's to the internal ISO639-1 2-char code
2014-03-23 00:40:36 +01:00