Commit Graph

2166 Commits

Author SHA1 Message Date
reger
f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
- defaults to Jetty (in this branch)
- add server version info & config option -> Admin Console -> Advanced Settings -> Http Networking
2013-10-17 03:34:22 +02:00
reger
da4ff5aefa add YaCy HttpCommand "authenticate" check to DefaultServlet 2013-10-17 00:06:17 +02:00
reger
1adb4b8741 merge rc1/master 2013-10-16 03:02:21 +02:00
reger
77a73c7475 add YaCy HttpCommand "location" check to DefaultServlet 2013-10-16 01:48:44 +02:00
reger
cc223b14a4 remove wrong content mod in SSI parser for virtual path /currentyacypeer/
(is handled on start of request handling)
2013-10-15 03:25:24 +02:00
reger
5606291574 fix last commit (not needed test of GZipInputStream) 2013-10-14 04:29:34 +02:00
reger
f9eed8cb44 add support for gzip encoded multipart forms (needed for transferRWI.html)
- quick and dirty reuse of existing HTTPDemon implementation
2013-10-14 04:18:52 +02:00
reger
cf32a92629 - add size check to multipart form data handling of YaCyDefaultServlet (same as in HTTPDemon.parseMultipart)
- reduce Jetty logging 
- give build.run a bit more memory (set to YaCy.default 600m from 512m)
2013-10-13 20:56:03 +02:00
reger
705f147820 - add localpeername.yacy to list of local address detection for AbstractRemoteHandler
- use proxy via header info as in legacy proxy handler
2013-10-13 18:06:42 +02:00
reger
0d4efabaa8 fix YaCy version string in proxy headers
(config parameter vString not longer used)
2013-10-13 17:56:53 +02:00
reger
2226189743 disable domainhandler due to error
- domainhandler causes closed response output stream in following handlers 
  on addresses resolved to local peer (like in hello protocoll preventing peer to switch to senior peer)
2013-10-13 07:24:33 +02:00
reger
eea504c117 update Info.plist
small DefaultServlet refactoring
2013-10-12 23:01:14 +02:00
reger
a44eede8b8 merge rc1/master 2013-10-11 01:50:25 +02:00
sixcooler
d9a02ed277 NPE fix for my last commit 2013-10-11 00:44:04 +02:00
reger
54a0272338 searchpage javascript (latestinfo) causes reset of search statistic after moving to next page
- disabled call via setTimeout in yacysearch.html
2013-10-10 23:23:58 +02:00
sixcooler
61f627eb85 fix for ssl-connections from proxy-usage staying in close-wait-state
+ some extra 'close' in HttpClient
2013-10-10 20:57:37 +02:00
Michael Peter Christen
d328cc4a83 fix for didyoumean, added also more asian alphabets 2013-10-09 16:17:50 +02:00
Michael Peter Christen
90c8577840 enhanced ranking; patches to replace old ranking 2013-10-09 15:10:03 +02:00
reger
e74f548551 make legacy http server (serverCore) implement YaCyHttpServer interface 2013-10-09 01:07:22 +02:00
reger
71d2655c02 downgrade to Jetty 8 to assure support of JRE 1.6
- introduce a YaCyHttp interface to modulize/separate http server
- adjust the Jetty version specific implementation part (in package net.yacy.http)
     - putting the version specific code in classes starting with Jetty8xxxx
     - moved existing Jetty9xxx implementation into a test class (to keep the code)
- adjust build to the changed jars
- make use of the introduced YaCyHttpServer interface in related htroot servlets

- adjust other test cases/classes
2013-10-09 00:40:48 +02:00
Michael Peter Christen
1b61bd40ed - Added new solr field url_file_name_tokens_t which stores the file name
tokens. This can be used to enhance the ranking.
- Added also a rating_i field as basis for later usage.
- enhanced the tokenization process.
2013-10-08 23:48:13 +02:00
orbiter
6efa7532d2 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-10-08 19:04:57 +02:00
orbiter
5f5a97bafc added the anchor text within web pages to the searcheable entities of a
web page. This can be of benefit for the ranking if these fields are
used for boosts.
2013-10-08 18:41:07 +02:00
orbiter
705b3338ee list more fields available for search and for ranking boosts 2013-10-08 18:15:35 +02:00
sixcooler
d536092fe4 fix false fill NAME_CACHE_MISS-DNS-Cache in case of a timeout
for eg. caused by massive requests when crawl from file
2013-10-08 18:02:42 +02:00
Michael Peter Christen
78e7aadb26 removed unused initialization method 2013-10-07 23:51:28 +02:00
Michael Peter Christen
4fbc4740df removed warnings 2013-10-07 23:41:50 +02:00
Michael Peter Christen
21aa6a0321 migration to Solr 4.5.0 2013-10-07 17:09:40 +02:00
Michael Peter Christen
ef31d0f279 fix for rss reader, see http://bugs.yacy.net/view.php?id=294 2013-10-07 12:59:54 +02:00
Michael Peter Christen
101a6e6e14 Patch the citation index for links with canonical tags.
This shall fulfill the following requirement:
If a document A links to B and B contains a 'canonical C', then the
citation rank computation shall consider that A links to C and B does
not link to C.
To do so, we first must collect all canonical links, find all references
to them, get the anchor list of the documents and patch the citation
reference of these links.
2013-10-07 11:15:58 +02:00
reger
daebeb93aa add call to AccessTracker to jetty security handler 2013-10-04 01:16:17 +02:00
reger
172aefaeeb adjust YaCySecurityHandler to Jetty 9 conventions
- mainly adjust prepareConstraintInfo to use the RoleInfo.setChecked as in Jetty Source distribution
- use constraint check behavior as in ConstraintSecurityHandler
  see http://git.eclipse.org/c/jetty/org.eclipse.jetty.project.git/tree/jetty-security/src/main/java/org/eclipse/jetty/security/ConstraintSecurityHandler.java?id=jetty-9.0.5.v20130813
2013-10-03 19:38:03 +02:00
reger
6f9ed439d3 - expand localHostName check of AbstractRemoteHandler
to pevent request is handled as proxy request 
- make domain handler not relay on included path in resolved .yacy address
2013-10-01 03:04:32 +02:00
reger
561ea135af fix : forgot adding security handler 2013-09-30 04:35:17 +02:00
reger
c7c706fd9f merge with rc1/master 2013-09-30 03:46:39 +02:00
reger
272b196d05 update Jetty server init() to activate yacy-domain and transparent proxy handler
- adding  domain & proxy handler to a context (as it was in inital design)
     (context required for dispatcher)
- make handler context and servlet context parallel available 
     (to allow use of YaCyDefaultServlet to handle legacyServlets)
- set transparent proxy request handled after dispatch.forward to skip further handling for .yacy domain requests
2013-09-30 03:12:52 +02:00
reger
fd119deb00 fix NPE on modified since check ( Response.requestHeader allowed to be null) 2013-09-30 02:50:53 +02:00
reger
66145a0410 - add welcome file (index.html) support to YaCyDefaultServlet
- change SolrServlet default search field (&df) to text_t
2013-09-29 03:34:00 +02:00
Michael Peter Christen
b28d43decc added two more fields source_cr_host_norm_i,target_cr_host_norm_i in
webgraph and an addition to postprocessing to copy all cr ranking
attributes to the link edges associated to the postprocessing documents
2013-09-27 16:57:05 +02:00
Michael Peter Christen
a52f3a597e fix for canonical-from-http-header feature 2013-09-27 15:09:04 +02:00
Michael Peter Christen
2dd7c5be44 added parsing of http-canonical tags (untested, could not find an
example page)
2013-09-27 13:17:50 +02:00
Michael Peter Christen
4476dea5ba do not fail if a wrong boost key is used; instead, print only a warning
See also: http://bugs.yacy.net/view.php?id=293
2013-09-27 12:28:09 +02:00
reger
ab9583d429 add default field (&df) to SolrServlet query if missing 2013-09-26 22:20:35 +02:00
Michael Peter Christen
3bf0104199 fix for crawl domain counter limitation (limit was reached too early) 2013-09-26 13:41:52 +02:00
Michael Peter Christen
82bfd9e00a - crawl profiles shall be deleted from active and passive stacks if they
are deleted to terminate the crawl because otherwise the crawl will go
on after the load-from-passive stack policy.
- better check if a crawl is terminated using the loader queue.
2013-09-26 10:22:31 +02:00
Michael Peter Christen
1b3d26dd23 hack to remove most of the warning: deprecated messages (but not all,
one is left)
2013-09-25 21:14:52 +02:00
Michael Peter Christen
a496313248 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-09-25 20:41:02 +02:00
sixcooler
3c48fc65fd reverted RemoteInstance to deprecated methods of httpClient-4.2
this should work with current remote-Solr-Instances
2013-09-25 18:45:16 +02:00
Michael Peter Christen
91a875dff5 self-healing of mistakenly deactivated crawl profiles. This fixes a bug
which can happen in rare cases when a crawl start and a cleanup process
happen at the same time.
2013-09-25 18:27:54 +02:00
Michael Peter Christen
095053a9b4 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-09-25 17:32:52 +02:00