Commit Graph

2205 Commits

Author SHA1 Message Date
reger
1437c45383 merge rc1/master 2013-11-07 21:30:17 +01:00
reger
082c9a98c1 move writeHeaders from Jetty8 servlet to YaCyDefaultServlet
- after removing Jetty server dependency (of Response using HttpServletResponse only)
2013-11-07 00:32:21 +01:00
reger
b85f702f22 add AccessTracker logging to SolrServlet 2013-11-05 22:57:55 +01:00
reger
de1f02420b implement HtmlResponseWriter to solrServlet (and rss / opensearch responswriter) as in yacy select servlet.
- set contenttype of HTLM/GrepHTML-Reponsewriter to "text/html"
- set a contenttype to GSAsearchServlet
2013-11-04 21:11:12 +01:00
reger
bfdb404867 implement a Jetty reconnect to work with Configbasic_p.html port change
- instead of shutting down the server it should be sufficient to manipulate the Jetty http connector
2013-11-03 21:34:21 +01:00
reger
d6760df3e5 fix servlet class exist check to use default path only (in Jetty8YaCyDefaultServlet)
- del redundant doget code in yacydefaultservlet
   - small declaration code opts
- del obsolete libt/proxyservlet.java
2013-11-03 02:26:00 +01:00
reger
b38de92a16 Merge origin/master into jetty 2013-11-02 00:48:42 +01:00
Michael Peter Christen
cc39667399 Speed enhancements and less CPU usage during Solr searches when using
the embedded Solr (the default). This was obtained by cirumventing solrj
search encapsulation and the implementation of direct index access
methods to Solr.
The effect will not only be seen during search, but this has also a
strong effect on suggestions (much more) and less CPU power usage during
index distribution (which needs many search requests)
2013-11-01 17:24:36 +01:00
Michael Peter Christen
434e13b46d in host browser also show the properties of failed documents including
referrer urls (this is a VERY USEFUL SEO and Web Admin feature!!)
2013-11-01 13:30:53 +01:00
reger
6944225037 - add GSA search /gsa/search servlet for Jetty to Server init
- include SecurityHandler check for /gsa/ /solr/ 
- change one more YaCyDefaultServlet dependency from Jetty to std. javax.Servlet
2013-10-30 23:11:36 +01:00
reger
53cb30a221 reduce logging (by assigning logger to existing logger)
- small additional cleanups
2013-10-30 00:51:04 +01:00
reger
332c6d4fe1 reactivate Domain handler for .yacy / .yacyh handling 2013-10-27 19:15:20 +01:00
reger
b1ce70434e resolve merge conflict
- add missing import statement
2013-10-27 15:24:04 +01:00
reger
7869a4c070 Merge origin/master into jetty
- merge conflict resolve
2013-10-27 15:12:17 +01:00
reger
f017066197 Merge origin/master into jetty 2013-10-27 15:09:24 +01:00
reger
06da6f517c add YaCyProxyServlet to handle /proxy.html?url=proxyurl
- based on Jetty ProxyServlet
- at this time use existing HTTPD ProxyHandler  for url rewrite
- add jetty-client jar (dependency in Jetty ProxyServlet)

reuse ProxyHandler.convertHeaderFromJetty in YaCyDefaultServlet
2013-10-27 05:04:24 +01:00
reger
69599566f9 catch one more malformed url in proxy url rewrite 2013-10-27 04:42:33 +01:00
reger
605530fec5 catch proxy url rewrite exception
malformed url (" http:\/\/" ) may cause error response
 testcase http://localhost:8090/proxy.html?url=http://dictionary.reference.com/browse/test
2013-10-27 04:06:11 +01:00
Michael Peter Christen
9bb7eab389 hacks to prevent storage of data longer than necessary during search and
some speed enhancements. This should reduce the memory usage during
heavy-load search a bit.
2013-10-25 15:05:30 +02:00
orbiter
3c3cb78555 - removed a lot of garbage and bloated code from GuiHandler.
- transformed log lines to String before they are stored because the
storage space is about 1:250 (45kb for one line before transformation,
180 bytes afterwards)
- this saves up to 10MB RAM so we can increase the number of lines to
1000 again.
2013-10-24 20:42:34 +02:00
Michael Peter Christen
5afa6e3aee Automatically flush the log cache if a short memory status is reached.
For the default of 200 lines this can flush about 10MB.
2013-10-24 17:39:50 +02:00
Michael Peter Christen
030d0776ff Enhanced crawl start for very, very large crawl lists (i.e. > 5000)
which had a problem because of badly used concurrency.
This fix also caused a redesign of the whole host deletion process.
This should fix bug http://bugs.yacy.net/view.php?id=250
2013-10-24 16:20:20 +02:00
Michael Peter Christen
6aabc4e5c8 reduced logging line memory, 10000 lines had filled up 450MB! grrr.
(thank you, a bomb from the past)
2013-10-24 16:17:53 +02:00
Michael Peter Christen
1a8783147b enhanced computation of number of solr documents. 2013-10-24 15:48:05 +02:00
Michael Peter Christen
4948c39e48 added concurrency for mass crawl check 2013-10-23 11:27:19 +02:00
Michael Peter Christen
1b4fa2947d - fixed a problem which ocurred when a document was not recognized with
the right content domain (i.e. identifying that it is an image, text
etc.) because it used the file extension and not an existing mime type
assignment.
- fixed the new setting that images shall be loaded for a better image
search.
- both fixes together makes it now possible to crawl
commons.wikimedia.org which makes use of 'funny' document names (i.e.
ending with .jpg while the document is html)
2013-10-23 00:16:54 +02:00
Michael Peter Christen
82621bead0 When doing bootstraping, always accept one seedlist-File without
checking the date of the file. This should help to start the peer in
case that the user has a completely wrong date setting.
2013-10-22 15:34:51 +02:00
Michael Peter Christen
691d7e70fa added hint to development/commit rss feed 2013-10-21 15:16:29 +02:00
orbiter
20bbde8665 fix for mustmatch regex computation: result had correct semantic, but
may have contained multiple same expressions within the disjunction of
domain-restrictions. This fix removes the redundant restrictions and
makes the regex shorter.
2013-10-18 13:55:37 +02:00
reger
cb2dbcb843 add graceful Jetty shutdown option
- as Jetty stop is not synced, yet
- include jetty jars and servlet-3.0 api jar  in Eclipse .classpath
2013-10-18 00:42:38 +02:00
reger
f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
- defaults to Jetty (in this branch)
- add server version info & config option -> Admin Console -> Advanced Settings -> Http Networking
2013-10-17 03:34:22 +02:00
reger
da4ff5aefa add YaCy HttpCommand "authenticate" check to DefaultServlet 2013-10-17 00:06:17 +02:00
Michael Peter Christen
c833d02cf5 fixed webgraph postprocessing (did nothing and repeated to do this...) 2013-10-16 11:49:04 +02:00
Michael Peter Christen
74d0256e93 enhanced postprocessing: fixed bugs, enable proper postprocessing also
without the harvestingkey, remove crawl profiles after postprocessing,
speed-up for clickdepth computation.
2013-10-16 11:27:06 +02:00
reger
1adb4b8741 merge rc1/master 2013-10-16 03:02:21 +02:00
reger
77a73c7475 add YaCy HttpCommand "location" check to DefaultServlet 2013-10-16 01:48:44 +02:00
Michael Peter Christen
7b69c438f7 more methods for the table class 2013-10-15 16:46:59 +02:00
Michael Peter Christen
820b896146 Replaced the inframe loading from yacy.net for donations with the
loading of this iframe from the local host. To make this more flexible,
this iframe is loaded once after startup from yacy.net.
2013-10-15 16:46:06 +02:00
reger
cc223b14a4 remove wrong content mod in SSI parser for virtual path /currentyacypeer/
(is handled on start of request handling)
2013-10-15 03:25:24 +02:00
reger
5606291574 fix last commit (not needed test of GZipInputStream) 2013-10-14 04:29:34 +02:00
reger
f9eed8cb44 add support for gzip encoded multipart forms (needed for transferRWI.html)
- quick and dirty reuse of existing HTTPDemon implementation
2013-10-14 04:18:52 +02:00
reger
cf32a92629 - add size check to multipart form data handling of YaCyDefaultServlet (same as in HTTPDemon.parseMultipart)
- reduce Jetty logging 
- give build.run a bit more memory (set to YaCy.default 600m from 512m)
2013-10-13 20:56:03 +02:00
reger
705f147820 - add localpeername.yacy to list of local address detection for AbstractRemoteHandler
- use proxy via header info as in legacy proxy handler
2013-10-13 18:06:42 +02:00
reger
0d4efabaa8 fix YaCy version string in proxy headers
(config parameter vString not longer used)
2013-10-13 17:56:53 +02:00
reger
2226189743 disable domainhandler due to error
- domainhandler causes closed response output stream in following handlers 
  on addresses resolved to local peer (like in hello protocoll preventing peer to switch to senior peer)
2013-10-13 07:24:33 +02:00
reger
eea504c117 update Info.plist
small DefaultServlet refactoring
2013-10-12 23:01:14 +02:00
reger
a44eede8b8 merge rc1/master 2013-10-11 01:50:25 +02:00
sixcooler
d9a02ed277 NPE fix for my last commit 2013-10-11 00:44:04 +02:00
reger
54a0272338 searchpage javascript (latestinfo) causes reset of search statistic after moving to next page
- disabled call via setTimeout in yacysearch.html
2013-10-10 23:23:58 +02:00
sixcooler
61f627eb85 fix for ssl-connections from proxy-usage staying in close-wait-state
+ some extra 'close' in HttpClient
2013-10-10 20:57:37 +02:00