reger
f111f30ace
Merge origin/master into jetty
2013-11-17 00:18:25 +01:00
Michael Peter Christen
f4172cbb3d
fix for another XSS bug
2013-11-17 00:17:25 +01:00
reger
94293176a3
use writeOptionHeaders with ServletResponse parameter only
2013-11-17 00:02:08 +01:00
orbiter
ff86cb683f
fixed some XSS bugs reported by Marius from http://ctf365.com/
2013-11-16 20:34:31 +01:00
orbiter
da33ee0d77
extended also timeout fr webgraph postprocessing
2013-11-16 18:30:06 +01:00
orbiter
74f9e40747
extended timeout during postprocessing of 30 minutes.
2013-11-16 18:29:08 +01:00
orbiter
19a051bec8
more monitoring for postprocessing and enhanced layout in Crawler
...
monitor page
2013-11-16 18:23:14 +01:00
Michael Peter Christen
9cf9727685
fix for wrong counter
2013-11-16 11:33:35 +01:00
Michael Peter Christen
fceac8cffd
more monitoring for postprocessing
2013-11-16 08:23:42 +01:00
Michael Peter Christen
6842783761
fixed and enhanced postprocessing
2013-11-16 08:23:21 +01:00
Michael Peter Christen
219d5934a4
fixed termination bug in Solr Connector
2013-11-16 08:22:29 +01:00
Michael Peter Christen
bf1bdd52a6
prevent requesting of 0-facets (which actually exist)
2013-11-15 15:41:41 +01:00
Michael Peter Christen
9d5895f643
enhanced and fixed postprocessing
2013-11-15 15:41:12 +01:00
Michael Peter Christen
f86fe90eda
enhanced mass storage speed to remote solr servers
2013-11-15 15:40:07 +01:00
Michael Peter Christen
6ed9821209
fixed several problems in solr connectors
2013-11-15 15:39:35 +01:00
Michael Peter Christen
191fd3d7e7
added an optimization option to HandleSet mass data storage structure
2013-11-15 15:38:00 +01:00
Michael Peter Christen
94b565ea0d
fixed keepalive min value
2013-11-15 15:37:01 +01:00
Michael Peter Christen
5ec5be5769
fixed logging for remote solr configuration
2013-11-15 15:36:24 +01:00
reger
b26787dc2d
- DefaultServlet: remove static gzip option
...
YaCy doesn't use pre-gzip'ed static html pages
- ProxyServlet: remove not neede procedure
- Server init: skip one overlaping servlet context
2013-11-14 01:37:51 +01:00
Michael Peter Christen
24a052ecb9
removed debug code for existsByIds
2013-11-13 13:41:18 +01:00
Michael Peter Christen
087df05e24
added option to Config_Network_p.html to enable remote search while
...
DHT-Receive is switched off.
2013-11-13 13:38:01 +01:00
Michael Peter Christen
1a4a69c226
set more logger to 'final static'
2013-11-13 06:18:48 +01:00
Michael Peter Christen
c60947360d
logger should be static
2013-11-13 06:04:28 +01:00
Michael Peter Christen
69b8d61c47
fix for search requests in GSA interface which contain 'funny'
...
characters (like ':' etc.)
2013-11-12 15:54:54 +01:00
orbiter
b085cb522b
replaced old existsByIds for embedded Solr with obviously much faster
...
new selection method (including stil existing debug code to test that
this is in fact better)
2013-11-11 11:25:01 +01:00
reger
1a6158e338
make test directory available in Maven pom
...
- exclude reference to old slf4j-log4j12
2013-11-10 22:20:35 +01:00
reger
b4fdb8c887
cleanup test directory from Jetty 9 implementation samples
...
- current Jetty implementation advances so that it seems not beneficial to keep the code
as it makes the test unuseable and use of Jetty 9 is due to Java 1.7 dependency not in sight.
2013-11-10 22:01:31 +01:00
reger
b29d262e70
implement Jetty8HttpServerImpl.generateSocketAddress
...
(code 1:1 copied from serverCore)
2013-11-10 18:59:18 +01:00
orbiter
4234b0ed6c
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2013-11-10 18:50:43 +01:00
orbiter
909bbb49d8
added (partly commented) test code for url rewrite methods .. to be
...
completed
2013-11-10 18:50:34 +01:00
orbiter
74c86a72a0
better default value for crawler user agent
2013-11-10 18:48:00 +01:00
reger
066a1ecf0a
add highlight queryparams to solrservlet if missing
...
- modify query params in Solr parameter map (instead of querystring)
2013-11-10 01:36:57 +01:00
Michael Peter Christen
899e7e92b0
added debug code
2013-11-09 02:37:12 +01:00
Michael Peter Christen
a5c1249ee2
reverted autowarming setting in solrconfig
2013-11-09 01:43:44 +01:00
reger
4684330505
Merge origin/master into jetty
...
Conflicts:
source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
2013-11-07 21:44:14 +01:00
reger
1437c45383
merge rc1/master
2013-11-07 21:30:17 +01:00
Michael Peter Christen
87a956e881
calculating and showing the number of files and the average size of a
...
file in the HTCACHE in ConfigHTCache_p.html
2013-11-07 12:13:12 +01:00
Michael Peter Christen
acc1f8a749
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
2013-11-07 12:01:37 +01:00
Michael Peter Christen
81d9e23532
fixed another memory leak in the PDF parser:
...
the class org.apache.pdfbox.pdmodel.font.PDFont occupies 8MB of space
which cannot be cleaned if PDFont.clearResources is called.
The attempt to clean the class cache therefore causes that the class is
loaded and this cache is initialized with some rubbish. I tried to
prevent to instantiate this class by usage of a hacked findLoadedClass
call to the SystemClassLoader (which is protected ...).
Now, without using the PDF parser at all, 8MB of RAM space is not
occupied, however, when the first PDF arrives this space will be taked
and never given back to GC.
WAKE UP YOU LAZY PDFBOX HACKER AND FIX THIS SHIT!
2013-11-07 11:57:01 +01:00
Michael Peter Christen
c152d996e6
reduced footprint of BookmarksDB which can take quite a lot of memory if
...
the number of bookmarks is high (i.e. > 2000 URLs)
2013-11-07 10:55:02 +01:00
Michael Peter Christen
81bb50118e
found and fixed a huge memory leak in solr caching (inside Solr). The
...
not-flushed Solr cache is now handled in this way:
- it is smaller by default
- an Solr-internal process is started to flush the cache periodically
(this does NOT clean the cache, just removes old objects)
- a Solr-external process (the standard YaCy cleanup-process) now has
direct access to the solr internal cache and flushes them completely.
The time frame for such a flush is defined by the cleanup-process
frequency, by default 10 minutes.
2013-11-07 10:01:44 +01:00
reger
7b17cdf6dd
add content_type:image/* to image search
...
- see numerous idx entries with content_type image without url_file_ext_s (for various reason) which should be included in result
- try it yourself with following sample query
/solr/select?q=content_type:image/* AND -url_file_ext_s:[* TO *]&defType=edismax&fl=sku,url_file_ext_s,content_type
adresses also possible url without or deviating extension.
2013-11-07 03:11:03 +01:00
reger
082c9a98c1
move writeHeaders from Jetty8 servlet to YaCyDefaultServlet
...
- after removing Jetty server dependency (of Response using HttpServletResponse only)
2013-11-07 00:32:21 +01:00
sixcooler
987f410011
URL-export:add query and fix for cast-class-exception
2013-11-06 19:22:26 +01:00
Michael Peter Christen
ffe8276063
replaced referrer link masking to 'pure' links to the referring page
...
(that was more useful during testing)
2013-11-06 18:05:46 +01:00
Michael Peter Christen
a8253ca49c
added missing unicode transformation in href link contents during
...
parsing
2013-11-06 18:05:02 +01:00
Michael Peter Christen
0cf9e9580b
added clickdepth and CR computation debug code to verify that the
...
process is complete
2013-11-06 15:01:40 +01:00
Michael Peter Christen
7f768b42d3
we do not need the load-image flag any more since this is now controlled
...
by parser switches
2013-11-06 15:00:57 +01:00
reger
b85f702f22
add AccessTracker logging to SolrServlet
2013-11-05 22:57:55 +01:00
reger
de1f02420b
implement HtmlResponseWriter to solrServlet (and rss / opensearch responswriter) as in yacy select servlet.
...
- set contenttype of HTLM/GrepHTML-Reponsewriter to "text/html"
- set a contenttype to GSAsearchServlet
2013-11-04 21:11:12 +01:00