Commit Graph

2427 Commits

Author SHA1 Message Date
reger
6e2fe777af simulate Authorization cookie for yacy servlet header 2014-01-10 19:31:36 +01:00
reger
ea7cef5d05 fix NPE in TemplateEngine
StackTrace For input string: ""
java.lang.NumberFormatException: For input string: ""
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:504)
	at java.lang.Integer.parseInt(Integer.java:527)
	at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:241)
	at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:199)
	at net.yacy.http.servlets.YaCyDefaultServlet.handleTemplate(YaCyDefaultServlet.java:896)
2014-01-10 18:11:32 +01:00
reger
cb6d0c2113 implementing YaCy legacy role names
- taking out customized SecurityHandler code as the original/default seems to just work fine
- with this individual sec. constraints can be applied via web.xml (using legacy role names)
2014-01-10 14:07:49 +01:00
reger
f09dbbef96 make SecurityHandler webappcontext ready 2014-01-10 12:36:42 +01:00
reger
37f2a82a5d making root context (htroot) a WebAppContext
- this allows additional features, like servlet configuration via web.xml and many more things.
- currently the standard servlets are still configured in the code (so the supplied defaults/web.xml is not realy needed, yet),
  but could be expanded
- lookup for web.xml - 1. in /DATA/SETTINGS then in /defaults
2014-01-10 10:42:47 +01:00
reger
28eae57e8b spend CrawlQueues a fremem routine
- clears errorStack
- will not get hit often (but better little than nothing on low mem)
2014-01-10 10:24:33 +01:00
reger
b931bf6b48 fix use of url proxy access pattern
pattern of transparent was used.
2014-01-08 08:12:56 +01:00
reger
280c4a3ac1 exclude terms with " for didYouMean suggestion
causes Solr error (and wordindex likely finds suggestion)

org.apache.solr.core.SolrCore org.apache.solr.common.SolrException: org.apache.solr.search.SyntaxError: Cannot parse 'text_t:""d"': Lexical error at line 1, column 12.  Encountered: <EOF> after : ""
	at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:171)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:187)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
	at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.query(EmbeddedSolrConnector.java:179)
	at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector$DocListSearcher.<init>(EmbeddedSolrConnector.java:345)
	at net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.getCountByQuery(EmbeddedSolrConnector.java:364)
	at net.yacy.cora.federate.solr.connector.MirrorSolrConnector.getCountByQuery(MirrorSolrConnector.java:326)
	at net.yacy.cora.federate.solr.connector.ConcurrentUpdateSolrConnector.getCountByQuery(ConcurrentUpdateSolrConnector.java:440)
	at net.yacy.search.index.Segment.getWordCountGuess(Segment.java:464)
	at net.yacy.data.DidYouMean.getSuggestions(DidYouMean.java:181)
	at suggest.respond(suggest.java:73)
2014-01-08 04:46:21 +01:00
reger
fbc1071f6d Merge origin/master 2014-01-07 22:48:45 +01:00
reger
7b800a0c8e fix: NPE on shutdown via script 2014-01-07 22:44:24 +01:00
Michael Peter Christen
ce4d42d77c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-01-07 21:52:38 +01:00
Michael Peter Christen
644573cfc4 using the adminAccountUserName from yacy.conf within apicall.sh 2014-01-07 21:52:19 +01:00
reger
6932aa4d7a use configured admin-username for api calls
- the admin user name can be configured, in apiExec calls the default "admin" username is used. 

TODO: the bin/apicall.sh script should likely take that into account.
2014-01-07 21:26:50 +01:00
orbiter
2ead4e44d9 introduced a new storage path ARCHIVE inside of DATA which will be used
as path for solr index dumps (instead of the SEGMENTS path). This will
make a maintenance of index backups easier. It will also provide a tool
to migrate from an freeworld index to a webportal index.
2014-01-07 17:53:49 +01:00
sixcooler
add0e42804 fix double-escaped urls from proxy-usage 2014-01-07 01:04:33 +01:00
sixcooler
865ce6f974 check blacklist proxyClient config 2014-01-07 01:01:55 +01:00
sixcooler
345f9aba27 make use of our DNS-cache again - this realy speeds up the lookup 2014-01-07 00:18:01 +01:00
reger
e6d284fe1e better solution for prev. commit with MultiMapSolrParams.getFieldInt not returning default parameter 2014-01-06 18:19:54 +01:00
reger
0bc2fc14ab improve NPE chance on missing parameters
java.lang.NullPointerException
	at net.yacy.http.servlets.SolrServlet.service(SolrServlet.java:145)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:501)
2014-01-06 17:52:21 +01:00
reger
f06cef5d5b reimplement proxy access by configured whitlist pattern
was currently limited to own ip.
2014-01-06 15:00:14 +01:00
reger
05d6cc6ea3 setting of IPv4Stack moved earlier
it seems even better to call system.setproperty before isrunning check
(if nothing helps we have to set it in startup script)
2014-01-06 11:28:05 +01:00
reger
30d925a96e reimplemented server access restriction
via Jetty IPAccessHandler to allow only configured IP's to access.
Handler is only loaded if a restriction is configured.

Since IPAcessHandler (Jetty 8) does not support IPv6 system property java.net.preferIPv4Stack=true
Testing showed system.setProperty seems to be sensitive to point of calling (earliest possible time seems to be best = early in yacy.main).
Moved the "isrunning..." just open browser check also to the new routine to preread the yacy.config only once.
2014-01-06 07:00:16 +01:00
orbiter
3cb6c7861f fixed shutdown authenticaton problem 2014-01-06 01:48:54 +01:00
Michael Peter Christen
ed06b5b94b set a realm message to log-in input window which explains that a
password for the account 'admin' can be (re-)set with the script
bin/passwd.sh
2014-01-05 17:43:34 +01:00
Michael Peter Christen
7005ecdabd cleanup 2014-01-05 15:06:40 +01:00
Michael Peter Christen
2939b47986 removed non-working realm setting in http client (auth for localhost was
added in previous commit)
2014-01-05 15:04:18 +01:00
orbiter
9d52b337f3 added http authentification to YaCy http client for all localhost
acesses to enable self-steering of the peer using the API table. This is
necessary in case that an password for the administration pages is set.
2014-01-05 14:46:11 +01:00
Michael Peter Christen
c951945666 modified log-in detail to enable admin-login from localhost with stored
hash even if localhost access is disabled. This is urgently needed for
the apicall.sh script since that is used for high-availability set-up
(checkalive and indexdump for index mirroring)
2014-01-05 11:50:23 +01:00
Michael Peter Christen
9bd71fdbb4 made the access tracker class static because it shall be used by the
jetty auth module
2014-01-05 05:04:28 +01:00
Michael Peter Christen
1c56befb93 fixed mess with test on localhost (which means local hosts for some
cases)
2014-01-05 04:55:30 +01:00
Michael Peter Christen
7d6fc79eb8 refactoring (usage of constant names for attributes of authentication
check)
2014-01-05 04:23:44 +01:00
Michael Peter Christen
b9d36e45e0 removed the &amp explicit encoding of ampersand character since this is
double-translated within the template replacement process.
2014-01-05 03:40:10 +01:00
reger
e2ccb6ce9d modified DefaultServlet parameter on invoke templates
call response with post=0 (if post empty) simulating previous behavior.

(template servlets typically test for post==null,
found one more Crawler.p.java were empty post caused problem,
= defaults not correctly set)
2014-01-04 20:49:26 +01:00
reger
4c38bceafc handle http connect for proxy
refactor header cleanup (reuse existing code)
2014-01-04 13:09:34 +01:00
reger
cfabe8f67a harmonize access restriction for urlproxy servlet
with proxy handler, what is currently
- use switched on in config
- access from a local IP / hostname

fix shutdown exception for crashprotection handler on interrupted connections.
2014-01-03 12:28:40 +01:00
reger
e6b9643fd6 extended request for local peer check to by hostname resolved ip
the current islocal() check did not detect a domain.com address as request for the local peer.
2014-01-03 01:13:56 +01:00
reger
c797f108a1 add error response on deniedl proxy access
send http 403 response
2014-01-02 09:11:08 +01:00
reger
0583f44306 reimplement proxy access log (to Jetty ProxyHandler)
- using existing HTTPDProxyHandler logger
- allow local loopback ip to access proxy
2014-01-02 03:37:33 +01:00
reger
8cbc1c970a Security Hot-Fix: for transparent proxy. 2014-01-01 20:48:35 +01:00
reger
58ecf5e4dd add to blacklist button in CrawlResults
http://bugs.yacy.net/view.php?id=220
introduced Blacklist.add with sourcefile only parameter
2014-01-01 11:01:22 +01:00
reger
e9081c0f17 moved startup execAPIActions call after Jetty startup
execAPIActions require http to be up. The 10s sleep was sufficient to allow Jetty to start, 
but it's more robust to place the call after http is assigned to switchboard/serverSwitch.
2014-01-01 10:28:49 +01:00
reger
19c1a7a5ca change SolrServlet from Filter to Servlet
(as no multicore required)
this allows to simplify context/servlet initialization in Jetty init.
2014-01-01 10:20:32 +01:00
reger
14c977dd26 fix NPE GSAresponseWriter on query=null
java.lang.NullPointerException
	at net.yacy.cora.federate.solr.responsewriter.GSAResponseWriter.highlight(GSAResponseWriter.java:328)
	at net.yacy.cora.federate.solr.responsewriter.GSAResponseWriter.write(GSAResponseWriter.java:263)
	at net.yacy.http.servlets.SolrServlet.service(SolrServlet.java:235)
2013-12-31 23:01:41 +01:00
orbiter
c3dee2d6bd added security patch 2013-12-31 15:25:44 +01:00
orbiter
dcf46ce8f6 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-12-31 15:20:49 +01:00
orbiter
343d2ef49a new data type for access tracker (unfinished) 2013-12-31 15:20:34 +01:00
reger
dd8ea0cdd6 fix "add to blacklist" button style in IndexControlRWIs_p
- added default filename filter to select field (as only addition to *.black list is permanent)

- modified Blacklist_p header/legend to show all active blacklists 
  (to support understanding that all configured lists are active)
- removed obsolete code in Blacklist_p servlet
2013-12-30 20:03:59 +01:00
reger
abbf487023 fix QueryGoal Image query (missing space)
see query log example .. url_file_ext_s:(jpg OR png OR gif) ORcontent_type:(image/*)) ..
2013-12-29 20:14:10 +01:00
reger
26e9d7e066 fix NPE in IndexControlRWIs_p.html
- metatags my be null
Caused by: java.lang.NullPointerException
	at net.yacy.search.query.QueryParams.getFacets(QueryParams.java:445)
	at net.yacy.search.query.QueryParams.getBasicParams(QueryParams.java:400)
	at net.yacy.search.query.QueryParams.solrTextQuery(QueryParams.java:345)
	at net.yacy.search.query.QueryParams.solrQuery(QueryParams.java:334)
	at net.yacy.search.query.SearchEvent.<init>(SearchEvent.java:290)
	at net.yacy.search.query.SearchEventCache.getEvent(SearchEventCache.java:176)
	at IndexControlRWIs_p.genSearchresult(IndexControlRWIs_p.java:641)
	at IndexControlRWIs_p.respond(IndexControlRWIs_p.java:141)
2013-12-29 08:05:37 +01:00
reger
7f9b9315fe Merge origin/master 2013-12-29 02:05:07 +01:00
reger
8eaabb9600 remove dependency from old serverCore.java
- remaining getPortNr not needed 
  (as current release allows only to set plain integer as port,
   see ConfigBasic)
2013-12-29 02:00:44 +01:00
orbiter
2018e55f8b switched back on index deletion (was accidently off because new jetty
framework delivers never null to post arguments .. there may be more of
that kind of problems)
2013-12-29 01:39:30 +01:00
orbiter
3961b643a3 write solr searches to search log 2013-12-29 01:25:44 +01:00
orbiter
15882beb19 fix for strange NPE
java.lang.NullPointerException
        at
net.yacy.search.Switchboard.updateMySeed(Switchboard.java:3667)
        at net.yacy.peers.Network.peerPing(Network.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at
net.yacy.kelondro.workflow.InstantBusyThread.job(InstantBusyThread.java:107)
        at
net.yacy.kelondro.workflow.AbstractBusyThread.run(AbstractBusyThread.java:165)
2013-12-29 00:40:31 +01:00
orbiter
f3ac923a7e ftp client shall be able to open non-anonymous ftp servers if login
details are given
2013-12-28 22:42:02 +01:00
reger
3d913558ab display configured adminUserName in ConfigAccounts_p
- fix read default username in  in loginservice
2013-12-27 21:04:14 +01:00
reger
fbdd89e198 Merge origin/master 2013-12-27 06:53:14 +01:00
reger
65a2f3d5e7 tweak Jetty credentials to work with YaCy UserDB
- user entry in UserDB with admin right can login to access protected pages
- dto. admin user, choosen username is stored in conf (adminAccountUserName=)
2013-12-27 06:45:22 +01:00
Michael Peter Christen
ffdfe5fb9b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-12-27 03:06:38 +01:00
reger
7d6b34a89f Merge origin/master 2013-12-27 03:04:14 +01:00
reger
45e8750ba5 nasty quick fix for admin login with other username as admin
- userDB is not sync'ed with Jetty credentials as of now only the std. admin account can login

switched initial browser open with ssl active back to std. http port
2013-12-27 02:59:19 +01:00
Michael Peter Christen
ee17bd0b69 added option to attach remote solr servers in read-only mode 2013-12-27 02:55:21 +01:00
Michael Peter Christen
25f9c35033 add patch which shall prevent that naive search mistakes like usage of
regular expressions cause no results. Usage of '*' followed by a dot or
any expression will now cause that this expression is used as a filetype
search.
2013-12-27 00:34:55 +01:00
Michael Peter Christen
667a6adddb - use default files from yacy.init property "defaultFiles" if no
jetty-configuration is given for default files.
- fix a problem with default paths if no path is given (i.e.
http://localhost:8090 instead of http://localhost:8090/). Without this
patch the path was resolved automatically to http://localhost:8090//
2013-12-26 23:59:04 +01:00
Michael Peter Christen
77aeb288a2 suppress deprecation warning (for now); TODO: find alternatives 2013-12-26 23:26:21 +01:00
reger
fca7f1d043 run SSL/HTTPS port (8443) ping test in migration only if SSL/HTTPS is on
- see last commit
2013-12-25 05:33:00 +01:00
reger
71cac1a278 added SSL/HTTPS connector to support SSL/https connection on port 8443
!!! attention !!! to make sure YaCy can start, https will be disabled if port 8443 is used
   - added ping test for above to migration 

- as of now port for https is hardcoded to default 8443
- if not urgend required I'd leave it this way (it's standard) to use different ports for http and https 

- post https port on ConfigBasic.html (if active)
2013-12-25 05:20:13 +01:00
Michael Peter Christen
82c0525e71 wrong logger fix 2013-12-23 10:52:02 +01:00
Michael Peter Christen
e17624b6dd added html retrieval from alternative DATA/HTDOCS path 2013-12-23 02:06:33 +01:00
Michael Peter Christen
07cee6b99c removed more unused code 2013-12-23 01:51:48 +01:00
Michael Peter Christen
20b48f894f refactoring: moving all servlets to the same package (the solr servlet
is currently actually a filter which should be changed somehow)
2013-12-23 01:32:29 +01:00
Michael Peter Christen
84167adb49 removed unused anomichttpd code after migration to jetty 2013-12-23 01:23:40 +01:00
Michael Peter Christen
b461a27abb fixed the SolrServlet 2013-12-20 01:51:51 +01:00
Michael Peter Christen
7603e879dc Merge branch 'master' into HEAD
Conflicts:
	.classpath
	source/net/yacy/cora/federate/solr/SolrServlet.java
2013-12-20 01:19:06 +01:00
Michael Peter Christen
25250405f1 solr servlet preparation for join with jetty branch 2013-12-20 00:45:58 +01:00
Michael Peter Christen
2f16770681 migrated to solr 4.6.0 2013-12-19 21:51:05 +01:00
Michael Peter Christen
57f0f71ac6 added patch to allow binary response writer 2013-12-19 10:13:43 +01:00
orbiter
937273d4e3 added parsing of metadata to surrogate reading:
a dublin core record inside of surrogate input files may now contain
tokens within the namespace 'md' (short for: metadata). The token names
must be valid withing the namespace of the solr field names. All
md-tokens inside of surrogate files then overwrite values within solr
documents before they are written to the solr index. This makes it
possible to assign collection names to each surrogate entry and also
ranking information can be added. Please see the example file.
2013-12-17 14:02:27 +01:00
reger
18497f6475 remove unused init parameter from DefaultServlet
- remove "RelativeResourceBase" parameter
2013-12-15 23:39:19 +01:00
orbiter
4de3fefdb5 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-12-15 19:13:00 +01:00
orbiter
7e346e1d79 using stringbuilder in query construction 2013-12-15 19:12:49 +01:00
reger
c84c313fe1 Merge origin/master into jetty 2013-12-14 20:02:24 +01:00
Michael Peter Christen
2702d9e56b - added a SolrQueryResponse2SolrDocumentList method which is able to
work around the unfolding process in Solr's BinaryResponseWriter.
This was a huge performance bottleneck in the embedded solr connector
and the problem is actually on Solr side, but we have now a workaround.
- This made it possible to abstract a high-performance index access
method which is implemented as method getDocumentListByParams. That
method is also implemented in the SolrServerConnector and provides a
very efficient access to a solr index if the index is embedded.
- a popular use of the document list retrieval is a result count which
can now also make use of the new method, via getDocumentCountByParams.
- enhanced the Error cache which now does not store error documents
within the ram cache if the document is also written to solr. When
documents are retrieved from the cache, they are partly read from the
ram cache and if not existent there, from the Solr index.
2013-12-13 15:56:29 +01:00
Michael Peter Christen
74466d731a use pre-compiled patterns in ymark 2013-12-12 11:50:48 +01:00
Michael Peter Christen
34633044b4 made pattern computation static 2013-12-12 10:55:36 +01:00
Michael Peter Christen
ef7ddbc933 added date parser caches to prevent re-calculation of costly date
parsing
2013-12-12 10:55:12 +01:00
Michael Peter Christen
552ef9f18e fix for bad ErrorCache.exists test (bug from latest commit) 2013-12-12 10:38:32 +01:00
Michael Peter Christen
09412ea3a4 counting search requests in solr interface 2013-12-12 03:37:19 +01:00
Michael Peter Christen
303f5694ba avoid usage of existsByQuery. If a document can be loaded by the ID
before testing other fields from the existsByQuery request, then a
document cache fills and queries after that one can be avoided.
2013-12-12 03:36:30 +01:00
reger
b43bbd3cc4 join DefaultServlet and Jetty8 implementation
- removing Jetty 8 specific dependencies
2013-12-09 23:45:57 +01:00
reger
089c5007ee move conditionalHeader to DefaultServlet
- by removing Jetty specific implementation detail
2013-12-08 00:56:45 +01:00
Michael Peter Christen
79771c60c0 IPv6 fixes 2013-12-06 14:30:08 +01:00
reger
92d9c56f9f Merge origin/master into jetty 2013-12-05 22:53:29 +01:00
Michael Peter Christen
78eac85161 better calibration of caches and queue maximum sizes 2013-12-04 23:15:10 +01:00
Michael Peter Christen
c8af19bd37 removed unnecessary check which causes a NPE when searching with empty
search string
2013-12-04 17:58:36 +01:00
Michael Peter Christen
e3c2f09de9 - reduce computation in case that specific postprocessing fields are not
selected
- de-select citation rank computation
2013-12-04 17:48:12 +01:00
Michael Peter Christen
cfa08024c7 removed optimization bevore postprocessing because that may cause a
time-out which will cause that postprocessing fails.
2013-12-04 16:04:29 +01:00
Michael Peter Christen
6f3a923691 fixed urlmask which was not able to combine several constraints 2013-12-04 13:48:01 +01:00
Michael Peter Christen
9a27bf6e82 removed filter computation in Protocol class for remote searches because
that is already done in the QueryParams class
2013-12-04 13:09:15 +01:00
Michael Peter Christen
f1b5db2c45 - performance graph does not shop peer ping in memory monitor any more
- after a forced GC, the PerformanceMemory view switches to automatic
update by default
2013-12-04 12:59:30 +01:00