Commit Graph

147 Commits

Author SHA1 Message Date
reger
7d0d19cb8e avoid File.deleteOnExit() on temp files
JVM registers each file in a list regardless of already deleted and never
cleans up the list during runtime.
This accumulates to a considerable amount of mem during large crawls and/or
long uptime.
To tackle this, all temp files are now created in a subdir of java.io.tmpdir 
and the jvm tmpdir property is set to this subdir, which is deleted by
code on shutdown.
Additionally let pdfParser use this tmp subdir too.
2015-11-17 22:27:07 +01:00
reger
a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt() 2015-10-31 23:09:03 +01:00
reger
2fb6ebe88a move java environment parameter setting disabling SNI (Server Name Indicator) support for https connections from code to startup script allowing admin to ~easy/transparent alter the YaCy default FALSE setting.
Background: some user report problem with connecting/crawling some sites via https which require SNI support (by default switched off in YaCy). On the other hand systems not demanding SNI support are sometimes not properly configured and due to a bug/feature in java 1.7 connection is aborted. The later is more often the case, so the default is still fine. With the java start parameter expert user can no alter the startparameter to -Djsse.enableSNIExtension=true (java default) if they crawl more hosts requiring SNI support.
The alternative to let YaCy try both during https handshake (deep inside the httpclient) is not pursut at this time.
2015-07-29 23:30:05 +02:00
reger
c1dcc8c456 fix display and limit of max server connections after startup
(on restart value returned to default=50)
This has no effect on Jetty but the limit is still respected.
2015-03-29 07:12:23 +02:00
reger
3c818fc912 add a check of java version string >=1.7 to startup class
stopping start with error msg on version < 1.7
2014-11-16 01:26:07 +01:00
Michael Peter Christen
11074d8d24 fix for a ssl bug that appear only in java 7.
The bug was reported in
http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5407&p=30956#p30956
a solution was described in
http://teknosrc.com/javax-net-ssl-sslprotocolexception-handshake-alert-unrecognized_name-solved/
which worked for this example given in the yacy forum
2014-10-17 13:25:17 +02:00
Marc Nause
1e6e69bc40 Finished implementation of UPNP:
*) will try other ports if YaCy standard ports are not available
*) distinguish between internal and external port (not sure if this
works 100%)

Still to add: propery in config to enter own external port (in case of
manually configured NAT)
2014-10-07 13:10:06 +02:00
reger
b0c87d8240 fix image search expand box, cut-off of 2nd capture line height
tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height)

+fix charset parameter in metadataImageParser
+update start errMsgTxt to "java 1.7"
2014-10-03 01:43:05 +02:00
Michael Peter Christen
6491270b3a large IPv6 redesign of peer ping methods!
removed preferred IPv4 in start options and added a new field IP6 in
peer seeds which will contain one or more IPv6 addresses. Now every peer
has one or more IP addresses assigned, even several IPv6 addresses are
possible. The peer-ping process must check all given and possible IP
addresses for a backping and return the one IP which was successful when
pinging the peer. The ping-ing peer must be able to recognize which of
the given IPs are available for outside access of the peer and store
this accordingly. If only one IPv6 address is available and no IPv4,
then the IPv6 is stored in the old IP field of the seed DNA.
Many methods in Seed.java are now marked as @deprecated because they had
been used for a single IP only. There is still a large construction site
left in YaCy now where all these deprecated methods must be replaced
with new method calls. The 'extra'-IPs, used by cluster assignment had
been removed since that can be replaced with IPv6 usage in p2p clusters.
All clusters must now use IPv6 if they want an intranet-routing.
2014-09-30 14:53:52 +02:00
reger
8284ea751a catch TimeoutException during ping and do not delete yacy.conf during prereadconfigfile
found a situation after crash (reboot) with existing running semaphore but YaCy not running.
Ping generated exception which finally deleted the conf file (during pre-read procedure)
- change to ping (catch exception solved it)
- additionally removed delete yacy.conf file (if needed we need to make a backup)
2014-09-16 23:14:13 +02:00
Michael Peter Christen
759e7d9538 fix for http://forum.yacy-websuche.de/viewtopic.php?p=30720#p30720 2014-09-16 14:53:30 +02:00
Michael Peter Christen
ca8b2bf099 removed www and welcome servlet, these had been demo servlets and are
not needed any more
2014-09-15 12:48:58 +02:00
reger
e9060d31bd update to Jetty 9
besides adjustments in code it makes the servlet settings in web.xml significant.
This applies to solr, gsa and proxy servlet. There is no longer a default setup in code during init (as jetty 9 checks for double definition).
2014-05-11 01:53:11 +02:00
reger
af6ad20728 fix: remove obsolete ref to yacy.home
(use Switchboard instead)
2014-04-04 02:45:04 +02:00
Michael Peter Christen
8b44fcf0f4 added missing @Override annotation 2014-03-28 13:48:37 +01:00
reger
b126b9ba17 add some InputFileStream close at end of reads
to make sure file is released
2014-03-24 02:32:17 +01:00
reger
3b89176b9f use config value htroot in Jetty init (was hardcoded)
- move htroot exist check from old httpdfilehandler to startup, remove from filehandler and legacy proxyhandler
- use SwitchboardConstant.htroot where appropriate
2014-02-27 00:23:34 +01:00
reger
809e976578 remove unused java imports form yacy.java 2014-02-24 05:19:40 +01:00
reger
a9b06f8719 add a -config command line parameter e.g. -config "port=9090" "port.ssl=8043"
- useful for remote installation to set any config file property
- multipe parameter can be set at once, on Windows enclose parameter in doublequotes
- special handling   "adminAccount=adminuser:adminpwd"  sets adminusername and md5 encoded admin-pwd

- adjusted windows startbatch to allow command line parameter handling
- remove not needed classpath calculation from startYACY_debug.bat
2014-02-24 05:16:31 +01:00
Michael Peter Christen
6e59ca4ebf removed jena library and all code that depended on jena. When jena was
introduced, it was also used for search facets. The generic search
facets are now deduced from generic solr fields which makes jena as tool
for facet semantics superfluous.
2014-02-07 01:20:06 +01:00
Michael Peter Christen
022c6d3ce1 do YaCy p2p connections using a timeout-request which covers the http
request into a separate thread and ignores the furthure result of a
request if that does not answer within the requested time-out. This is a
try to solve a problem with the peer-ping, which hangs whenever a peer
appears to be dead or blocked.
2014-01-19 15:21:23 +01:00
reger
7b800a0c8e fix: NPE on shutdown via script 2014-01-07 22:44:24 +01:00
reger
6932aa4d7a use configured admin-username for api calls
- the admin user name can be configured, in apiExec calls the default "admin" username is used. 

TODO: the bin/apicall.sh script should likely take that into account.
2014-01-07 21:26:50 +01:00
reger
05d6cc6ea3 setting of IPv4Stack moved earlier
it seems even better to call system.setproperty before isrunning check
(if nothing helps we have to set it in startup script)
2014-01-06 11:28:05 +01:00
reger
30d925a96e reimplemented server access restriction
via Jetty IPAccessHandler to allow only configured IP's to access.
Handler is only loaded if a restriction is configured.

Since IPAcessHandler (Jetty 8) does not support IPv6 system property java.net.preferIPv4Stack=true
Testing showed system.setProperty seems to be sensitive to point of calling (earliest possible time seems to be best = early in yacy.main).
Moved the "isrunning..." just open browser check also to the new routine to preread the yacy.config only once.
2014-01-06 07:00:16 +01:00
orbiter
3cb6c7861f fixed shutdown authenticaton problem 2014-01-06 01:48:54 +01:00
orbiter
9d52b337f3 added http authentification to YaCy http client for all localhost
acesses to enable self-steering of the peer using the API table. This is
necessary in case that an password for the administration pages is set.
2014-01-05 14:46:11 +01:00
Michael Peter Christen
7d6fc79eb8 refactoring (usage of constant names for attributes of authentication
check)
2014-01-05 04:23:44 +01:00
reger
8eaabb9600 remove dependency from old serverCore.java
- remaining getPortNr not needed 
  (as current release allows only to set plain integer as port,
   see ConfigBasic)
2013-12-29 02:00:44 +01:00
reger
45e8750ba5 nasty quick fix for admin login with other username as admin
- userDB is not sync'ed with Jetty credentials as of now only the std. admin account can login

switched initial browser open with ssl active back to std. http port
2013-12-27 02:59:19 +01:00
reger
71cac1a278 added SSL/HTTPS connector to support SSL/https connection on port 8443
!!! attention !!! to make sure YaCy can start, https will be disabled if port 8443 is used
   - added ping test for above to migration 

- as of now port for https is hardcoded to default 8443
- if not urgend required I'd leave it this way (it's standard) to use different ports for http and https 

- post https port on ConfigBasic.html (if active)
2013-12-25 05:20:13 +01:00
Michael Peter Christen
84167adb49 removed unused anomichttpd code after migration to jetty 2013-12-23 01:23:40 +01:00
reger
b1ce70434e resolve merge conflict
- add missing import statement
2013-10-27 15:24:04 +01:00
reger
7869a4c070 Merge origin/master into jetty
- merge conflict resolve
2013-10-27 15:12:17 +01:00
reger
f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
- defaults to Jetty (in this branch)
- add server version info & config option -> Admin Console -> Advanced Settings -> Http Networking
2013-10-17 03:34:22 +02:00
reger
71d2655c02 downgrade to Jetty 8 to assure support of JRE 1.6
- introduce a YaCyHttp interface to modulize/separate http server
- adjust the Jetty version specific implementation part (in package net.yacy.http)
     - putting the version specific code in classes starting with Jetty8xxxx
     - moved existing Jetty9xxx implementation into a test class (to keep the code)
- adjust build to the changed jars
- make use of the introduced YaCyHttpServer interface in related htroot servlets

- adjust other test cases/classes
2013-10-09 00:40:48 +02:00
reger
5c4ba9b5db merge rc1 master 2013-09-22 02:21:24 +02:00
orbiter
70ba74b23a disabled ipv4 preference to enable ipv6-only networks like freifunk 2013-09-20 16:52:37 +02:00
reger
f7f86d8a5d update to Jetty 9 jars
- include javax.servlet 3.0
2013-09-14 20:49:05 +02:00
reger
127adbf5cf remove references to 10_http thread (legacy http server)
and add needed get/set function to jetty http server wrapper
2013-09-12 22:02:11 +02:00
reger
105cf8f593 changes to adjust jetty to recent code changes 2013-09-09 02:37:29 +02:00
reger
aafef72a8a merged current rc1/master into jetty branch to allow further development with latest version
ServerSideIncludes and servlet return values need further work (for working jetty integration)
- TODO: added nasty quickfix to allow SSI -  needs further work
- TODO: YaCy servlet return values/parameters are not handled
2013-09-09 02:36:06 +02:00
Michael Peter Christen
765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
in intranets and the internet can now choose to appear as Googlebot.
This is an essential necessity to be able to compete in the field of
commercial search appliances, since most web pages are these days
optimized only for Google and no other search platform any more. All
commercial search engine providers have a built-in fake-Google User
Agent to be able to get the same search index as Google can do. Without
the resistance against obeying to robots.txt in this case, no
competition is possible any more. YaCy will always obey the robots.txt
when it is used for crawling the web in a peer-to-peer network, but to
establish a Search Appliance (like a Google Search Appliance, GSA) it is
necessary to be able to behave exactly like a Google crawler.
With this change, you will be able to switch the user agent when portal
or intranet mode is selected on per-crawl-start basis. Every crawl start
can have a different user agent.
2013-08-22 14:23:47 +02:00
Roland Haeder
841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
to optimize memory usage

Conflicts:
	source/net/yacy/search/Switchboard.java
2013-07-17 18:31:30 +02:00
Michael Peter Christen
5878c1d599 - refactoring of log to ConcurrentLog:
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
2013-07-09 14:28:25 +02:00
reger
8a7fcb391d enable use of solrcore.properties for property substitution of solrconfig.xml
- move setting of system property solr.directoryFactory=solr.MMapDirectoryFactory to solrcore.properties
- add check of os.arch for 64bit system, if it fails use default/solrcore.x86.properties (if exists) as solrcore.properties
 
reason: on 32bit MMapDirectoryFactory may fail with.....
Caused by: java.io.IOException: Map failed
	at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:849)
	at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:283)
2013-06-01 05:43:08 +02:00
Michael Peter Christen
a8dc4346e8 default configuration of MMapDirectoryFactory for solr, increased lock
timeout, less documents from remote searches (too many results had
easily blocked a peer)
2013-05-30 12:31:28 +02:00
Michael Peter Christen
16e9d4d1dd added a restart hint 2013-03-15 10:00:06 +01:00
reger
c37d718f16 make sure yacy.running is deleted if not running (catch exception)
- to prevent following log if YaCy was previously not properly shutdown 

E ... STARTUP WARNING: the file C:\src\git\yacy-rc1\DATA\yacy.running exists, this usually means that a YaCy instance is still running
E ... STARTUP FATAL ERROR: java.util.concurrent.TimeoutException
java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException
	at net.yacy.cora.protocol.TimeoutRequest.call(TimeoutRequest.java:91)
	at net.yacy.cora.protocol.TimeoutRequest.ping(TimeoutRequest.java:112)
	at net.yacy.yacy.startup(yacy.java:200)
	at net.yacy.yacy.main(yacy.java:638)
Caused by: java.util.concurrent.TimeoutException

- adjust Netbeans path (to solr4.1.jars)
2013-02-11 22:53:19 +01:00
Michael Peter Christen
cb38e860cf After the observation that Windows user simply forget that they started
YaCy; YaCy is still running and the user additionally expect that
another doubleclick on the YaCy icon simply opens the search windows
(again) I decided to add a function that complies to the expectation to
the user: simply open the browser pop-up page again if the user starts
YaCy while YaCy is still running.
2013-02-07 23:39:00 +01:00