Commit Graph

57 Commits

Author SHA1 Message Date
theli
f8ad65eae1 *) First trial implementation of robots.txt support
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-07 11:17:21 +00:00
theli
e09f1fe8e4 *) IfsL: Suppressing stacktraces on further proxy errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-05 10:45:56 +00:00
theli
6c722706b7 *) Moving yacyDebugMode intialization to switchboard
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@660 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-05 10:34:34 +00:00
theli
4e07828807 *) httpdProxyHandler.java
- harmonizing proxy exception handling
- adding malformed URL + blacklist check for http head method
- adding malformed URL check to http post method
- chunked encoding is now not used anymore for http post if clients
  are http/0.9 or http/1.0 clients (same behaviour as already implemented for get)
- now an exception will be thrown on internal httpc errors to force an error output
  to the client or a connection close. This should help to fix the "binary data in browser window" bug

*) plasmaSwitchboard.java
- fixing the following Bug
  E 2005/09/03 18:02:42 PLASMA Could not index URL http://mis04.de/FAIL/snot.php: null
  java.lang.NullPointerException
	at de.anomic.plasma.plasmaSwitchboard.processResourceStack(plasmaSwitchboard.java:1000)
	at de.anomic.plasma.plasmaSwitchboard.deQueue(plasmaSwitchboard.java:625)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at de.anomic.server.serverInstantThread.job(serverInstantThread.java:95)
	at de.anomic.server.serverAbstractThread.run(serverAbstractThread.java:243)
  This bug could occure if the cached responseHeader is null
- getting the mimeType now from the parsed document instead of the responseHeader because the 
  mimeType could have been changed during content parsing (e.g. because of the mimetypeParser)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@656 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-05 10:10:00 +00:00
theli
a7256e8f4e *) Adding X-Forwarded-For Header
See: http://www.yacy-forum.de/viewtopic.php?t=1118&highlight=xforwardedfor
*) httpc.java: Bugfix for incorrect http response statuscode parsing 
   In some situations the statustext whas chopped
*) Adding a lot of fileheaders containing YaCy copyright and license
*) httpd.java: Adding additional debugging http header that should help du detect
   the "binary data in browser window" bug.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@653 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-05 08:01:54 +00:00
borg-0300
81cb8feb15 back to 649 :/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@651 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-04 22:03:44 +00:00
borg-0300
5194511e8e *) attempt to find bug
See: http://www.yacy-forum.de/viewtopic.php?t=1121

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@650 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-04 19:08:51 +00:00
theli
8f3d19b331 *) Suppress stacktrace on proxy error for "Connection reset"
See: http://www.yacy-forum.de/viewtopic.php?t=1107

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@646 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-03 15:27:51 +00:00
theli
a20814291f *) Bugfix for "Race condition zwischen httpc und switchboard"
See: http://www.yacy-forum.de/viewtopic.php?p=9036

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@644 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-03 13:40:32 +00:00
theli
3dc6845bef *) Logging error message to logging output if no errormessage can be send to the user by the proxy
Note: This is only done if you set the logging level of PROXY to FINE

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@632 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 21:06:56 +00:00
theli
3df5c7a6cf *) Displaying an proxy error page instead of a white page if the server has closed
the connection before yacy was able to receive the http response line
   See: http://www.yacy-forum.de/viewtopic.php?p=8866#8866
        http://www.yacy-forum.de/viewtopic.php?t=704

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@630 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 11:34:32 +00:00
borg-0300
cc493ef8c1 Added change from Hermes
See: http://www.yacy-forum.de/viewtopic.php?t=1050

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@629 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 11:18:41 +00:00
theli
4edb5b6f1e *) Bugfix for "ProxyAccess logging" Bug
Loglevel was not set corretly for Proxy.access logger
   See: http://www.yacy-forum.de/viewtopic.php?p=8875#8875

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 10:06:47 +00:00
theli
7a7254713d *) Moving Logging directory per default to DATA/LOG
See: http://www.yacy-forum.de/viewtopic.php?t=940#7656

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 08:56:35 +00:00
theli
4fd5b95b1f *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logSevere instead of logFailure and logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:32:59 +00:00
theli
6adf8a4bde *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logFailure instead of logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:10:39 +00:00
theli
0dfa8b62e2 *) Changing Proxy-Useragent string according to thread http://www.yacy-forum.de/viewtopic.php?p=8183#8183
A typical useragent string now e.g. looks like: 
   Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10; YaCy 0.401/00602; yacy.net) Gecko/20050716 Firefox/1.0.6

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@607 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 13:41:47 +00:00
theli
e3aa3a2d75 *) Bugfix for ProxyAccess Logger
URL was accidentally logged without the parameters  

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@604 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 11:37:54 +00:00
theli
af7b8f75bd *) Making proxyAccessLogging configureable via yacy.logging file
- logging can be disabled now
   - logging directory / filelimit / rotation count can be configured now
   See: http://www.yacy-forum.de/viewtopic.php?t=965&postdays=0&postorder=asc&start=30#8280

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@595 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-29 11:31:58 +00:00
theli
751a778b54 *) Bugfix for heise newsletter Problem
See: http://www.yacy-forum.de/viewtopic.php?p=7836#7836

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@560 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-19 08:28:43 +00:00
theli
7d8af6b41a *) Bugfix for heise newsletter Problem
See: http://www.yacy-forum.de/viewtopic.php?p=7836#7836

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@559 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-19 08:23:12 +00:00
orbiter
f5259f29e8 word cache behaviour fix and other fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-11 23:33:19 +00:00
rramthun
eacff63eda Typos...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-02 16:09:19 +00:00
orbiter
ad90f0ad13 activated RWI distribution to DHT for senior peers (default redundancy 3), necessary now for network growth
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-27 12:51:00 +00:00
rramthun
69afae514d Updated translations...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-21 15:42:06 +00:00
orbiter
c64970fa47 re-implemented proxy-busy-check and fixed some other things
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@421 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-21 11:17:04 +00:00
orbiter
86d778f7bc default-button in profile menue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-20 17:08:58 +00:00
orbiter
40036ba69c fixed dht transmission; added url-blacklist blocking also for remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-12 00:07:09 +00:00
orbiter
311e627363 blocking of blacklisted urls in indexReceive and small changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-11 15:36:10 +00:00
orbiter
2f0d7ea8d3 removed htcache stati (superfluous now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@396 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-09 00:33:34 +00:00
theli
13eeaa08f3 *) httpc.java:
- Now it's possible to interrupt pending httpc-actions on server shutdown  
   - this is possible because of a newly introduced registration mechanism for
     open sockets
*) yacyCore.java
   - blocking peerPing threads can now be interrupted on server shutdown
*) serverCore.java
   - restructuring shutdown code 
*) error.html
   - port number is now set correctly if port forwarding was enabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-07 13:58:54 +00:00
orbiter
b79070b471 fixed proxy/scraper
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@385 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-07 13:25:41 +00:00
orbiter
419f8fb398 fixed bugs/missing code regarding new crawl stack
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-07 01:38:49 +00:00
orbiter
858cd94299 replaced indexing ram-queue by file-based stack-queue
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-06 14:48:41 +00:00
theli
57c30f1d78 *) bugfix for usage of httpc without gzip content encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-04 11:25:25 +00:00
theli
08e4334c1d *) Status.java: showing amount of time since last upload of seed-file
*) hello.java: adding additional output for principal-downgrade bug
*) httpd.java, httpdFileHandler.java, httpdProxyHandler.java: improved errorhandling
*) yacyCore.java: trying to fix principal-downgrade bug
*) yacySeed.java: adding some constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@329 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-28 11:27:31 +00:00
theli
d53b2393e5 *) autoconfig.java: ip address was not reported correctly when port-forwardin is on
*) hello.java: reportedip my be empty at peer startup
*) httpc.java: adding method to determine if the connection was already closed or is broken
*) httpdProxyHandler.java: trying to do a better errorhandling
*) server/serverCore.java
- setting myseed ip-address and port correctly if port-forwarding is on
- doing a more failsafe close and adding some debugging output
*) yacyClient.java: adding some logging statements to allow a better detection of 
   "degraded to senior"-bug
*) yacyCore.java: restructuring publishMySeed
   (@Orbiter: pleas take a look)
- to avoid buzy waiting
- to allow a gracefull shutdown on server shutdown
- new seed count was not calculated correctly in the previous version
*) yacySeedDB.java: host ip and port was not initialized correctly if port-forwarding
   was activated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 11:00:26 +00:00
orbiter
a5b40923b6 added word migration to assortments (start with 'java -classpath classes yacy -migratewords')
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@278 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-15 01:22:07 +00:00
theli
c7d294d8d4 *) Bugfix for:
- 302 redirection Problem on Amazon Server
   - Wrong References in proxymsg/error.html
   See: http://www.yacy-forum.de/viewtopic.php?t=515

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@271 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-13 07:50:35 +00:00
theli
bbc46f037f *) avoid exception throwing on already catched exception
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@267 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-10 09:38:38 +00:00
theli
ee9e110366 *) removing old logging configuration properties from yacy.init
*) serverLog.java logging functions now also accept exceptions als
   additional parameters.
   The Stacktrace of this ecceptions will then be appended to the 
   logging message and can e.g. be viewed on the gui logging page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-10 09:19:24 +00:00
theli
cb17ff4aa2 *) adding support of proxy access logging (much similar to squids
access.log) file
*) splitting doGet function in separate functions for fulfilling
   requests from cache and from web to make error handling easier
*) using connection property and httpHeader constants instead of
   hardcoded strings whenever possible
*) sending back a proxy error message as body of every respond
   containing a http error code
*) correcting problems of messages received from other proxies
   containing 204, 304 status codes.
*) using chunked transfer encoding if the server has not set the
   content length (e.g. because of gzip content encoding) but 
   the client has established a persistent connection to yacy.
   This is only possible for http/1.1 clients. For http/1.0 clients
   the connection will simply be closed on the end of the message.   
*) removing unneeded functions (e.g. respondError) because of newly
   introduced functions of httpd.java
*) removing hop by hop headers (according to rfc)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 10:22:05 +00:00
theli
05ab7c4d68 *) Correcting Problems with "transparent proxy" mode.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-18 09:45:27 +00:00
theli
361f05978d Multiple updates regarding the yacy seedUpload facility,
optional content parsers, thread pool configuration ...

Please help me testing if everything works correct.

*) Migration of yacy seedUpload functionality
See: http://www.yacy-forum.de/viewtopic.php?t=256
- new uploaders can now be easily introduced because of a new modulare uploader system
- default uploaders are: none, file, ftp
- adding optional uploader for scp
- each uploader provides its own configuration file that will be 
  included into the settings page using the new template include feature
- Each uploader can define its libx dependencies. If not all needed libs are
  available, the uploader is deactivated automatically.

*) Migration of optional parsers
See: http://www.yacy-forum.de/viewtopic.php?t=198
- Parsers can now also define there libx dependencies
- adding parser for bzip compressed content
- adding parser for gzip compressed content
- adding parser for zip files
- adding parser for tar files
- adding parser to detect the mime-type of a file
  this is needed by the bzip/gzip Parser.java
- adding parser for rtf files
- removing extra configuration file yacy.parser
  the list of enabled parsers is now stored in the main config file

*) Adding configuration option in the performance dialog to configure
See: http://www.yacy-forum.de/viewtopic.php?t=267
- maxActive / maxIdle / minIdle values for httpd-session-threadpool
- maxActive / maxIdle / minIdle values for crawler-threadpool

*) Changing Crawling Filter behaviour
See: http://www.yacy-forum.de/viewtopic.php?p=2631

*) Replacing some hardcoded strings with the proper constants of the httpHeader class

*) Adding new libs to libx directory. This libs are
- needed by new content parsers
- needed by new optional seed uploader
- needed by SOAP API (which will be committed later)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-17 08:25:04 +00:00
theli
4dd387aae9 *) moving constants (see last commit) to proper httpHeader class
*) migrating fileHandler + proxyHandler to use constants instead of hardcoded values

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-13 09:14:12 +00:00
theli
74f12bb0f3 *) adding transparent proxy support
Now a firewall can transparently redirect all 
   http traffic through yacy.
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@96 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 22:36:26 +00:00
theli
cbdc499ba6 *) adding many missing (File)?(Input|Output)Stream.close() calls to avoid "Too many open files bug".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@90 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 07:24:33 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
orbiter
c7c6aaf06e many bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@73 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-30 01:22:46 +00:00
orbiter
87a61a01c2 fixed bad-gzip-trailer behaviour (now cuts off trailer)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@42 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-22 13:45:07 +00:00