Commit Graph

101 Commits

Author SHA1 Message Date
allo
ec10220d57 Fix for last Commit: .class Files in htroot, not in the dir of the localized HTML-Files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@955 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-19 07:17:49 +00:00
allo
4db2080188 Bugfix for www and share.
http://www.yacy-forum.de/viewtopic.php?p=11486


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-19 06:52:43 +00:00
theli
40777556c5 *) Connection Tracking
- adding automatic refresh
   - accepts new parameter nameLookup which can be used to deactivate 
     yacy-peer name lookup (because we have problems with this on large seed-dbs)

*) ViewFile
   New page that can be used to view 
   - original content 
   - plain text content 
   - parsed content
   - parsed sentences 
   of a webpage specified by there url hash
   Mainly for debugging purpose at the moment

*) Robots.txt 
   Bugfix for if-modified-since usage
   TODO: synchronization of downloads to avoid loading the same robots-file 
   multiple times in parallel by different threads

*) Shutdown
   Better abortion of transferRWI and transferURL sessions on server shutdown

*) Status Page
   Adding icon to start/stop crawling via status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-18 07:45:27 +00:00
allo
6430fa520e bugfix for broken HTDOCS
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@938 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-14 11:50:11 +00:00
theli
219acc1e8f *) Bugfix for wrong http version in response to http/1.0 requests
See: http://www.yacy-forum.de/viewtopic.php?t=1312

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-13 06:30:13 +00:00
allo
0f2f783e46 no no-cache for mediaExts
see http://www.yacy-forum.de/viewtopic.php?p=11210#11210


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@924 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-12 20:45:14 +00:00
allo
7ca60f97bf localization Support for Includes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-12 12:44:05 +00:00
orbiter
b45ffecd39 log to fix http://www.yacy-forum.de/viewtopic.php?p=11111#11111
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@911 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-11 07:46:14 +00:00
theli
1688be8590 *) plasmaSwitchboard.java
adding more verbose logging output for db initialization
*) httpdFileHandler.java
   adding cache for servlet response methods


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@897 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-10 09:13:17 +00:00
theli
e3a586d7bd *) Using serverByteBuffer instead of ByteArrayOutputStream
to speedup httpdFileHandler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@896 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-10 07:15:57 +00:00
orbiter
16a49c1c9d fix for graphics generation bug, see http://www.yacy-forum.de/viewtopic.php?p=10987#10987
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@886 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-09 14:46:33 +00:00
orbiter
5153ec0f3e update to image painter
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@873 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-07 01:25:39 +00:00
orbiter
1b2db0b52a fix for file-share access; damaged some commits before by me :-(
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@870 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-06 22:30:13 +00:00
theli
a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
various checks like the blacklist check or the robots.txt disallow check are now
   done by a separate thread to unburden the indexer thread(s)
   TODO: maybe we have to introduce a threadpool here if it turn out that this single
         thread is a bottleneck because of the time consuming robots.txt downloads

*) improved index transfer
   The index selection and transmission is done in parallel now to improve index 
   transfer performance.
   TODO: maybe we could speed up performance by unsing multiple transmission threads in 
         parallel instead of only a single one.

*) gzip encoded post requests
   it is now configureable if a gzip encoded post request should be send on
   intex transfer/distribution

*) storage Peer (very experimentell and not optimized yet)
   Now it's possible to send the result of the yacy indexer thread to a remote peer 
   istead of storing the indexed words locally. 
   This could be done by setting the property "storagePeerHash" in the yacy config file
   - Please note that if the index transfer fails, the index ist stored locally.
   - TODO: currently this index transfer is done by the indexer thread. 
     To seedup the indexer
     a) this transmission should be done in parallel and
     b) multiple chunks should be bundled and transfered together


*) general performance improvements  
   - better memory cleanup after http request processing has finished
   - replacing some string concatenations with stringBuffers
   - replacing BufferedInputStreams with serverByteBuffer
   - replacing vectors with arraylists wherever possible
   - replacing hashtables with hashmaps wherever possible
   This was done because function calls to verctor or hashtable functions
   take 3 time longer than calls to functions of arraylists or hashmaps.
   TODO: we should take a look on the class serverObject which is inherited from hashmap
         Do we realy need a synchronization for this class?
   TODO: replace arraylists with linkedLists if random access to the list elements is not needed

*) Robots Parser supports if-modified-since downloads now
   If the downloaded robots.txt file is older than 7 days the robots parser tries to
   download the robots.txt with the if-modified-since header to avoid unnecessary downloads
   if the file was not changed. Additionally the ETag header is used to detect changes.

*) Crawler: better handling of unsupported mimeTypes + FileExtension

*) Bugfix: plasmaWordIndexEntity was not closed correctly in 
   - query.java
   - plasmaswitchboard.java

*) function minimizeUrlDB added to yacy.java 
   this function tests the current urlHashDB for unused urls
   ATTENTION: please don't use this function at the moment because
              it causes the wordIndexDB to flush all words into the
              word directory!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 10:45:33 +00:00
orbiter
01db66dc69 implemented image-servlets. the imagetest will stay there only for a limited time. Now images can be generated on-the-fly from servlets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 08:40:20 +00:00
theli
1dc94e7753 *) Adding support for gzip content-encoding of http post requests
used to transferRWIs and transferURLs.
   See: http://www.yacy-forum.de/viewtopic.php?t=1167#10020

*) adding yacyVersion.java containing constants defining yacy versions
   that support a given feature.
   Needed to determine if a remote peer is able to decode gzip 
   content-encoded http post bodies properly.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@772 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-22 10:30:55 +00:00
orbiter
e17df64b54 removed IS_ADMIN - feature. This was covered by plasmaSwitchborad.adminAuthenticated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-21 09:22:01 +00:00
theli
b990dc1ad1 *) Replacing jsch 0.1.19 lib with newer version 0.1.21
*) Replacing PDFBox 0.7.1 lib with newer version 0.7.2
*) Refactoring of classes httpd/httpc/httpHeaders to
   make many methods for httpHeader/Requestline parsing
   reusable for new icap implementation
*) adding chunked input stream support
   - needed by new icap implementation
   - needed by future httpc HTTP/1.1 support 
*) httpd.java
   - moving all connection property contants to class httpHeader
   - moving readHeader function to class httpHeader
   - moving parseQuery function to class httpHeader
   - moving handleTransparentProxy function to class httpHeader
*) httpHeader.java
   - adding new fuction to parse the http response line
   - adding new function to converte http headers to a string that
     can be send to the client
   - adding a function that generates a proper url using all parsed
     connection properties
*) ICAP Support
   - yacy now supports handling of icap response modification requests
   - this feature can be used by other icap enabled proxies to contact 
     yacy as icap server, and to handover the downloaded content to yacy.logging
     for indexing
   - functionality was successfully tested with squid 2.5Stable 10 + icap patch
   - further icap services e.g. URL filtering based on yacy's blacklists are possible
*) plasmaSwitchboard.java
   - htcache entries that are still needed for indexing are now properly registered 
     as in use after system restart
   - extended logging: log message now shows parsing and indexing time for each sb. entry
    

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@757 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-20 21:49:47 +00:00
theli
f783061414 *) Changing redirection code from 307 to 302
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@710 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-12 11:38:46 +00:00
theli
a6a8af0f04 *) httpdFileHandler templateCache can now be disabled
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@708 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-12 10:47:27 +00:00
theli
44b36d087e *) Implementing a Cache for the servelet template files (.html)
should help to reduce IO
   See: http://www.yacy-forum.de/viewtopic.php?t=749

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@690 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-09 11:14:22 +00:00
theli
bead8a32aa *) IndexCreate_p.java:
Crawler StartURLs will now also added to the errorURL-DB if an error occures on this url
*) kelondroStack.java, plasmaSwitchboardQueue.java
   Adding method which returns a list of all entries in the queue. This list is used by IndexCreate_p.java 
   instead of an iterator to display the indexing-list. 
   Advantages: avoid concurrent modifications of the list while displaying it. 
               Speedup because now we have to access only one sync function instead of multiple ones 
               (one for each entry)
*) IndexCreateIndexingQueue_p.java
   Using new list() function of plasmaSwitchboardQueue
*) httpdFileHandler.java
   If a servelet returns the special value "LOCATION" the httpFileHandler does a Redirection of 
   the Browser to the URL specified by the servelet. This can e.g. be used when a http get request is
   used insead of a post request, but a refresh should not be allowed.
*) IndexCreateWWWLocalQueue_p.html
   Now it's possible to delete single entries of the local crawler queue

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-01 07:52:46 +00:00
theli
ebbd063c92 *) Making mimeTable static final
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@619 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-31 09:22:55 +00:00
theli
4fd5b95b1f *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logSevere instead of logFailure and logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@615 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:32:59 +00:00
theli
6adf8a4bde *) Renaming Logger function names to reflect the proper Java Logging API Loglevels
- please use logFine instead of logDebug
   - please use logFailure instead of logError
   See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@614 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-30 21:10:39 +00:00
rramthun
0864ea367d Added preformated changelog.txt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@567 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-21 10:38:22 +00:00
allo
41aa3ae72e provide a virtuell Headerfield IS_ADMIN.
This allows Serverlets to check Admin Status.
http://www.yacy-forum.de/viewtopic.php?t=1003


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-20 21:22:05 +00:00
orbiter
8d6c288f04 display of peer name in headline; see http://www.yacy-forum.de/viewtopic.php?p=7466#7466
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@535 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-14 15:45:48 +00:00
theli
1d83d7e4d7 *) httpdFileHandler.java:
no stacktrace will be printed into log file for "Connection timed out" Errors now
   See: http://www.yacy-forum.de/viewtopic.php?p=6381

*) plasmaCrawlWorker.java:
   If a "Read timed out" error occurs while crawling a site, the failed crawl will be
   retried.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-04 11:05:04 +00:00
orbiter
849b194149 fixed news receipt and added processing buttons on News page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@458 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-30 07:15:39 +00:00
orbiter
af67c633d5 doc-changes and more strict brute-force handling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-25 09:56:54 +00:00
theli
13eeaa08f3 *) httpc.java:
- Now it's possible to interrupt pending httpc-actions on server shutdown  
   - this is possible because of a newly introduced registration mechanism for
     open sockets
*) yacyCore.java
   - blocking peerPing threads can now be interrupted on server shutdown
*) serverCore.java
   - restructuring shutdown code 
*) error.html
   - port number is now set correctly if port forwarding was enabled


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-07 13:58:54 +00:00
theli
0e2c33ee55 *) Network.html/Network.java:
- Adding function to manually force peer ping to remote yacy peer
  See:Network.html?page=4
- for debugging purpose only!

*) serverAbstractThread.java:
- Adding posibility to notify a server thread via a synchronization object
- this is needed e.g. by the port forwarding feature to send a notification
  to the peerPing thread to redo peer-ping with the new ip/port Settings_p.html

*) Port Forwarding Feature (it should work now)
- adding a serverThread which is responsible to detect broken port forwarding 
  connections and to do reconnect if needed
- serverCore.java: moving port forwarding initialization into a separate function
- adding positility to configure the ssh port 
- moving configuration section on the gui into a separate fieldset
- hello.java: only trying to do a second connect to the clientIp address during
  peer handshake if either remote port forwarding is not enabled locally or
  the clientIP is not equal to any local ip

*) httpdFileHandler.java:
- printout a more verbose errormessage

*) httpc.java
- allowing to deactivate content encoding from outside


 

*) plasmaCrawlWorker.java
- the crawler worker now tries to refetch the content of a website without
  gzip content encoding if a gzip error occured



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-04 11:09:48 +00:00
theli
08e4334c1d *) Status.java: showing amount of time since last upload of seed-file
*) hello.java: adding additional output for principal-downgrade bug
*) httpd.java, httpdFileHandler.java, httpdProxyHandler.java: improved errorhandling
*) yacyCore.java: trying to fix principal-downgrade bug
*) yacySeed.java: adding some constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@329 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-28 11:27:31 +00:00
theli
0405de7635 *) Avoiding NullpointerException while testing exception test
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 08:04:53 +00:00
theli
9e47ba5ad6 *) adding missing calls for function close() to avoid "too many open file" bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@282 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 08:34:52 +00:00
theli
890e3f4d4a *) adding missing calls for function close() to avoid "too many open file" bug*) adding
*) bugfix in plasma/plasmaParser.java:
   - parsers with missing dependencies wehre not ignored correctly
*) passing a logger instance to the parsers modules which can be used 
   for logging purposes by the parsers (not done yet)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-13 13:49:17 +00:00
theli
c7d294d8d4 *) Bugfix for:
- 302 redirection Problem on Amazon Server
   - Wrong References in proxymsg/error.html
   See: http://www.yacy-forum.de/viewtopic.php?t=515

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@271 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-13 07:50:35 +00:00
theli
f0440911e8 *) removing unnecessary stacktrace printout
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@266 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-10 09:34:50 +00:00
theli
ee9e110366 *) removing old logging configuration properties from yacy.init
*) serverLog.java logging functions now also accept exceptions als
   additional parameters.
   The Stacktrace of this ecceptions will then be appended to the 
   logging message and can e.g. be viewed on the gui logging page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-10 09:19:24 +00:00
theli
f157181086 *) starting implementation of Content-MD5 header
which should help to detect transfer errors on yacy to yacy
   communication
   - not finished yet
*) removing unneeded functions (e.g. respondHeader) because newly
   introduced functions in class httpd.java
*) httpdFileHandler.java now always sends back a proxy error message
   as body of a response with an error code
*) adding support of gzip content encoding 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 10:12:07 +00:00
orbiter
5f90daa265 implemented localization environment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@171 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-24 14:23:04 +00:00
theli
4dd387aae9 *) moving constants (see last commit) to proper httpHeader class
*) migrating fileHandler + proxyHandler to use constants instead of hardcoded values

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-13 09:14:12 +00:00
theli
cbdc499ba6 *) adding many missing (File)?(Input|Output)Stream.close() calls to avoid "Too many open files bug".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@90 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 07:24:33 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
orbiter
f99930c04b fixed brute-force + peer-disconnect - Bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@75 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-01 23:31:21 +00:00
orbiter
c7c6aaf06e many bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@73 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-30 01:22:46 +00:00
orbiter
2de90020ed fixed caching+synchronization+brute-force-denial
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@67 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-27 21:09:40 +00:00
(no author)
f39812da91 *) Some performance improvements
- many classes set to final
- implementation of a session-thread pool
- reusage of the server handler class (normally the httpd object)
  within the session thread
- implementation of a httpc object pool
- introduction of a linebuffer in httpd which can be reused
- reusing the properties table in the httpc
- added to apache libs (commons-collections, commons-pool) which 
  are needed for the object/thread pool implementation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@26 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 06:55:57 +00:00
(no author)
b7d4389e4b *) support for Proxy Auto-Config File generation added.
File is accessible using: 
   http://proxy:8080/autoconfig.pac

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@20 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-15 09:06:15 +00:00
orbiter
248077d3f0 initial load with yacy 0.36
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-07 19:19:42 +00:00