Commit Graph

132 Commits

Author SHA1 Message Date
hydrox
56b9f34411 *)removed unused imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 16:30:45 +00:00
hydrox
295aff52a3 *)added offline-browsing-support (onlineMode=0)
*)online-mode now can be changed in Status.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1010 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 12:25:40 +00:00
theli
ec3af327f7 *) Bugfix for Proxy-Authentication against remote proxy
See: http://www.yacy-forum.de/viewtopic.php?p=11804#11804

*) Adding first version of db test for mysql
   NOTES:
   - db user + db + db table must be created before starting the test
   - db table must be empty. Entries can not be updated at the moment
   - db connection properties must be changed in the sourcecode at the moment
   TODOs:
   - accepting connection properties via command line
   - implementing update + remove + read operations
   - 'maybe' adding code to create db + table if it doesn't exists

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@991 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-27 11:28:37 +00:00
theli
02d9af1a70 *) Restructuring and extending of Remote Proxy Support
- remote proxy configuration can now be "really" changed on the fly and takes effect immediately
   - adding possibility to disable remote proxy usage for yacy->yacy communication
   - adding possibility to disable remote proxy usage for ssl
   - restructuring proxy configuration so that it is stored in a single place now

*) Adding possibility to import a foreign word DB (or even more of them in parallel) 
   at runtime into the peers DB
   - this can be done by calling IndexImport_p.html 
   - ATTENTION: please not that at the moment this thread must be aborted via gui
     before a normal server shutdown is done. 
   - TODO: integrating IndexImport Thread into normal server shutdown
   - TODO: Adding posibility to import crawl-queues, etc. from foreign peers
   - TODO: removing old import function from yacy.java and calling the new routines instead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-22 13:28:04 +00:00
borg-0300
e642a5d8b7 more constants
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-17 15:46:12 +00:00
allo
f65c939a60 userDB Auth
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@874 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-07 13:49:07 +00:00
theli
a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
various checks like the blacklist check or the robots.txt disallow check are now
   done by a separate thread to unburden the indexer thread(s)
   TODO: maybe we have to introduce a threadpool here if it turn out that this single
         thread is a bottleneck because of the time consuming robots.txt downloads

*) improved index transfer
   The index selection and transmission is done in parallel now to improve index 
   transfer performance.
   TODO: maybe we could speed up performance by unsing multiple transmission threads in 
         parallel instead of only a single one.

*) gzip encoded post requests
   it is now configureable if a gzip encoded post request should be send on
   intex transfer/distribution

*) storage Peer (very experimentell and not optimized yet)
   Now it's possible to send the result of the yacy indexer thread to a remote peer 
   istead of storing the indexed words locally. 
   This could be done by setting the property "storagePeerHash" in the yacy config file
   - Please note that if the index transfer fails, the index ist stored locally.
   - TODO: currently this index transfer is done by the indexer thread. 
     To seedup the indexer
     a) this transmission should be done in parallel and
     b) multiple chunks should be bundled and transfered together


*) general performance improvements  
   - better memory cleanup after http request processing has finished
   - replacing some string concatenations with stringBuffers
   - replacing BufferedInputStreams with serverByteBuffer
   - replacing vectors with arraylists wherever possible
   - replacing hashtables with hashmaps wherever possible
   This was done because function calls to verctor or hashtable functions
   take 3 time longer than calls to functions of arraylists or hashmaps.
   TODO: we should take a look on the class serverObject which is inherited from hashmap
         Do we realy need a synchronization for this class?
   TODO: replace arraylists with linkedLists if random access to the list elements is not needed

*) Robots Parser supports if-modified-since downloads now
   If the downloaded robots.txt file is older than 7 days the robots parser tries to
   download the robots.txt with the if-modified-since header to avoid unnecessary downloads
   if the file was not changed. Additionally the ETag header is used to detect changes.

*) Crawler: better handling of unsupported mimeTypes + FileExtension

*) Bugfix: plasmaWordIndexEntity was not closed correctly in 
   - query.java
   - plasmaswitchboard.java

*) function minimizeUrlDB added to yacy.java 
   this function tests the current urlHashDB for unused urls
   ATTENTION: please don't use this function at the moment because
              it causes the wordIndexDB to flush all words into the
              word directory!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-05 10:45:33 +00:00
orbiter
3c1d968d29 fix-fix for 792 and small changes in ftpc/download/dir experiments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-26 10:36:42 +00:00
orbiter
dc474aa22f various bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@792 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-09-26 01:10:41 +00:00
orbiter
2d8557cb10 minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@487 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-03 02:02:39 +00:00
theli
228b04b499 *) Bugfix for "wrong seed-upload timestamp" problem
http://www.yacy-forum.de/viewtopic.php?t=817

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@480 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-08-02 15:36:10 +00:00
orbiter
85877413a0 tried to fix principal bug .. not succeeded
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-27 13:38:46 +00:00
orbiter
3470a72d48 fixed div by zero, set default delays, fixed release number format and display
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@435 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-26 11:47:50 +00:00
allo
45378323c3 stupid mistake
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@430 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-25 09:52:48 +00:00
allo
e16e4ba32b staticIP/DynDns Settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-20 08:53:47 +00:00
theli
0e2c33ee55 *) Network.html/Network.java:
- Adding function to manually force peer ping to remote yacy peer
  See:Network.html?page=4
- for debugging purpose only!

*) serverAbstractThread.java:
- Adding posibility to notify a server thread via a synchronization object
- this is needed e.g. by the port forwarding feature to send a notification
  to the peerPing thread to redo peer-ping with the new ip/port Settings_p.html

*) Port Forwarding Feature (it should work now)
- adding a serverThread which is responsible to detect broken port forwarding 
  connections and to do reconnect if needed
- serverCore.java: moving port forwarding initialization into a separate function
- adding positility to configure the ssh port 
- moving configuration section on the gui into a separate fieldset
- hello.java: only trying to do a second connect to the clientIp address during
  peer handshake if either remote port forwarding is not enabled locally or
  the clientIP is not equal to any local ip

*) httpdFileHandler.java:
- printout a more verbose errormessage

*) httpc.java
- allowing to deactivate content encoding from outside


 

*) plasmaCrawlWorker.java
- the crawler worker now tries to refetch the content of a website without
  gzip content encoding if a gzip error occured



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@368 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-07-04 11:09:48 +00:00
theli
9d8c66fb5e *) adding possibility to forward received yacy-messages (htroot/yacy/message.java)
via a command-line email program (e.g. sendmail) to a configured email address
   - the configuration dialog is reachable via Settings_p.html#messageForwarding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-29 09:50:48 +00:00
theli
3227a9eba5 *) Adding retry function for seed uploading
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 08:45:44 +00:00
theli
9a98988c3c *) Bugfix for SSL/NIO Bug
See: http://www.yacy-forum.de/viewtopic.php?t=516
   - removing NIO from server/serverCore.java because of massive problems
     with socket close issues
*) Adding support for remote port forwarding via sch
   @Orbiter: Please take a look into
   - hello.java
   - server/serverCore.java.publicIP()
   - yacy/yacyClient.java.publishMySeed(...)
*) Making startup loading of additional content parsers more failsafe


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 07:28:07 +00:00
theli
a566588e9b *) adding configuration section for new http keep-alive support
*) moving transparent proxy configuration into new config section


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 09:44:07 +00:00
orbiter
594c591223 changes towards 0.38
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-03 02:43:35 +00:00
orbiter
6f09251bbc added peer-Name settings to Settings page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-24 15:47:50 +00:00
theli
0e1d9e9722 *) shrinking httpc linebuffer when httpc is returned to pool. This is done to free memory
*) Making Seed-Upload configuration more verbose.
*) Some Changes in SOAP Search API (not finished yet).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-23 10:10:51 +00:00
theli
b625aa91fd *) Trying to solve Seed-Upload-Configuration - "Error with submitted information. Nothing changed." Bug:
see: http://www.yacy-forum.de/viewtopic.php?p=3233

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@157 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-23 09:12:20 +00:00
theli
361f05978d Multiple updates regarding the yacy seedUpload facility,
optional content parsers, thread pool configuration ...

Please help me testing if everything works correct.

*) Migration of yacy seedUpload functionality
See: http://www.yacy-forum.de/viewtopic.php?t=256
- new uploaders can now be easily introduced because of a new modulare uploader system
- default uploaders are: none, file, ftp
- adding optional uploader for scp
- each uploader provides its own configuration file that will be 
  included into the settings page using the new template include feature
- Each uploader can define its libx dependencies. If not all needed libs are
  available, the uploader is deactivated automatically.

*) Migration of optional parsers
See: http://www.yacy-forum.de/viewtopic.php?t=198
- Parsers can now also define there libx dependencies
- adding parser for bzip compressed content
- adding parser for gzip compressed content
- adding parser for zip files
- adding parser for tar files
- adding parser to detect the mime-type of a file
  this is needed by the bzip/gzip Parser.java
- adding parser for rtf files
- removing extra configuration file yacy.parser
  the list of enabled parsers is now stored in the main config file

*) Adding configuration option in the performance dialog to configure
See: http://www.yacy-forum.de/viewtopic.php?t=267
- maxActive / maxIdle / minIdle values for httpd-session-threadpool
- maxActive / maxIdle / minIdle values for crawler-threadpool

*) Changing Crawling Filter behaviour
See: http://www.yacy-forum.de/viewtopic.php?p=2631

*) Replacing some hardcoded strings with the proper constants of the httpHeader class

*) Adding new libs to libx directory. This libs are
- needed by new content parsers
- needed by new optional seed uploader
- needed by SOAP API (which will be committed later)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-17 08:25:04 +00:00
rramthun
85c2f3be8a Fixed spelling mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-12 17:50:45 +00:00
theli
74f12bb0f3 *) adding transparent proxy support
Now a firewall can transparently redirect all 
   http traffic through yacy.
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@96 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 22:36:26 +00:00
theli
e7f7aa0bb9 *) Import statements reorganized
Now it's easier to determine which class really uses which other class*) Reogranizing Import Statements 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@83 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:36:42 +00:00
theli
351c86d5d9 *) Migration of optional Content Parser integration
- each additional parser must be in a subpackage 
  of plasma.parser
- each parser must have its own ant build file (which will 
  be called automatically from the main build file)
- Calling the main build file results in building a separate 
  zip file for each optional parser. This zip file includes:
  + sources of the Parser.java
  + compiled classes of the Parser.java
  + needed additional libs (libx)
- To install an additional parser the user simply needs to
  extract the zip file listed above into his/her yacy directory.
- The configuration (enabling/disabling) of a parser can be done
  via the webinterface (currently the settings dialoge) and is
  done "on-the-fly". The installation can not be done "on-the-fly"
  at the moment because of classpath issues.
- The classpath of the linux startup/stop scripts is generated 
  automatically now (including all libraries from lib and libx).

*) Bugfix: File Extension was not calculated correctly by the crawler
   e.g.: file extension was accidentally: .php?param=value
   Corrected.

*) Adding additional parser for parsing of rss/atom feeds
- added needed libs to do this.

TODO:
- automatic building classpath for windows startup scripts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@78 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-03 09:47:56 +00:00
orbiter
ba16da72b4 fixed not-working kelondroRecords-Cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@56 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 14:46:59 +00:00
orbiter
7eb3c81aad name check on new peer names
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-08 23:15:03 +00:00
orbiter
248077d3f0 initial load with yacy 0.36
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-07 19:19:42 +00:00