Commit Graph

228 Commits

Author SHA1 Message Date
theli
af9cd67334 *) migration to Java NIO
- to avoid buzy waiting and
   - make socket blocking interruptable
*) changing reference to logger
*) introduce commandObjMethodCache to improve performance
*) doing a stream shutdown before closing the connection
   to aviod problems when using persistent connections
*) calling method UNKNOWN of the server-handler class when 
   receiving an unknown command
*) calling method EMPTY of the server-handler class when
   receiving an empty command

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@254 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 10:51:56 +00:00
theli
1eff96f471 *) removing buzy waiting
*) changing reference to logger

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 10:47:50 +00:00
theli
829b65c1c8 *) adding additional classes needed for new logging
- ConsoleOutErrHandler.java used to log warnings/errors to stderr 
  and all other messages to stdout
- GuiHandler.java
  used to keep logging messages in memory that can then be viewed
  via the http gui
- serverSimpleLogFormatter.java
  needed to format logging messages for FileHandler, ConsoleOutErrHandler
  and GuiHandler
- serverMiniLogFormatter.java
  needed for proxy access logging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 09:25:08 +00:00
low012
8c2789b22a to catch is an irregular verb
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 00:07:25 +00:00
orbiter
4574fa4ce7 bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@224 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 15:28:29 +00:00
theli
83b41ef2f7 *) Adding timeouts for shutdown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 13:44:25 +00:00
orbiter
ca3b4ccaf4 added snippet-routines (not yet finished)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 00:52:24 +00:00
orbiter
3771b10b89 implemented automated migration indexCache 0.37 -> indexAssortmentCluster
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@205 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-01 14:24:25 +00:00
theli
0e1d9e9722 *) shrinking httpc linebuffer when httpc is returned to pool. This is done to free memory
*) Making Seed-Upload configuration more verbose.
*) Some Changes in SOAP Search API (not finished yet).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-23 10:10:51 +00:00
theli
361f05978d Multiple updates regarding the yacy seedUpload facility,
optional content parsers, thread pool configuration ...

Please help me testing if everything works correct.

*) Migration of yacy seedUpload functionality
See: http://www.yacy-forum.de/viewtopic.php?t=256
- new uploaders can now be easily introduced because of a new modulare uploader system
- default uploaders are: none, file, ftp
- adding optional uploader for scp
- each uploader provides its own configuration file that will be 
  included into the settings page using the new template include feature
- Each uploader can define its libx dependencies. If not all needed libs are
  available, the uploader is deactivated automatically.

*) Migration of optional parsers
See: http://www.yacy-forum.de/viewtopic.php?t=198
- Parsers can now also define there libx dependencies
- adding parser for bzip compressed content
- adding parser for gzip compressed content
- adding parser for zip files
- adding parser for tar files
- adding parser to detect the mime-type of a file
  this is needed by the bzip/gzip Parser.java
- adding parser for rtf files
- removing extra configuration file yacy.parser
  the list of enabled parsers is now stored in the main config file

*) Adding configuration option in the performance dialog to configure
See: http://www.yacy-forum.de/viewtopic.php?t=267
- maxActive / maxIdle / minIdle values for httpd-session-threadpool
- maxActive / maxIdle / minIdle values for crawler-threadpool

*) Changing Crawling Filter behaviour
See: http://www.yacy-forum.de/viewtopic.php?p=2631

*) Replacing some hardcoded strings with the proper constants of the httpHeader class

*) Adding new libs to libx directory. This libs are
- needed by new content parsers
- needed by new optional seed uploader
- needed by SOAP API (which will be committed later)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-17 08:25:04 +00:00
rramthun
2d751ba831 Fixed a spelling mistake
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-13 20:43:08 +00:00
orbiter
0cfe94bb66 fixed last commit + added missing files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-11 15:54:26 +00:00
orbiter
b4030e5023 implemented serverSwitchActions - action-hooks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-11 14:58:03 +00:00
theli
74f12bb0f3 *) adding transparent proxy support
Now a firewall can transparently redirect all 
   http traffic through yacy.
   

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@96 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 22:36:26 +00:00
theli
cbdc499ba6 *) adding many missing (File)?(Input|Output)Stream.close() calls to avoid "Too many open files bug".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@90 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-08 07:24:33 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
theli
351c86d5d9 *) Migration of optional Content Parser integration
- each additional parser must be in a subpackage 
  of plasma.parser
- each parser must have its own ant build file (which will 
  be called automatically from the main build file)
- Calling the main build file results in building a separate 
  zip file for each optional parser. This zip file includes:
  + sources of the Parser.java
  + compiled classes of the Parser.java
  + needed additional libs (libx)
- To install an additional parser the user simply needs to
  extract the zip file listed above into his/her yacy directory.
- The configuration (enabling/disabling) of a parser can be done
  via the webinterface (currently the settings dialoge) and is
  done "on-the-fly". The installation can not be done "on-the-fly"
  at the moment because of classpath issues.
- The classpath of the linux startup/stop scripts is generated 
  automatically now (including all libraries from lib and libx).

*) Bugfix: File Extension was not calculated correctly by the crawler
   e.g.: file extension was accidentally: .php?param=value
   Corrected.

*) Adding additional parser for parsing of rss/atom feeds
- added needed libs to do this.

TODO:
- automatic building classpath for windows startup scripts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@78 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-03 09:47:56 +00:00
orbiter
d0010ff0b0 last changes for release 0.37
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@76 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-02 12:23:15 +00:00
orbiter
f99930c04b fixed brute-force + peer-disconnect - Bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@75 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-01 23:31:21 +00:00
orbiter
2de90020ed fixed caching+synchronization+brute-force-denial
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@67 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-27 21:09:40 +00:00
orbiter
7fb645b0ab enhanced crawling performance, changed memory settings, new performace options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@51 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 23:15:40 +00:00
theli
58b1a0ba40 *) adding an new package for extra content parsers
*) adding content parser for
- pdf (using the pdf-box library)
- doc (using the textmining.org library)
*) adding a Interface for content parsers
*) adding a configuration file which can be used to configure which parser is used for which mimeType
*) Sempahore class was moved and renamed to serverSemaphore
*) Changing yacy shutdown behaviour
Buzy waiting loop for shutdown was removed and replaced with a blocking call (using the semaphore class mentioned above) to the new switchboard.waitForShutdown method.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@46 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:24:53 +00:00
theli
c9c0a1f11c *) Trying to speedup local crawling
- introduction of a threadpool for crawling
- introduction of a job queue to avoid buzy waiting for a free crawler slot

*) New classes added
- queue for receiving of crawler jobs
- semaphore class to do reader/writer synchronization (mutual exclusion)
- message object to hold all needed data about a crawler job

*) Trying to solve session-thread shutdown problem
- session thread stopped variable is now set from outside before interrupting the
  session thread.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@39 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-21 10:31:40 +00:00
(no author)
942914ffd2 *) Adding additional functions to serverByteBuffer so that it
can be used instead of a ByteArrayOutputStream
*) Using a serverByteBuffer for lineBuffering in class httpc
   instead of a ByteArrayOutputStream

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@35 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 07:39:40 +00:00
orbiter
97ec8d65e4 fixed makerelease & clean-up of dead code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@33 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 14:04:16 +00:00
(no author)
f39812da91 *) Some performance improvements
- many classes set to final
- implementation of a session-thread pool
- reusage of the server handler class (normally the httpd object)
  within the session thread
- implementation of a httpc object pool
- introduction of a linebuffer in httpd which can be reused
- reusing the properties table in the httpc
- added to apache libs (commons-collections, commons-pool) which 
  are needed for the object/thread pool implementation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@26 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 06:55:57 +00:00
orbiter
c0807abd33 new crawl/proxy/cache design + fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@18 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-13 23:00:20 +00:00
orbiter
248077d3f0 initial load with yacy 0.36
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-07 19:19:42 +00:00