Commit Graph

378 Commits

Author SHA1 Message Date
theli
351c86d5d9 *) Migration of optional Content Parser integration
- each additional parser must be in a subpackage 
  of plasma.parser
- each parser must have its own ant build file (which will 
  be called automatically from the main build file)
- Calling the main build file results in building a separate 
  zip file for each optional parser. This zip file includes:
  + sources of the Parser.java
  + compiled classes of the Parser.java
  + needed additional libs (libx)
- To install an additional parser the user simply needs to
  extract the zip file listed above into his/her yacy directory.
- The configuration (enabling/disabling) of a parser can be done
  via the webinterface (currently the settings dialoge) and is
  done "on-the-fly". The installation can not be done "on-the-fly"
  at the moment because of classpath issues.
- The classpath of the linux startup/stop scripts is generated 
  automatically now (including all libraries from lib and libx).

*) Bugfix: File Extension was not calculated correctly by the crawler
   e.g.: file extension was accidentally: .php?param=value
   Corrected.

*) Adding additional parser for parsing of rss/atom feeds
- added needed libs to do this.

TODO:
- automatic building classpath for windows startup scripts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@78 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-03 09:47:56 +00:00
allo
1a4ad5a0ac updated the versionnummber ...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@77 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-02 19:16:45 +00:00
orbiter
d0010ff0b0 last changes for release 0.37
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@76 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-02 12:23:15 +00:00
orbiter
f99930c04b fixed brute-force + peer-disconnect - Bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@75 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-01 23:31:21 +00:00
allo
4856f04797 set version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@74 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-30 11:08:59 +00:00
orbiter
c7c6aaf06e many bug-fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@73 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-30 01:22:46 +00:00
orbiter
48650c082c fixed 100%-CPU-Bug in plasmaCondenser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@72 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-29 12:07:13 +00:00
orbiter
995673d795 several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@71 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-28 22:04:57 +00:00
allo
52abc456fb new Templates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@70 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-28 19:38:35 +00:00
allo
9cc4171d6d The Version is set in build.xml, too (@mc: update both places on new version)
The Target "all" Builds YaCy(like make) and the Target "dist" zips the addons(like make install)
(todo: tar.gz creation in the target)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@69 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-28 18:32:50 +00:00
theli
fdf206239a *) adding separate targets to build zip files for each optional content parser.
This archive file can then be extrcted to the yacy root to add the new yacy feature.

TODO: but how to solve the java classpath problem for the yacy startup scripts?

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@68 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-28 09:59:44 +00:00
orbiter
2de90020ed fixed caching+synchronization+brute-force-denial
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@67 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-27 21:09:40 +00:00
rramthun
56409402f0 Fixed some spelling mistakes...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@66 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 18:10:40 +00:00
orbiter
9156fd53bc fixed bugs in last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@65 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 15:47:33 +00:00
rramthun
9cb8779208 Fixed some spelling mistakes...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@64 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 15:38:44 +00:00
orbiter
e25f2354c2 removed synchronization and thread blockings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@63 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 14:19:44 +00:00
theli
3756e6d20f *) "Httpc object was not returned to object pool." bug fixed.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@62 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 10:38:35 +00:00
theli
47e426ff7e *) one possible deadlock (because of nested object locks) removed in class kelondroMap
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@61 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 08:33:59 +00:00
theli
58a65b60bd *) synchronized keyword removed from function processLocalCrawling to avoid deadlocks.
This synchronized keyword is not needed anymore because of the crawler jobqueue which
   is responsible for the synchronization now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@60 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 06:59:36 +00:00
theli
65fc650109 *) plasmaCrawlLoader shutdown problem fixed (hopefully)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@59 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 16:34:16 +00:00
rramthun
19e69f0efd Changed <head> YACY into YaCy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@58 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 15:34:18 +00:00
allo
4c8cc101d6 Bugfix: Do not shot the first X lines, but the last X line of log
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@57 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 15:10:24 +00:00
orbiter
ba16da72b4 fixed not-working kelondroRecords-Cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@56 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 14:46:59 +00:00
orbiter
d03d60f8b5 separated yacy-core from yacy-libx; fixed makerelease
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@55 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 12:42:14 +00:00
allo
c09c54c652 staticIP Property, for people with dyndns aliases ;-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@54 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 12:34:11 +00:00
allo
d005d7484e yacyDebugMode - allow Lan-IPs for testing
where was the Code from 0.25 lost?


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@53 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 12:13:49 +00:00
orbiter
2d9fc71af1 adopted makerelease and start/stop-scripts fro new libs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@52 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 11:35:50 +00:00
orbiter
7fb645b0ab enhanced crawling performance, changed memory settings, new performace options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@51 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 23:15:40 +00:00
theli
10078bb354 *) date string was accidentally replaced with the current value
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@50 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:55:52 +00:00
theli
fd584c113c *) some minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@49 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:52:11 +00:00
theli
f44b219e44 *) Eclipse has accidentally copied in the wrong file header into the new files (because these headers were accidentally set as default for the whole workspace instead of the project)
Fixed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@48 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:47:34 +00:00
theli
081ebd5517 *) I've accidentally used Java 5.0 syntax for enumerations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@47 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:42:02 +00:00
theli
58b1a0ba40 *) adding an new package for extra content parsers
*) adding content parser for
- pdf (using the pdf-box library)
- doc (using the textmining.org library)
*) adding a Interface for content parsers
*) adding a configuration file which can be used to configure which parser is used for which mimeType
*) Sempahore class was moved and renamed to serverSemaphore
*) Changing yacy shutdown behaviour
Buzy waiting loop for shutdown was removed and replaced with a blocking call (using the semaphore class mentioned above) to the new switchboard.waitForShutdown method.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@46 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:24:53 +00:00
(no author)
17d993cfee *) adding directory and classes needed by the new content parsers for pdf + doc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@45 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:12:11 +00:00
orbiter
8b31f9e202 enhanced shut-down behaviour & added experimental nio-wrapper for kelondroRA (not active yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@44 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-23 13:00:56 +00:00
rramthun
ff21586a27 Fixed some spelling mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@43 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-22 15:29:28 +00:00
orbiter
87a61a01c2 fixed bad-gzip-trailer behaviour (now cuts off trailer)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@42 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-22 13:45:07 +00:00
orbiter
00f223cfc1 fixed post-parsing (a case when the bluelist is empty)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@41 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-21 17:13:43 +00:00
allo
044b93412a Copyright notice
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@40 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-21 13:49:51 +00:00
theli
c9c0a1f11c *) Trying to speedup local crawling
- introduction of a threadpool for crawling
- introduction of a job queue to avoid buzy waiting for a free crawler slot

*) New classes added
- queue for receiving of crawler jobs
- semaphore class to do reader/writer synchronization (mutual exclusion)
- message object to hold all needed data about a crawler job

*) Trying to solve session-thread shutdown problem
- session thread stopped variable is now set from outside before interrupting the
  session thread.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@39 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-21 10:31:40 +00:00
rramthun
4e429ae243 Fixed *.bat start-scripts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@38 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 19:44:00 +00:00
rramthun
ce7d8c4fe0 Fixed some spelling mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@37 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 16:01:14 +00:00
rramthun
570de9c4f4 Fixed some spelling mistakes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@36 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 13:28:32 +00:00
(no author)
942914ffd2 *) Adding additional functions to serverByteBuffer so that it
can be used instead of a ByteArrayOutputStream
*) Using a serverByteBuffer for lineBuffering in class httpc
   instead of a ByteArrayOutputStream

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@35 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 07:39:40 +00:00
(no author)
432e01910b *) Bugfix: Image falsification
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@34 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-20 06:41:52 +00:00
orbiter
97ec8d65e4 fixed makerelease & clean-up of dead code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@33 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 14:04:16 +00:00
rramthun
b61567a39e Fixed spelling mistake and inserted author as described in mailinglist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@32 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 13:56:59 +00:00
(no author)
4a76ccc6d6 *) Some minor bugfixes
- httpc: wrong error-message on 404
- httpc: error message was accidentally shown when object 
  was released from pool


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@31 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 10:42:48 +00:00
(no author)
1fec00bc24 *) Bugfix to avoid Nullpointer-Exceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@30 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 10:39:58 +00:00
(no author)
e2a884031c *) added new lib dir to classpath.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@29 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-19 08:17:16 +00:00