Commit Graph

58 Commits

Author SHA1 Message Date
orbiter
8a4f297324 fixed/enhanced snippet error-handling; suppression of results where no snippet exists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-30 00:01:53 +00:00
orbiter
712fe9ef18 bugfixed utf-8 decoding and parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-29 22:55:37 +00:00
orbiter
3addf58046 enhanced snippet-loading with threads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@322 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-24 07:41:07 +00:00
orbiter
56d28a16f0 bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 14:40:39 +00:00
orbiter
d6c85228a6 enhanced snippet computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 12:12:12 +00:00
theli
aae9a433a6 *) correcting usage of supportedFileExt-List
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@315 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 07:43:59 +00:00
orbiter
1e7f062350 many bugfixes, memory leak fixes, performance enhancements; new kelondroHashtable; activated snippets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@313 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-23 02:07:45 +00:00
orbiter
68dc2b0c6b added kelondroArray, the basis for upcoming kelondroHash and some bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-21 01:17:25 +00:00
orbiter
a19541e563 code-enhancements after analysis with AppPerfect
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@307 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-20 16:36:31 +00:00
orbiter
85075269a6 extended fail-safe memory-managament. prevents too much allocation, too often GC and should help for the 100%CPU-bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@303 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-20 00:46:23 +00:00
orbiter
e3c92818db avoiding OutOfMemoryError routines
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@302 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-19 13:37:17 +00:00
orbiter
3e8ee5a46d enhanced caching in kelondroRecords and added better synchronization/finalizer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-19 05:27:42 +00:00
orbiter
5d06ded005 enhanced html parser speed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-17 01:26:51 +00:00
orbiter
5a490aa065 fixed html parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 21:49:56 +00:00
orbiter
a25b5b4986 fixed possible memory leak in htmlScraper: be aware that now links can get lost; further work necessary
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 18:31:28 +00:00
theli
9e47ba5ad6 *) adding missing calls for function close() to avoid "too many open file" bug
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@282 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 08:34:52 +00:00
orbiter
a1ffc27041 preparations for image/movie/music indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@280 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-16 00:31:13 +00:00
orbiter
a5b40923b6 added word migration to assortments (start with 'java -classpath classes yacy -migratewords')
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@278 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-15 01:22:07 +00:00
theli
ee9e110366 *) removing old logging configuration properties from yacy.init
*) serverLog.java logging functions now also accept exceptions als
   additional parameters.
   The Stacktrace of this ecceptions will then be appended to the 
   logging message and can e.g. be viewed on the gui logging page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-10 09:19:24 +00:00
theli
c1a4e0dc28 *) changing reference to logger
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-09 10:44:55 +00:00
orbiter
4574fa4ce7 bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@224 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 15:28:29 +00:00
orbiter
33f9315e58 implemented multithreading of indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@221 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 13:19:05 +00:00
orbiter
ca3b4ccaf4 added snippet-routines (not yet finished)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-08 00:52:24 +00:00
orbiter
594c591223 changes towards 0.38
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-03 02:43:35 +00:00
orbiter
d8fdc2526e added experimental snipplet-generation (to be disabled for 0.38)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-02 01:33:10 +00:00
orbiter
3771b10b89 implemented automated migration indexCache 0.37 -> indexAssortmentCluster
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@205 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-06-01 14:24:25 +00:00
orbiter
e89ded9e41 bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@204 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-31 22:12:43 +00:00
orbiter
3d8a2ff937 enhanced parallelization of local/global/remote crawling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-29 11:56:40 +00:00
orbiter
21110dcd5e fixed bugs with open files and caching
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@175 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-25 13:48:48 +00:00
theli
74eb21f62e *) adding image tag into rss template
*) adding a xslt stylesheet so that the rss document can be viewed in a normal webbrowser
*) adding pubDate tag to each search item

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-25 08:47:34 +00:00
orbiter
5c6147a54c introduced assortment structure (generalization of singletons)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-18 21:52:17 +00:00
theli
73e297f30f *) adding proper default values for RealtimeParsableMimeTypes if something goes wrong with the configuration file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-18 07:40:48 +00:00
theli
361f05978d Multiple updates regarding the yacy seedUpload facility,
optional content parsers, thread pool configuration ...

Please help me testing if everything works correct.

*) Migration of yacy seedUpload functionality
See: http://www.yacy-forum.de/viewtopic.php?t=256
- new uploaders can now be easily introduced because of a new modulare uploader system
- default uploaders are: none, file, ftp
- adding optional uploader for scp
- each uploader provides its own configuration file that will be 
  included into the settings page using the new template include feature
- Each uploader can define its libx dependencies. If not all needed libs are
  available, the uploader is deactivated automatically.

*) Migration of optional parsers
See: http://www.yacy-forum.de/viewtopic.php?t=198
- Parsers can now also define there libx dependencies
- adding parser for bzip compressed content
- adding parser for gzip compressed content
- adding parser for zip files
- adding parser for tar files
- adding parser to detect the mime-type of a file
  this is needed by the bzip/gzip Parser.java
- adding parser for rtf files
- removing extra configuration file yacy.parser
  the list of enabled parsers is now stored in the main config file

*) Adding configuration option in the performance dialog to configure
See: http://www.yacy-forum.de/viewtopic.php?t=267
- maxActive / maxIdle / minIdle values for httpd-session-threadpool
- maxActive / maxIdle / minIdle values for crawler-threadpool

*) Changing Crawling Filter behaviour
See: http://www.yacy-forum.de/viewtopic.php?p=2631

*) Replacing some hardcoded strings with the proper constants of the httpHeader class

*) Adding new libs to libx directory. This libs are
- needed by new content parsers
- needed by new optional seed uploader
- needed by SOAP API (which will be committed later)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-17 08:25:04 +00:00
theli
ddc5675781 *) Correcting typo
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-14 11:14:34 +00:00
theli
d2c4e9a55e *) Implementing yacy forum wishlist item: "Pause Crawling"
see: http://www.yacy-forum.de/viewtopic.php?t=48



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@118 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-14 09:41:05 +00:00
orbiter
b4030e5023 implemented serverSwitchActions - action-hooks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-11 14:58:03 +00:00
orbiter
1d7fed87dc redesign of index caching - removed indexCache.db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@86 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-07 21:11:18 +00:00
rramthun
3f85978519 Fixed one spelling mistake, limited input for ICQ numbers to 9 digits and made ICQ number in peer profiles clickable.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@85 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-07 21:07:43 +00:00
theli
2aa5fe8f50 *) Import statements reorganized
Now it's easier to determine which class really uses which other class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@82 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-05-05 05:32:19 +00:00
orbiter
48650c082c fixed 100%-CPU-Bug in plasmaCondenser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@72 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-29 12:07:13 +00:00
orbiter
995673d795 several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@71 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-28 22:04:57 +00:00
orbiter
2de90020ed fixed caching+synchronization+brute-force-denial
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@67 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-27 21:09:40 +00:00
orbiter
9156fd53bc fixed bugs in last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@65 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 15:47:33 +00:00
orbiter
e25f2354c2 removed synchronization and thread blockings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@63 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 14:19:44 +00:00
theli
58a65b60bd *) synchronized keyword removed from function processLocalCrawling to avoid deadlocks.
This synchronized keyword is not needed anymore because of the crawler jobqueue which
   is responsible for the synchronization now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@60 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-26 06:59:36 +00:00
theli
65fc650109 *) plasmaCrawlLoader shutdown problem fixed (hopefully)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@59 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 16:34:16 +00:00
orbiter
ba16da72b4 fixed not-working kelondroRecords-Cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@56 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-25 14:46:59 +00:00
orbiter
7fb645b0ab enhanced crawling performance, changed memory settings, new performace options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@51 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 23:15:40 +00:00
theli
58b1a0ba40 *) adding an new package for extra content parsers
*) adding content parser for
- pdf (using the pdf-box library)
- doc (using the textmining.org library)
*) adding a Interface for content parsers
*) adding a configuration file which can be used to configure which parser is used for which mimeType
*) Sempahore class was moved and renamed to serverSemaphore
*) Changing yacy shutdown behaviour
Buzy waiting loop for shutdown was removed and replaced with a blocking call (using the semaphore class mentioned above) to the new switchboard.waitForShutdown method.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@46 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-24 21:24:53 +00:00
orbiter
8b31f9e202 enhanced shut-down behaviour & added experimental nio-wrapper for kelondroRA (not active yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@44 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-04-23 13:00:56 +00:00