Commit Graph

100 Commits

Author SHA1 Message Date
danielr
dba7ba079e fixed NPE seen with queues_p.xml (serverClassLoader finds already loaded classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-23 16:55:46 +00:00
orbiter
f4ae8082c3 - better error analysis for ooRange Exception in kelondroBase64Ordering
- quadcore support for kelondroRowSet array ordering

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4932 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-15 23:25:57 +00:00
orbiter
e269c12710 small changes in partition routine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4929 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 23:17:56 +00:00
orbiter
31efb8fbee - fix for LOG path generation when the DATA/LOG does not exists (fix for bug introduced in SVN 4923)
- some more/better asserts
- slight performance enhancements in remove method in index management. Works for all who do not run using asserts (the majority)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4928 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 22:51:47 +00:00
danielr
68c38c2d34 - WatchCrawler shows status without JavaScript
- Performance can be scaled + DHT-profile
- names for pool-threads
- some small refactorings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4923 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-14 10:24:58 +00:00
danielr
7feae906aa - organize imports
- removed potential null pointer accesses
- removed unnecessary casts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-06 16:01:27 +00:00
orbiter
5fde679acb - fixed problem in performance configuration
- extended rss fetch size for rssTerminal


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4798 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-13 15:28:55 +00:00
danielr
d4bce6affd refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-03 09:06:00 +00:00
orbiter
d0678f7ab9 refactoring as result of
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=959&p=7560#p7560

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4752 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-05-01 22:40:42 +00:00
orbiter
b9a2a2d287 more search performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4735 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 15:09:06 +00:00
orbiter
ff755fb858 small corrections and enhancements after search timing profiling
search should be a little bit faster now

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4734 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-24 13:31:55 +00:00
danielr
48ffd61e6a changed "patched wrong" to warning, so it goes to the logfile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-19 07:54:44 +00:00
orbiter
2f629d20a7 - tried to fix the '4217666-problem'
- removed more unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4715 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-19 04:24:29 +00:00
orbiter
93376acdca fixed a bad chunkcache limit check which could have caused ArrayIndexOutOfBoundsExceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4695 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-14 03:49:02 +00:00
orbiter
696b8ee3f5 fix for http://forum.yacy-websuche.de/viewtopic.php?p=6806#p6806
- removed all InputStream.available() because this does not work for files > 2GB
- iterator terminate when a IOException occurs
- added handling of non-executing index.add methods to enhance assert usage
- added index for file indexes > 2GB, to be used in new indexHeap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 11:55:59 +00:00
orbiter
117ae78001 speed enhancement for reading of eco-table indexes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4647 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-06 11:50:15 +00:00
orbiter
3ce3a4a3a1 added stub for new index container heap data structure (purpose: index folding)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 22:58:42 +00:00
orbiter
d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
this is another step to enable multiple, concurrent fulltext-indexes
- another try to make the yacy-httpc more stable

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-26 14:13:05 +00:00
orbiter
fba46c51d7 fixed non-termination bug in qsort
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4593 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 23:15:28 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
fc94fbe224 another improvement to the collection sorting
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 23:11:04 +00:00
orbiter
11270d450e better quicksort-pivot computation: 30% faster (measured with test program)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4588 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 22:01:12 +00:00
orbiter
3e44293f07 - fixed a problem with thread pools in row collection
- added a line-viewing feature in threaddump	

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-20 14:21:58 +00:00
orbiter
433ff855f7 - fixed another concurrency problem in collection sorting
- fixed a typing problem that was introduced in svn 4579 and caused the crawl monitor to fail

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4585 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 23:47:24 +00:00
orbiter
f3996e63b8 tried to fix more deadlocks:
- changed connection modes in ftpc
- replaced sort tread pool in row collections by new one using util.concurrent. the old pool had caused blockings

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4582 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-19 11:23:43 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
275a226cc5 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-04 22:45:45 +00:00
orbiter
1dce2f1079 more multithreading support:
- replaced some synchronized classes by classes from util.concurrent
- used a util.concurrent.SynchronousQueue to implement a persistent sorting thread in
  the very basic kelondroRowCollection which supports sorting with a second thread
  in case that a double-core processing CPU is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 15:16:47 +00:00
orbiter
a1e9e6e2e6 fix for search result page navigation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 02:23:04 +00:00
orbiter
9abc927645 to fix inconsistencies in collection index, a double reference reporting mechanism has been implemented
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-20 01:22:46 +00:00
orbiter
dc26d6262b - removed write buffer from kelondroCache (was never used because buggy; will now be replaced by new EcoBuffer)
- added new data structure 'eco' for an index file that should use only 50% of write-IO compared to kelondroFlex
The new eco index is not used yet, but already successfully tested with the collectionIndex
The main purpose is to replace the kelondroFlex at every point when enough RAM is available.
Othervise, the kelondroFlex stays as option in case of low memory (which then can even use a file-index)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-17 12:12:52 +00:00
orbiter
a5054c038d - added large number of generics
- redesign of ordering structures in kelondro (old did not work with strict generics)
- 50% IO reduction during read access on kelondroFlex (ommiting of read on index table)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-11 00:12:01 +00:00
orbiter
9d8b17188a more generics, bugfixes for wrong cast
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 03:39:36 +00:00
orbiter
4dc438f7e7 moved to Java 1.5:
- changed build script to use java 1.5 compiler
- first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings)
- added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet)

please set your development environment to use java 1.5!


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 17:56:59 +00:00
orbiter
52dd015218 new release strategy: the standard release is now built the same way as the pro release
a new release type was added: 'embedded' which is the same as the current standard release was
this will not have any effect to the next release 0.56, which will still a pro-release on public download
the transition the the new release strategy must be done now to enable automatic update by the updated in future releases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4287 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-20 02:46:41 +00:00
orbiter
c527969185 - enhanced monitoring of ranking parameters
for details, please try http://localhost:8080/IndexControlRWIs_p.html
- fixed computation of ranking ordering in some cases

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-16 14:48:09 +00:00
orbiter
ec7ba0d3d0 - fixed problem with too small sort fields (sortbound was not set)
- slightly changed handling of date in indexURLEntry

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-13 02:24:10 +00:00
orbiter
64b3b79e44 - fix for termination problem with uniq()
- addition to seed dna interpretation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 14:39:30 +00:00
orbiter
0abf33ed03 - tried to remove deadlock
- enhanced searchtime in kelondroRowSets
- enhanced uniq() - reverse enumeration causes less time in case of mass removal of doubles

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-12 01:14:51 +00:00
orbiter
d0d2771883 disabled multiprocessoring of rowCollection.sort for testing purpose
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-11 00:28:22 +00:00
orbiter
edc4da5317 fix for division by zero in test reoutine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4201 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-10 08:57:00 +00:00
orbiter
df38aaf7bd update to RowCollection sort speed-enhancements:
- better handling of small collections (less overhead)
- usage of pre-sorted limits
- different re-sort limit
- more testing procedures

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-09 15:34:11 +00:00
orbiter
ecba35de72 enhanced computing speed of kelondro core function: sorting
the enhancement was made by using better organized data structures and
multi-threading during the sort. A sort can be divided into two separate
processes when the first partition of the quicksort algorithm was done.
Generating a separate thread and starting the thread takes only 10 milliseconds,
so using a separate thread makes only sense if the data amount is large.
statistics about the speed-up:
without ehancement: 250 milliseconds for 100000 entries
with data structure enhancement: 170 milliseconds for 100000 entries
with additional second thread (if second processor is present): 130 milliseconds.

For dual-processor systems, this means about 100% speed-up
a test can be made with the following command:
java -classpath classes de.anomic.kelondro.kelondroRowCollection


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4198 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-09 00:51:38 +00:00
orbiter
6eaa5a0e64 enhanced local search speed. The ranking process is now 6 times faster that before.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-07 22:38:09 +00:00
orbiter
b856e377a9 some additions and a small bugfix to SVN 4158
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-21 23:26:22 +00:00
fuchsi
b5f7df8d0a Speed up remove operations in rowCollections.
- Array element shifting during remove is only done when it is necessary to keep the order of a row collection.
- This will speed up the most expensive operation "common word shrinking" by a factor of 500-1000 (in the worst cases we shifted > 60 GB of data during this operation)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-11 17:17:08 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
11ac7688d5 reverted a part of last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3736 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-16 17:52:11 +00:00
orbiter
b3f97b5c38 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3735 6c8d7289-2bf4-0310-a012-ef5d649a1542 2007-05-16 17:45:39 +00:00