Commit Graph

409 Commits

Author SHA1 Message Date
orbiter
75a1702133 - fix for ConcurrentModificationException during shutdown
- fix for Ranking distribution problem (suma-lab peer does not exist any more)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4749 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-30 11:19:52 +00:00
orbiter
9935e83c86 added new news window into the status page. At this moment it is just a test.
The news inside the window are about peer arrivals and departures, remote search accesses and crawls

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-26 01:00:10 +00:00
danielr
763f9d4f5d serverCore: setting timeout for new connection before SSLDetect
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4723 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-22 09:03:16 +00:00
orbiter
3c76342619 - added servlet to configure the search page greeting line
- added information output about the current network definition in the network servlet
- better description and usage of profile entries in User Profile servlet regarding FOAF format
- reformatting of menues at status page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4710 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-18 13:58:56 +00:00
orbiter
9a32a4c328 fixed concurrentModificationException during hello-process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4693 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-14 03:04:28 +00:00
danielr
959f448e5f - disabled redirects in proxy (so client sees real path)
- added connection stats (only connections currently in use)
- remove "old" connections (closed or idle for some time)
- synchronized shared parts of proxyHandler


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4682 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-12 11:39:48 +00:00
orbiter
444dce7e81 more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-10 15:28:58 +00:00
orbiter
2c2dcd12a2 - enhanced performance of Eco-Tables: less time-consuming size() - operations
- will increase speed of indexing and collection.index creation


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4675 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-10 13:24:55 +00:00
orbiter
e356625b22 - refacotring of stream copy handling to support time-consuming operations
- made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer)
- introduced another timeout setting (java internal property)
- more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-10 09:53:07 +00:00
orbiter
f97971b63b fixed NPE problems doing a shutdown from command-line
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4671 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-09 22:59:17 +00:00
orbiter
2c1c3bb6eb - some refactoring (sorry Daniel, hab in deinem Code rumgewütet)
- fixed broken downloads (flush was missing)
- different problem handling when download is corrupted
- different default values in yacy.init

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4669 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 21:36:33 +00:00
danielr
d96e2badc7 - fixed POST in proxy
- prepared http connection tracking
- refactoring (mainly moving StreamTools to serverFileUtils)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4668 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 21:17:40 +00:00
orbiter
14404d31a8 - enhanced performance graph (more info)
- added conditions for rarely used logging lines to prevent unnecessary CPU usage for non-printed info

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4667 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 14:44:39 +00:00
orbiter
696b8ee3f5 fix for http://forum.yacy-websuche.de/viewtopic.php?p=6806#p6806
- removed all InputStream.available() because this does not work for files > 2GB
- iterator terminate when a IOException occurs
- added handling of non-executing index.add methods to enhance assert usage
- added index for file indexes > 2GB, to be used in new indexHeap

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-08 11:55:59 +00:00
orbiter
225f9fd429 various fixes
- shutdown behavior (killing of client sessions)
- EcoFS reading better
- another synchronization in balancer.size()


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 13:12:58 +00:00
danielr
7c149a4ee8 - undo less 'binary data found'
- removed duplicate stackTrace


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4643 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 17:46:11 +00:00
danielr
2aef1414f5 removed test (in yacy.init)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:49:25 +00:00
danielr
5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:17:16 +00:00
orbiter
764a40e37d speed enhancements for crawler and url retrieval (affects also search speed)
- concurrency for LURL-fetching: this can be done using a concurrent lookup into the separated url databases. Concurrency is possible because there is no IO during lookup. The more LURL-Tables are present, the better is the speedup. More CPUs will increase speed
- because a large number of LURL-lookups are made during crawling (for double-check), the LURL-Lookup speed enhancements enhances also crawling speed
- search speed also profits from LURL-lookup enhancement
- changed some flushing parameters in word index caching which should make better use of large word index caches and should speed up indexing
- removed flush chunksize parameter, because this was only useful for IO path enhancement feature which was removed some weeks ago to prevent blocking and deadlocks during search requests

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-31 15:41:19 +00:00
orbiter
2c34038912 addition/correction to last commit: usage of concurrent-classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 21:17:12 +00:00
orbiter
b2150057d2 removed unnecessary cleanup method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4625 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 20:32:08 +00:00
orbiter
368593e449 enhanced the concurrency handling of indexing process (better queue size control, better data concept, better shutdown behavior)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 00:03:44 +00:00
orbiter
0241d070bc added concurrency to indexing process:
- the methods {parsing, semantic analysis (condensing), structure analysis (web structure)} in the serialized indexing path had been made concurrent.
- four BlockingQueues handle concurrency and hand-over of the indexing objects, the last object in the queue is stored into a blockingQueue of maximum size 1 to serialize the process for storage (which uses IO and therefore here should not be deserialized)
- a concurrency of (CPUs + 1) is default. Single-CPU users will profil from the change because large files cannot block the indexing process any more.
- removed the secondary indexing thread, which is superfluous now. Concurrency is default for all users.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-28 11:56:28 +00:00
orbiter
bca87f1e38 - refactoring of serverThreads: renaming to distinguish busy-threads and blocking-threads
- added blockingThreads which are threads that are not driven by pause times but by BlockingQueue lookup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-27 12:03:16 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
4c584dff87 disabled soLinger to prevent that too many connections stay open (it's a TEST!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:46:55 +00:00
orbiter
9c989fe5f7 fixed deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 00:49:16 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
d87d295c68 one more try to fix the connection problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 13:13:11 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
orbiter
7ce76c8ff8 added missing file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 22:57:53 +00:00
orbiter
bfed9c2da6 - some refactoring in search process
- separated sidebars in new search interface and placed them in their own files
  which can be put in into the search page like plug-ins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 21:46:55 +00:00
orbiter
275a226cc5 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-04 22:45:45 +00:00
danielr
fbe335db73 consistent use of de.anomic.server.serverMemory to get information about memory statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-02 15:42:50 +00:00
orbiter
8c06436c4a removing the error-db upon each time a start-up is made.
This is necessary because the table uses a lot of RAM and the content is never re-used after Start-Up.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4520 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-01 09:44:33 +00:00
orbiter
4fdf695064 - fixed a bug in remote search that prevented that any results had been generated (!)
- added a great number of printStackTrace and new exceptions that shall be used to find the cause
  for a bug in yacy client-server communication which causes the interruption of data transfer
  which then causes the parser bug for the seed strings.
- tried to fix the communication bug on server-side (copy functions)
Be aware that the log may be full of errors and bugs - there should not be more bugs but there is more to see


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 23:12:43 +00:00
orbiter
1dce2f1079 more multithreading support:
- replaced some synchronized classes by classes from util.concurrent
- used a util.concurrent.SynchronousQueue to implement a persistent sorting thread in
  the very basic kelondroRowCollection which supports sorting with a second thread
  in case that a double-core processing CPU is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 15:16:47 +00:00
orbiter
ff5969901c modified dir servlet to cooperate with intranet indexing from the own HTDOCS repository:
- removed md5 file generation (spoils the won repository)
- removed comments in file share (was never used)
- moved dir list comparator to other place (maybe solves problem, lets see)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-15 13:12:25 +00:00
orbiter
f890b039ee experiments wit openstreetmaps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-12 15:39:32 +00:00
orbiter
3c7b94c119 - fix for online caution delay settings, see
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=738&p=4723#p4723
- removed remote search limitation for non-dht-peers according to discussion in
  http://forum.yacy-websuche.de/viewtopic.php?f=15&t=793&hilit=&p=5277#p5277

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 20:11:50 +00:00
orbiter
acf771d5e1 - fixed bug with too much RAM in crawler queue
- fixed dir bug
- better calculation of TF for join
- better waiting-on-result logic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-31 23:40:47 +00:00
orbiter
0f5c4abaca more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 10:12:48 +00:00
orbiter
db25425893 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 23:08:32 +00:00
low012
cfd4fecd12 *) blanks in paths for restart and update script are replaced by backslash+blank now (see http://forum.yacy-websuche.de/viewtopic.php?t=745)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-21 18:04:08 +00:00
orbiter
016fc594af more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 09:58:56 +00:00
orbiter
03e7782269 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 19:23:38 +00:00
orbiter
f7c5ccedc7 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 00:31:26 +00:00
orbiter
df2a7a8ac8 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 18:47:45 +00:00
orbiter
4dc438f7e7 moved to Java 1.5:
- changed build script to use java 1.5 compiler
- first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings)
- added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet)

please set your development environment to use java 1.5!


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 17:56:59 +00:00
fuchsi
d517e96714 last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-21 00:53:46 +00:00