Commit Graph

495 Commits

Author SHA1 Message Date
orbiter
225f9fd429 various fixes
- shutdown behavior (killing of client sessions)
- EcoFS reading better
- another synchronization in balancer.size()


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4662 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-07 13:12:58 +00:00
danielr
7c149a4ee8 - undo less 'binary data found'
- removed duplicate stackTrace


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4643 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 17:46:11 +00:00
danielr
2aef1414f5 removed test (in yacy.init)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:49:25 +00:00
danielr
5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-04-05 13:17:16 +00:00
orbiter
764a40e37d speed enhancements for crawler and url retrieval (affects also search speed)
- concurrency for LURL-fetching: this can be done using a concurrent lookup into the separated url databases. Concurrency is possible because there is no IO during lookup. The more LURL-Tables are present, the better is the speedup. More CPUs will increase speed
- because a large number of LURL-lookups are made during crawling (for double-check), the LURL-Lookup speed enhancements enhances also crawling speed
- search speed also profits from LURL-lookup enhancement
- changed some flushing parameters in word index caching which should make better use of large word index caches and should speed up indexing
- removed flush chunksize parameter, because this was only useful for IO path enhancement feature which was removed some weeks ago to prevent blocking and deadlocks during search requests

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-31 15:41:19 +00:00
orbiter
2c34038912 addition/correction to last commit: usage of concurrent-classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4626 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 21:17:12 +00:00
orbiter
b2150057d2 removed unnecessary cleanup method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4625 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 20:32:08 +00:00
orbiter
368593e449 enhanced the concurrency handling of indexing process (better queue size control, better data concept, better shutdown behavior)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4617 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-30 00:03:44 +00:00
orbiter
0241d070bc added concurrency to indexing process:
- the methods {parsing, semantic analysis (condensing), structure analysis (web structure)} in the serialized indexing path had been made concurrent.
- four BlockingQueues handle concurrency and hand-over of the indexing objects, the last object in the queue is stored into a blockingQueue of maximum size 1 to serialize the process for storage (which uses IO and therefore here should not be deserialized)
- a concurrency of (CPUs + 1) is default. Single-CPU users will profil from the change because large files cannot block the indexing process any more.
- removed the secondary indexing thread, which is superfluous now. Concurrency is default for all users.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4609 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-28 11:56:28 +00:00
orbiter
bca87f1e38 - refactoring of serverThreads: renaming to distinguish busy-threads and blocking-threads
- added blockingThreads which are threads that are not driven by pause times but by BlockingQueue lookup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-27 12:03:16 +00:00
orbiter
541b817502 refactoring of switchboard queueing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-22 01:28:37 +00:00
orbiter
4c584dff87 disabled soLinger to prevent that too many connections stay open (it's a TEST!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4565 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 10:46:55 +00:00
orbiter
9c989fe5f7 fixed deadlock
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4562 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-15 00:49:16 +00:00
orbiter
fa1090113d - next try to fix the networking problem:
set the maximum transfer size to less than MTU=1500-52: buffer size <= 1448
- some refactoring of transfer methods (naming)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-14 00:16:04 +00:00
orbiter
d87d295c68 one more try to fix the connection problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4556 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-12 13:13:11 +00:00
orbiter
7cc4ff05c9 some code enhancements and bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-09 23:48:24 +00:00
orbiter
7ce76c8ff8 added missing file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4530 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 22:57:53 +00:00
orbiter
bfed9c2da6 - some refactoring in search process
- separated sidebars in new search interface and placed them in their own files
  which can be put in into the search page like plug-ins

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-05 21:46:55 +00:00
orbiter
275a226cc5 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-04 22:45:45 +00:00
danielr
fbe335db73 consistent use of de.anomic.server.serverMemory to get information about memory statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4522 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-02 15:42:50 +00:00
orbiter
8c06436c4a removing the error-db upon each time a start-up is made.
This is necessary because the table uses a lot of RAM and the content is never re-used after Start-Up.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4520 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-03-01 09:44:33 +00:00
orbiter
4fdf695064 - fixed a bug in remote search that prevented that any results had been generated (!)
- added a great number of printStackTrace and new exceptions that shall be used to find the cause
  for a bug in yacy client-server communication which causes the interruption of data transfer
  which then causes the parser bug for the seed strings.
- tried to fix the communication bug on server-side (copy functions)
Be aware that the log may be full of errors and bugs - there should not be more bugs but there is more to see


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4519 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 23:12:43 +00:00
orbiter
1dce2f1079 more multithreading support:
- replaced some synchronized classes by classes from util.concurrent
- used a util.concurrent.SynchronousQueue to implement a persistent sorting thread in
  the very basic kelondroRowCollection which supports sorting with a second thread
  in case that a double-core processing CPU is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-27 15:16:47 +00:00
orbiter
ff5969901c modified dir servlet to cooperate with intranet indexing from the own HTDOCS repository:
- removed md5 file generation (spoils the won repository)
- removed comments in file share (was never used)
- moved dir list comparator to other place (maybe solves problem, lets see)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-15 13:12:25 +00:00
orbiter
f890b039ee experiments wit openstreetmaps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4477 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-12 15:39:32 +00:00
orbiter
3c7b94c119 - fix for online caution delay settings, see
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=738&p=4723#p4723
- removed remote search limitation for non-dht-peers according to discussion in
  http://forum.yacy-websuche.de/viewtopic.php?f=15&t=793&hilit=&p=5277#p5277

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4438 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-03 20:11:50 +00:00
orbiter
acf771d5e1 - fixed bug with too much RAM in crawler queue
- fixed dir bug
- better calculation of TF for join
- better waiting-on-result logic

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-31 23:40:47 +00:00
orbiter
0f5c4abaca more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-29 10:12:48 +00:00
orbiter
db25425893 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-23 23:08:32 +00:00
low012
cfd4fecd12 *) blanks in paths for restart and update script are replaced by backslash+blank now (see http://forum.yacy-websuche.de/viewtopic.php?t=745)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4351 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-21 18:04:08 +00:00
orbiter
016fc594af more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-09 09:58:56 +00:00
orbiter
03e7782269 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 19:23:38 +00:00
orbiter
f7c5ccedc7 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-01-06 00:31:26 +00:00
orbiter
df2a7a8ac8 more generics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-28 18:47:45 +00:00
orbiter
4dc438f7e7 moved to Java 1.5:
- changed build script to use java 1.5 compiler
- first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings)
- added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet)

please set your development environment to use java 1.5!


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-27 17:56:59 +00:00
fuchsi
d517e96714 last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-21 00:53:46 +00:00
fuchsi
1cb6e431a6 Replace the ISO8601 aka W3C datetime parser by one that supports every representation allowed by this standard, see http://www.w3.org/TR/NOTE-datetime
- useful expecially for sitemaps parsing, where this date format is used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 22:45:58 +00:00
fuchsi
33ee6745f6 more cleanup in serverDate
- remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead
- remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster
- default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset.
- continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 19:39:19 +00:00
fuchsi
3c30c2da75 more cleanup and API consistency changes, more to come...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4284 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-19 13:29:50 +00:00
fuchsi
f41172f850 Merge httpDate into serverDate as suggested. Removed some unnecessary code and fixed a possible synchronization problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4283 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-18 22:35:02 +00:00
fuchsi
21b8d1b918 small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-14 19:17:54 +00:00
fuchsi
1bd02762de Improve HTTP/ICAP header processing.
- workaround for illegal line endings (LF only), closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=595
- fixed bug where we didn't break the processing immediately on EOS (the loop was run until the buffer was completely filled with -1)
- further performance improvements (one simple loop, avoid double processing of every byte and unnecessary temporary buffers)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4270 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-12 06:37:18 +00:00
orbiter
e22014dc83 some memory enhancements when generating and displaying ymage objects
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-07 02:15:12 +00:00
orbiter
b46bcaa5d8 changed method of profiling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-12-04 20:19:13 +00:00
fuchsi
18e516317d Fix problem with buggy HTTP-Servers which send illegal control characters in HTTP-Headers, they are ignored now.
Thx to celle for the patch and see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=560 for more information.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 06:02:45 +00:00
orbiter
2fcd18a972 - fixed bad behaviour of search event worker processes
- fixed export of url lists in xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-23 01:08:16 +00:00
orbiter
6f1308da2f - some enhancements to IndexControlURLs (shows more links, connects referrer to another query)
- some refactoring to search process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-17 01:53:02 +00:00
fuchsi
425e4ead66 Allow absolute paths in configuration settings.
- before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging).
- abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path.

- exceptions (hardcoded): 
  DATA/LOG/yacy.logging
  DATA/SETTINGS/httpProxy.conf
  DATA/SETTINGS/user.db
TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example.

- add missing workPath to yacy.init (it was used in code, but there was no default in the file)
- fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos.
- replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-04 10:36:25 +00:00
borg-0300
e8d32d9f62 other loglevel
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4195 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-02 16:06:54 +00:00
borg-0300
a5d28785b1 less OOM (works for me)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4194 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-02 14:55:46 +00:00
hermens
18144043e6 Correct UTC Offset at beginning/end of daylight savings time
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4185 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-30 19:20:02 +00:00
orbiter
a31b9097a4 preparations for mass remote crawls:
two main changes must be implemented to enable mass remote crawls:
- shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote
  crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused
  as crawl agent for unwanted file retrieval
- implement new index files that control double-check of remotely crawled urls

After removal of robots.txt checking from stacker threads, the multi-threading of this process is void.
Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since
creation of these threads is not resource-consuming, for a detailed explanation see svn 4106

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-29 01:43:20 +00:00
fuchsi
0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
- put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation.
- putASIS(...) have been removed, now done with simple put(...) (see above).
- puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()).
- putHTML(...) escapes special characters into corresponding HTML enities ('<' => '&lt;') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ".
In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value.
A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values.

* added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456
* removed duplicate code (mostly related to the big changes above).

TODO:
- make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
- probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting.
- further improve the speed of page creation for the WatchCrawler.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-24 21:38:19 +00:00
fuchsi
f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
- some minor code cleanups (mostly unnecessary casts, null checks)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 04:13:46 +00:00
orbiter
711641f167 extended client connection clean-up:
there are now two time-outs, one for the complete connection time, and one for an idle time
connections that are idle for more than 2 minutes are closed, and connections that are alive since more than one hour are also closed
if the complete number of connections exceeds 64, all connections more than 64 and have most idle time are also closed

During normal operation of peers these forced closings should never appear,
but the existence of the idle connection check ensures the availability of the peer and the usability of the host.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:06:12 +00:00
orbiter
b19bb6e5b1 - reverted svn 4132; this did not solve the problem and removed the emergency mehtod which caused production failure for shure within some hours
- removed and added some debugging lines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 14:34:05 +00:00
orbiter
01e0669264 re-designed some parts of DHT position calculation (effect is the same as before)
and replaced old fist hash computation by new method that tries to find a gap in the current dht
to do this, it is necessary that the network bootstraping is done before the own hash is computed
this made further redesigns in peer initialization order necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:30:23 +00:00
orbiter
2f1ff048ba some fixes to socket connection time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 23:45:05 +00:00
orbiter
1488769e1f cleanup of unmaintained and outdated performance methods:
removed object pools in httpc. Object pooling is not recommended,
if the creation of the object is not time-intensive. Object pools are only useful,
if there is much computation necessary to create some basic data that is stored
in the object pool and can be re-used. This does not apply to object pools in YaCy.
Object pooling of client sessions would make sense if they would allow re-use of
living connections to other yacy clients. But every connection is closed after usage
of an object in the client pool, therefore the YaCy server client objects are not such
that hold hardware/network-allocated entities.
See:
http://www.javaperformancetuning.com/news/qotm033.shtml
http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling
http://docs.sun.com/source/816-7159-10/pt_chap5.html
http://www.microjava.com/articles/techtalk/recylcle2


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 20:49:52 +00:00
orbiter
3cb9cdc9be try to fix connection problem, possible cause for wrong junior status and non-passive passive peers:
the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far
this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case.
To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions
with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time
at the end of server sessions is also re-enabled.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 17:31:29 +00:00
fuchsi
ae4b9308ef Fix problems with some web servers which couldn't handle the way yacy was sending requests. Thx to celle for the patch.
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=320

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 09:15:28 +00:00
orbiter
daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-05 09:01:35 +00:00
orbiter
a34d9b8609 * added a search history cache that maintains search results for 10 minutes
it is necessary for the new search process that will do automatic re-searches
a positive effect is, that when a re-search is done it can be monitored how many
results had been contributed from other peers. The message for this contribution
was moved from the end of the result page to the top.
* enhanced re-search time when a global search was done an the local index has
already a great number of results for this word
* re-organised presearch computation; must be further enhanced

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-24 23:12:59 +00:00
orbiter
bb426565f0 added new yacy protocol for mass url-pull for better remote crawling distribution
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-22 00:59:05 +00:00
low012
54004e929b *) Better Bourne-Shell (OpenSolaris) compatibility, update and restart really work now. As the Bourne-Shell is the grandfather of most modern shells, it should also work with Linux (tested with Mandriva, works) and OSX (Please test!).
*) Fixed a typo.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-20 21:52:52 +00:00
orbiter
344911bfaa shorter minimum delay values for intranet crawl targets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 23:18:12 +00:00
orbiter
b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-15 21:31:31 +00:00
orbiter
757703a938 synchronization of access tracker to avoid java-internal loop in TreeMap during shutdown
see http://forum.yacy-websuche.de/viewtopic.php?p=1178#p1178

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-31 10:42:11 +00:00
orbiter
9ca46a8c69 indexing of local (intranet) urls enabled
To do this, one must create a separate YaCy network that has a local URL domain
A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 00:46:17 +00:00
orbiter
511dcbb172 fixed encoding bug made in SVN 3993
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3998 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-23 00:50:57 +00:00
orbiter
40b0547611 - documentaton changes (removed old forum links)
- different handling of link quotation
- different handling of link normalization
- enhanced html/unicode en/de-coding

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 15:32:10 +00:00
orbiter
b6d9cca67e - fixed problem with yacyVersion and own version generation
- within this context: generalized date format handling
- extended Update interface:
 * a version lookup can be triggered manually
 * a complete lookup + download + re-boot process can be triggered with one click

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-16 23:47:21 +00:00
orbiter
5444b07674 fixed bug with decompression of index abstracts
this fixes a problem that occurred when searching for several words

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3968 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 12:39:16 +00:00
orbiter
924ae39170 replaced old map loading method with new implementation which is more robust against change of line termination methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3967 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 11:45:41 +00:00
orbiter
36a37f758b fix for oom exception during release download
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-03 22:55:47 +00:00
orbiter
21fabe259b another fix to the restart function; now tested under linux
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3947 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-01 22:43:08 +00:00
orbiter
28baecd41b another fix for the concurrentModificationException in AccessTracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3944 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-30 23:01:57 +00:00
orbiter
19786b73b6 next try for a better restart
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3941 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 21:36:18 +00:00
orbiter
c5c268c43e tried to fix restart button
** kann das mal jemand auf seiner linux-platform testen **
** und feed-back geben ob der restart funktionier ? **

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 12:46:08 +00:00
orbiter
e03fcf4627 SSI fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=29
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3936 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 10:45:13 +00:00
orbiter
9bbd39b67c - removed unfinished auto-updater from roland and martin
- added new download-option for releases on the status page
still mising:
- thomas-style restart for linux/mac
- untar/gunzip on shell basis
(comes next)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 14:52:26 +00:00
orbiter
1782ef57e5 - added SSI parser and include directive for <!--# include virtual="<file>" -->
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-26 14:37:10 +00:00
orbiter
0e57a8062b added network definition for different YaCy networks
(needs much more work)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3919 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-22 14:29:14 +00:00
michitux
25529290ca - 2 small changes in documentation
- hopefully fixed logging of GCs (in order to avoid things like "performed necessary GC, freed 18014398509481565 KB (requested/available/average: 4096 / 1631 / 2957 KB)") with the help of KoH


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3909 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-17 19:32:38 +00:00
orbiter
6518bb6c08 changed release strategy:
we will provide two different releases in the future, one standard release and one 'pro'-release.
the 'pro'-release contains all additional parsers AND has different default performance values.
The pro-version differs therefore from the previous 'all'-version by this default values.
The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder.
The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade:
- dist: creates a standard release (only, no libx target any more)
- distPro: creates a pro-release (includes the libx)
- distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro
Furthermore, the naming of 'dev'-releases had been removed.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-16 14:11:52 +00:00
orbiter
069562a14d fixed problem with re-crawl; replaced error file-db with ram-db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3900 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 23:47:08 +00:00
orbiter
c7a614830a several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3899 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 17:45:49 +00:00
(no author)
2784820ee3 *) moving sleep to a better place
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3895 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 16:53:22 +00:00
theli
7a1b811d18 *) bugfix for SocketException:
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3893 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 15:58:10 +00:00
orbiter
2b937abef1 slighlty different behavior in shutdown sequence for http server threads:
- first close streams
- make pause (that one that was made in httpdFileHandler)
- close sockets

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3890 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-14 11:58:20 +00:00
karlchenofhell
e1d809d5f1 - more detailed logging of MEMORY messages
- forced GCs don't contribute to heuristics anymore

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3881 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-13 15:03:56 +00:00
orbiter
0b10ef64ba better server access tracking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3878 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-13 13:05:51 +00:00
orbiter
66ec8b63c1 added a httpd access tracker:
- all requests to the own httdp can now be listed in the access tracker menu
- the search statistics had been renamed to access tracker and extended by this tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3861 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 14:05:20 +00:00
karlchenofhell
8bff810d19 - fixed logging output of serverMemory.request()
- don't start up if DATA/yacy.running exists as this is usually a sign of an already started yacy-instance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3831 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 12:45:03 +00:00
karlchenofhell
f05ca43780 - the wiki-parser works for remote wiki-code now, not displaying links anymore as if they were local (ViewProfile comment)
- fixed wrong link to CrawlStart on Status-page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3816 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 11:35:48 +00:00
karlchenofhell
30c3d909b1 - fixed charset problem in ConfigProfil_p.html (use accept-charset="UTF-8" in forms)
- fixed wrong XML output if no peers are known in Network.xml
- simplified parsing of table properties in wikiCode and ZTableToken
- reimplemented GC heuristics. They are needed to constantly ensure that an amount of free memory is available which is higher than Java's max. limit for performing a Full GC (please use serverMemory.request(long, boolean) rather than serverMemory.available(long, boolean) to provide data for averaging over the last GCs)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3793 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-05 11:37:19 +00:00
theli
e1a5babff1 *) Logging GUI handler: line-size is now set to max-size if max-size was exceeded
See: http://www.yacy-forum.de/viewtopic.php?p=36355

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3786 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-02 21:23:32 +00:00
(no author)
94cc9f05f5 *) Improvements for restart via update wrapper
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3785 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-02 15:25:13 +00:00
orbiter
33ad0c8246 added a web structure computation and logging:
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 08:13:48 +00:00
karlchenofhell
baa9402b97 - wiki-parser is now configurable via the config setting wikiParser.class which holds the class-name for the parser to use
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3742 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 16:19:25 +00:00
karlchenofhell
601fc7d1c5 - added source to J7Zip-modifed.jar and it's license (changelog is still to come)
- moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools
- prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-20 13:29:12 +00:00
karlchenofhell
0a64047081 - plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method)
- added 7zip parser
- added 'text/sgml' to realtime parseable mimetypes (sometimes returned by the mime type parser)
- added new cached output stream class, very suitable for parsers because of limited memory

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-18 23:13:44 +00:00
theli
b30e64daab *) passing homepath to serverLog.configureLogging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3738 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-18 13:04:26 +00:00
orbiter
b3f97b5c38 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3735 6c8d7289-2bf4-0310-a012-ef5d649a1542 2007-05-16 17:45:39 +00:00
karlchenofhell
086239da36 - added servlet: remote crawler queue overview
- added servlet: crawl profile editor

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-16 10:11:25 +00:00
orbiter
2fa8b50e54 reverting svn 3691+3692
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3696 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 19:31:40 +00:00
orbiter
22a0e9f117 more timeout-control
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 14:53:17 +00:00
orbiter
24db55a541 added timeout for httpd-sockets during read
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3691 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-09 14:30:01 +00:00
orbiter
7f56c8d4aa fixed some seed selection details
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3685 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 22:22:35 +00:00
theli
0b5fc3c28c *) moving date functions to serverDate class
*) Sitemap-parser
   - logging added
   - parsing of modDate added

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3667 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-06 12:36:49 +00:00
rramthun
d6811ac243 *) Moving tar.jar from libx to lib
*) Enhanced interface

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3649 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-04 19:46:23 +00:00
theli
469583ea80 *) new interface class. should be implemented by the updater to allow communication between the updater and yacy
(not yet functional)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3648 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-04 14:22:10 +00:00
theli
7c902996b5 *) changes required for the uploaderWrapper
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3618 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-28 16:04:37 +00:00
orbiter
595ee10468 fixed datatabase inconsistency bugs
inserted many debug lines
added a huge number of asserts
extended database test methods


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3579 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-19 13:37:02 +00:00
orbiter
7a7a1c7c29 fight against problems with remove-methods and synchronization
- some bugs may have been fixed with wrong removal operations
- removed temporary storage of remove-positions and replaced by direct deletions
- changed synchronization
- added many assets
- modified dbtest to also test remove during threaded stresstest

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3576 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-17 15:15:47 +00:00
(no author)
6186185775 *) Moved some comments to javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-14 10:11:37 +00:00
orbiter
fcdf000fbc bugfix for http://www.yacy-forum.de/viewtopic.php?p=33838#33838
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 22:08:40 +00:00
orbiter
ba2c307ab3 optimized memory allocation in kelondroRow.Entry
such an entry cannot be instantiated without allocation of new byte[]; instead
it can re-use memory from other kelondroRow.Entry objects.
during bugfixing also other bugs may have been solved, maybe the INCONSISTENCY problem
could have been solved. One cause can be missing synchronization during bulk storage
when a R/W-path optimization is done. To test this case, the optimization is currently
switched off.
More memory enhancements can be done after this initial change to the allocation scheme.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3536 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-03 12:10:12 +00:00
orbiter
a5d668c0c6 added speed-buttons for easy performance setting
appears in crawl start and on indexing monitor page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3473 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-12 16:24:28 +00:00
orbiter
d755a8026d - better OOM protection
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 16:15:40 +00:00
orbiter
5d5e6ebfcc fix for http://www.yacy-forum.de/viewtopic.php?p=32631#32631
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3436 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 08:54:07 +00:00
orbiter
51e12049fa third generation of R/W head path optimization
- data from collection arrays are read in order
- merged data is written in order

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-28 11:13:23 +00:00
karlchenofhell
6fbe31425a - some code-cleanup (no more syntax-warnings here)
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
dc0c06e43d PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS
redesign for better IO performance
enhanced database seek-time by avoiding write operations at distant
positions of a database file. until now, a USEDC counter was written
at the head-section of a kelondroRecords database file (which is the
basic data structure of all kelondro database files) to store the
actual number of records that are contained in the database. Now, this
value is computed from the database file size. This is either done
only once at start-time, or continuously when run in asserts enabled.
The counter is then updated only in RAM, and written at close of the
file. If the close fails, the correct number can be computed from the
file size, and if this is not equal to the stored number it is a strong
evidence that YaCY was not shut down properly.
To preserve consistency, the complete storage-routine had to be re-written.
Another change enhances read of nodes in some cases, where the data-tail
can be read together with the data-head. This saves another IO lookup during
each DB node fetch.
Includes also many small bugfixes.
IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 08:35:51 +00:00
karlchenofhell
c016fcb10f - added streaming-support to CrawlURLFetchStack_p servlet
- bug for NPE in list.java
- use more constants

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-19 12:47:46 +00:00
orbiter
1f1f398bfa enhanced speed of RAM cache flush by factor 20 (twenty times faster)
- the speed was doubled by avoiding read access during the dump
- the speed was dramatically increased at least by factor 10
   by using a temporary ram-file where the structures are flushed to
   before it is dumped then as a whole byte-chunk to the file system.
The speed enhancements also affects some other parts of the database.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3353 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 23:21:46 +00:00
orbiter
7673f0869b minor enhancements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:01:03 +00:00
orbiter
fcc11391a8 some redesign attempts because sorting of lastseen does not work correctly
not finished yet
target: better selection of peer-ping targets, which should enhance stabilization of the net

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3319 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-02 13:12:31 +00:00
orbiter
306c50ac40 QPM (queries per minute) statistic stub
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 15:39:11 +00:00
orbiter
7598e1243e removed unused variables/imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 09:28:47 +00:00
allo
98cb777e18 abstract wikiCode in putWiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3293 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-29 15:09:58 +00:00
karlchenofhell
15f0334cd3 - fixed IllegalThreadStateException in LogParser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-21 14:45:52 +00:00
hydrox
814a09a0ed *) reversed r3250 and parts of r3252 (nanotime() is an java1.5 function)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 11:10:57 +00:00
hydrox
f7623f5d24 *) added missing measuring points for Parser-Runtime
*) changed precision of Parser-Runtime from ms to ns

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 09:25:04 +00:00
karlchenofhell
5d540b219e - LogalizerHandler skips interfaces again
- added LogParser stats to LogStatistics

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 17:01:20 +00:00
allo
e1fb3550ab fix for profile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 14:52:51 +00:00
hydrox
6faf9b70b7 *) LogParserPLASMA now counts its total runtime.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:35:33 +00:00
hydrox
e5f854bc37 *) added LogalizerHandler-settings to yacy.logging.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-17 13:25:11 +00:00
karlchenofhell
77b73aa7a8 - log-entries 'Indexed' are parsed correctly now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 18:42:34 +00:00
karlchenofhell
71112b1fe6 - added LogStatistics_p.html servlet based on the logalizer (indexing values not functional yet due to charset/regex problems)
add the following to DATA/LOG/yacy.logging:
---
# Properties for the LogalizerHandler
de.anomic.server.logging.LogalizerHandler.enabled = true
de.anomic.server.logging.LogalizerHandler.debug = false
de.anomic.server.logging.LogalizerHandler.parserPackage = de.anomic.server.logging.logParsers
---
and "de.anomic.server.logging.LogalizerHandler" to the list of global handlers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 16:13:21 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
karlchenofhell
bdda9e802f - added some commented string constants to ease use of the result-table
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3215 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 05:51:39 +00:00
karlchenofhell
4dce5ec261 - if mem is too low but former GCs helped, the word-cache limit is only decreased now, if a subsequent GC doesn't
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3197 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-13 06:30:17 +00:00
hydrox
c27e88104c *) getResults() should now work and compile properly with Java 1.4
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 11:12:09 +00:00
hydrox
a2fb54afff *) Quickfix for http://www.yacy-forum.de/viewtopic.php?p=29973#29973 getResults() used a java 1.5 function (Output is temporally disabled until a sulution with 1.4 functions is implemented)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 10:39:22 +00:00
hydrox
3acd90033c *) added functions to get results from log-parsers (not documented yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3186 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-11 09:18:10 +00:00
karlchenofhell
0336480a3e - the maxMemory-fix for the Sun JVM 1.4.2 wrongly also applied to 1.6, thx to NN
- added logging of reducing word-cache (log-level fine)
- disabled memprereq field in PerformanceQueues_p.html, because it is now set by the collections db
- minor changes to ConfigSkins / -Language

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-05 08:00:05 +00:00
allo
6ff8359b98 possibility to use anonther bindPort than the externally reachable port.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-03 21:00:07 +00:00
karlchenofhell
d6eb699e8e - fix for last commit (didn't know that the paragraph sign has an UTF-8-specific location)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 14:04:21 +00:00
karlchenofhell
41bc31d2c2 - ConfigAdvanced_p => XHTML (no invalid IDs)
- removed unmappable characters from code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-23 13:35:34 +00:00