Commit Graph

4070 Commits

Author SHA1 Message Date
orbiter
9a7b093eed tried to avoid endless loop, see also:
http://forum.yacy-websuche.de/viewtopic.php?f=6&t=467&hilit=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4175 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-22 14:35:45 +00:00
orbiter
9d539ec621 added option to display the network name as page greeting instead the page greeting string
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4174 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-22 08:01:44 +00:00
orbiter
b856e377a9 some additions and a small bugfix to SVN 4158
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4173 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-21 23:26:22 +00:00
hermens
501a7aae90 Small correction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-20 12:02:31 +00:00
hermens
caff520988 Removed unnecessary and unused code.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4171 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-20 11:56:15 +00:00
hermens
d732840f8a Avoid ConcurrentModificationException when accessing the PerformanceQueues page while yacy is indexing.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4170 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 23:36:40 +00:00
fuchsi
35303f9504 add real size values (KBytes) of the DHT-In/Out-RAM-Caches to the PerformanceQueues page. A lot of users seem to tweak this value and it might help in finding the best size in relation to the peer's memory ressources.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4169 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 21:47:07 +00:00
fuchsi
a9aef8e5e0 remove duplicate entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4168 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 16:24:37 +00:00
fuchsi
38bbd4a4b3 no code changes. just touched yacyClient.java to trigger a rebuild of the file in an uncleaned tree.
NOTE: run "ant clean" before building SVN 4166/4167 in a tree that includes class files from a previous build to make sure, that every class file is rebuilt!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4167 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 15:31:38 +00:00
fuchsi
f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
- some minor code cleanups (mostly unnecessary casts, null checks)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-19 04:13:46 +00:00
fuchsi
ca83f5a8d9 Add external lib FontBox which is part of the PDFBox (they extracted the font handling code into this package in 0.7.3).
Add the packages to the eclipse .classpath.
Closes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=453

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-18 19:53:52 +00:00
fuchsi
3352474dd8 Remove grouping separator in Network.xml (yacystats will woork without it) and format a few more numbers.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-16 13:29:11 +00:00
fuchsi
06e6a1ff62 Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code).
NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-16 02:12:31 +00:00
fuchsi
e77aec8c9d fix handling of encrypted PDF-Documents (with default user password "")
- update PDFBox package to current version 0.7.3
- use new security model in PDFBox to "guess" wether we can decrypt a document or not
NOTE: When upgrading to this version make sure the old PDFBox-0.7.2.jar is removed from libx/

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4161 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-15 13:18:38 +00:00
low012
b54fcd732b *) fixed exceptions that occured when non-integer values were entered where integers were expected
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4160 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-12 19:09:20 +00:00
low012
52c68875bd *) removed (hopefully only) surplus double encodings (http://forum.yacy-websuche.de/viewtopic.php?t=368)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-12 15:27:23 +00:00
fuchsi
b5f7df8d0a Speed up remove operations in rowCollections.
- Array element shifting during remove is only done when it is necessary to keep the order of a row collection.
- This will speed up the most expensive operation "common word shrinking" by a factor of 500-1000 (in the worst cases we shifted > 60 GB of data during this operation)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-11 17:17:08 +00:00
fuchsi
5c91359297 accidently commited personal testing values as defaults in last commit.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4157 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-10 15:20:22 +00:00
fuchsi
e255888095 Add headless AWT, nice level and memory parameters to the init script. It should work like the startYACY.sh now.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4156 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-10 15:19:07 +00:00
fuchsi
ce0bb1dc8a Increase defaults for the DHT Recieve Limits to prevent "busy" states.
see 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-10 10:07:16 +00:00
low012
fdb0b861f8 *) fixed wrong calculation of network words, network links, network PPM if peer is senior or principal peer
*) added network QPH
*) banner is cached for 1 second to avoid DOS
*) still no logo


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-09 21:47:37 +00:00
fuchsi
3b8540198b finally fix the init script. tested this time.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4153 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-09 14:53:16 +00:00
fuchsi
508de558f7 sbStackCrawlThread is null during first cleanProfiles() run at startup.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4152 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-08 15:56:40 +00:00
fuchsi
70614385ef Attempt to fix the "lost profile handle" bug.
It seems improbable, but it might happen, that during a crawl all queues (indexing, crawling, ...) except the crawl URL stacker ran empty. This commit adds an additional check for an empty crawl stacker queue before executing the profile cleaner.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4151 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-08 15:11:26 +00:00
fuchsi
905e7e60f5 change dir to the yacy base directory before starting it making sure relative path specifications work properly.
see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=405

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4150 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-08 13:40:49 +00:00
low012
507ecd8afa *) added banner that can be displayed like this: http://localhost:8080/Banner.png
possible arguments: textcolor, bgcolor, bordercolor
   example: http://localhost:8000/Banner.png?textcolor=ffffff&bgcolor=121212&bordercolor=ffffff
   take care: YaCy uses CMY color model!
*) there are still some known bugs, but I can't continue coding right now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-07 21:59:36 +00:00
fuchsi
70884da0eb Add external package JARs to classpath. (copied from startYACY.sh)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4148 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-07 19:28:21 +00:00
fuchsi
ebfd1e0b42 remove left over '>' in description and replace ' ' by '+' in rss search where URL-encoded parameters are required.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4147 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-05 18:52:15 +00:00
orbiter
4ce25b3661 - documentation update
- start of new development cycle.
in case your don't know: commits until 0.553 will not automatically be used be the auto-update funktion


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4146 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 20:43:52 +00:00
orbiter
1a0f89d7e8 release 0.55
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4145 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 20:12:08 +00:00
fuchsi
ed20531e68 don't encode in channel element as well
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4144 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 12:12:27 +00:00
fuchsi
9b0948cb4c gnarf. mixed up the positions. finally fixed...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4143 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:58:01 +00:00
fuchsi
c0f5fc51ef bugfix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4142 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:47:48 +00:00
orbiter
33fb2f756d added emergency fail case in remote crawls
in extreme situations this will cause that no remote crawls are send out any more
this is bad, but it protects the case where failing remote crawls fill up the local queue too much,
which is even worse

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4141 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 10:40:30 +00:00
fuchsi
c5a8585ac6 fix more encooding problems in yacysearch.rss.
- URL encoding for search terms where required
- removed "ugly" CDATA escaping
- UTF-8 encoding for the XML
- no HTML style escaping for XML/RSS element values
Note: some unicode characters might still be encooded in a wrong way.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 09:21:03 +00:00
fuchsi
6b00fe0c4e fix ArrayIndexOutOfBoundsException
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4139 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-04 08:50:33 +00:00
low012
e2f3268c13 *) removed double encoding (http://forum.yacy-websuche.de/viewtopic.php?t=368)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4138 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 20:13:32 +00:00
orbiter
3e60ae93b9 modified remote search snippet fetch behavior: do not fetch snippets for more than 300 milliseconds, even if the snippets can be found locally without online fetch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4137 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 16:42:11 +00:00
orbiter
97f1ca52bd fox for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=390
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4136 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:45:12 +00:00
orbiter
143fa40d77 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=394&p=2382#p2382
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4135 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:34:16 +00:00
orbiter
711641f167 extended client connection clean-up:
there are now two time-outs, one for the complete connection time, and one for an idle time
connections that are idle for more than 2 minutes are closed, and connections that are alive since more than one hour are also closed
if the complete number of connections exceeds 64, all connections more than 64 and have most idle time are also closed

During normal operation of peers these forced closings should never appear,
but the existence of the idle connection check ensures the availability of the peer and the usability of the host.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 15:06:12 +00:00
orbiter
b19bb6e5b1 - reverted svn 4132; this did not solve the problem and removed the emergency mehtod which caused production failure for shure within some hours
- removed and added some debugging lines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4133 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 14:34:05 +00:00
fuchsi
1eba408d2f Make sure that sockets which couldn't be opened aren't handled as active connections, in which case they wouldn't be closed.
Please test this and report any problems (connections that stay open for a very long time according to http://<your_yacy_peed>/Connections_p.html to http://forum.yacy-websuche.de/viewtopic.php?f=5&t=386

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 12:18:26 +00:00
fuchsi
03c5b4ad68 more fixes to the yacysearch.rss, it's now 100% valid according to http://feedvalidator.org
- RFC-822 date time had to include the time instead of date only
- <opensearch:link> doesn't exist -> <atom:link>, see http://www.opensearch.org/Specifications/OpenSearch/1.1
- <link> elements are mandatory for <channel> and <item>

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 04:00:52 +00:00
fuchsi
e3c6236eef fixed the last opensearch/rss issue. The GUID-Tag in RSS is supposed to coontain a unique ID. By default, the ID is supposed to be a permanent link to the feed element (the permalink) in which case it's content _must_ match the syntax of a URL. The guid _can_ contain a non-URL ID, but it _must_ be specified as such with an additional isPermLink="false" attribute in this case.
see http://www.rssboard.org/rss-2-0#ltguidgtSubelementOfLtitemgt

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4130 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:46:30 +00:00
orbiter
d69d386f7d added additional forced client connection closing
if a specific number of simultanous connections is reached
the limit is currently set to 64 connections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:21:53 +00:00
orbiter
dea7bee049 - increased minimum time before an active connection is interrupted from 1 minute to 10 minutes
- added sorting by connection time in client connection tabe of connectionTimeComparatorInstance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 23:56:04 +00:00
orbiter
f8e69ce4dc removed progress bar in Network list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4127 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 22:50:47 +00:00
orbiter
c1440d2241 fixed problem with redirection: redirected URLs had not been tested with the double-check
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=348

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 22:40:53 +00:00
orbiter
b183bf6f42 - fixed opensearch bugs
- added 'full domain' button to expert crawl start
- removed not-workin 'only one domain' button, the regex allowed crawling of other domains

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:43:05 +00:00