Commit Graph

4078 Commits

Author SHA1 Message Date
fuchsi
1eba408d2f Make sure that sockets which couldn't be opened aren't handled as active connections, in which case they wouldn't be closed.
Please test this and report any problems (connections that stay open for a very long time according to http://<your_yacy_peed>/Connections_p.html to http://forum.yacy-websuche.de/viewtopic.php?f=5&t=386

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 12:18:26 +00:00
fuchsi
03c5b4ad68 more fixes to the yacysearch.rss, it's now 100% valid according to http://feedvalidator.org
- RFC-822 date time had to include the time instead of date only
- <opensearch:link> doesn't exist -> <atom:link>, see http://www.opensearch.org/Specifications/OpenSearch/1.1
- <link> elements are mandatory for <channel> and <item>

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 04:00:52 +00:00
fuchsi
e3c6236eef fixed the last opensearch/rss issue. The GUID-Tag in RSS is supposed to coontain a unique ID. By default, the ID is supposed to be a permanent link to the feed element (the permalink) in which case it's content _must_ match the syntax of a URL. The guid _can_ contain a non-URL ID, but it _must_ be specified as such with an additional isPermLink="false" attribute in this case.
see http://www.rssboard.org/rss-2-0#ltguidgtSubelementOfLtitemgt

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4130 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:46:30 +00:00
orbiter
d69d386f7d added additional forced client connection closing
if a specific number of simultanous connections is reached
the limit is currently set to 64 connections

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4129 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-03 00:21:53 +00:00
orbiter
dea7bee049 - increased minimum time before an active connection is interrupted from 1 minute to 10 minutes
- added sorting by connection time in client connection tabe of connectionTimeComparatorInstance

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 23:56:04 +00:00
orbiter
f8e69ce4dc removed progress bar in Network list
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4127 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 22:50:47 +00:00
orbiter
c1440d2241 fixed problem with redirection: redirected URLs had not been tested with the double-check
see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=348

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 22:40:53 +00:00
orbiter
b183bf6f42 - fixed opensearch bugs
- added 'full domain' button to expert crawl start
- removed not-workin 'only one domain' button, the regex allowed crawling of other domains

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:43:05 +00:00
fuchsi
7404f2c35c Fix some of the issues with the RSS search interface, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=392
Note: the new DateFormatter822 in the plasmaSwitchboard is just a copy of the DateFormatter that always uses the US locale to allow formatting of a loocale independent date String.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4124 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:28:29 +00:00
orbiter
98abe0804d another enhancement to crawl starts with link files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4123 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 20:30:42 +00:00
fuchsi
ed2ca8fc4c Add search type to top word suggestion searches.
Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=391

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 19:49:50 +00:00
daburna
aef1ab9526 #updated German translation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4121 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 18:42:08 +00:00
orbiter
1b42152a76 fixed and enhanced some details in crawl start with file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 00:49:38 +00:00
orbiter
16e101f135 - fix for bad xml tag in Network.xml
- switched on automatic deletion of passive peers in pro versions

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 22:45:44 +00:00
orbiter
4465db7399 removed debug information from network grafic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4118 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:32:10 +00:00
orbiter
01e0669264 re-designed some parts of DHT position calculation (effect is the same as before)
and replaced old fist hash computation by new method that tries to find a gap in the current dht
to do this, it is necessary that the network bootstraping is done before the own hash is computed
this made further redesigns in peer initialization order necessary

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-01 12:30:23 +00:00
hermens
d547c3b4bd Avoid NullPointerException in yacySeedDB.lookupByIP
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4116 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-29 11:18:09 +00:00
orbiter
5b1a937ed8 fix for crawl stack database format change, introduced in SVN 4113
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4115 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 08:17:08 +00:00
orbiter
af25c98306 enhanced local search performance in case of a remote search:
there is no waiting until the local search terminates to show the result page.
the local search appear like all other results from remote peers using a separated thread.
This has especially a stron effect, if the local index for a specific word is large.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4114 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:36:22 +00:00
orbiter
842308ea97 - redesigned crawl start menu, integrated monitoring pages
- removed web structure picture from indexing menu and grouped it together with htcache monitor
- added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database
- extended crawl profile edit servlet, shows now also terminated crawls
- option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues!
- fixed here and there problems with indexing queues
- enhances indexing speed by changing cache flush sizes.
- changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown

attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched.
next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:21:31 +00:00
orbiter
341f7cb327 steps to enhance remote search performance:
- added a file size limitation, that disallows parsing of large documents during (offline-) remote search
- added profiling information to search result computation, visible at search access tracker. this info shows used time for URL fetch and snippet computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4112 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-26 10:11:50 +00:00
orbiter
2f1ff048ba some fixes to socket connection time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 23:45:05 +00:00
orbiter
3c74014004 automatic deletion of dead client connections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 22:46:11 +00:00
orbiter
49f1c58d64 restoring alternative update location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4109 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 21:43:16 +00:00
orbiter
11b4f80bde - fixed non-closing client connections
- added client connection tracker in connections servelet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-25 21:36:08 +00:00
orbiter
d352853f2d fix for non-closing client sessions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4107 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-24 08:42:07 +00:00
orbiter
1488769e1f cleanup of unmaintained and outdated performance methods:
removed object pools in httpc. Object pooling is not recommended,
if the creation of the object is not time-intensive. Object pools are only useful,
if there is much computation necessary to create some basic data that is stored
in the object pool and can be re-used. This does not apply to object pools in YaCy.
Object pooling of client sessions would make sense if they would allow re-use of
living connections to other yacy clients. But every connection is closed after usage
of an object in the client pool, therefore the YaCy server client objects are not such
that hold hardware/network-allocated entities.
See:
http://www.javaperformancetuning.com/news/qotm033.shtml
http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling
http://docs.sun.com/source/816-7159-10/pt_chap5.html
http://www.microjava.com/articles/techtalk/recylcle2


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 20:49:52 +00:00
orbiter
3cb9cdc9be try to fix connection problem, possible cause for wrong junior status and non-passive passive peers:
the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far
this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case.
To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions
with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time
at the end of server sessions is also re-enabled.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-23 17:31:29 +00:00
fuchsi
00dab81077 simpler solution to last commit + works with and without navigation collumn on the left
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4104 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-20 01:52:10 +00:00
fuchsi
eb16a99e94 avoid floating of long page titles around the favicon in search results
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4103 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-19 22:08:56 +00:00
fuchsi
9524b9c16a second try of rev 4100 :). Tested in Iceweasel/Firefox 2.0.6, Konqueror 3.5.7, Opera 9.23 (all linux) and IE6-SP1 (wine)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-17 19:39:15 +00:00
fuchsi
6b8faaadb6 undo last commit for further evaluation, a progressbar element is used on other pages as well...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-17 03:36:35 +00:00
fuchsi
1880bba420 A few changes to the progress bar and search result statistics layout influenced by the discussion in <http://forum.yacy-websuche.de/viewtopic.php?f=5&t=268> with the idea of saving vertical space. Please check in every available browser and comment wether it's better than before. ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4100 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-16 14:30:53 +00:00
daburna
404ebf1474 # update of de.lng
- NO unused strings anymore!!!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4099 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-16 10:17:22 +00:00
daburna
041922652a # update of de.lng
- removed or updated unsused strings
- updated some files

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4098 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-15 13:10:56 +00:00
borg-0300
ba59de773f again and again junior - test
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4097 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-13 17:05:53 +00:00
hermens
9fa75ef4d1 Limit the percentage of the progress indicator to reasonable values
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4096 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-13 16:37:23 +00:00
orbiter
4275727d69 fix for peer ping problem (implemented a 3-time re-ping); cause for 'Connection reset' still unknown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4095 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-12 00:42:53 +00:00
fuchsi
e78098be9b According to HTML-Specs "name" and "id" attributes share the same namespace. So we can't have one element with name="offset" and another one with id="offset". Additionally IE6's getElementById() returns elements with matching names as well and Opera is mimicing this behaviour.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4094 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 16:21:14 +00:00
orbiter
07d1e98909 fixed round-robin method of peer-ping order (the successfully pinged peer was not updated to current last-seed date)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4093 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 16:07:35 +00:00
fuchsi
a1dcd065ad some tweaks to the search results layout
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 15:56:14 +00:00
orbiter
76e4c2d69e fix for peer-ping in case that remote peer does not respond with valid values
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 15:27:01 +00:00
fuchsi
e192f99134 fix small bug introduced in r4089 that appeared when we tried to remove "gzip" encoding from Accept-Encodings header
closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=336

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 21:46:40 +00:00
fuchsi
ae4b9308ef Fix problems with some web servers which couldn't handle the way yacy was sending requests. Thx to celle for the patch.
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=320

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4089 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 09:15:28 +00:00
fuchsi
6601e37512 clear caches after changing blacklists, closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241&p=1964#p1964
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 08:15:25 +00:00
fuchsi
5b0c1449e1 various fixes and cleanups for blacklist handling:
1. avoid adding duplicate file name entries in config properties for lists, 
2. correctly merge all path masks from all list files for the same host masks,
3. rewrite helper methods standard java methods for Collection transformations,
4. merged various methods with identical functionality for different Collection implementations into one,
5. minor refactoring to improve code readability.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-10 06:20:27 +00:00
orbiter
e27aeb7fdc patch for bad crawl filter at crawl start
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4086 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 19:21:41 +00:00
orbiter
841cf71022 fix for NPE in DHT transfer selection, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=327
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4085 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 19:08:13 +00:00
orbiter
3047ae2cd9 fixed some more old links to new hompage location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4084 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 18:43:39 +00:00
orbiter
dbd1eeead5 fix for missing object miss-cache flush value:
the value is alway zero because there is no miss-cache flush
see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=288

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4083 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-09 18:35:05 +00:00