Commit Graph

82 Commits

Author SHA1 Message Date
orbiter
89b9b2b02a redesigned remote crawl process:
- instead of pushing urls to other peers, the urls are actively pulled
  by the peer that wants to do a remote crawl
- the remote crawl push process had been removed
- a process that adds urls from remote peers had been added
- the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version
- the list-interface had been removed
- servlets using the list-interface had been removed (this implementation did not properly manage double-check)
- changes in configuration file to support new pull-process
- fixed a bug in crawl balancer (status was not saved/closed properly)
- the yacy/urls-protocol was extended to support different networks/clusters
- many interface-adoptions to new stack counters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:07:37 +00:00
fuchsi
0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
- put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation.
- putASIS(...) have been removed, now done with simple put(...) (see above).
- puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()).
- putHTML(...) escapes special characters into corresponding HTML enities ('<' => '&lt;') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ".
In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value.
A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values.

* added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456
* removed duplicate code (mostly related to the big changes above).

TODO:
- make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437
- probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting.
- further improve the speed of page creation for the WatchCrawler.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-24 21:38:19 +00:00
fuchsi
9524b9c16a second try of rev 4100 :). Tested in Iceweasel/Firefox 2.0.6, Konqueror 3.5.7, Opera 9.23 (all linux) and IE6-SP1 (wine)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4102 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-17 19:39:15 +00:00
fuchsi
6b8faaadb6 undo last commit for further evaluation, a progressbar element is used on other pages as well...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-17 03:36:35 +00:00
fuchsi
1880bba420 A few changes to the progress bar and search result statistics layout influenced by the discussion in <http://forum.yacy-websuche.de/viewtopic.php?f=5&t=268> with the idea of saving vertical space. Please check in every available browser and comment wether it's better than before. ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4100 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-16 14:30:53 +00:00
fuchsi
e78098be9b According to HTML-Specs "name" and "id" attributes share the same namespace. So we can't have one element with name="offset" and another one with id="offset". Additionally IE6's getElementById() returns elements with matching names as well and Opera is mimicing this behaviour.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4094 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-11 16:21:14 +00:00
orbiter
6c3bcadc1c - re-implemented image search
- generalized search result status bar, is now also visible during text search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4077 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-06 13:26:38 +00:00
orbiter
4779f314fe first version of next-generation search interface:
- snippets are not fetched by browser using ajax, they are now fetched internally
- YaCy-internat threads control existence of snippets and sort out bad results
- search results are prepared using SSI includes
- the search result page is visible right after the search request, the results drop in when they are detected
- no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers
- added result page switching! after the first 10 results, the next page can be retrieved
- number of remote results is updated online on the result page as they drop in
- removed old snippet servelet (which had been also a security leak btw)
- media search is broken now, will be redesigned and fixed in another step


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-03 23:43:55 +00:00
orbiter
71e5d24f4a fix for watch crawler, see http://forum.yacy-websuche.de/viewtopic.php?p=1771#p1771
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4064 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-28 12:20:19 +00:00
orbiter
e332b844b2 - enhanced remote search: during waiting time for remote crawls
some urls are fetched so the url cache can be filled with these urls
- the url-prefetch is used to sort out some unresolved urls
- the snippet-fetcher is triggered with the search event id. This is used
  to remove missing snippets from the search cache so they will not be displayed again


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4060 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-26 18:18:35 +00:00
michitux
5cf634a4a4 New media-search ui:
- uses the progressbar introduced in the image-search
- results are displayed using the same layout as the text-search
- results are displayed in the order they arrive


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-11 22:20:01 +00:00
orbiter
62347b50f4 added security layer for ViewImage:
- images may be requested by localhost and authorized users only, if the request is done using a clear-text URL
- the image may be requested also using a code that can be a license to retrieve a URL for everyone
- some servelets produce URL licenses for ViewImage, like image search results


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4027 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-03 23:06:53 +00:00
michitux
8ebfd732ce - Fix for the redisplay of hidden results in Opera, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=140 for details.
- Now the message that there are hidden results is hidden when all results are displayed again.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-19 19:49:48 +00:00
orbiter
89e1848db6 fixed problem with favicons:
target servers had been able to see search words from the referrer of the favicon fetch.
This has been removed by using the getImage - servlet for favicon fetch.
Since java does not support loading of bmp and ico-Images, such parsers had been added.
The image parser had been coded from their original microsoft documentation.
This influences also the image-search functionality: there can now be a preview
of found bmp-images. Another benefit: favicons for search results are now cached with the HTCACHE.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3965 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-15 01:34:01 +00:00
michitux
110a1a2b16 - fixed the handover of the searchterm and -type on index.html when the user clicks on "more options..."
- some small changes to make index.html and the menu valid XHTML 1.0 strict
- changed the inconsistent eol - characters in index.html to unix-ones


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3940 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 19:23:42 +00:00
orbiter
1d0cce8f3a documentation update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3911 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-17 22:35:16 +00:00
michitux
25529290ca - 2 small changes in documentation
- hopefully fixed logging of GCs (in order to avoid things like "performed necessary GC, freed 18014398509481565 KB (requested/available/average: 4096 / 1631 / 2957 KB)") with the help of KoH


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3909 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-17 19:32:38 +00:00
michitux
184ba22ce9 New image - search HTML/JavaScript - frontend:
- <noscript> - area for non-JS - Browsers
- progressbar for the loading - process (may be used in other searches too)
- the image that is available first ist displayed first, so the images aren't moved around when new results arrive
- the correct number of results is displayed
- successfully tested in IE 5.5 and 6, Opera, Firefox and Konqueror (recent versions)



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3904 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-16 21:55:15 +00:00
theli
339153d40e *) favicons that are specified in the document content via html link-tags
are now detected and displayed on the search page (requested by allo).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3845 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-09 15:22:37 +00:00
allo
38c180b28b hide results with wrong("red") snippet.
(maybe not as default? But it works pretty good for me)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-08 23:59:18 +00:00
allo
559d1c447f Bookmarks tag suggestion
AJAX fix for configadvanced
empty bugs are not a interface bug, but a scraper bug.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3821 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 22:49:31 +00:00
theli
e75ca857c3 *) Bugfix for problem with ajax graphic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3815 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 07:40:32 +00:00
allo
54ddb3262c enter on webstrucutre
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3783 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-02 14:50:36 +00:00
allo
d0f8254f95 better refresh ui
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3779 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-01 13:58:48 +00:00
(no author)
5cc8bb075b Syntaxfehler beseitigt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3764 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-26 19:51:05 +00:00
(no author)
ef24bed406 Sorry...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3760 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-24 16:25:07 +00:00
(no author)
a29cb2e1af blupp
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3759 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-24 16:14:46 +00:00
orbiter
a3ecfe0a45 replaced failed-icon by new 'bad'-icon
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3680 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 14:05:49 +00:00
theli
6f46245a51 *) Bookmarks: Ajax icon is displayed while loading title
*) First version of a sitemap parser added
   - currently only autodetection of sitemap files is supported
*) DB-Import restructured
   - pause/resume should work again now


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-06 09:52:04 +00:00
michitux
56b30d6593 * fixed a bug in ie: class-names for image-snippets were set, but had no influence on ie, they have to be set with className =
* fixed a bug in safari (hopefully, sorry that I removed the old fix, the divs): yacy-logo is now above the fieldset, the fieldset clears and has a margin-left set
 * fixed a bug with the dls: for example in ViewProfile.html the dt's (the terms/keys) had not the same height as the dd's, so the dt's were not in the same row with the coresponding dd's towards the bottom
 * moved my new css-classes to the right place in base.css

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3572 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-13 23:14:12 +00:00
michitux
e917bfcae3 * Bugfix: changed handling of the query-string to be independent from input-elements
* removed unnecessary divs

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-13 18:01:02 +00:00
michitux
4990909178 Some bugfixes, new layout/style for image search results:
* removed divide by zero bug when 20_dhtdistribution_busysleep is 0
 * replaced German comment with wrong charset in source/de/anomic/plasma/plasmaCrawlBalancer.java by an English one
 * replaced the table-fix for floating behind snipped images by a br with clear
 * removed unnecessary old xhtml-files (were not in use, they were created when we weren't having xhtml for testing)
 * new layout for image-search results: replaced the old one with spans and tables inside (not valid) with new divs, now each image snippet container has the same size
TODO:
 * the ids of the snippetLoading-divs aren't valid because ids must start with an alphabetic letter or an underscore, they have to be prefixed
 * in the returned snippet-xml is an unresolved pattern for status (the status is only set for text snippets)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-12 18:21:17 +00:00
karlchenofhell
4f2e6ef47b - WatchCrawler_p shows max. 80 characters of URLs now (maybe dynamically adjustable based on browser width?)
- typo in BlacklistCleaner

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3445 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 23:16:25 +00:00
karlchenofhell
bf7a69197d - fix for possible NPE in queues_p
- WatchCrawler_p:
  - display crawler traffic
  - pause/resume local- and global crawler


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-22 22:26:11 +00:00
karlchenofhell
b873ad51ab - fix for http://www.yacy-forum.de/viewtopic.php?t=3369
- merged netBude's alternative for tables in yacysearch.html & search results valid
- added statistic info to index.html as proposed here: http://www.yacy-forum.de/viewtopic.php?p=29762#29762
- fixed error-log in httpTemplate

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3189 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-12 00:52:38 +00:00
orbiter
1d2d1854b9 added size of rwi and urls to WatchCrawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3112 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-21 21:33:35 +00:00
orbiter
61798f0ae6 added option to distinguish between text crawl and media crawl
- for each crawl start, there is now a flag for text and media
- the localCrawl flag is superfluous
- added new crawl profiles
- if an image search is done, only media links are crawled for the snippets


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3100 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-19 03:10:46 +00:00
orbiter
febe6b114a design update of crawler monitor
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3094 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 01:18:28 +00:00
orbiter
40049e0635 fixed media search snippet flow
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-17 22:52:58 +00:00
orbiter
7ff86d6ba6 - image search now shows thumbnails (in bad order, but it works)
- repaired DHT selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3081 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-14 02:48:37 +00:00
orbiter
28971da91c fix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-12 02:51:59 +00:00
orbiter
e4570bffaf -implemented a specialized snippet-fetch for media content
-changed search result preparation for media search presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-12 02:09:25 +00:00
orbiter
1377c53aa3 extraction of media links from search results
these links are mixed to the snippets for testing purpose
(a final version will handle this differently)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3069 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-11 01:31:23 +00:00
orbiter
fb9e0f0284 preparations for media snippets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3064 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-09 23:15:58 +00:00
orbiter
b5a29e9651 - fix for snippets that are too short
- added keyword to snippet fetch to suppres removal of not-found snippet words (for debugging)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-25 00:38:09 +00:00
michitux
567c40f5f0 Bookmark/delete-links now visible when mouse is over the searchresult, in standard-compliant browsers with css, in Microsoft Internet Explorer via JavaScript
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2608 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-16 16:56:22 +00:00
orbiter
d54144a4e3 fixed bad snippet behavior (hopefully)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2596 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 14:17:18 +00:00
orbiter
5015e780c2 - simplified watchCrawler code
- changed display of watchCrawler slightly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2594 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 13:54:10 +00:00
allo
9bed90f8dc bugfix in js
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2587 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 06:33:22 +00:00
allo
13d0cff257 right dhtml.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 14:02:34 +00:00