Commit Graph

3180 Commits

Author SHA1 Message Date
orbiter
2a0eb09e08 enhanced html id names and tag cloud visualization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7257 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 09:36:49 +00:00
orbiter
863065abc4 added user agent logging to access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 08:09:59 +00:00
mikeworks
61e87c0b14 IndexControlRWIs_p.html, IndexControlURLs_p.html, ViewFile.html/.java: changes to HTML output and   in case of empty values for XHTML strict / transitional validation
de.lng: Added missing translation for Show Content and changed existing line 
--> Index Administration should now correctly validate XHTML 1.0 Strict / Trans

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7255 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 16:51:29 +00:00
apfelmaennchen
a79728b97d some updates to experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7254 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 09:58:50 +00:00
apfelmaennchen
ef782cd026 and even more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-16 10:20:41 +00:00
orbiter
ed4371dcf3 enhanced navigation implementation and enhanced tag cloud computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 23:45:12 +00:00
orbiter
ca738ac924 - added a tag cloud to search results (using the topics)
- some refactoring of score classes
- added default package for new classes add_ymark and delete_ymark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 22:01:39 +00:00
apfelmaennchen
7aca763ca8 Some more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 12:53:41 +00:00
apfelmaennchen
4270ed696c Experimental code (I need to transfer the code to my macbook, sorry) for the new bookmarks API based on the Tables concept (same as for crawl starts). Currently you can add a bookmark by api/ymarks/add_ymark.xml?url=http://www.yacy.net&title=YaCy and watch the result via the standard view Tables_p.html.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 05:40:19 +00:00
orbiter
e4d561971e added more score cluster options and made score cluster usage more transparent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-14 11:40:02 +00:00
orbiter
e8f90201a5 fix for scheduling of rss feeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7247 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-13 13:00:36 +00:00
orbiter
7cd9d9d22a - enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
- added limitations on DidYouMean result sets according to input and output string length

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 22:02:10 +00:00
apfelmaennchen
beb65437d2 additional fix for the widget - now a second result page is loaded automatically in case of too little search results for the scroll event to trigger
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 21:55:24 +00:00
apfelmaennchen
2bb0c9b503 Fix for search widget keyup event handling. ESC will close the widget window and RIGHT will load additional search results, especially when the scroll event won't work because of too litte results.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 21:28:28 +00:00
orbiter
de722090b5 enhancements in did-you-mean guessing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 09:45:15 +00:00
orbiter
a59c885ee0 autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 08:36:33 +00:00
orbiter
b7acd92ce4 Auto-Suggestions for YaCy Search:
- added a suggest servlet according to opensearch and firefox standard
- integrated the suggest servlet into opensearch description file
- integrated a autocomplete plugin for jquery
- added a autocomplete addition to the yacy search windows showing autosuggest queries

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 01:23:49 +00:00
mikeworks
67b3b4b13b de.lng: Added translation for skin color picker on http://demo.zuum.net:8080/ConfigAppearance_p.html
ConfigAppearance_p.html: Some HTML 1.0 Strict changes on the Customization page http://demo.zuum.net:8080/ConfigAppearance_p.html
--> Now all Customization pages should validate XHTML 1.0 strict

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 01:03:10 +00:00
orbiter
24f1cba7b2 performance hacks:
- faster generation of index abstract compression during remote search
- less synchronization in IO record reading
- request index abstract generation only if necessary and faster time-out in remote search 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7239 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 12:44:07 +00:00
orbiter
6d61b80fb6 added ColorPicker to WatchWebStructure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 08:57:40 +00:00
orbiter
45b1ab3d07 custom + generic skins:
- added a generic skin which is filled with actual color assignment using a servlet
- enabled css servlets
- added a generic color scheme in configuration file
- added configuration input in Customization/Appearance servlet
- added a jquery color picker widget
- placed color picked widget to input field of generic colour definition input fields

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 00:00:10 +00:00
orbiter
fcd40cd30f - disabled domZones (buggy, must think about better solution)
- increased time-out for dns resolver and isLocal property

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 10:17:50 +00:00
orbiter
0d363a94d7 more performance hacks
this makes YaCy search results VERY fast for all verify=false search cases
and it enhances the search speed also for all other snippet-fetch cases.
With this change my peer performed 100 Queries Per Second (!!!) while doing 10 queries simultanously (!!!)
in an intranet index of 20000 URLs on my 16-core Mac

Check this yourself by doing:
cd bin
./searchtestmulti.sh
after finishing the run, divide 1000 by the given time per query (which is the qps for one thread)
and then multiply again by 10 (because 10 search threads has been started)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 08:55:57 +00:00
orbiter
11bebe356b fixed crawl start: with SVN 7225 the name of the crawl start url was not given in input field and therefore all crawl starts had contained the empty string as crawl start url
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 22:02:24 +00:00
orbiter
2971c91988 fix for http://forum.yacy-websuche.de/viewtopic.php?p=20977#p20977
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 21:44:28 +00:00
orbiter
091dd3f6ec - enhanced intranet search speed
- enhanced intranet portscan speed (better time-out)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 10:54:13 +00:00
low012
b9f405d1e8 *) added comments
*) more beautyful and easier to understand code (IMO)
*) added display= parameter to a lot of links in Wiki.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 00:32:50 +00:00
mikeworks
70576e88d2 de.lng: Added some more untranslated strings I found and uncommented old ones that were removed
terminal_p.html: Put back the old ID which was really easy to find
IndexCreate.js: Because XHTML 1.0 Strict does not allow name tags for some elements rewrote most element access functions to use getElementById
Table_API_p.html and all other html pages: Some XHTMl 1.0 Strict fixes, changed checkAll javascript, marked the first row with checkboxes as unsortable where applicable
Table_API_p.java and all other java pages: URLencoded lines with possible ampersands & -> & for validation XHTML 1.0 Strict sourcecode
--> All Index Create pages should validate now. Hope I did not break anything else (too much :-)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-06 00:00:23 +00:00
orbiter
6e6994e328 latest bugfixes to search and indexing function after test of demo presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 17:49:53 +00:00
orbiter
c3bf17a3a1 fixed must-match filter for smb crawling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 00:05:08 +00:00
orbiter
099def2a04 small changes in search widget appearance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7221 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 23:43:33 +00:00
orbiter
50586a0dfd rename of widget to 'widget'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7220 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 19:57:29 +00:00
apfelmaennchen
dffa142529 Fix for author navigator in yacyui-portalsearch.js
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 19:27:12 +00:00
orbiter
574346f8ce better must-match pattern for intranet file-crawls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 12:55:39 +00:00
orbiter
aacf572a26 - enhancements for search speed
- bug fixes in many classes including basic data structure classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 11:54:48 +00:00
sixcooler
aa6075402a smal fix for crawling from 'sitelist' at changes from 7214
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7216 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-01 22:41:28 +00:00
orbiter
2c549ae341 fixed a number of small bugs:
- better crawl star for files paths and smb paths
- added time-out wrapper for dns resolving and reverse resolving to prevent blockings
- fixed intranet scanner result list check boxes
- prevented htcache usage in case of file and smb crawling (not necessary, documents are locally available)
- fixed rss feed loader
- fixes sitemap loader which had not been restricted to single files (crawl-depth must be zero)
- clearing of crawl result lists when a network switch was done
- higher maximum file size for crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 23:57:58 +00:00
orbiter
f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
- nobody understand the auto-dom filter without a lenghtly introduction about the function of a crawler
- nobody ever used the auto-dom filter other than with a crawl depth of 1
- the auto-dom filter was buggy since the filter did not survive a restart and then a search index contained waste
- the function of the auto-dom filter was in fact to just load a link list from the given start url and then start separate crawls for all these urls restricted by their domain
- the new Site Link-List option shows the target urls in real-time during input of the start url (like the robots check) and gives a transparent feed-back what it does before it can be used
- the new option also fits into the easy site-crawl start menu

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 12:50:34 +00:00
mikeworks
63e387508c ConfigLanguage_p.java: Fixed the filename for the API call to ConfigLanguage_p.html - previously ConfigLanguage.html was recorded and the action could not be replayed with error 404 - Not found
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 03:16:17 +00:00
mikeworks
f468d377d7 Collage.html and Collage.java: Added <p> in body before images for XHTML 1.0 Transitional validationg and alt tag to images as well as closing tag <img (...) />
terminal_p.html: Set new link for starting a crawl to CrawlStartSite_p.html and replaced the old embed object of the Among.us Flash object by their new JS which takes care of adding the object correctly
de.lng: Moved the translations for the JS part from yacyinteractive.html to the yacyinteractive.js part
--> Terminal page is now valid XHTML 1.0 Transitional


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7211 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 23:11:29 +00:00
orbiter
3057a0b939 - intranet scanner now produces urls with host names, not ips if possible
- CrawStartIntranet servlet shows IPs and host names

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 22:44:49 +00:00
mikeworks
421aa6a8bb ConfigLiveSearch.html: Fixed some HTML problems to validate at least XHTML 1.0 Transitional - strict is not possible because iframes are used. Replacing iframes with embedded object tag does not work in IE
ConfigPortal.html: Fixed some HTML problems to validate at least XHTML 1.0 Transitional - for strict the target attribute of the a link has to be removed
yacyinteractive.html: Moved all JS code to an external yacyinteractive.js file in JS folder
yacysearch.html: Removed embedded scripts from in between the body tags - now everything is loaded in the header
de.lng: Just in case JS files will be parsed at some point added translation for yacyinteractive.html result counter

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 20:06:39 +00:00
mikeworks
b7bb0cabaf Blacklist_p.html: Minor HTML and Javascript changes to get XHTML 1.0 Strict validation, lowercae onchange, id tags instead of name tags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7205 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 05:41:32 +00:00
mikeworks
cd505d7e30 de.lng: German translation of the new Intranet Servlet introduced in SVN 7203 in CrawlStartIntranet_p.html
CrawlStartIntranet_p.html: New Intranet Crawl Start Servlet - minor HTML changes to get XHTML 1.0 Strict validation, remove (double) name tags, remove single ending </dt>

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7204 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-28 22:05:12 +00:00
orbiter
e63896f2a8 added an intranet scanner and a servlet which shows all intranet addresses and an option to start a site-crawl for all these addresses at once.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7203 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-28 12:18:54 +00:00
suessthomas
44874f2cb9 Added "encoding =" UTF-8 "in the RSS files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-27 20:36:49 +00:00
orbiter
d2fd93135c - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed
- migrated the 'yacy' user agent to 'yacybot' in many client methods since the 'yacy' user agent is only used for the proxy

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7199 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-27 14:54:32 +00:00
mikeworks
ad7efe6016 rssTerminal.html: Fixing the 'null' is null or not an object in rss2.js when viewing the YaCy default Status page http://localhost:8080/Status.html with Internet Explorer
feed.xml: copy of feed.rss that helps Internet Explorer also read the Feed - workaround for the fix above
Problem is described in the forums and should be fixed better ;-(http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2766&p=20702)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 22:55:52 +00:00
mikeworks
190de644dd de.lng: Added German translations for some missing table content on Network view
WatchWebStructure_p.html: Added JS verification of RGB color codes (currently only RGB value is checked but this could be enhanced to also check for websafe colors)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7195 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 22:28:32 +00:00
orbiter
d5dc88a351 shop cleanup button only if servlet was called without post/put arguments.
This should avoid confusion after a search for a word where it is possible to delete the word. If a delete button is shown to delete the word, then there should not be a button available to delete the whole index to avoide a wrong usage when a user searches only for a word to delete it.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7194 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 21:11:08 +00:00
orbiter
a83186ac7d fix for bug in cytrails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 10:32:40 +00:00
mikeworks
b019426811 de.lng: Added German translations for new Index Creation pages RSS Feeds and adapted text in Tables_p.html and CrawlStartExpert_p.html to match some typos, also changed one name tag to id to conform with XHTML 1.0 Strict
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 01:39:51 +00:00
orbiter
48c0d508ac fixes for crawling of smb links (file length not always available)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-25 22:32:26 +00:00
orbiter
daeea96aea renamed servlet CrawlStart_p.html to CrawlStartSite_p.html to circumvent problem with translation which still showed old expert crawl start page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7183 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-22 21:46:31 +00:00
orbiter
10a9cb1971 simplified snippet computation process and separated the algorithm into two classes
also enhances selection criteria for best snippet line computation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-22 20:50:02 +00:00
orbiter
84a023cbc8 fixed several search bugs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7180 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-21 21:48:42 +00:00
lotus
937dd956d3 save default number of search items via web interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7179 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-21 19:45:49 +00:00
orbiter
4e8cf0c72c added a search box and navigation to api steering servlet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7178 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-21 13:10:18 +00:00
orbiter
97ee278931 enhanced search speed:
- better control of number of running search threads
- no time-out waiting time when no ranking feeding takes place
- local search queries by a remote peer may be faster up to 300 milliseconds
- a local search may even be faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 13:17:25 +00:00
orbiter
377f001e0d sorting of crawl profile names in crawl profile editor, see
http://forum.yacy-websuche.de/viewtopic.php?p=20851#p20851

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-20 09:09:38 +00:00
orbiter
a2f9974745 some redesign in the access tracker to realize sixcoolers question about "smartes way for deleting the first Object":
- not so much abstraction for a collection, makes use of remove() (no operands) possible
- different way to delete elements in track (destructive, not constructive (less copies of elements in new queue))
- more abstraction for class api since no static class must be used any more

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7169 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-19 23:00:24 +00:00
low012
f32bb5e51f *) Changed image in Steering.html from linked image to embedded image because shutdown is so fast now, browsers can't load image before Yacy instance is gone already. Had to make image smaller since IE does not accept large Base64 encoded images.
*) Decreases wait time in Steering.html before first check since 
*) HTML fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7165 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-18 00:18:52 +00:00
orbiter
37baa8bae3 - fixes for concurrency exceptions and failed database integrity verification
- added link to yacystats peer when peer is more than one day old

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7164 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-17 10:20:04 +00:00
orbiter
29fe401f93 - some layout and text enhancement for site crawl start
- Quix0rs patch from http://forum.yacy-websuche.de/viewtopic.php?p=20839#p20839 (parts)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 23:00:07 +00:00
orbiter
8c1da27347 - added more comments for user in site crawl servlet
- added a disable/enable function in case that 'sitemap' is selected for functions that do (not) apply
- better naming of menu items
- limit default crawl depth

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7162 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 22:04:14 +00:00
orbiter
58b7417a59 - added a new 'easy' crawl start menu which can be used for the special case of loading a complete domain
- the previous crawl start servet was renamed to CrawlStartExpert_p
- easy crawl start is now default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7160 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 12:02:43 +00:00
orbiter
461a2a6ec7 enhanced remote crawling:
- 300 ppm is default now (but this is switched off by default; if you switch it on you may want more traffic?)
- better timing for busy queue
- better amount of remote url retrieval
- better time-out values
- better tracking of availability of remote crawl urls
- more logging for result of receipt sending

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 09:34:17 +00:00
orbiter
670ba4d52b - removed the remote crawl option from the network configuration submenu and
- added a remote crawl menu item to the index create menu. This menu also shows a list of peers that provide remote crawl urls
- set remote crawl option by default to off. This option may be important but it also confuses first-time users


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7158 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 00:39:05 +00:00
orbiter
ac1c08924e more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7149 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 15:27:27 +00:00
orbiter
39f409a7bb performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7147 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 14:32:24 +00:00
orbiter
2e75879504 fix for latest commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7145 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 13:01:18 +00:00
orbiter
6e4653cf50 remove DoS protection in remote search for intranet hosts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7144 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 12:38:05 +00:00
orbiter
3c0e07ba72 removed all delays in shutdown process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7143 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 09:13:28 +00:00
orbiter
906c572621 - enhanced index create menu structure
- clear search log caches each time a search is done

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7142 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-14 09:06:27 +00:00
orbiter
fc924f024e import of oai sources from a list using a command line interface:
if you have a list of oai servers you can import them all using the linux command:
bin/importOAIList.sh <name-of-oai-list-file>


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7141 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-13 10:13:34 +00:00
orbiter
64860dc1bb enhanced search event logging (to be used for further improvements)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7140 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-13 09:33:04 +00:00
sixcooler
17eebd4ef8 counting crawler traffic again:
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2808

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7138 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-11 15:58:15 +00:00
lotus
547d5226ae fix banner reload parameters (were no html errors)
adapted default colours

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7137 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-11 11:05:07 +00:00
orbiter
34a25856a5 - added navigation to next/prev search page using arrow keys (left/right)
- better information text for YaCy GUI application

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7134 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-10 10:42:01 +00:00
lotus
5ce679a053 focus search field on load, no click necessary anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7132 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-10 08:45:26 +00:00
orbiter
013926f01c added 'francais' as language option for default configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7131 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-10 08:35:47 +00:00
orbiter
4c21d8dc9d - changed default values for online caution (the pausing may not be necessary any more)
- fixed bug in WeakPriorityBlockingQueue
- show favicon faster using pre-loading (same technique as used for fast image search)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7130 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-09 23:25:19 +00:00
orbiter
348dece62f redesign of the SortStack and SortStore classes:
created a WeakPriorityBlockingQueue as special implementation
of a PriorityBlockingQueue with a weak object binding.
- better abstraction of ordering technique
- fixed some bugs according to result numbering (distinguish different counters in Queue)
- fixed a ordering bug in post-ranking (ordering was decreased instead of increased)
- reversed ordering numbering using a reversed ordering. The higher the ranking number the better (now).

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7128 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-09 15:30:25 +00:00
orbiter
114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
- removed old parser
- removed old importer framework (was only used by removed old parser)
- added a new sitemap parser in parser framework
- linked new parser with parser access in old sitemap processing routines

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-08 14:13:15 +00:00
lotus
b73ea6581d fix json in case of query includes "
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-08 11:54:25 +00:00
orbiter
5fe828fa06 - replaced pdfbox and fontbox version 1.1.0 with 1.2.1
- added some clear statements that shall clear static cache size within the pdfbox library
- the pdfbox library contains a memory leak; it is unsafe to run a peer with pdf parser permanently on.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7120 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-07 17:13:47 +00:00
lotus
5dff8f62c4 fix tray information display for non-windows
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7117 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 13:30:40 +00:00
orbiter
fb828f3767 - performance enhancements in search response time using faster query ID computation and an ID cache
- code cleanup

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 10:00:07 +00:00
orbiter
0ab6a462ee - added a missing entry in YaCy interface robots.txt for bookmarks
- changed default robots.txt deny list to include some more interface pages because the loading of such pages are a peer load issue for YaCy when crawlers come by and information on these pages are not useful for public search. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7112 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-06 09:58:54 +00:00
orbiter
ae07e11bc5 enhanced image search result display: concurrent loading of images before they are displayed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7109 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-05 23:02:46 +00:00
orbiter
9c0c94683c because of a bug in search result caching count search results had not been generated as fast as possible.
with this fix search results are (even) faster.
Also enhanced: image search. This is now speeded up using a image search result look-ahead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-04 22:57:12 +00:00
low012
5f391fcfa9 *) cleaned up in wikiCode parser (more to be done)
*) HTML fixes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7103 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-04 14:01:34 +00:00
orbiter
7be988768d simple selection of views in ViewFile.html (omit usage of button)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7100 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-03 22:35:07 +00:00
orbiter
d8f52c5b9c added a changelog url to download
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7096 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-03 12:55:36 +00:00
orbiter
5de70c3d7c changed way of storage for search requests:
- the search request cache can now get as large as 1000 entries
- if more entries arrive, unused are deleted
- the elements may stay in the cache up to 10 minutes and longer if they are used
- the elements are deleted earlier that 10 minutes if the memory gets low
This commit was mainly done for metager-feeding peers that have a query load of 50000 queries each day. Also added:
- a monitor for cache hit/cache miss in PerformanceMemory_p.html (see at bottom of page)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7093 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-02 21:52:45 +00:00
orbiter
9d080f387e change in handling of the all-visible home path for storage in YaCy:
the home path can now be distinguished between
- data home; the path where the DATA directory is created
- application home; everything else
This will make it possible to store application data on Mac releases within the
~/Library/YaCy
directory; a place where Mac applications write their data.
Similar techniques will be possible for debian and windows.
To use the new data path, YaCy can be started with
-start <data path>
or
-gui <data path>


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7092 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-02 19:24:22 +00:00
orbiter
875741bcff fix for http://forum.yacy-websuche.de/viewtopic.php?p=20657#p20657
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-02 10:05:04 +00:00
orbiter
65eaf30f77 redesign of crawl profiles data structure. target will be:
- permanent storage of auto-dom statistics in profile
- storage of profiles in WorkTable data structure
not finished yet. No functional change yet.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7088 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-31 15:47:47 +00:00
orbiter
104318d58a - added nice colors to feed indexing state messages
- added a 'remove all' button for new and scheduled rss feed list
- made adding of new rss feeds concurrent so interface is more responsible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-27 11:56:51 +00:00
orbiter
4f22e2df41 bugfixes for
- next-execution-time in scheduler
- deletion of scheduled rss feed loading (now deletes also the scheduling entry)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-26 16:42:00 +00:00