Commit Graph

7024 Commits

Author SHA1 Message Date
orbiter
58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7258 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 11:35:09 +00:00
orbiter
2a0eb09e08 enhanced html id names and tag cloud visualization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7257 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 09:36:49 +00:00
orbiter
863065abc4 added user agent logging to access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 08:09:59 +00:00
mikeworks
61e87c0b14 IndexControlRWIs_p.html, IndexControlURLs_p.html, ViewFile.html/.java: changes to HTML output and   in case of empty values for XHTML strict / transitional validation
de.lng: Added missing translation for Show Content and changed existing line 
--> Index Administration should now correctly validate XHTML 1.0 Strict / Trans

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7255 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 16:51:29 +00:00
apfelmaennchen
a79728b97d some updates to experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7254 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 09:58:50 +00:00
apfelmaennchen
ef782cd026 and even more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-16 10:20:41 +00:00
orbiter
ed4371dcf3 enhanced navigation implementation and enhanced tag cloud computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 23:45:12 +00:00
orbiter
ca738ac924 - added a tag cloud to search results (using the topics)
- some refactoring of score classes
- added default package for new classes add_ymark and delete_ymark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 22:01:39 +00:00
apfelmaennchen
7aca763ca8 Some more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 12:53:41 +00:00
apfelmaennchen
4270ed696c Experimental code (I need to transfer the code to my macbook, sorry) for the new bookmarks API based on the Tables concept (same as for crawl starts). Currently you can add a bookmark by api/ymarks/add_ymark.xml?url=http://www.yacy.net&title=YaCy and watch the result via the standard view Tables_p.html.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 05:40:19 +00:00
orbiter
e4d561971e added more score cluster options and made score cluster usage more transparent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-14 11:40:02 +00:00
orbiter
e8f90201a5 fix for scheduling of rss feeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7247 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-13 13:00:36 +00:00
orbiter
7cd9d9d22a - enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
- added limitations on DidYouMean result sets according to input and output string length

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 22:02:10 +00:00
apfelmaennchen
beb65437d2 additional fix for the widget - now a second result page is loaded automatically in case of too little search results for the scroll event to trigger
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7245 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 21:55:24 +00:00
apfelmaennchen
2bb0c9b503 Fix for search widget keyup event handling. ESC will close the widget window and RIGHT will load additional search results, especially when the scroll event won't work because of too litte results.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7244 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 21:28:28 +00:00
orbiter
de722090b5 enhancements in did-you-mean guessing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 09:45:15 +00:00
orbiter
a59c885ee0 autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 08:36:33 +00:00
orbiter
b7acd92ce4 Auto-Suggestions for YaCy Search:
- added a suggest servlet according to opensearch and firefox standard
- integrated the suggest servlet into opensearch description file
- integrated a autocomplete plugin for jquery
- added a autocomplete addition to the yacy search windows showing autosuggest queries

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 01:23:49 +00:00
mikeworks
67b3b4b13b de.lng: Added translation for skin color picker on http://demo.zuum.net:8080/ConfigAppearance_p.html
ConfigAppearance_p.html: Some HTML 1.0 Strict changes on the Customization page http://demo.zuum.net:8080/ConfigAppearance_p.html
--> Now all Customization pages should validate XHTML 1.0 strict

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 01:03:10 +00:00
orbiter
24f1cba7b2 performance hacks:
- faster generation of index abstract compression during remote search
- less synchronization in IO record reading
- request index abstract generation only if necessary and faster time-out in remote search 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7239 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 12:44:07 +00:00
orbiter
6a166c2040 patches for bad proxy behaviour
- accept ipv6 localhost clients
- index media files (url only)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 11:38:36 +00:00
orbiter
d607b30b6a performance enhancements for search and code review for database functions
- removed read cache from Records data structure because the read cache had no cache hit during search operation
- copied old read-cache class to CachedRecords and the old, now new Records class does not have the cache any more and a code review checked that data structures and synchronization is clean
- removed unnecessary synchronization from Table class during get()

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7237 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 11:01:50 +00:00
orbiter
6d61b80fb6 added ColorPicker to WatchWebStructure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7236 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 08:57:40 +00:00
orbiter
45b1ab3d07 custom + generic skins:
- added a generic skin which is filled with actual color assignment using a servlet
- enabled css servlets
- added a generic color scheme in configuration file
- added configuration input in Customization/Appearance servlet
- added a jquery color picker widget
- placed color picked widget to input field of generic colour definition input fields

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 00:00:10 +00:00
mikeworks
41bf8ef9f9 de.lng: Added translation for Link List option in Crawl Site page http://localhost:8080/CrawlStartSite_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7234 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-10 10:44:15 +00:00
orbiter
fcd40cd30f - disabled domZones (buggy, must think about better solution)
- increased time-out for dns resolver and isLocal property

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 10:17:50 +00:00
orbiter
ec38eca278 fix for new URI equal method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 09:27:31 +00:00
orbiter
0d363a94d7 more performance hacks
this makes YaCy search results VERY fast for all verify=false search cases
and it enhances the search speed also for all other snippet-fetch cases.
With this change my peer performed 100 Queries Per Second (!!!) while doing 10 queries simultanously (!!!)
in an intranet index of 20000 URLs on my 16-core Mac

Check this yourself by doing:
cd bin
./searchtestmulti.sh
after finishing the run, divide 1000 by the given time per query (which is the qps for one thread)
and then multiply again by 10 (because 10 search threads has been started)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 08:55:57 +00:00
orbiter
b8aee6d402 performance hacks for better search performance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 23:50:28 +00:00
orbiter
11bebe356b fixed crawl start: with SVN 7225 the name of the crawl start url was not given in input field and therefore all crawl starts had contained the empty string as crawl start url
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 22:02:24 +00:00
orbiter
2971c91988 fix for http://forum.yacy-websuche.de/viewtopic.php?p=20977#p20977
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7228 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 21:44:28 +00:00
orbiter
091dd3f6ec - enhanced intranet search speed
- enhanced intranet portscan speed (better time-out)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 10:54:13 +00:00
low012
b9f405d1e8 *) added comments
*) more beautyful and easier to understand code (IMO)
*) added display= parameter to a lot of links in Wiki.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 00:32:50 +00:00
mikeworks
70576e88d2 de.lng: Added some more untranslated strings I found and uncommented old ones that were removed
terminal_p.html: Put back the old ID which was really easy to find
IndexCreate.js: Because XHTML 1.0 Strict does not allow name tags for some elements rewrote most element access functions to use getElementById
Table_API_p.html and all other html pages: Some XHTMl 1.0 Strict fixes, changed checkAll javascript, marked the first row with checkboxes as unsortable where applicable
Table_API_p.java and all other java pages: URLencoded lines with possible ampersands & -> & for validation XHTML 1.0 Strict sourcecode
--> All Index Create pages should validate now. Hope I did not break anything else (too much :-)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-06 00:00:23 +00:00
orbiter
efa59250f8 release 0.98 for SuMa-eV Demo tomorrow
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7224 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 17:52:54 +00:00
orbiter
6e6994e328 latest bugfixes to search and indexing function after test of demo presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 17:49:53 +00:00
orbiter
c3bf17a3a1 fixed must-match filter for smb crawling
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7222 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 00:05:08 +00:00
orbiter
099def2a04 small changes in search widget appearance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7221 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 23:43:33 +00:00
orbiter
50586a0dfd rename of widget to 'widget'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7220 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 19:57:29 +00:00
apfelmaennchen
dffa142529 Fix for author navigator in yacyui-portalsearch.js
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7219 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 19:27:12 +00:00
orbiter
574346f8ce better must-match pattern for intranet file-crawls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 12:55:39 +00:00
orbiter
aacf572a26 - enhancements for search speed
- bug fixes in many classes including basic data structure classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 11:54:48 +00:00
sixcooler
aa6075402a smal fix for crawling from 'sitelist' at changes from 7214
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7216 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-01 22:41:28 +00:00
sixcooler
61c82f3105 gzip-compresson @ transferRWI & transferURL back again
This reduce upload-volume to suit limited bandwidth of home-users like me :-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7215 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-01 00:42:43 +00:00
orbiter
2c549ae341 fixed a number of small bugs:
- better crawl star for files paths and smb paths
- added time-out wrapper for dns resolving and reverse resolving to prevent blockings
- fixed intranet scanner result list check boxes
- prevented htcache usage in case of file and smb crawling (not necessary, documents are locally available)
- fixed rss feed loader
- fixes sitemap loader which had not been restricted to single files (crawl-depth must be zero)
- clearing of crawl result lists when a network switch was done
- higher maximum file size for crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 23:57:58 +00:00
orbiter
f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
- nobody understand the auto-dom filter without a lenghtly introduction about the function of a crawler
- nobody ever used the auto-dom filter other than with a crawl depth of 1
- the auto-dom filter was buggy since the filter did not survive a restart and then a search index contained waste
- the function of the auto-dom filter was in fact to just load a link list from the given start url and then start separate crawls for all these urls restricted by their domain
- the new Site Link-List option shows the target urls in real-time during input of the start url (like the robots check) and gives a transparent feed-back what it does before it can be used
- the new option also fits into the easy site-crawl start menu

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 12:50:34 +00:00
mikeworks
63e387508c ConfigLanguage_p.java: Fixed the filename for the API call to ConfigLanguage_p.html - previously ConfigLanguage.html was recorded and the action could not be replayed with error 404 - Not found
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7212 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 03:16:17 +00:00
mikeworks
f468d377d7 Collage.html and Collage.java: Added <p> in body before images for XHTML 1.0 Transitional validationg and alt tag to images as well as closing tag <img (...) />
terminal_p.html: Set new link for starting a crawl to CrawlStartSite_p.html and replaced the old embed object of the Among.us Flash object by their new JS which takes care of adding the object correctly
de.lng: Moved the translations for the JS part from yacyinteractive.html to the yacyinteractive.js part
--> Terminal page is now valid XHTML 1.0 Transitional


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7211 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 23:11:29 +00:00
orbiter
3057a0b939 - intranet scanner now produces urls with host names, not ips if possible
- CrawStartIntranet servlet shows IPs and host names

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 22:44:49 +00:00
orbiter
75964909aa added missing path to htroot (may only be necessary for cross-linking of servlet classes)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7209 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 22:19:41 +00:00