Commit Graph

4712 Commits

Author SHA1 Message Date
orbiter
3d945bb442 fix for ftp client: suppress bad directory listing time-out
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7348 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-29 08:41:29 +00:00
orbiter
d4a1a1850b removed warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7347 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-29 07:52:10 +00:00
low012
3b5830b7d4 *) Fixed typo.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7346 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-28 03:05:22 +00:00
low012
9b3fae9496 *) cleaning up the code a little bit
*) program to interface, not implementation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-28 02:57:31 +00:00
orbiter
7bb4b001ed - view image files from cache
- fixed generic header settings; affects CORS functionality

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7344 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-27 09:16:16 +00:00
low012
e7552bd719 *) cleaning up the code a little bit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7343 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-27 00:54:59 +00:00
orbiter
321eb012fe removed two warnings and reverted one change
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7340 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-26 11:15:42 +00:00
apfelmaennchen
737aaf6952 various small changes to ymarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7339 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-25 21:16:47 +00:00
apfelmaennchen
8a50670546 some code clean up for the last post
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7338 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-24 23:40:55 +00:00
apfelmaennchen
442497868d another step towards an auto tagging function for YMarks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7337 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-24 23:26:29 +00:00
f1ori
741a87a3e9 * make .yacy-domains crawlable (.yacy-domains are local domains, so only in custom networks/peers)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7334 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-22 19:12:51 +00:00
f1ori
fd74bc388c * fix small bug in sessionid-removal
* add testcase for seesionid-removal

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7333 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 23:55:40 +00:00
f1ori
dca9e16f51 * don't index pages, which redirect, twice
* there fore auto-redirection of HTTPClient for crawling is disabled and the old code is reactivated

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7332 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 22:46:12 +00:00
low012
eb79b952ef *) cleaner code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7331 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 03:39:53 +00:00
low012
38fdf43587 *) renamed classes according to standard Java coding conventions
*) String.isEmpty() was introduced in Java 1.6, but we still use Java 1.5

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7330 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 01:29:32 +00:00
low012
025e3f4790 *) renamed classes according to standard Java coding conventions
*) removed unsused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7328 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 00:39:21 +00:00
low012
3b9aa0504e *) removed unsused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7327 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 00:28:32 +00:00
low012
db3db0fdb9 *) trying to make this class less confusing (probably failing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7326 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-21 00:13:08 +00:00
apfelmaennchen
54e63b556e intermediate step for a YMark auto-tagging function based on word frequencies.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7325 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-17 15:17:29 +00:00
apfelmaennchen
403ee9c014 added a drill-down for metadata and word count to /api/ymarks/test_treeview.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7324 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-16 00:48:38 +00:00
f1ori
a025b1da89 * fix bug when browsing local filesystem (e. g. repository) with yacy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7323 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-15 14:47:16 +00:00
apfelmaennchen
11ae5b108e enabled rebuildIndex for /Table_YMark_p.html (rebuilds the tags and folders index)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7320 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-13 13:02:56 +00:00
sixcooler
b87bf88ac8 using less memory on merging and rewriting blobs
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-12 16:02:20 +00:00
apfelmaennchen
94a9be18a4 added a ymark table administration: /Table_YMark_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7316 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-10 22:53:27 +00:00
apfelmaennchen
25339f93c7 more updates to ymarks
- working xbel import/export
- exported xbel includes yacy specific metadata but still validates against PUBLIC DTD


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7315 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-09 17:01:31 +00:00
f1ori
d62e449a11 * fix FilterEngine, forgot comparision-operator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7314 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-08 09:37:44 +00:00
apfelmaennchen
cdd65aca71 update to ymarks
- get_xbel.xml is almost working
- startet ymark api documentation info.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7313 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-07 20:03:01 +00:00
apfelmaennchen
808edffaf6 ymarks
- some refactoring
- working xbel and html import (/api/ymarks/test_import.html)
- working treeview (/api/ymarks/test_treeview.html)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7312 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-06 20:26:13 +00:00
f1ori
2c539b514a * add domaincheck (local/global/domainlist) to urlcleaner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7311 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-06 16:50:33 +00:00
orbiter
117fc86b3d fix for http://forum.yacy-websuche.de/viewtopic.php?p=21199#p21199
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7308 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-05 13:19:37 +00:00
orbiter
441fbc26e2 security patch for WeakPriorityBlockingQueue (produced a deadlock)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7307 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-05 09:38:31 +00:00
orbiter
5dcb838293 - removed thread overhead when calling dns services
- fixed localsearch (changed it by accident)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-05 00:29:32 +00:00
orbiter
4c50d3428e smaller file size for array stacks to support smaller deletion sizes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-04 13:29:19 +00:00
orbiter
09badc697b - low-memory patch for crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7304 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-04 13:26:27 +00:00
orbiter
becc463d8a enhanced did-you-mean
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7300 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-04 00:25:19 +00:00
apfelmaennchen
43586a2ace a update to ymarks (please test if you wish):
- import HTML (e.g. FF export) via /api/ymarks/import.html
- view your import via /api/ymarks/test.html
- get a xml list via /api/ymarks/get_ymark_list.xml?tags=&folders=
- delete bookmark tables via standard interface /Tables_p.html
it is still very experimental!! 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7299 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 22:52:03 +00:00
orbiter
93c535d111 fixed http://forum.yacy-websuche.de/viewtopic.php?p=21113#p21113
fixed a concurrent modification exception during search and a time-out problem

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 20:58:50 +00:00
orbiter
04932dc268 added rdf data structure for rss feeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 20:06:23 +00:00
orbiter
84f2953cd8 fix for rss loader / rss type recognition
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7296 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 19:58:01 +00:00
orbiter
4c72885cba added a sitemap entry parser and loader for sitemaps
(a recursion if a sitemap refers to another sitemap)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 19:48:33 +00:00
orbiter
790e0b1894 - enhanced index deletion in IndexControlRWIs_p: delete also robots.txt database and cache if demanded
- added option for details of deletion
- added deletion to new ConfigHTCache_p servlet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 18:31:36 +00:00
apfelmaennchen
f5324b27f2 more updates to the new bookmarks (ymarks)....
- split YMarkTables and YMarkIndex in two different classes
- HTML import is working properly
- XBEL import is still broken


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7292 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-03 06:47:02 +00:00
orbiter
445619f3ec added a submenu ConfigHTCache_p.html to set the size of the HTCache separately from the proxy configuration.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7291 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-02 23:57:11 +00:00
sixcooler
85c65475fa smal but important correction of last commit @ HTTPClient
(if there is a response it really should be taken to its end)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-02 21:18:44 +00:00
f1ori
acd93b1b31 * add failsafe mechanisme to domainlist retrieval
domainlist is saved locally, if none of the given urls in network.unit.domainlist
  could be retrieved, the file from the last boot is used instead

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-02 17:57:48 +00:00
orbiter
70c95608d4 Added CORS Access header for yacysearch.rss output
used some of the recommendations from Copro:
http://forum.yacy-websuche.de/viewtopic.php?p=21015#p21015
Original Request:
http://forum.yacy-websuche.de/viewtopic.php?p=20829#p20829

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7288 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-11-02 16:28:40 +00:00
lotus
18729351e7 upnp: hint for wrongly detected local ip address
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-31 20:05:59 +00:00
f1ori
def4253555 * add option to network definition to provide a domainlist (syntax like in blacklists)
* crawler and search allow only urls matching one in domainlist (if list is provided)
* this may be useful to prevent dedicated networks from being "polluted"
* FilterEngine is improved Backlist-object, Blacklist may inherit from FilterEngine in the future

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-30 14:44:33 +00:00
orbiter
ac6b503adf untar files without gzip decompression even if the file has gz extension. this is done when the decompression fails.
decompressed gzip files with gz extension may appear if the server sets a gzip compression header

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7282 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-28 23:12:33 +00:00
apfelmaennchen
efe0667fdd more new bookmark (ymarks) code with experimental html and xbel import
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-28 15:24:15 +00:00
mikeworks
caabebf9be Fixed spelling mistake omiting -> omitting in debug messages in ConfigUpdate_p.java and Switchboard.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7280 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-28 04:03:11 +00:00
orbiter
fb92f9ae8e added mime type image/jpeg (image/jpg is wrong but it is left here because it does not harm and this error also exists in configuration of web servers)
see also:
http://forum.yacy-websuche.de/viewtopic.php?p=21129#p21129

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7279 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-27 21:53:11 +00:00
orbiter
155d556568 - better memory protection
- more logging
- little bit of refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7278 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-27 13:21:18 +00:00
f1ori
7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7276 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-26 16:10:20 +00:00
orbiter
25a8e55bc9 more logging about bad seeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7275 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-26 15:00:22 +00:00
orbiter
959b8c6fa0 - allow greater seed size
- more logging for bad seeds

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7274 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-26 14:40:19 +00:00
orbiter
e103419a56 - removed <3 peers barrier for peer ping feedback
- more logging

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7273 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-26 13:08:09 +00:00
apfelmaennchen
d0e6c03b51 some updates to the new bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7272 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-25 22:44:05 +00:00
orbiter
facfd204e9 added a parent configuration option.
see /ConfigPortal.html
requested here:
http://forum.yacy-websuche.de/viewtopic.php?p=21099#p21099

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7271 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-25 22:16:07 +00:00
orbiter
e3964f2c31 better catch of network definition load error; continue with secondary network load definition location
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7270 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-25 09:20:45 +00:00
low012
65a0381f76 *) cleaning up code (still not done)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7267 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-24 23:57:07 +00:00
orbiter
e3e3b49d52 - enhanced main release recognition
- yacybot user agent now includes the yacy network name (not the peer name!)
- refactoring and clean-up (mostly turned tab into spaces)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7266 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-24 21:43:01 +00:00
apfelmaennchen
9c94ebdee4 small changes to new bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-22 13:14:09 +00:00
apfelmaennchen
244b56e9d3 an update to the new bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7264 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-21 19:18:17 +00:00
low012
dc40f51b8d *) added headlines as proposed by Vega
*) <pre> will be displayed monospaced in wiki and blog again
*) bugfix for <pre> spanning multiple lines
*) replaced deprecated <s> tag with <span> equivalent

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7262 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 23:00:43 +00:00
apfelmaennchen
f035f257da added some more bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7261 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 21:09:41 +00:00
low012
22ed9c380c *) fixed bug which was introduced in r7226 (shame on me) which made wiki unusable (all entries were stored with empty subject as key -> edits were lost)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7260 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 21:06:23 +00:00
f1ori
60fd2e549d * log failures when writing config file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7259 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 15:00:29 +00:00
orbiter
58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7258 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 11:35:09 +00:00
orbiter
863065abc4 added user agent logging to access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7256 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-18 08:09:59 +00:00
apfelmaennchen
a79728b97d some updates to experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7254 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-17 09:58:50 +00:00
apfelmaennchen
ef782cd026 and even more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7253 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-16 10:20:41 +00:00
orbiter
ed4371dcf3 enhanced navigation implementation and enhanced tag cloud computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7252 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 23:45:12 +00:00
orbiter
ca738ac924 - added a tag cloud to search results (using the topics)
- some refactoring of score classes
- added default package for new classes add_ymark and delete_ymark

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7251 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 22:01:39 +00:00
apfelmaennchen
7aca763ca8 Some more experimental bookmark code...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7250 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 12:53:41 +00:00
apfelmaennchen
4270ed696c Experimental code (I need to transfer the code to my macbook, sorry) for the new bookmarks API based on the Tables concept (same as for crawl starts). Currently you can add a bookmark by api/ymarks/add_ymark.xml?url=http://www.yacy.net&title=YaCy and watch the result via the standard view Tables_p.html.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-15 05:40:19 +00:00
orbiter
e4d561971e added more score cluster options and made score cluster usage more transparent
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-14 11:40:02 +00:00
orbiter
e8f90201a5 fix for scheduling of rss feeds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7247 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-13 13:00:36 +00:00
orbiter
7cd9d9d22a - enhanced DidYouMean computation using a faster count on index entries; this causes that results can be ranked better
- added limitations on DidYouMean result sets according to input and output string length

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7246 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 22:02:10 +00:00
orbiter
de722090b5 enhancements in did-you-mean guessing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7243 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 09:45:15 +00:00
orbiter
a59c885ee0 autocomplete and did-you-mean can now understand _all_ languages and can generate suggestions in all languages and character types
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 08:36:33 +00:00
orbiter
b7acd92ce4 Auto-Suggestions for YaCy Search:
- added a suggest servlet according to opensearch and firefox standard
- integrated the suggest servlet into opensearch description file
- integrated a autocomplete plugin for jquery
- added a autocomplete addition to the yacy search windows showing autosuggest queries

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7241 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-12 01:23:49 +00:00
orbiter
24f1cba7b2 performance hacks:
- faster generation of index abstract compression during remote search
- less synchronization in IO record reading
- request index abstract generation only if necessary and faster time-out in remote search 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7239 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 12:44:07 +00:00
orbiter
6a166c2040 patches for bad proxy behaviour
- accept ipv6 localhost clients
- index media files (url only)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7238 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 11:38:36 +00:00
orbiter
d607b30b6a performance enhancements for search and code review for database functions
- removed read cache from Records data structure because the read cache had no cache hit during search operation
- copied old read-cache class to CachedRecords and the old, now new Records class does not have the cache any more and a code review checked that data structures and synchronization is clean
- removed unnecessary synchronization from Table class during get()

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7237 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 11:01:50 +00:00
orbiter
45b1ab3d07 custom + generic skins:
- added a generic skin which is filled with actual color assignment using a servlet
- enabled css servlets
- added a generic color scheme in configuration file
- added configuration input in Customization/Appearance servlet
- added a jquery color picker widget
- placed color picked widget to input field of generic colour definition input fields

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-11 00:00:10 +00:00
orbiter
fcd40cd30f - disabled domZones (buggy, must think about better solution)
- increased time-out for dns resolver and isLocal property

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 10:17:50 +00:00
orbiter
ec38eca278 fix for new URI equal method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 09:27:31 +00:00
orbiter
0d363a94d7 more performance hacks
this makes YaCy search results VERY fast for all verify=false search cases
and it enhances the search speed also for all other snippet-fetch cases.
With this change my peer performed 100 Queries Per Second (!!!) while doing 10 queries simultanously (!!!)
in an intranet index of 20000 URLs on my 16-core Mac

Check this yourself by doing:
cd bin
./searchtestmulti.sh
after finishing the run, divide 1000 by the given time per query (which is the qps for one thread)
and then multiply again by 10 (because 10 search threads has been started)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7231 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-09 08:55:57 +00:00
orbiter
b8aee6d402 performance hacks for better search performance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7230 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 23:50:28 +00:00
orbiter
091dd3f6ec - enhanced intranet search speed
- enhanced intranet portscan speed (better time-out)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7227 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 10:54:13 +00:00
low012
b9f405d1e8 *) added comments
*) more beautyful and easier to understand code (IMO)
*) added display= parameter to a lot of links in Wiki.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7226 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 00:32:50 +00:00
orbiter
6e6994e328 latest bugfixes to search and indexing function after test of demo presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7223 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-05 17:49:53 +00:00
orbiter
aacf572a26 - enhancements for search speed
- bug fixes in many classes including basic data structure classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-04 11:54:48 +00:00
sixcooler
61c82f3105 gzip-compresson @ transferRWI & transferURL back again
This reduce upload-volume to suit limited bandwidth of home-users like me :-)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7215 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-01 00:42:43 +00:00
orbiter
2c549ae341 fixed a number of small bugs:
- better crawl star for files paths and smb paths
- added time-out wrapper for dns resolving and reverse resolving to prevent blockings
- fixed intranet scanner result list check boxes
- prevented htcache usage in case of file and smb crawling (not necessary, documents are locally available)
- fixed rss feed loader
- fixes sitemap loader which had not been restricted to single files (crawl-depth must be zero)
- clearing of crawl result lists when a network switch was done
- higher maximum file size for crawler

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7214 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 23:57:58 +00:00
orbiter
f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
- nobody understand the auto-dom filter without a lenghtly introduction about the function of a crawler
- nobody ever used the auto-dom filter other than with a crawl depth of 1
- the auto-dom filter was buggy since the filter did not survive a restart and then a search index contained waste
- the function of the auto-dom filter was in fact to just load a link list from the given start url and then start separate crawls for all these urls restricted by their domain
- the new Site Link-List option shows the target urls in real-time during input of the start url (like the robots check) and gives a transparent feed-back what it does before it can be used
- the new option also fits into the easy site-crawl start menu

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 12:50:34 +00:00
orbiter
3057a0b939 - intranet scanner now produces urls with host names, not ips if possible
- CrawStartIntranet servlet shows IPs and host names

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7210 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 22:44:49 +00:00
orbiter
c60aed4435 no caching in browser of dynamic web pages sent by YaCy http
this may prevent unnecessary IO caused by cache storage of the browser

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-29 19:56:42 +00:00
orbiter
e63896f2a8 added an intranet scanner and a servlet which shows all intranet addresses and an option to start a site-crawl for all these addresses at once.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7203 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-28 12:18:54 +00:00