Commit Graph

195 Commits

Author SHA1 Message Date
orbiter
ef82cced01 removed default line 'P2P WEB SEARCH' if no line is given
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-01 00:43:52 +00:00
orbiter
94110df85a moved logging partially to kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 01:06:56 +00:00
orbiter
024da2916b refactoring of logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 23:33:47 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00
orbiter
bf93767ec6 refactoring of kelondro database classes
(to be continued)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 15:33:00 +00:00
orbiter
fc27bf8c4c refactoring of kelondro classes:
kelondro shall become independent from other packages.
moved bytebuffer, date and memory to kelondro

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 14:48:11 +00:00
low012
b41a06228f *) cleaning up...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:48:52 +00:00
low012
ce81391095 *) using parameters like site: in the search field does not affect urlmask anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:37:24 +00:00
low012
80e6356860 *) r 5512 has introduced a bug which resulted in useless filters if site:, filtetype:, or inurl: was used since the filters included the word "null".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-25 22:16:49 +00:00
lotus
5078e837ac better readability / no functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5512 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-22 13:41:11 +00:00
lotus
c7c291bc6b allow simultaneous inurl: site: and filetype: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 14:59:27 +00:00
orbiter
9ef77d57f5 added an access control to the search interface using white/blacklists:
in the network configuration, you can configure a whiteliste and a blacklist
- blacklistet clients cannot search
- whitelistet client get never any search restrictions
- for all other clients: apply DoS search restrictions
Please see the example configuriation in yacy.network.freeworld.unit
by default, all clients from localhosts get whitlistet.
If you have your own YaCy network, please put all the IPs of your peers into the whitelist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 10:55:48 +00:00
lotus
4641ecd6d9 inurl: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-08 18:59:29 +00:00
lotus
0d1bd78674 * full site: syntax support e.g. site:de.wikipedia.org
possible if dots in query would work yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-07 21:05:07 +00:00
orbiter
9bed4de280 fix for the search bug introduced in SVN 5449
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 23:16:10 +00:00
orbiter
b2b7edae18 fixed interactive search
- added dummy servlet class, because otherwise the template engine is not triggered.
thats so because the yacy httpd works much faster as normal file server without a scan
of the served pages. Therefore each page with templates must now have a class file associated to it.
- fixed json output format of yacysearch

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5449 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 20:04:09 +00:00
lotus
ca80930892 accept leading dots on filetype: and site: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 10:04:24 +00:00
low012
1af728ae09 *) regex for site operator changed as proposed by Lotus
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:30:34 +00:00
low012
9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 14:58:32 +00:00
orbiter
28d2d28573 added support for filetype search
(just use filetype:<type> in the search query)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 17:57:04 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
orbiter
0b4808ba3d added new interactive search feature:
- during the user types search queries, the local database is searched
- results are presented interactively

This was implemented using a new JSON result format for search results in YaCy
- added JSON as file format for servlets
- refactoring of current search servlets (xml and html)
- added JSON output format for search results
- added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 15:24:25 +00:00
lotus
1951d30a62 addendum to last commit
handle words with length < 3 correctly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 19:43:40 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
low012
77e41da7d2 *) further propagation of display value (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
*) removed another depreciated parameter "time" which led to ugly -UNRESOLVED_PATTERN- in URL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-18 19:39:46 +00:00
lotus
7782a43060 fix if LANGUAGE: was not defined and the end of the query
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 11:36:17 +00:00
lotus
fe2792e9ce use accept-language header instead of user agent for language detection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 17:47:11 +00:00
lotus
93ddf206e6 opensearch fix if user agent had no language
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 20:13:18 +00:00
orbiter
6e7d113eac fix for wrong index initialization after network switch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5203 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-23 23:30:25 +00:00
orbiter
1198eeecc7 added language selection to search query:
- the language can be selected using a LANGUAGE:<language> element in the query line, i.e.:
java LANGUAGE:en
- the language can be selected with a post element in google-style syntax with the 'rl' element:
?lr=lang_en&query=java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5193 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-21 07:28:57 +00:00
orbiter
00c1535f84 added ranking and evaluation of language type in a search
the wanted language is taken from the browser user-agent string

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-21 00:04:42 +00:00
orbiter
4fbee21cea - added fetch-ahead again (had been removed in last commit)
- reverted default query mode to verify=false

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5111 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 23:50:13 +00:00
orbiter
fc03b0437a fixed a error case where a second search after a first search with a different search word failed
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5109 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 15:55:25 +00:00
orbiter
d3d41e2ee4 - fixed problem with searching with quotes (still not complete, but not as bad as before)
- fixed parsing of crawl-delay statements when seconds were given with float numbers
- enhanced performance of profiling (not too many loggings; not more than one per second)
- removed some debug output
- fixed wrong return type in logging
- added a logging condition in httpd to prevent that logging statements are generated when they are not written (should be added everywhere!)
- fixed wrong word distance computation in RWI management


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-02 23:49:48 +00:00
lotus
3fbfd5a78b * fix for non-changing offset on new search term
* dht-heap doesn't has to be deleted (5097), we simply write a new one on exit
* do not install YaCy in startup because a Windows-shutdown might corrupt something. Installing YaCy as a service would solve this.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5099 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-02 15:09:31 +00:00
orbiter
536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-25 18:11:47 +00:00
danielr
621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
- removed unnecessary code (unused variables, String.toString)
- corrected some calculations (cast int to double or long ;)
- improved little performance (using Integer.valueOf() instead of new Integer)
- log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions
- finalized some (more) fields
- finally close some streams
- made inner classes static if not using environment
- generalized some equals (from specificClass to Object)
- fixed some potential nullpointer accesses


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-06 19:43:12 +00:00
danielr
17b7845eb5 * refactoring
- moved constants from plasmaSwitchboard to own class (all 232 ;)
- moved remoteProxy-Methods to httpRemoteProxyConfig, better names
- removed some unnecessary code (else-statements)
* formatting (correct indentation)
* minor bugfixes (due to findbugs.sf.net)
* hopefully fixed "missing quote" (announcing StringParts as UTF-8)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 13:57:00 +00:00
danielr
3bb870bfcd added final where possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-02 12:12:04 +00:00
orbiter
c3d461d191 - removed superfluous copyright statement
- updated my email address

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-20 17:14:51 +00:00
orbiter
3ca98fee42 removed superfluous copyright statement
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5010 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-20 00:21:07 +00:00
danielr
d14e8d348f - times of LOCAL_SEARCH are shown in milliseconds (also in yacysearch.java ;)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5003 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-15 17:35:02 +00:00
orbiter
b38f467e3c better SRU compliance
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4976 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-07 21:50:24 +00:00
orbiter
a6719dfd2b - refactoring of robots parser
- no more keep-order parameter in remove (it was not possible to make this strict, and not useful)
- some small enhancements in balancer
- robots parser without references in switchboard
- changes synchronization in robots

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4969 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-05 00:35:20 +00:00
orbiter
474e29ce4a added options to configure the 'corporate identity'-icons, the home page link and the greeting line from
the skin menue. Additionally an example is given there how to integrate a search page with an iframe.
Please see the skin menu.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4967 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-03 23:37:04 +00:00
orbiter
c998dc6556 - added security functions to flush url and search caches in case that memory is full
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4933 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-16 21:39:58 +00:00
orbiter
994c609cf8 added new shell script to do a web search from the terminal
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4916 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 21:33:36 +00:00
orbiter
f5ef7f222e - fixed a bug in parser (directory paths had not been recognized)
- no access check when a search is made only local without snippet fetch
- added comment and status message in resourceObserver (this takes very long at startup time!)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4911 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-11 09:54:58 +00:00