Commit Graph

312 Commits

Author SHA1 Message Date
orbiter
5841ee83d3 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-11 21:29:18 +00:00
orbiter
ce8dc575ca refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-11 00:12:19 +00:00
orbiter
bea3b99aff moved table and util classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6397 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-10 01:14:19 +00:00
orbiter
4446acc8cd moved kelondro order
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6392 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 23:22:22 +00:00
orbiter
f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
- moved here the logging classes as part of the new net.yacy.kelondro package

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6391 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 23:13:30 +00:00
orbiter
735e2737e3 * added index segments
This is a major change in the organization of indexes.
Please consider a back-up of your data before you run this update.
All existing index files will be moved and renamed to a new position.
With this change, it will be possible to maintain different indexes for different purposes and it will be possible to have a distinction between DHT-in and DHT-out specific indexes. Tenants may also have their own index, and it may be possible to have histories and back-ups of indexes. This is just the beginning, many servlets must be adopted after this change, but all functions that had been there should still work.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6389 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-09 14:44:20 +00:00
orbiter
0c17b600c6 remote search by default off
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6365 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-30 15:06:29 +00:00
orbiter
c3a4aee255 some redesign with a possible fix for the ReferenceContainerCache.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6336 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-22 14:33:57 +00:00
orbiter
604c37927f used comparator for did-you-mean that uses index sizes for comparisment, but:
- limit comparisment to only the first 10 elements that had been sorted before without IO
- added a size cache to index computation because the size is computed at least twice in set comparator


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6306 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-08 13:48:17 +00:00
orbiter
a58d9cae7d - show location name in geolocalization search result
- added link from location icon to openstreetmap browser with coordinates

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6305 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-08 10:18:03 +00:00
orbiter
eaddf2d464 - corrected layout of map preview
- added caption to maps containing latitude and longitude information
- prevented that maps occur on second search page
- added location names to did-you-mean
- some refactoring of did-you-mean
- added equal and compareTo test to Coordinates class to make that work in set
- fixed utf-8 support for library files
- fixed a bug in images search icon view caption

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6294 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-04 23:33:47 +00:00
orbiter
2740d9dd79 added integration of osm maps for search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6291 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-04 14:32:36 +00:00
orbiter
1762a7bcd6 - moved DidYouMean to the data package
- added a DidYouMeanLibrary class that shall support the did you mean function with additional word lists


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-01 13:04:35 +00:00
orbiter
72e5407115 refactoring of snippet cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6268 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-08-27 14:34:41 +00:00
orbiter
e7736d9c8d more refactoring: made all variables in SearchEvent private
to prepare splitting of the class into two parts: local and remote search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6265 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-08-26 15:59:55 +00:00
orbiter
d8ca6e6bf1 more refactoring for search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6263 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-08-25 21:27:01 +00:00
orbiter
72ac5bd80f refactoring of search process.
this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-08-24 15:24:02 +00:00
orbiter
1d8d51075c refactoring:
- removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here:
http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html
We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages.
- cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http.
- because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-19 20:37:44 +00:00
orbiter
5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
- The indexing queue was a historic data structure that was introduced at the very beginning at the project as a part of the switchboard organisation object structure. Without the indexing queue the switchboard queue becomes also superfluous. It has been removed as well.
- Removing the switchboard queue requires that all servlets are called without a opaque generic ('<?>'). That caused that all serlets had to be modified.
- Many servlets displayed the indexing queue or the size of that queue. In the past months the indexer was so fast that mostly the indexing queue appeared empty, so there was no use of it any more. Because the queue has been removed, the display in the servlets had also to be removed.
- The surrogate work task had been a part of the indexing queue control structure. Without the indexing queue the surrogates needed its own task management. That has been integrated here.
- Because the indexing queue had a special queue entry object and properties attached to this object, the propterties had to be moved to the queue entry object which is part of the new indexing queue withing the blocking queue, the Response Object. That object has now also the new properties of the removed indexing queue entry object.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-17 13:59:21 +00:00
f1ori
8931c8d6b4 improvments to debianpackage:
* autoupdate completely disabled, display hint
* restart-button in interface works!

* moved all build-Variables to yacyBuildProperties
* fixed some warnings


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6195 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-11 17:03:22 +00:00
orbiter
0e8647d62f refactoring of search classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6184 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-08 22:14:57 +00:00
orbiter
dafffd0153 refactoring of parsers and document processing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-08 21:48:08 +00:00
orbiter
154bbc3364 code cleanup: call of static methods directly to the class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 13:01:35 +00:00
orbiter
222850414e simplification of the code: removed unused classes, methods and variables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 09:27:46 +00:00
orbiter
99fa265e1d fix for search bug caused by tenant patch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-22 22:31:29 +00:00
orbiter
57af311627 fix for wrong urls in navigator when a tenant is used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-22 12:25:18 +00:00
orbiter
be1c7ddc64 refactoring of search classes -- moved Ranking Profile to search package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6086 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-16 21:45:40 +00:00
orbiter
b5bc399cea added necessary synchronization for logging statistics (causes deadlock)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6083 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-16 10:37:13 +00:00
orbiter
ce1adf9955 serialized all logging using concurrency:
high-performance search query situations as seen in yacy-metager integration showed deadlock situation caused by synchronization effects inside of sun.java code. It appears that the logger is not completely safe against deadlock situations in concurrent calls of the logger. One possible solution would be a outside-synchronization with 'synchronized' statements, but that would further apply blocking on all high-efficient methods that call the logger. It is much better to do a non-blocking hand-over of logging lines and work off log entries with a concurrent log writer. This also disconnects IO operations from logging, which can also cause IO operation when a log is written to a file. This commit not only moves the logger from kelondro to yacy.logging, it also inserts the concurrency methods to realize non-blocking logging.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 21:19:54 +00:00
orbiter
bc6dd8194b refactoring: moved search query class to new search package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6075 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-15 11:49:00 +00:00
apfelmaennchen
303ccda69f small fix for "did you mean"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6063 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 11:11:30 +00:00
orbiter
7c4d1d471c hand-over of more specific object
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6062 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 10:22:25 +00:00
apfelmaennchen
9150bc0f7d - don't show empty "did you mean"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6061 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 07:02:50 +00:00
apfelmaennchen
09acfa66d1 - improved "did you mean"
- added &meanCount= to query string
- &meanCount=0 ==> no suggestion, no performance loss
- sorting suggestions by sb.indexSegment.termIndex().count()

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6059 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-13 06:20:05 +00:00
apfelmaennchen
54a48b4184 - added "did you mean" to search page
- currently works for single word queries only!

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6057 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-12 20:36:03 +00:00
orbiter
733385cdd7 enahnced database access times by removal of unnecessary synchronization.
added also more hacks that resulted from high-volum query testing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 23:02:42 +00:00
orbiter
27fa6a66ad - completed the author navigation
- removed some unused variables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-08 23:30:12 +00:00
orbiter
c879783008 added steering of navigator computation:
- by default the navigator computation if off for servlet yacysearch.html, but:
- the servlet is called by default with a option to switch navigator results on
this will prevent that metasearch users will get slow results that are caused by unnecessary computations

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-07 22:51:15 +00:00
orbiter
c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing.
- added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM!
- removed the old FlexTable database structure. It was not used any more.
- removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more)
- add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-07 21:48:01 +00:00
orbiter
a0c53abbe1 - wait until local results are computed during search, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2167&hilit=&p=15521#p15521
- show only x+1 pages in page navigator

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-04 20:58:47 +00:00
orbiter
1c77db670f re-designed response format for navigation:
- changed json and rss response templates


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-04 10:54:49 +00:00
orbiter
a5d481eab1 enhanced navigation
- fixed too early computation of navigation
- moved navigation rendering to yacysearchtrailer
- added more asserts

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-01 22:45:28 +00:00
orbiter
88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-29 10:03:35 +00:00
orbiter
99bf0b8e41 refactoring of plasmaWordIndex:
divided that class into three parts:
- the peers object is now hosted by the plasmaSwitchboard
- the crawler elements are now in a new class, crawler.CrawlerSwitchboard
- the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment

The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-28 14:26:05 +00:00
orbiter
63a0255166 - refactoring: added new content package, which will contain connector classes for different types of data sources to import texts into the YaCy index
- refactoring: migrated data objects for the new connector classes
- added a DAO interface class to specify an abstract interface for database retrieval connector methods

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5977 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-26 07:44:22 +00:00
orbiter
f246928c20 first attempt to add 'real' Navigation to yacy search results: host navigation
- after a search is started, it is analysed how many hits are in each site
- this can be done really efficient, because the navigation information is hidden in the url hash and can be computed very fast
- the search result shows a column on the right with the hosts and the hits per host
- after a click on a host the search is modified using the efficient site: - operator

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5976 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-25 22:27:34 +00:00
orbiter
a642d6a7b5 - added navigation icons for search result pages
- modified result page rendering to use new icons instead of numbers
- set different default values in yacy.init for higher indexing performance; removed pro-values
- modified WatchCrawler to accept 30000 PPM instead of only a maximum of 6000 PPM

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-14 23:11:10 +00:00
lotus
0e01e846ef small fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5898 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-28 18:59:56 +00:00
lotus
d8fca85c11 fixed search: allow dots in operators
added new operator "tld:" which was the former "site:"
"site:" uses fast site operator introduced in r5770

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5897 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-28 17:12:31 +00:00
orbiter
1b8d346b4c fixes in connection with transiton to byte[] hashes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5843 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-20 21:54:00 +00:00
orbiter
c8624903c6 full redesign of index access data model:
terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash.
this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before.
Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5812 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-16 15:29:00 +00:00
orbiter
12d81e98eb - fixed bad search results when searching for empty string
- simplified result handling and page composition in case that nothing was searched


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5807 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-15 11:22:43 +00:00
orbiter
c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index.
Moved to version 0.74

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-03 13:23:45 +00:00
orbiter
7ba078daa1 - added fast site-operator
- refactoring merge into BLOBArray

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-04-02 13:26:47 +00:00
orbiter
587838bd09 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5758 6c8d7289-2bf4-0310-a012-ef5d649a1542 2009-03-30 21:13:53 +00:00
orbiter
83792d9233 more refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5722 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-16 16:24:53 +00:00
orbiter
209f25f5f5 refactoring to integrate indexCell data structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5718 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-16 00:18:37 +00:00
orbiter
7f67238f8b refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5710 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 14:56:25 +00:00
orbiter
14a1c33823 refactoring of wordIndex class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 10:34:51 +00:00
orbiter
e2e7949feb replaced old PPM computation with a better one that simply sums up events that had been stored in the profiling table.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5706 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 00:13:47 +00:00
apfelmaennchen
4f3bdc64b5 - added ?callback= parameter for JsonP support
- this is needed for json ajax cross domain calls
- see: http://bob.pythonmac.org/archives/2005/12/05/remote-json-jsonp/

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5674 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-07 10:18:47 +00:00
orbiter
aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 11:04:13 +00:00
orbiter
6ffc6e3389 more refactoring of indexer and kelondro classes;
- integrating the indexer into kelondro as package 'text'
- renaming of classes in kelondro.index

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-02 10:00:32 +00:00
orbiter
76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-01 23:58:14 +00:00
orbiter
c25c334b75 replaced old DHT transmission method with new method. Many things have changed! some of them:
- after a index selection is made, the index is splitted into its vertical components
- from differrent index selctions the splitted components can be accumulated before they are placed into the transmission queue
- each splitted chunk gets its own transmission thread
- multiple transmission threads are started concurrently
- the process can be monitored with the blocking queue servlet
To implement that, a new package de.anomic.yacy.dht was created. Some old files have been removed.
The new index distribution model using a vertical DHT was implemented. An abstraction of this model
is implemented in the new dht package as interface. The freeworld network has now a configuration
of two vertial partitions; sixteen partitions are planned and will be configured if the process is bug-free.
This modification has three main targets:
- enhance the DHT transmission speed
- with a vertical DHT, a search will speed up. With two partitions, two times. With sixteen, sixteen times.
- the vertical DHT will apply a semi-dht for URLs, and peers will receive a fraction of the overall URLs they received before.
  with two partitions, the fractions will be halve. With sixteen partitions, a 1/16 of the previous number of URLs.
BE CAREFULL, THIS IS A MAJOR CODE CHANGE, POSSIBLY FULL OF BUGS AND HARMFUL THINGS.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5586 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-10 00:06:59 +00:00
orbiter
86763c42c4 enhanced interactive search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-04 15:59:22 +00:00
orbiter
9d282d2c16 - renamed interactivesearch to yacyinteractive
- added a configuration option to set the pop up page in Config Appearance
- added a minimized header option to yacyinteractive
- fixed a bug in yacysearch: default values when no query is done


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5569 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-03 13:04:02 +00:00
orbiter
ef82cced01 removed default line 'P2P WEB SEARCH' if no line is given
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5553 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-01 00:43:52 +00:00
orbiter
94110df85a moved logging partially to kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-31 01:06:56 +00:00
orbiter
024da2916b refactoring of logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 23:33:47 +00:00
orbiter
83ce65707a (almost) completed partition of classes in kelondro
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:44:20 +00:00
orbiter
7ee494fde5 more refactoring of kelondro:
- seperated BLOB from table classes
- renamed 'coding' package to 'order'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 22:08:08 +00:00
orbiter
bf93767ec6 refactoring of kelondro database classes
(to be continued)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 15:33:00 +00:00
orbiter
fc27bf8c4c refactoring of kelondro classes:
kelondro shall become independent from other packages.
moved bytebuffer, date and memory to kelondro

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 14:48:11 +00:00
low012
b41a06228f *) cleaning up...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5529 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:48:52 +00:00
low012
ce81391095 *) using parameters like site: in the search field does not affect urlmask anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5528 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-28 19:37:24 +00:00
low012
80e6356860 *) r 5512 has introduced a bug which resulted in useless filters if site:, filtetype:, or inurl: was used since the filters included the word "null".
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5517 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-25 22:16:49 +00:00
lotus
5078e837ac better readability / no functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5512 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-22 13:41:11 +00:00
lotus
c7c291bc6b allow simultaneous inurl: site: and filetype: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5478 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 14:59:27 +00:00
orbiter
9ef77d57f5 added an access control to the search interface using white/blacklists:
in the network configuration, you can configure a whiteliste and a blacklist
- blacklistet clients cannot search
- whitelistet client get never any search restrictions
- for all other clients: apply DoS search restrictions
Please see the example configuriation in yacy.network.freeworld.unit
by default, all clients from localhosts get whitlistet.
If you have your own YaCy network, please put all the IPs of your peers into the whitelist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5475 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-12 10:55:48 +00:00
lotus
4641ecd6d9 inurl: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5456 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-08 18:59:29 +00:00
lotus
0d1bd78674 * full site: syntax support e.g. site:de.wikipedia.org
possible if dots in query would work yet

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-07 21:05:07 +00:00
orbiter
9bed4de280 fix for the search bug introduced in SVN 5449
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5451 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 23:16:10 +00:00
orbiter
b2b7edae18 fixed interactive search
- added dummy servlet class, because otherwise the template engine is not triggered.
thats so because the yacy httpd works much faster as normal file server without a scan
of the served pages. Therefore each page with templates must now have a class file associated to it.
- fixed json output format of yacysearch

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5449 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 20:04:09 +00:00
lotus
ca80930892 accept leading dots on filetype: and site: search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5444 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-06 10:04:24 +00:00
low012
1af728ae09 *) regex for site operator changed as proposed by Lotus
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5441 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-05 18:30:34 +00:00
low012
9e58ae036d *) added site operator which can be used to only show results from a certain domain. example: "test site:edu" shows only documents which contain the word test and which come from an edu domain
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5439 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-04 14:58:32 +00:00
orbiter
28d2d28573 added support for filetype search
(just use filetype:<type> in the search query)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5418 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-29 17:57:04 +00:00
orbiter
47292e696a more performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-04 12:54:16 +00:00
orbiter
d39d420b39 performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-03 15:38:29 +00:00
orbiter
0b4808ba3d added new interactive search feature:
- during the user types search queries, the local database is searched
- results are presented interactively

This was implemented using a new JSON result format for search results in YaCy
- added JSON as file format for servlets
- refactoring of current search servlets (xml and html)
- added JSON output format for search results
- added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-12-02 15:24:25 +00:00
lotus
1951d30a62 addendum to last commit
handle words with length < 3 correctly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5369 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-26 19:43:40 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
low012
77e41da7d2 *) further propagation of display value (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
*) removed another depreciated parameter "time" which led to ugly -UNRESOLVED_PATTERN- in URL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5285 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-18 19:39:46 +00:00
lotus
7782a43060 fix if LANGUAGE: was not defined and the end of the query
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5242 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-03 11:36:17 +00:00
lotus
fe2792e9ce use accept-language header instead of user agent for language detection
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5235 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-01 17:47:11 +00:00
lotus
93ddf206e6 opensearch fix if user agent had no language
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5233 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-30 20:13:18 +00:00
orbiter
6e7d113eac fix for wrong index initialization after network switch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5203 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-23 23:30:25 +00:00
orbiter
1198eeecc7 added language selection to search query:
- the language can be selected using a LANGUAGE:<language> element in the query line, i.e.:
java LANGUAGE:en
- the language can be selected with a post element in google-style syntax with the 'rl' element:
?lr=lang_en&query=java

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5193 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-21 07:28:57 +00:00
orbiter
00c1535f84 added ranking and evaluation of language type in a search
the wanted language is taken from the browser user-agent string

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5192 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-21 00:04:42 +00:00