Commit Graph

62 Commits

Author SHA1 Message Date
orbiter
30d79d69a6 fix for wrong display of search statistics
see http://www.yacy-forum.de/viewtopic.php?p=31242#31242

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3352 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-08 10:42:35 +00:00
orbiter
c464157a6e replaced some toString()
see http://www.yacy-forum.de/viewtopic.php?p=31151#31151

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-06 16:26:56 +00:00
orbiter
c2d6edf21d integrated number of remote targets as 'partitions' into remote search protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-01 13:27:23 +00:00
orbiter
4f6eed5623 QPM increment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3309 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-31 16:21:20 +00:00
orbiter
f3f99b19c6 extended search statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 01:45:29 +00:00
orbiter
c0851ee943 refactoring: moved and renamed de.anomic.data.searchResults to plasma package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3248 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 00:38:03 +00:00
orbiter
76fab83395 fixed bugs in seach statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3240 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-18 00:26:16 +00:00
allo
0c81bd39d4 XSS-safe put as default.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-16 14:07:54 +00:00
orbiter
52c6461e6b some bugfix for statistics
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3211 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 16:03:00 +00:00
(no author)
fe72b772cf added a monitor page for search requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 01:50:57 +00:00
orbiter
0a050bc043 enhanced ranking
- redesign of data storage in plasmaSearchRankingProfile
- profiles are extended by new ranking parameters
- new RWI ranking parameters are considered during ranking
- appearance attributes (i.e. emphasised text) is now considered
- faster ranking
- some attributes that had been checked during post-ranking can now be
  checked during pre-ranking phase
- removed old ranking parameter on index.html page (will be replaced by profiles in the future)
- ranking can now consider appearances of media content
- snippet-loading for media types now work correctly (fetches only from the wanted media)
- ranking-profiles can be handed over the remote peers and apply there also
- re-search of same query with different domain now also re-triggers remote search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-20 15:44:29 +00:00
orbiter
1377c53aa3 extraction of media links from search results
these links are mixed to the snippets for testing purpose
(a final version will handle this differently)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3069 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-11 01:31:23 +00:00
orbiter
109ed0a0bb - cleaned up code; removed methods to write the old data structures
- added an assortment importer. the old database structures can
  be imported with
  java -classpath classes yacy -migrateassortments
- modified wordmigration. The indexes from WORDS are now imported
  to the collection database. The call is
  java -classpath classes yacy -migratewords
  (as it was)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-05 02:47:51 +00:00
orbiter
ad1e4aa88e added selection of audio, video, image and application resources
to search procedure. This function can currently not used through the
search interface, but only through remote search.

added accumulation of search attributes to enable the audio, video,
image and application selection.

fixed a problem with external URL representation generation


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-01 16:21:17 +00:00
orbiter
ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
- enhanced condenser: better handling of utf-8 and pre-formatted texts


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-28 15:00:15 +00:00
orbiter
8e7215475b - extended ViewFile to use is as debugging-tool: you can now use the
post-parameter url to submit an url directly
- fixed some bugs in text parser (not all parts had been analysed)
- fixed a bug in remote search interface (could not handle constraints)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 15:47:19 +00:00
orbiter
30888e7a2f implementation of search constraints
Such constraints may formulate specific restrictions to web searches
This is implemented by scraping information for constraints from a web
page during parsing, and storing flags to the pages within the web index.

In this first step, only information for index pages ("index of", directory listings)
are scraped and stored in flags
- added new flag class kelondroBitfield
- added scraper method in condenser
- added bitfield structure for all scrape types (see also condenser)
- added bitfield structure for appearance locations (see RWIEntry)
- added handover protocol for remote search and index distribution
- extended kelondroColumn class to hold bitfield types
- added another search attribute on search page (index.html)
- extended search-filter to enable filtering of non-matching constraints
- set all new database types to be default
- refactoring: moved word hash generation to condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 02:16:30 +00:00
orbiter
497428c8ec refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2949 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-10 01:13:33 +00:00
orbiter
bb7d4b5d5e refactoring to prepare new RWI entry object
- moved all url and index(RWI) entries to index package
- better naming to distinguish RWI entries and URL entries


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-08 16:17:47 +00:00
orbiter
8fdefd5c68 generalization of payload definition of index storage
this is one step forward to the migration to a new collection data format

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2912 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-05 02:10:40 +00:00
orbiter
d3431433b0 more anonymization in logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2876 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-28 22:28:33 +00:00
orbiter
b79e06615d - added new LURL.Entry class for next database migration
- refactoring of affected classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 22:25:07 +00:00
orbiter
a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
- added test migration method to migrate the old LURL to a new LURL
the new LURL will be splitted into different tables for each month
this solves several problems:
- the biggest table in YaCy is splitted in different parts and can
  also be managed in filesystems that are limited to 2GB
- the oldest entries can easily be identified, used for re-crawl und
  deleted
- The complete database can be limited to a specific size (as wanted many times)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-12 23:14:41 +00:00
orbiter
afbb547f3d extended options for abstracts generation in remote search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2739 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-10 12:22:16 +00:00
orbiter
2e4aa6a170 refactoring of Advanced Config:
- removed settings that are in Basic Settings
- joined pages that belong together
- moved include pages from yacy/ to /

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2726 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-09 10:24:54 +00:00
orbiter
dbc2e039bb added time-out option parameter to call hierarchy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2691 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 09:40:18 +00:00
orbiter
00746ca232 identified and fixed search performance problem caused by
snippet loading. Some access to header-db had been twice and even
more times in some cases. Snippet resource loading fixed.
Furthermore the snippet loading during remote search within the
remote peer has been disabled, but can be switched on remotely by
new flag 'includesnippet=true'

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2688 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 01:15:02 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
orbiter
3aac5b26da - added automatic tag generation when a web page from the search results is added
- added new image 'B' in front of search results for bookmark generation
- added news generation when a public bookmark is added
- the '+' in front of search results has new meaning: positive rating for that result
- added news generation when a '+' is hit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 00:37:02 +00:00
orbiter
6e2907135a bugfixes for remote search server part
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2573 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 22:19:34 +00:00
orbiter
cf9884e22b first attempt to implement a secondary search
this is a set of search processes that shall enrich search results
with specialized requests to realize a combination of search results
from different peers.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2571 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-13 17:13:28 +00:00
orbiter
75b198bc02 - updated references to indexContainer
- more bugfixes and debugging for indexAbstract processing

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2555 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 11:13:27 +00:00
orbiter
4f9e42d5ed more changes towards better join-search
- fixed problems with index-abstract generation
- added analysis output for index abstract receive

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2551 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-12 00:42:42 +00:00
orbiter
82a6054275 - fixed bug with new indexAbstract generation
- added partly evaluation of indexAbstracts during remote searches

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2544 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-11 10:39:25 +00:00
orbiter
74d1dea30b changes towards better join-search
- added generation of a compressed index within remote peers during global search
- added selection of specific urls within remote peers during secondary global search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-10 22:36:47 +00:00
orbiter
96c6e4e322 - enhancements to detailed search page
- enhancements to search ranking computation process
- removed bugs in postranking

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 01:26:06 +00:00
orbiter
8a0e35618b enhancements to search result preparation
- added detailed count on remote search results
- enhanced search sequence during remote searches (doing local search in sequence)
- strict adherence to timout limits

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2497 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-06 17:51:28 +00:00
orbiter
a474669338 start with refactoring of index management
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-16 16:11:55 +00:00
orbiter
dbe96e6541 added hand-over of search filter and prefer ranking to yacy protocol
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2029 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-20 10:15:00 +00:00
orbiter
00a5d435e2 - fixed some bugs with domain filter
- added new ranking filter "prefermask": urls that match the filter are ranked better


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-13 23:19:36 +00:00
orbiter
bd283b8443 fixed bugs:
- null pointer exception during startup of a robinson-configured peer
- wrong time calculation of default value of re-crawl option

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2005 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-06 16:28:28 +00:00
orbiter
3286b1f498 re-organisation of lurl-creation and -stacking
this was necessary to prevent useless write to the database
in case of blacklist appearance of the url

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-17 10:16:07 +00:00
orbiter
bae3783d38 added a snippet marking
(search words are now bold in snippets)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1823 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-05 01:11:06 +00:00
orbiter
fb7411d7bb re-structuring of ranking application:
concentration of all ranking attributes in the
plasmaSearchRankingProfile

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1541 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-05 01:47:51 +00:00
orbiter
d98418390b - introduced rankingProfile Class
- selection of ranking and timing profiles for each search


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1539 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-04 23:51:00 +00:00
orbiter
03c65742ba changes towards the new index storage scheme:
- replaced usage of temporary IndexEntity by EntryContainer
- added more attributes to word index
- added exact-string search (using quotes in query)
- disabled writing into WORDS during search; EntryContainers are used instead


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1485 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-30 00:42:38 +00:00
orbiter
f4ffa9aee5 - implemented more attributes to index entries
- implemented hand-over of new word index attributes during remote search
- implemented word-distance computation during search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1382 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-20 15:14:21 +00:00
orbiter
37f88b4017 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 23:51:29 +00:00
orbiter
7920e1547d code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1163 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 09:13:13 +00:00
orbiter
b9cc9029e3 added ybr selection for remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1119 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-22 16:10:24 +00:00