Commit Graph

35 Commits

Author SHA1 Message Date
orbiter
7ff86d6ba6 - image search now shows thumbnails (in bad order, but it works)
- repaired DHT selection

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3081 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-14 02:48:37 +00:00
orbiter
ee3d91cb6b print-out of links that result from contraint-filtering
in search result

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-13 01:39:34 +00:00
orbiter
e4570bffaf -implemented a specialized snippet-fetch for media content
-changed search result preparation for media search presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3073 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-12 02:09:25 +00:00
orbiter
1377c53aa3 extraction of media links from search results
these links are mixed to the snippets for testing purpose
(a final version will handle this differently)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3069 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-11 01:31:23 +00:00
orbiter
10d888e70c - added a media search for images, audio, video and applications
- new search options on search page
- new option in ViewInfo to display all links of a file
- enhanced collection data structure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-07 02:40:57 +00:00
orbiter
9a85f5abc3 cleanup
- removed 'deleteComplete' flag; this was used especially for WORDS indexes
- shifted methods from plasmaSwitchboard to plasmaWordIndex

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-06 12:51:46 +00:00
orbiter
109ed0a0bb - cleaned up code; removed methods to write the old data structures
- added an assortment importer. the old database structures can
  be imported with
  java -classpath classes yacy -migrateassortments
- modified wordmigration. The indexes from WORDS are now imported
  to the collection database. The call is
  java -classpath classes yacy -migratewords
  (as it was)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-05 02:47:51 +00:00
orbiter
ad1e4aa88e added selection of audio, video, image and application resources
to search procedure. This function can currently not used through the
search interface, but only through remote search.

added accumulation of search attributes to enable the audio, video,
image and application selection.

fixed a problem with external URL representation generation


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-01 16:21:17 +00:00
orbiter
1697fa3dc0 added a 'more options' link to yacysearch page
(which referes to the index.html page with extended options activated)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-01 02:45:49 +00:00
orbiter
30888e7a2f implementation of search constraints
Such constraints may formulate specific restrictions to web searches
This is implemented by scraping information for constraints from a web
page during parsing, and storing flags to the pages within the web index.

In this first step, only information for index pages ("index of", directory listings)
are scraped and stored in flags
- added new flag class kelondroBitfield
- added scraper method in condenser
- added bitfield structure for all scrape types (see also condenser)
- added bitfield structure for appearance locations (see RWIEntry)
- added handover protocol for remote search and index distribution
- extended kelondroColumn class to hold bitfield types
- added another search attribute on search page (index.html)
- extended search-filter to enable filtering of non-matching constraints
- set all new database types to be default
- refactoring: moved word hash generation to condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 02:16:30 +00:00
orbiter
49a83f99d9 - fix for wrong DHT ordering in DHT selection
- fix for http://www.yacy-forum.de/viewtopic.php?t=3112&highlight=

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2995 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-21 00:36:41 +00:00
orbiter
e55ef0df28 - automatic migration of old RWI entries to new format during remote search
if new collections are activated
- one more assert in RowSet, control of removeMarker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-20 22:55:27 +00:00
orbiter
bb7d4b5d5e refactoring to prepare new RWI entry object
- moved all url and index(RWI) entries to index package
- better naming to distinguish RWI entries and URL entries


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2937 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-08 16:17:47 +00:00
orbiter
b79e06615d - added new LURL.Entry class for next database migration
- refactoring of affected classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 22:25:07 +00:00
orbiter
a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
- added test migration method to migrate the old LURL to a new LURL
the new LURL will be splitted into different tables for each month
this solves several problems:
- the biggest table in YaCy is splitted in different parts and can
  also be managed in filesystems that are limited to 2GB
- the oldest entries can easily be identified, used for re-crawl und
  deleted
- The complete database can be limited to a specific size (as wanted many times)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-12 23:14:41 +00:00
theli
a2e3095044 *) Bugfix. Add missing plasmaParserDocument.close() calls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2680 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-30 10:09:01 +00:00
theli
b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
*) better logging of parser failures
*) simplified usage of plasmaparser through switchboard
*) restructuring of crawler
   - crawler now returns an error message if it is used in sync mode (e.g. by snippet fetcher)
*) snippet-fetcher: more verbose error messages
*) serverByteBuffer.java: adding new function append(String,encoding)
*) serverFileUtils.java: adding functions to copy only a given number of bytes between streams


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2641 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 12:25:07 +00:00
orbiter
aa38721cf6 new features for surftipps
- new generation with less memory
- removal of doubles
- positive votes can generate entries without original news (so they can live on)
- link deletion on search results are now also negative votes for surftipps (but they may rarely hit any news)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2640 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-20 12:01:51 +00:00
borg-0300
16ba5d1b46 topwords: only [a-z] words, quality is better;
blank removes; 
properties added;


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2632 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-19 10:44:45 +00:00
orbiter
3aac5b26da - added automatic tag generation when a web page from the search results is added
- added new image 'B' in front of search results for bookmark generation
- added news generation when a public bookmark is added
- the '+' in front of search results has new meaning: positive rating for that result
- added news generation when a '+' is hit

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 00:37:02 +00:00
orbiter
96c6e4e322 - enhancements to detailed search page
- enhancements to search ranking computation process
- removed bugs in postranking

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-08 01:26:06 +00:00
orbiter
3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
allo
f4d200ffa2 typo
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2281 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-07 20:44:45 +00:00
orbiter
00a5d435e2 - fixed some bugs with domain filter
- added new ranking filter "prefermask": urls that match the filter are ranked better


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-13 23:19:36 +00:00
orbiter
41afccaf34 small update to search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-10 23:05:01 +00:00
orbiter
14d6e476c9 tried to solve some problems with new picture viewer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-10 22:34:47 +00:00
orbiter
d0dd8b14d2 fixed picture tag and presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2014 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-07 22:09:59 +00:00
orbiter
f0833b0328 introduced simple search interface
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2007 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-06 21:48:24 +00:00
orbiter
c5087710a4 fixed type/cat properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2002 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-05 10:47:42 +00:00
orbiter
47b541b2d1 added better option handling in yacysearch
added depth option for image presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2001 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-05 10:34:24 +00:00
orbiter
c9e16bfd48 first try to insert image search (does not work yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2000 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-04 23:12:10 +00:00
orbiter
e2e8d0c188 some kind of refactoring of yacysearch:
made 'room' for new picture search result presentation

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1993 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-03 22:47:59 +00:00
borg-0300
cb23fc3d83 keywords added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1934 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 12:36:39 +00:00
borg-0300
1258df8133 no "[0-9]+" in topwords
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1933 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-19 12:33:50 +00:00
orbiter
f0a38873eb * added yacysearch page with better view on search results
the old search page is obsolete and will be removed
* ConfigBasic.html is now the default page instead of index.html
  as long as no password is set

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1815 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-04 18:52:04 +00:00