Commit Graph

330 Commits

Author SHA1 Message Date
orbiter
104318d58a - added nice colors to feed indexing state messages
- added a 'remove all' button for new and scheduled rss feed list
- made adding of new rss feeds concurrent so interface is more responsible

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7078 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-27 11:56:51 +00:00
orbiter
e10cd115a9 - added a new RSS reader interface. This is not finished but you can now load and look at RSS feeds. It will be used to index RSS feeds in a way that is appropriate for such kind of data.
- refactoring of Mediawiki and PHPBB3 loader interface names (just renamed)
- removed two old not used RSS loader interfaces
- fixed a bug in RSS parser library of cora
- added a new RSS parser component to the set of yacy document parsers

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-20 11:30:02 +00:00
orbiter
5a994c9796 added a scheduler based on API actions
- every process that is monitored with the API Steering interface can now be scheduled!
- added input methods in Steering interface to set a scheduling time
- added a view on the steering api that shows only crawl jobs inside the Crawl Profile servlet
- added a scheduling call process in the cleanup process handler that triggers the scheduled processes
This causes that the cleanup now also looks for scheduled processes. Such processes are therefore not executed at
the same time as given in the target execution time but they will be executed within the cleanup process time window.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-19 12:13:54 +00:00
orbiter
63c5634b0f added online documentation for ranking configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6997 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-26 10:08:21 +00:00
mikeworks
aa663cda4d ConfigUpdate_p.html and ConfigUpdate_p.java: Added check for downloaded releases and disabled buttons in case no new releases available
de.lng: Updated German translation for additional String in ConfigUpdate_p.html
XHTML 1.0 Strict fixes for all the other .html files
yacy/ui/css/yacyui-portalsearch.css: added .hidden class that was removed from ConfigProperties_p.html
Switchboard.java: Added URL for thread Remote Crawl Job and set URL for Remote Crawl URL Loader to null to fix empty href=""

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6996 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-25 13:19:16 +00:00
mikeworks
0f248e7433 ConfigBasic.html: XHTML 1.0 Strict fixes
DictionaryLoader_p.html: Filled <dt> elements to eliminate warnings
Moved CSS for portalsearch field from header to metas template because it belongs in the <head>er
yacui-portalsearch.css Added #yacylivesearch form { display: inline; } because HTML 1.0 Strict does not allow <form><input> and the added <p> would otherwise provoke a line break
de.lng: Updates translations for added <dt> elements and deactivated statement in DictionaryLoader_p.html


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-07-24 15:26:53 +00:00
orbiter
11b7853940 added a configuration page for search heuristics. currently you can switch on there:
- a site-operation heuristic that loads all direct links from a portal page if the site-operator is used
- a direct crawl for search results from scroogle for the given search terms
The configuration page can be found directly beside the network configuration page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6951 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-27 21:38:16 +00:00
orbiter
dcd01698b4 added a 'transition feature' that shall lower the barrier to move from g**gle to yacy (yes!):
Here a new concept called 'search heuristics' is introduced. A heuristic is a kind of 'shortcut' to good results in IT, here for good search results. In this case it will be used to get a very transparent way to compare what YaCy is able to produce as search result and what g**gle produces as search result. Here is what your can do now:
- add the phrase 'heuristic:scroogle' to your search query, like 'oil spill heuristic:scroogle' and then a call to scroogle is made to get anonymous search results from g**gle.
- these results are _not_ taken as meta-search results, but are used to instantly feed a crawling and indexing process. This happens very fast, here 20 results from scroogle are taken and loaded all simultanously, parsed and indexed immediately and from the results of the parsed content the search result is feeded, along to the normal p2p search
- when new results from that heuristic (more to come) get part of the search results, then it is verified if such results are redundant to existing (they had been part of the normal YaCy search result anyway) or if they had been completely new to YaCy.
- in the search results the new search results from heuristics are marked with a 'H ++' and search results from heuristics that had been already found by YaCy are marked with a 'H ='. That means:
- you can now see YaCy and Scroogle search results in one result page but you also see that you would not have 'missed' the g**gle results when you would only have used YaCy.

- to make it short: YaCy now subsumes g**gle results. If you use only YaCy, you miss nothing.

to come: a configuration page that let you configure the usage of heuristics and get this feature by default.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6944 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-25 16:44:57 +00:00
orbiter
a33f39832e - small change in display of use cases
- explain usage of ftp, smb and file search domains

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6913 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-06 23:26:04 +00:00
orbiter
1610c81dff fixes for embedded search / search widget
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6911 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-06-01 22:09:17 +00:00
orbiter
431852f0a7 testing new 'seach on map' image (slightly larger)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6896 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-21 13:12:47 +00:00
orbiter
1defd580bc - added option to localization search to distinguish between a search for a location according to the search word only or for the relation between a web search results and locations found in the metadata fields
- used that to display two layers on map: cities and search result locations
- added many marker grafics for the display of the markers on the map
- some refactoring of the yacy news code plus bugfixes for latest move from Tree to Table data structure

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6889 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-19 12:53:09 +00:00
orbiter
1e8c6cefae - added 'search on map' - Link to search result page
- added default search option to location search
- show default search in search window on location search page
- added icon for location search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6886 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-18 14:48:54 +00:00
orbiter
3661cb692c added dictionary loader servlet that can be used to get the geolocalization file:
/DictionaryLoader_p.html
Will also be used for more dictionary files in the future

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6872 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-14 09:52:53 +00:00
orbiter
0769517129 added a robots.txt monitor in the crawler monitor submenu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6733 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-09 11:31:15 +00:00
orbiter
8c88abf685 added follow-me link for twitter in status hints
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6729 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-07 23:29:29 +00:00
orbiter
270fb38674 - fixed some bugs in Table viewer
- added 'select all' feature in Tables_p
- enhanced ViewFile.html: has now an input field to load arbitrary resources from the web and analyze them (!!!)
- included the ViewFile servlet into the Index Administration menu
- show in ViewFile if ressource is in url-db and/or in Web cache
- bugfixes to BEncodedHeap and Tables management

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-05 15:41:15 +00:00
orbiter
047f8718a7 added kiosk-mode button on standard search page and interactive search page
see also:
http://forum.yacy-websuche.de/viewtopic.php?p=19178#p19178

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6667 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 12:54:53 +00:00
orbiter
ac492fa2a5 added a close button image
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6666 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-15 10:40:33 +00:00
suessthomas
9e14958115 minor corrections and bug fixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6663 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-11 15:05:38 +00:00
orbiter
8a76f38d26 Added a new steering servlet that can be used to repeat actions that had been made on the yacy interface. This can be used to:
- start again a previously started crawl
- submit settings (again). This option will be used to transmit
  all settings of one peer to another peer if the remote-peer
  steering function is ready
This steering framework will also be used for a 'schedule-everything'
which will also include a new scheduler for crawling.
  

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6642 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-02-03 09:31:12 +00:00
suessthomas
a29b17a2fd Minor HTML Changes, Images recompress.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6627 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-27 21:38:54 +00:00
orbiter
8ce936bcdd added an api recording function: it shall be possible to record
all operations on YaCy in a database that should make it possible
1) to re-create a setting on fresh peers
2) to transmit a setting from one peer to another
3) to re-create crawl starts after a complete deletion of the index
This functionality will also support
4) scheduled re-crawls (new implementation)
To implement this, a new database structure has been crated that stores maps into blob heaps. to encode maps the b-encoding technique was used (this is the same encoding that torrent files use)
- added a b-encoder
- enhanced the b-decoder
- added a b-encoded map heap data structure
- added a table organisation based on b-encoded heaps
- added a servlet to maintain such tables (see Tables_p.html)
- integrated the servlet into the Advanced Settings menu
- added an api recording based on the new tables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6606 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-21 22:06:03 +00:00
orbiter
81035e7080 moved a sub-menu entry
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6578 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-13 16:18:23 +00:00
orbiter
9bbd546e64 in live search, show at least 20 entries instead of only 10
this is a work-around for the problem that the search widget
does not load a second page if the first page did not fill up
the window with enough lines such that a scrollbar ist visible.
Because the scrollbar triggers loading of following pages, this
must be enforced with the trick that more result lines must be
shown after the first search.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6570 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 15:23:01 +00:00
orbiter
d126d6c1b5 renamed the servlet WatchCrawler_p to Crawler_p
this was done because that servlet may be used for wget/cronjob
triggered crawl starts and it appears to be confusing that the
name of the crawl start servlet looks like a pure monitoring tool.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 10:05:28 +00:00
orbiter
a3b8b7b5c5 some redesign of the main menu structure:
- moved all index generation servlets to it's own main menu item, including proxy indexing
- removed external index import because this operation is not recommended any more. Joining an index can simply be done by moving the index files from one peer to the other peer; they will be merged automatically
- fix to prevent endless loops when disconnecting http sessions
- fix to prevent application of bad blacklist entries that can cause a 'Dangling meta character' exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6558 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-10 00:10:43 +00:00
orbiter
a44112b562 - moved index cleaner to blacklist submenu, because the index cleaner cleans the index with the blacklist
- version switch to 0.93 to reflect advancements

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6516 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-12-03 14:26:30 +00:00
orbiter
491ba6a1ba - some refactoring in workflow
- some refactoring in search process
- fixed image search for json and rss output
- search navigation on bottom of search result page in cases where there are more than 6 results on page
- fixes for number of displayed documents
- disabled pseudostemming

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6504 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-24 11:13:11 +00:00
orbiter
4c6312d103 enhanced image search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-18 23:56:05 +00:00
orbiter
e3025ee691 - new icon for OAI-PMH loading action
- added many stack trace outputs for exceptions in crawl profile handler to find the 'missing profile handle' bug
- catched one more timeout exception in httpd file loader

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6457 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-11-05 16:40:15 +00:00
orbiter
a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
- extended the OAI-PMH test applet and integrated it into the menu. Does still not import OAI-PMH records, but shows that it is able to read and parse this data
- some redesign in ZURL storage: refactoring of access methods, better concurrency, less synchronization
- added a limitation to the LURL metadata database table cache to 20 million entries: this cache was until now not limited and only limited by the available RAM which may have caused a memory-leak-like behavior.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6440 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-31 11:58:06 +00:00
suessthomas
56a5bd090d Small fixes to header.template for more XHTML compatibility.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6420 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-16 20:31:06 +00:00
orbiter
76bca8cffd show interactive search without menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6417 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-15 13:26:14 +00:00
orbiter
c864901087 - moved httpd.mime to defaults path
- some documentation fixes
- adopted a default setting for the search window: moves css setting to base.css
- some enhancements for the DocumentIndex class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6410 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-14 13:29:09 +00:00
orbiter
26b81bd1f1 added another search integration help page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6345 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-25 14:03:07 +00:00
orbiter
feece4bfcb slightly changed default skins:
- search result headline font is slightly larger and has an underline (like g**gle)
- no dashed line between results in grey style
- no search results with menu on left side by default (but is still possible)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6334 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-09-21 21:30:24 +00:00
orbiter
5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
- The indexing queue was a historic data structure that was introduced at the very beginning at the project as a part of the switchboard organisation object structure. Without the indexing queue the switchboard queue becomes also superfluous. It has been removed as well.
- Removing the switchboard queue requires that all servlets are called without a opaque generic ('<?>'). That caused that all serlets had to be modified.
- Many servlets displayed the indexing queue or the size of that queue. In the past months the indexer was so fast that mostly the indexing queue appeared empty, so there was no use of it any more. Because the queue has been removed, the display in the servlets had also to be removed.
- The surrogate work task had been a part of the indexing queue control structure. Without the indexing queue the surrogates needed its own task management. That has been integrated here.
- Because the indexing queue had a special queue entry object and properties attached to this object, the propterties had to be moved to the queue entry object which is part of the new indexing queue withing the blocking queue, the Response Object. That object has now also the new properties of the removed indexing queue entry object.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-17 13:59:21 +00:00
orbiter
7d493cf8cc moved parser configuration in separate servelet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-14 06:57:13 +00:00
orbiter
93c69fa1cb - added hints to integrate a yacy search in phpBB3
- added also a phpBB3 crawl start with optimized crawl attributes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6126 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-22 23:38:15 +00:00
apfelmaennchen
b6058a7db1 yacyui-portalsearch:
- more bug fixes
- moved from faviconize to YaCy's favicons

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6122 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-22 18:20:12 +00:00
lotus
48051fef4b another fix for IE
http://forum.yacy-websuche.de/viewtopic.php?p=16030#p16030

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6115 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-22 11:15:00 +00:00
orbiter
a119860b82 moved IndexImportWikimedia into different menu position
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6094 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-19 14:03:28 +00:00
apfelmaennchen
ab09d8ebb3 - small noscript fix
- noscript is now functionall but ugly

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 22:10:02 +00:00
apfelmaennchen
36dc9b09ac - partial update to jquery-1.3.2
- partial update to jquery-ui-1.7.2
- yacyportalsearch fixed sidebar for navigators


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-11 21:34:39 +00:00
apfelmaennchen
5a7dec880e - some improvements for: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=1904#p15668
- portalsearch: introduced yconf.load_js and yconf.load_css
- yacysearch.html still having problems with focus after sidebar is loaded
- yacysearchtrailer.json seems not to be valid json for ?nav=all

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-10 22:11:31 +00:00
orbiter
3d5f2ff544 - added new servlets to support search portal administrators for the integration of yacy search fields in their web pages
- moved some servlets from here to there..
- changed menu structure
- removed yacyui-portaltest.html which contained an example for the live search which is now integrated on all pages in yacy. The code snippet example from that page is integrated into the ConfigLiveSearch.html servlet


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-29 14:16:03 +00:00
orbiter
4b4bddca00 added new submenu to crawler menu: import of phpbb3 forum postings from mysql
- yacy can import phpbb3 posts without crawling
- all data is written as surrogate
- indexed surrogate files can be re-used

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5985 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-27 14:53:23 +00:00
orbiter
41dd31cad2 replaced new navigation icons with same images but smaller resolution (16x16 instead of 128x128)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5961 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-17 22:22:29 +00:00
orbiter
a642d6a7b5 - added navigation icons for search result pages
- modified result page rendering to use new icons instead of numbers
- set different default values in yacy.init for higher indexing performance; removed pro-values
- modified WatchCrawler to accept 30000 PPM instead of only a maximum of 6000 PPM

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5952 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-14 23:11:10 +00:00