Commit Graph

98 Commits

Author SHA1 Message Date
orbiter
89b9b2b02a redesigned remote crawl process:
- instead of pushing urls to other peers, the urls are actively pulled
  by the peer that wants to do a remote crawl
- the remote crawl push process had been removed
- a process that adds urls from remote peers had been added
- the server-side interface for providing 'limit'-urls exists since 0.55 and works with this version
- the list-interface had been removed
- servlets using the list-interface had been removed (this implementation did not properly manage double-check)
- changes in configuration file to support new pull-process
- fixed a bug in crawl balancer (status was not saved/closed properly)
- the yacy/urls-protocol was extended to support different networks/clusters
- many interface-adoptions to new stack counters

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-29 02:07:37 +00:00
orbiter
55da871211 preparations for better ranking: better debugging of index properties
to do this, the index administration interface was extended.
It is now possible to select parts of a index.
See properties shown in interface after a word search for details.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4218 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-15 03:03:18 +00:00
orbiter
794d296129 project link update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4193 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-11-01 20:40:15 +00:00
orbiter
87b297b4d2 update of link to english forum
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-29 10:50:27 +00:00
orbiter
842308ea97 - redesigned crawl start menu, integrated monitoring pages
- removed web structure picture from indexing menu and grouped it together with htcache monitor
- added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database
- extended crawl profile edit servlet, shows now also terminated crawls
- option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues!
- fixed here and there problems with indexing queues
- enhances indexing speed by changing cache flush sizes.
- changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown

attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched.
next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-09-28 01:21:31 +00:00
low012
a493bd88b6 *) updated a few links
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4066 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-08-29 16:19:00 +00:00
orbiter
8d6aa7a66d replaced detailed search page by ranking definition page (this is what it essentially is)
the ranking definition there will influence the normal web search

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-25 14:04:36 +00:00
orbiter
0c4cf68c45 moved publication options (wiki, blog, share) to a single menu entry 'Publication'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4003 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-24 15:11:17 +00:00
orbiter
7c5c814a47 - simplified code (removed exception handling where not necessary)
- added confirmation dialog for shutdown and restart

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3962 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-13 14:54:01 +00:00
orbiter
527b3decde - re-sructuring of configuration menus
- added new system update configuration page
- moved system update from status page to system udate page
- moved shutdown and restart from status page to main menu
- added new configuration properties to yacy.init (not yet actively used)
- added some methods to handle new automatic update process

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3958 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-10 23:56:25 +00:00
michitux
110a1a2b16 - fixed the handover of the searchterm and -type on index.html when the user clicks on "more options..."
- some small changes to make index.html and the menu valid XHTML 1.0 strict
- changed the inconsistent eol - characters in index.html to unix-ones


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3940 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-29 19:23:42 +00:00
orbiter
0622dcc392 updated project links, integrated new english forum and yacystats
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3930 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-28 09:00:38 +00:00
daburna
8c691287f6 - new forum link
- deleted newsletter-link

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3929 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-27 21:30:31 +00:00
orbiter
3c19fcf519 harmonisation of servlet naming, headlines and menu entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3884 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-13 20:53:52 +00:00
michitux
4f1f280abd Fixed Status - page for IE (http://www.yacy-forum.de/viewtopic.php?t=4113), added new Stylesheets in conditional comments for different IE - versions, added hiding of favicons (as proposed by allo).
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3883 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-13 20:27:51 +00:00
orbiter
66ec8b63c1 added a httpd access tracker:
- all requests to the own httdp can now be listed in the access tracker menu
- the search statistics had been renamed to access tracker and extended by this tracker

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3861 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-11 14:05:20 +00:00
orbiter
5fd1d5a58e added time configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3853 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-10 20:25:33 +00:00
orbiter
e07458bad4 added time-out function to web analysis
the default time-out is 1 second

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-10 20:00:44 +00:00
orbiter
01d4cbc143 moved indexing menue back to top
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3825 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 23:56:07 +00:00
orbiter
3f49cd516b splittet the index create page into two pages:
- one with less option but with information about other remote crawls
- one with complete information but without any other information
on both pages the steering options had beed removed. They are now at the monitoring page.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3813 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 22:27:03 +00:00
orbiter
4fb86b02e4 - fixed problem when surftip url is broken
- more renaming of menu entrie
- redesigned network page: moved grafic more to top
- changed default link to index monitor: now it points to the web structure picture

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3808 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 18:24:35 +00:00
orbiter
4934c5f4d3 - corrected News menu (wrong headline)
- changed order, appearance, naming of menues in main menu
- implemented no-main-menu mode for blog and wiki (similar to simple search page)
The YaCy wiki can now be embedded in iframes inside on web pages using the display option:
i.e.
http://localhost:8080/Wiki.html?display=2
http://localhost:8080/Blog.html?display=2

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3803 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 12:57:13 +00:00
orbiter
71b0206935 - shifted control queue monitor pages to crawl monitor
- the crawl start menu is now cleaned up and ready for more options

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3802 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 12:25:52 +00:00
orbiter
a585b4d41b added web structure image
see http://localhost:8080/WatchWebStructure_p.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3747 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-22 15:20:50 +00:00
karlchenofhell
086239da36 - added servlet: remote crawler queue overview
- added servlet: crawl profile editor

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-16 10:11:25 +00:00
orbiter
68f5d64ae6 replaced yacy logo by better version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3675 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-05-07 08:54:27 +00:00
orbiter
89c1511738 - added new Network Configuration menu, can be found in basic settings
- new cluster functions will be available in this menu, but currently not enabled,
  because corresponding interface methods are not ready yet
- shifted remote crawl settings to new network configuration menu
- shifted DHT distribution/receive to the new network configuration menu
- adopted some string constants
- added cluster configuration settings to yacy.init


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3589 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-23 20:47:07 +00:00
orbiter
f88aea4869 reverted another bad idea
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-12 21:28:31 +00:00
michitux
4990909178 Some bugfixes, new layout/style for image search results:
* removed divide by zero bug when 20_dhtdistribution_busysleep is 0
 * replaced German comment with wrong charset in source/de/anomic/plasma/plasmaCrawlBalancer.java by an English one
 * replaced the table-fix for floating behind snipped images by a br with clear
 * removed unnecessary old xhtml-files (were not in use, they were created when we weren't having xhtml for testing)
 * new layout for image-search results: replaced the old one with spans and tables inside (not valid) with new divs, now each image snippet container has the same size
TODO:
 * the ids of the snippetLoading-divs aren't valid because ids must start with an alphabetic letter or an underscore, they have to be prefixed
 * in the returned snippet-xml is an unresolved pattern for status (the status is only set for text snippets)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3566 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-12 18:21:17 +00:00
orbiter
4eecf0bb12 templates for embedded display. try:
http://localhost:8080/yacysearch.html?display=2&input=0&search=yacy&resource=local

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3557 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-04-10 14:28:04 +00:00
theli
7edd5a0b77 *) correcting notifier.gif path
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3510 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-22 07:42:28 +00:00
karlchenofhell
dcc13abd59 - fixed small bug at home page, button "peer's console"
- fixed <fieldset><dl> for safari on many pages
- added Blog-link to Network page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3450 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-08 13:39:09 +00:00
orbiter
5741701b59 moved crawl start up, personal web pages down in main menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3443 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-07 16:08:13 +00:00
karlchenofhell
9623bf7bbe - removed call of java 1.5 method
- added config servlet for local robots.txt
- removed YPStats_p as it is of no use anymore
- supertemplates use XHTML now
- quick-fix for http://www.yacy-forum.de/viewtopic.php?p=32296#32296

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3422 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-03-01 13:54:14 +00:00
orbiter
f7803a6ce4 enhanced crawl balancer
- new domains now get a chance to get crawled early
- less IO operations
- new balancing method
- better dump order at shutdown time
- bugfixes regarding not found url hashes (no more superfluous cache kill)
- domain access time is now shared over all balancer stacks
- viewing the stack does no more disturbish the balancing algorithm that much
- intelligent selection of best next domain using domain access times
- extra double-check (to double-check the double-check)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-21 16:23:31 +00:00
orbiter
39b0658839 Redesign of Webinterface menu structure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-20 14:35:29 +00:00
karlchenofhell
d114a0136e - crawl profile: don't add null-values
- added some settings and statistics for url-fetcher 'server'-mode
- added own stack for fetchable URLs
- added possibility to fill stack via shift from peer's queues, via POST (addurls=$count and url$num=$url) or via file-upload
- added "htroot" to classpath of linux start-script

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3370 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-17 19:16:53 +00:00
karlchenofhell
f88487e964 - added servlet to clean blacklists (find & delete invalid entries)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3286 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-27 23:30:58 +00:00
orbiter
c82f494b08 reorganisation of search statistics page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 21:36:25 +00:00
daburna
aedb34a988 removed "cookie menu"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3208 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 12:10:58 +00:00
(no author)
fe72b772cf added a monitor page for search requests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3206 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-15 01:50:57 +00:00
rramthun
25a64fe3da -windows-script echoes correct port-number now
-minor changes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-06 17:27:44 +00:00
orbiter
30888e7a2f implementation of search constraints
Such constraints may formulate specific restrictions to web searches
This is implemented by scraping information for constraints from a web
page during parsing, and storing flags to the pages within the web index.

In this first step, only information for index pages ("index of", directory listings)
are scraped and stored in flags
- added new flag class kelondroBitfield
- added scraper method in condenser
- added bitfield structure for all scrape types (see also condenser)
- added bitfield structure for appearance locations (see RWIEntry)
- added handover protocol for remote search and index distribution
- extended kelondroColumn class to hold bitfield types
- added another search attribute on search page (index.html)
- extended search-filter to enable filtering of non-matching constraints
- set all new database types to be default
- refactoring: moved word hash generation to condenser class

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-23 02:16:30 +00:00
low012
75915502ec *) Cursor will jump to textfield on http://localhost:8080/yacysearch.html when page is loaded if JavaScript is enabled. (No changed behavior if JavaScript is diabled.)
*) If text is entered in textbox on http://localhost:8000/yacysearch.html and user clicks on "Web Search" in top menu, text will appear in textfield on http://localhost:8080/index.html if JavaScript is enabled. (No changed behavior if JavaScript is diabled.)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2994 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-21 00:00:18 +00:00
allo
63a2616eb7 -If you click on "Administration", you can log-in.
-better Linktext on Status.html


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2863 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-26 10:11:14 +00:00
daburna
c97984bbac -corrected link and updated language file for simpleheader.template
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2799 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 18:41:46 +00:00
daburna
a1736675ca -ups
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2769 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-14 11:19:43 +00:00
low012
f7447894f1 *) fixed link to WatchCrawler_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2740 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-10 12:39:29 +00:00
orbiter
df1629b05a - code cleanup
- version 0.471
- moved surftipps to own web page


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
orbiter
c42b011648 added watch crawler to menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2584 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-15 01:09:34 +00:00