Commit Graph

41 Commits

Author SHA1 Message Date
reger
5cb05c3013 adjust table column width to not line wrap crawler traffic line 2015-02-04 03:51:34 +01:00
reger
0260d3d800 Allow to hide linkstructure graphic in crawl monitor
using/setting the config param DECORATION_GRAFICS_LINKSTRUCTURE
2015-01-28 03:59:01 +01:00
Ryszard Goń
3cdbd5f5c6 Fix for progress table background not resizing
when the post-processing started/ended.
2015-01-02 00:11:32 +01:00
Ryszard Goń
3144313974 Postprocessing progress bar fix
(Make it work as [probably] actually intended)
2014-12-27 03:02:18 +01:00
Michael Peter Christen
bbadccbd8d better buttons 2014-04-29 16:22:31 +02:00
orbiter
469e0a62f1 added new button to terminate all crawls 2014-04-22 23:14:54 +02:00
Michael Peter Christen
8443255e18 better link structure limit calibration 2014-04-04 12:48:55 +02:00
Michael Peter Christen
a6bb9be97e - added d3.js for visualizations using embedded svg
- added a servlet api/linkstructure.json which generates a link graph
information in json
- added a javascript link graph renderer hypertree.js using d3 and the
new servlet linkstructure.json
- embedded the new link graph in the crawler monitor and the host
browser
2014-04-03 14:51:19 +02:00
Michael Peter Christen
7a49f72480 fix for crawler column width 2014-04-02 01:16:34 +02:00
Michael Peter Christen
656e2ce62a replacing direct html table cellspacing with css set-up for cellspacing 2014-03-31 01:15:35 +02:00
Michael Peter Christen
92655c7fd9 - added bootstrap css framework
- adopted all YaCy administration pages to new framework
- created new search page layout (working, but still work in progress)
- old skin files are fully appliable! (and looking good)
- target is a new style based on bootstrap examples, see /test.html
- icons in YaCy may be replaced by glyphicons (to be done)
2014-03-18 13:42:31 +01:00
orbiter
e9abb25b03 tried javascript hack to make statistic divs equal height 2014-03-16 14:56:30 +01:00
Michael Peter Christen
1245cfeb43 small change to crawler monitor to fit in larger translations 2014-02-28 13:58:05 +01:00
orbiter
1960aafd6c better height for statistic windows 2014-02-28 00:17:50 +01:00
orbiter
b0e3e2100d better width for Progress table 2014-02-28 00:11:20 +01:00
malykhin.dmitry
29a7598991 update russian lang-file and small improve web-interface 2014-02-27 07:43:17 +04:00
reger
365f77ea8c make internal page links relative to ease any future development for context aware servlets
note also http://bugs.yacy.net/view.php?id=106
2014-02-10 21:40:42 +01:00
Michael Peter Christen
6ada0daae9 making latency_factor and maximum number of same hosts in loader queue
settings available in Crawler_p.html servlet for steering.
2014-01-21 19:28:00 +01:00
orbiter
19a051bec8 more monitoring for postprocessing and enhanced layout in Crawler
monitor page
2013-11-16 18:23:14 +01:00
Michael Peter Christen
fceac8cffd more monitoring for postprocessing 2013-11-16 08:23:42 +01:00
orbiter
9c681cc00d added segment sizes, postprocessing status and cpu load to crawler
monitor
2013-07-23 19:10:11 +02:00
orbiter
2c3b024196 if the crawl was paused (automatically), show the reason for pausing in
the Crawler_p servlet.
2013-04-09 18:55:26 +02:00
Frank
7763f2554f add the new PPMbar in Crawler_p for a better style and better use. 2013-03-17 11:43:12 +01:00
Michael Peter Christen
788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
The default schema uses only some of them and the resting search index
has now the following properties:
- webgraph size will have about 40 times as much entries as default
index
- the complete index size will increase and may be about the double size
of current amount
As testing showed, not much indexing performance is lost. The default
index will be smaller (moved fields out of it); thus searching
can be faster.
The new index will cause that some old parts in YaCy can be removed,
i.e. specialized webgraph data and the noload crawler. The new index
will make it possible to:
- search within link texts of linked but not indexed documents (about 20
times of document index in size!!)
- get a very detailed link graph
- enhance ranking using a complete link graph

To get the full access to the new index, the API to solr has now two
access points: one with attribute core=collection1 for the default
search index and core=webgraph to the new webgraph search index. This is
also avaiable for p2p operation but client access is not yet
implemented.
2013-02-22 15:45:15 +01:00
reger
7761b60325 fix: Broken Link on Crawler_p.html - issue 218
http://bugs.yacy.net/view.php?id=218
- reduced Solr logging (/select)
2012-12-29 04:53:20 +01:00
Michael Peter Christen
eca68fa197 added debug code to crawler monitor 2012-11-25 15:43:42 +01:00
Michael Peter Christen
71ed8e5e07 bugfixes for crawler 2012-11-07 12:52:19 +01:00
Michael Peter Christen
906e51214a the web structure image shows the pivot dot in a different color 2012-10-25 10:18:28 +02:00
Michael Peter Christen
9eaede50e7 enhanced web structure images 2012-10-23 18:11:19 +02:00
Michael Peter Christen
ae6feb5610 showing the web structure graph as animation in the crawl monitor 2012-10-23 02:50:26 +02:00
Michael Peter Christen
a13e5153ac - added the possibility to have not one but a list of crawl start urls
- the list of urls is entered in the expert crawl start in a textfield;
the one-line input field was replaced with a text box
- start urls can also be given in one single line where the urls are
separated by a '|'-character
- as an effect, the crawl profile cannot carry a single start url for
identificaton because it is possible to have more. Therefore the url was
removed from the crawl profile
- this affect all servlets which display a crawl profile: removed the
url field from all there servlets
- to work consistently with several start urls and the other crawl
starts which computed crawl start url lists from sitelists or sitemaps,
the crawl start servlet was restructured completely
- new rules for must-match patterns were created to make it possible
that site crawl starts also work with several crawl starts at once
2012-09-14 12:25:46 +02:00
sixcooler
bea002dc15 correct table in new look of Crawler_p 2012-06-19 13:13:00 +02:00
Michael Peter Christen
638390930d another patch to fix the Crawler_p layout 2012-05-25 15:56:21 +02:00
Michael Peter Christen
c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of
queue of urls that shall be crawled
2012-05-25 01:45:38 +02:00
Michael Peter Christen
16b21f7a5b Added more steering in Crawler_p.html interface 2012-05-23 18:00:37 +02:00
Michael Peter Christen
8d63a5887c bugfixes 2012-02-02 23:38:23 +01:00
Michael Peter Christen
9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
ready-prepared crawl list but at the stacks of the domains that are
stored for balanced crawling. This affects also the balancer since that
does not need to prepare the pre-selected crawl list for monitoring. As
a effect:
- it is no more possible to see the correct order of next to-be-crawled
links, since that depends on the actual state of the balancer stack the
next time another url is requested for loading
- the balancer works better since the next url can be selected according
to the current situation and not according to a pre-selected order.
2012-02-02 21:33:42 +01:00
Michael Peter Christen
f214f6ebb4 added no-load queues to the crawler monitor 2012-01-07 17:17:11 +01:00
orbiter
abb35addb8 added
accept-charset="UTF-8"
to all forms
this applies patches from http://forum.yacy-websuche.de/viewtopic.php?p=20891#p20891

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-14 22:57:43 +00:00
orbiter
90c3e5d6f6 - cleanup, removed unused imports
- added crawling queue sizes to /api/status_p.xml, syntax same as in queues_p.html
- fixed a bug in queue enumeration that caused a out of bounds exception

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6842 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-04-27 21:47:41 +00:00
orbiter
d126d6c1b5 renamed the servlet WatchCrawler_p to Crawler_p
this was done because that servlet may be used for wget/cronjob
triggered crawl starts and it appears to be confusing that the
name of the crawl start servlet looks like a pure monitoring tool.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 10:05:28 +00:00