Commit Graph

17 Commits

Author SHA1 Message Date
Ryszard Goń
3144313974 Postprocessing progress bar fix
(Make it work as [probably] actually intended)
2014-12-27 03:02:18 +01:00
Michael Peter Christen
1245cfeb43 small change to crawler monitor to fit in larger translations 2014-02-28 13:58:05 +01:00
Michael Peter Christen
9e0e39a9a4 small change to start/stop/pause icon style 2014-02-03 17:39:26 +01:00
orbiter
19a051bec8 more monitoring for postprocessing and enhanced layout in Crawler
monitor page
2013-11-16 18:23:14 +01:00
Michael Peter Christen
fceac8cffd more monitoring for postprocessing 2013-11-16 08:23:42 +01:00
orbiter
9c681cc00d added segment sizes, postprocessing status and cpu load to crawler
monitor
2013-07-23 19:10:11 +02:00
Frank
7763f2554f add the new PPMbar in Crawler_p for a better style and better use. 2013-03-17 11:43:12 +01:00
Michael Peter Christen
788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
The default schema uses only some of them and the resting search index
has now the following properties:
- webgraph size will have about 40 times as much entries as default
index
- the complete index size will increase and may be about the double size
of current amount
As testing showed, not much indexing performance is lost. The default
index will be smaller (moved fields out of it); thus searching
can be faster.
The new index will cause that some old parts in YaCy can be removed,
i.e. specialized webgraph data and the noload crawler. The new index
will make it possible to:
- search within link texts of linked but not indexed documents (about 20
times of document index in size!!)
- get a very detailed link graph
- enhance ranking using a complete link graph

To get the full access to the new index, the API to solr has now two
access points: one with attribute core=collection1 for the default
search index and core=webgraph to the new webgraph search index. This is
also avaiable for p2p operation but client access is not yet
implemented.
2013-02-22 15:45:15 +01:00
Michael Peter Christen
b7004043ea - added a field cache for solr queries which call only for a single
value
- fixed a version conflict exception within a solr add request
2012-11-24 22:30:05 +01:00
sixcooler
f64e78497a fix for reload-feature in Crawler_p 2012-06-14 02:13:23 +02:00
Michael Peter Christen
638390930d another patch to fix the Crawler_p layout 2012-05-25 15:56:21 +02:00
Michael Peter Christen
c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of
queue of urls that shall be crawled
2012-05-25 01:45:38 +02:00
Michael Peter Christen
9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
ready-prepared crawl list but at the stacks of the domains that are
stored for balanced crawling. This affects also the balancer since that
does not need to prepare the pre-selected crawl list for monitoring. As
a effect:
- it is no more possible to see the correct order of next to-be-crawled
links, since that depends on the actual state of the balancer stack the
next time another url is requested for loading
- the balancer works better since the next url can be selected according
to the current situation and not according to a pre-selected order.
2012-02-02 21:33:42 +01:00
Michael Peter Christen
f214f6ebb4 added no-load queues to the crawler monitor 2012-01-07 17:17:11 +01:00
low012
b0bdf2d9ed *) Oops!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7490 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-17 20:42:35 +00:00
low012
de065e594f *) make sure that only positive values are accepted as refresh interval on Crawler Monitor page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7489 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-17 20:40:34 +00:00
orbiter
d126d6c1b5 renamed the servlet WatchCrawler_p to Crawler_p
this was done because that servlet may be used for wget/cronjob
triggered crawl starts and it appears to be confusing that the
name of the crawl start servlet looks like a pure monitoring tool.


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6568 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-01-12 10:05:28 +00:00