#%env/templates/metas.template%# #%env/templates/header.template%# #%env/templates/submenuCrawlMonitor.template%# #(process)#

Crawl Results Overview

These are monitoring pages for the different indexing queues.

YaCy knows 5 different ways to acquire web indexes. The details of these processes (1-5) are described within the submenu's listed above which also will show you a table with indexing results so far. The information in these tables is considered as private, so you need to log-in with your administration password.

Case (6) is a monitor of the local receipt-generator, the opposed case of (1). It contains also an indexing result monitor but is not considered private since it shows crawl requests from other peers.

Case (7) occurs if surrogate files are imported

An illustration how yacy works

The image above illustrates the data flow initiated by web index acquisition. Some processes occur double to document the complex index migration structure.

::

(1) Results of Remote Crawl Receipts

This is the list of web pages that this peer initiated to crawl, but had been crawled by other peers. This is the 'mirror'-case of process (6).

Use Case: You get entries here, if you start a local crawl on the 'Index Creation'-Page and check the 'Do Remote Indexing'-flag. Every page that a remote peer indexes upon this peer's request is reported back and can be monitored here.

::

(2) Results for Result of Search Queries

This index transfer was initiated by your peer by doing a search query. The index was crawled and contributed by other peers.

Use Case: This list fills up if you do a search query on the 'Search Page'

::

(3) Results for Index Transfer

The url fetch was initiated and executed by other peers. These links here have been transmitted to you because your peer is the most appropriate for storage according to the logic of the Global Distributed Hash Table.

Use Case: This list may fill if you check the 'Index Receive'-flag on the 'Index Control' page

::

(4) Results for Proxy Indexing

These web pages had been indexed as result of your proxy usage. No personal or protected page is indexed; such pages are detected by Cookie-Use or POST-Parameters (either in URL or as HTTP protocol) and automatically excluded from indexing.

Use Case: You must use YaCy as proxy to fill up this table. Set the proxy settings of your browser to the same port as given on the 'Settings'-page in the 'Proxy and Administration Port' field.

::

(5) Results for Local Crawling

These web pages had been crawled by your own crawl task.

Use Case: start a crawl by setting a crawl start point on the 'Index Create' page.

::

(6) Results for Global Crawling

These pages had been indexed by your peer, but the crawl was initiated by a remote peer. This is the 'mirror'-case of process (1).

Use Case: This list may fill if you check the 'Accept remote crawling requests'-flag on the 'Index Create' page

::

(7) Results from surrogates import

These records had been imported from surrogate files in DATA/SURROGATES/in

Use Case: place files with dublin core metadata content into DATA/SURROGATES/in or use an index import method (i.e. MediaWiki import, OAI-PMH retrieval)

#(/process)# #(table)#

The stack is empty.

::

Statistics about #[domains]# domains in this stack:

#{domains}# #{/domains}#
Domain URLs Blacklist to use
#[domain]# #[count]#

#(size)# Showing all #[all]# entries in this stack. :: Showing latest #[count]# lines from a stack of #[all]# entries. #(/size)#

#(showCollection)#::#(/showCollection)# #(showInit)#::#(/showInit)# #(showExec)#::#(/showExec)# #(showDate)#::#(/showDate)# #(showWords)#::#(/showWords)# #(showTitle)#::#(/showTitle)# #(showCountry)#::#(/showCountry)# #(showIP)#::#(/showIP)# #(showURL)#::#(/showURL)# #{indexed}# #(showCollection)#::#(/showCollection)# #(showInit)#::#(/showInit)# #(showExec)#::#(/showExec)# #(showDate)#::#(/showDate)# #(showWords)#::#(/showWords)# #(showTitle)# :: #(/showTitle)# #(showCountry)#::#(/showCountry)# #(showIP)#::#(/showIP)# #(showURL)# :: #(/showURL)# #{/indexed}#
CollectionInitiatorExecutorModifiedWordsTitleCountryIP of HostURL
#[collection]##[initiatorSeed]##[executorSeed]##[modified]##[count]# #(available)# -not cached- :: #(nodescr)#no title::#[urldescr]##(/nodescr)# #(/available)# #[country]##[ip]# #(available)# -not cached- :: #[url]# #(/available)#
:: #(/table)# #%env/templates/footer.template%#