yacy_search_server/htroot/IndexReIndexMonitor_p.html
reger 72f6a0b0b2 enhance recrawl job
- allow to modify the query to select documents to  process (after job has started)
- allow to include failed urls (httpstatus <> 200)
2015-06-06 18:45:39 +02:00

109 lines
5.0 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >
<head>
<title>YaCy '#[clientname]#': Field Re-Indexing</title>
#%env/templates/metas.template%#
</head>
<body id="IndexReindexMonitor">
#%env/templates/header.template%#
#%env/templates/submenuIndexControl.template%#
<h2>Field Re-Indexing</h2>
<p>In case that an index schema of the embedded/local index has changed, all documents with missing field entries can be indexed again with a reindex job.</p>
<form action="IndexReIndexMonitor_p.html" method="post" enctype="multipart/form-data" accept-charset="UTF-8">
<table><tr valign="top"><td>
<fieldset>
<table>
<tr>
<td>Documents in current queue</td>
<td>#[querysize]#</td>
<td>#(reindexjobrunning)#::<input type="submit" value="refresh page" class="btn btn-success"/>#(/reindexjobrunning)#</td>
</tr>
<tr>
<td>Documents processed</td>
<td>#[docsprocessed]#</td>
<td></td>
</tr>
<tr>
<td>current select query </td>
<td>#[currentselectquery]#</td>
<td></td>
</tr>
<tr>
<td>&nbsp;</td>
<td></td>
<td></td>
</tr>
</table>
#(reindexjobrunning)#
<input type="submit" name="reindexnow" value="start reindex job now" class="btn btn-primary"/>
::<input type="submit" name="stopreindex" value="stop reindexing" class="btn btn-danger"/>
#(/reindexjobrunning)#
<p class="info">#[infomessage]#</p>
</fieldset>
</td><td>
#(reindexjobrunning)#::
<fieldset><legend>Remaining field list</legend>
<p>reindex documents containing these fields: </p>
<table>
<tr><th>Field</th><th>count</th></tr>
#{fieldlist}#
<tr>
<td>#[fieldname]#</td> <td align="right">#[fieldscore]#</td>
</tr>
#{/fieldlist}#
</table>
</fieldset>
#(/reindexjobrunning)#
</td></tr></table>
</form>
<h2>Re-Crawl Index Documents</h2>
<p>Searches the local index and selects documents to add to the crawler (recrawl the document).
This runs transparent as background job. Documents are added to the crawler only if no other crawls are active
and are added in small chunks.</p>
<form action="IndexReIndexMonitor_p.html?setup=recrawljob" method="post" enctype="multipart/form-data" accept-charset="UTF-8">
<table><tr valign="top"><td>
<fieldset>
#(recrawljobrunning)#
<input type="submit" name="recrawlnow" value="start recrawl job now" class="btn btn-primary"/>
to re-crawl documents with fresh_date_dt before today.
<p></p>
<p><small>after starting the recrawl job you can apply a custom Solr query to select documents to be processed</small></p>
::<input type="submit" name="stoprecrawl" value="stop recrawl job" class="btn btn-danger"/>
#(/recrawljobrunning)#
</fieldset>
</td>
<td>
#(recrawljobrunning)#::
<fieldset><legend>Re-Crawl Query Details</legend>
<table>
<tr>
<td>Documents to process</td><td>#[docCount]#</td>
</tr>
<tr>
<td>Current Query</td><td>#[recrawlquerytext]#</td>
</tr>
<tr>
<td>&nbsp;</td><td> </td>
</tr>
<tr>
<td>&nbsp;</td><td> </td>
</tr>
<tr>
<td>Edit Solr Query</td><td><input type="text" name="recrawlquerytext" size="40" value="#[recrawlquerytext]#" /><input type="submit" name="updquery" value="update" class="btn btn-sm btn-default"/></td>
</tr>
<tr>
<td>include failed urls</td><td><input type="checkbox" name="includefailedurls" onchange="this.form.submit()" #(includefailedurls)#::checked="checked"#(/includefailedurls)# /></td>
</tr>
</table>
</fieldset>
#(/recrawljobrunning)#
</td>
</tr></table>
</form>
#%env/templates/footer.template%#
</body>
</html>