redesign of index creation interface

- the input remains in the IndexCreation menu point
- after pressing the submit button, the IndexingMonitor is called
- the code for creation of new indexing starts was moved to the indexingMonitor
- Existing crawl profiles can be monitored in the Indexing Monitor
- the code for creation of crawl profile data was shifted from indexing start to indexing monitor
- existing crawl profiles can be deleted on the crawl monitor page

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3095 6c8d7289-2bf4-0310-a012-ef5d649a1542
This commit is contained in:
orbiter 2006-12-18 02:56:32 +00:00
parent febe6b114a
commit c500178fd7
4 changed files with 84 additions and 378 deletions

View File

@ -3,7 +3,7 @@ javacSource=1.4
javacTarget=1.4
# Release Configuration
releaseVersion=0.493
releaseVersion=0.494
releaseFile=yacy_dev_v${releaseVersion}_${DSTAMP}_${releaseNr}.tar.gz
#releaseFile=yacy_v${releaseVersion}_${DSTAMP}_${releaseNr}.tar.gz
releaseDir=yacy_dev_v${releaseVersion}_${DSTAMP}_${releaseNr}

View File

@ -16,7 +16,7 @@
You can define URLs as start points for Web page crawling and start crawling here. "Crawling" means that YaCy will download the given website, extract all links in it and then download the content behind these links. This is repeated as long as specified under "Crawling Depth".
</p>
<form action="IndexCreate_p.html" method="post" enctype="multipart/form-data">
<form action="WatchCrawler_p.html" method="post" enctype="multipart/form-data">
<table border="0" cellpadding="5" cellspacing="1">
<tr class="TableHeader">
<td><strong>Attribut</strong></td>
@ -242,51 +242,15 @@
</table>
</form>
<p>
#(error)#<!-- 0 -->
::<!-- 1 -->
Error with profile management. Please stop YaCy, delete the file DATA/PLASMADB/crawlProfiles0.db and restart.
::<!-- 2 -->
Error: #[errmsg]#
::<!-- 3 -->
Application not yet initialized. Sorry. Please wait some seconds and repeat the request.
::<!-- 4 -->
<strong>ERROR: Crawl filter "#[newcrawlingfilter]#" does not match with crawl root "#[crawlingStart]#".</strong> Please try again with different filter.
::<!-- 5 -->
Crawling of "#[crawlingURL]#" failed. Reason: #[reasonString]#<br>
::<!-- 6 -->
Error with URL input "#[crawlingStart]#": #[error]#
::<!-- 7 -->
Error with file input "#[crawlingStart]#": #[error]#
#(/error)#
</p>
<p>
#(info)#
::
Set new prefetch depth to "#[newproxyPrefetchDepth]#"
::
Crawling of "#[crawlingURL]#" started.
You can monitor the crawling progress either by watching the URL queues
(<a href="/IndexCreateWWWLocalQueue_p.html">local queue</a>,
<a href="/IndexCreateWWWGlobalQueue_p.html">global queue</a>,
<a href="/IndexCreateLoaderQueue_p.html">loader queue</a>,
<a href="/IndexCreateLoaderQueue_p.html">indexing queue</a>)
or see the fill/process count of all queues on the
<a href="/PerformanceQueues_p.html">performance page</a>.
<strong>Please wait some seconds, because the request is enqueued and delayed until the proxy/HTTP-server is idle for a certain time.</strong>
The indexing results are presented on the
<a href="IndexMonitor.html">Index Monitor</a>-page.
<strong>It will take at least 30 seconds until the first result appears there. Please be patient, the crawling will pause each time you use the proxy or web server to ensure maximum availability.</strong>
If you crawl any un-wanted pages, you can delete them <a href="IndexCreateWWWLocalQueue_p.html">here</a>.<br />
::
Removed #[numEntries]# entries from crawl queue. This queue may fill again if the loading and indexing queue is not empty.
::
Crawling paused successfully.
::
Continue crawling.
#(/info)#
</p>
#(refreshbutton)#
::
<form action="IndexCreate_p.html" method="post" enctype="multipart/form-data">
@ -305,50 +269,6 @@
</fieldset>
</form>
<p id="crawlingProfiles"><strong>Crawl Profile List:</strong></p>
<table border="0" cellpadding="2" cellspacing="1">
<colgroup>
<col width="120" />
<col />
<col width="16" />
<col width="60" />
<col width="10" span="2" />
<col />
<col width="10" span="5" />
</colgroup>
<tr class="TableHeader">
<td><strong>Crawl Thread</strong></td>
<td><strong>Start URL</strong></td>
<td><strong>Depth</strong></td>
<td><strong>Filter</strong></td>
<td><strong>MaxAge</strong></td>
<td><strong>Auto Filter Depth</strong></td>
<td><strong>Auto Filter Content</strong></td>
<td><strong>Max Page Per Domain</strong></td>
<td><strong>Accept '?' URLs</strong></td>
<td><strong>Fill Proxy Cache</strong></td>
<td><strong>Local Indexing</strong></td>
<td><strong>Remote Indexing</strong></td>
<td></td>
</tr>
#{crawlProfiles}#
<tr class="TableCell#(dark)#Light::Dark#(/dark)#">
<td>#[name]#</td>
<td><a href="#[startURL]#">#[startURL]#</a></td>
<td>#[depth]#</td>
<td>#[filter]#</td>
<td>#[crawlingIfOlder]#</td>
<td>#[crawlingDomFilterDepth]#</td>
<td>#[crawlingDomFilterContent]#</td>
<td>#[crawlingDomMaxPages]#</td>
<td>#(withQuery)#no::yes#(/withQuery)#</td>
<td>#(storeCache)#no::yes#(/storeCache)#</td>
<td>#(localIndexing)#no::yes#(/localIndexing)#</td>
<td>#(remoteIndexing)#no::yes#(/remoteIndexing)#</td>
<td>#(deleteButton)#::<form action="IndexCreate_p.html" method="get" enctype="multipart/form-data"><input type="hidden" name="handle" value="#[handle]#" /><input type="submit" name="deleteprofile" value="Delete" /></form>#(/deleteButton)#</td>
</tr>
#{/crawlProfiles}#
</table>
<p id="crawlingStarts"><strong>Recently started remote crawls in progress:</strong></p>
<table border="0" cellpadding="2" cellspacing="1">
<tr class="TableHeader">

View File

@ -43,29 +43,13 @@
// javac -classpath .:../classes IndexCreate_p.java
// if the shell's current path is HTROOT
import java.io.File;
import java.io.IOException;
import java.io.Writer;
import java.net.MalformedURLException;
import java.util.Date;
import java.util.Enumeration;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
import de.anomic.data.wikiCode;
import de.anomic.htmlFilter.htmlFilterContentScraper;
import de.anomic.htmlFilter.htmlFilterWriter;
import de.anomic.http.httpHeader;
import de.anomic.kelondro.kelondroBitfield;
import de.anomic.plasma.plasmaURL;
import de.anomic.net.URL;
import de.anomic.plasma.plasmaCrawlEURL;
import de.anomic.plasma.plasmaCrawlProfile;
import de.anomic.plasma.plasmaSwitchboard;
import de.anomic.server.serverFileUtils;
import de.anomic.server.serverObjects;
import de.anomic.server.serverSwitch;
import de.anomic.server.serverThread;
@ -81,230 +65,10 @@ public class IndexCreate_p {
plasmaSwitchboard switchboard = (plasmaSwitchboard) env;
serverObjects prop = new serverObjects();
prop.put("error", 0);
prop.put("info", 0);
prop.put("refreshbutton", 0);
if (post != null) {
if (post.containsKey("crawlingstart")) {
// init crawl
if (yacyCore.seedDB == null) {
prop.put("error", 3);
} else {
// set new properties
String newcrawlingfilter = post.get("crawlingFilter", ".*");
env.setConfig("crawlingFilter", newcrawlingfilter);
int newcrawlingdepth = Integer.parseInt(post.get("crawlingDepth", "0"));
env.setConfig("crawlingDepth", Integer.toString(newcrawlingdepth));
boolean crawlingIfOlderCheck = post.get("crawlingIfOlderCheck", "off").equals("on");
int crawlingIfOlderNumber = Integer.parseInt(post.get("crawlingIfOlderNumber", "-1"));
String crawlingIfOlderUnit = post.get("crawlingIfOlderUnit","year");
int crawlingIfOlder = recrawlIfOlderC(crawlingIfOlderCheck, crawlingIfOlderNumber, crawlingIfOlderUnit);
env.setConfig("crawlingIfOlder", crawlingIfOlder);
boolean crawlingDomFilterCheck = post.get("crawlingDomFilterCheck", "off").equals("on");
int crawlingDomFilterDepth = (crawlingDomFilterCheck) ? Integer.parseInt(post.get("crawlingDomFilterDepth", "-1")) : -1;
env.setConfig("crawlingDomFilterDepth", Integer.toString(crawlingDomFilterDepth));
boolean crawlingDomMaxCheck = post.get("crawlingDomMaxCheck", "off").equals("on");
int crawlingDomMaxPages = (crawlingDomMaxCheck) ? Integer.parseInt(post.get("crawlingDomMaxPages", "-1")) : -1;
env.setConfig("crawlingDomMaxPages", Integer.toString(crawlingDomMaxPages));
boolean crawlingQ = post.get("crawlingQ", "off").equals("on");
env.setConfig("crawlingQ", (crawlingQ) ? "true" : "false");
boolean storeHTCache = post.get("storeHTCache", "off").equals("on");
env.setConfig("storeHTCache", (storeHTCache) ? "true" : "false");
boolean localIndexing = post.get("localIndexing", "off").equals("on");
env.setConfig("localIndexing", (localIndexing) ? "true" : "false");
boolean crawlOrder = post.get("crawlOrder", "off").equals("on");
env.setConfig("crawlOrder", (crawlOrder) ? "true" : "false");
boolean xsstopw = post.get("xsstopw", "off").equals("on");
env.setConfig("xsstopw", (xsstopw) ? "true" : "false");
boolean xdstopw = post.get("xdstopw", "off").equals("on");
env.setConfig("xdstopw", (xdstopw) ? "true" : "false");
boolean xpstopw = post.get("xpstopw", "off").equals("on");
env.setConfig("xpstopw", (xpstopw) ? "true" : "false");
String crawlingMode = post.get("crawlingMode","url");
if (crawlingMode.equals("url")) {
// getting the crawljob start url
String crawlingStart = post.get("crawlingURL","");
crawlingStart = crawlingStart.trim();
// adding the prefix http:// if necessary
int pos = crawlingStart.indexOf("://");
if (pos == -1) crawlingStart = "http://" + crawlingStart;
// normalizing URL
try {crawlingStart = new URL(crawlingStart).toNormalform();} catch (MalformedURLException e1) {}
// check if url is proper
URL crawlingStartURL = null;
try {
crawlingStartURL = new URL(crawlingStart);
} catch (MalformedURLException e) {
crawlingStartURL = null;
}
// check if pattern matches
if ((crawlingStartURL == null) /* || (!(crawlingStart.matches(newcrawlingfilter))) */) {
// print error message
prop.put("error", 4); //crawlfilter does not match url
prop.put("error_newcrawlingfilter", newcrawlingfilter);
prop.put("error_crawlingStart", crawlingStart);
} else try {
// check if the crawl filter works correctly
Pattern.compile(newcrawlingfilter);
// stack request
// first delete old entry, if exists
String urlhash = plasmaURL.urlHash(crawlingStart);
switchboard.wordIndex.loadedURL.remove(urlhash);
switchboard.noticeURL.remove(urlhash);
switchboard.errorURL.remove(urlhash);
// stack url
plasmaCrawlProfile.entry pe = switchboard.profiles.newEntry(crawlingStartURL.getHost(), crawlingStart, newcrawlingfilter, newcrawlingfilter, newcrawlingdepth, newcrawlingdepth, crawlingIfOlder, crawlingDomFilterDepth, crawlingDomMaxPages, crawlingQ, storeHTCache, true, localIndexing, crawlOrder, xsstopw, xdstopw, xpstopw);
String reasonString = switchboard.sbStackCrawlThread.stackCrawl(crawlingStart, null, yacyCore.seedDB.mySeed.hash, "CRAWLING-ROOT", new Date(), 0, pe);
if (reasonString == null) {
// liftoff!
prop.put("info", 2);//start msg
prop.put("info_crawlingURL", ((String) post.get("crawlingURL")));
// generate a YaCyNews if the global flag was set
if (crawlOrder) {
Map m = new HashMap(pe.map()); // must be cloned
m.remove("specificDepth");
m.remove("localIndexing");
m.remove("remoteIndexing");
m.remove("xsstopw");
m.remove("xpstopw");
m.remove("xdstopw");
m.remove("storeTXCache");
m.remove("storeHTCache");
m.remove("generalFilter");
m.remove("specificFilter");
m.put("intention", post.get("intention", "").replace(',', '/'));
yacyCore.newsPool.publishMyNews(new yacyNewsRecord("crwlstrt", m));
}
} else {
prop.put("error", 5); //Crawling failed
prop.put("error_crawlingURL", wikiCode.replaceHTML(((String) post.get("crawlingURL"))));
prop.put("error_reasonString", reasonString);
plasmaCrawlEURL.Entry ee = switchboard.errorURL.newEntry(crawlingStartURL, null, yacyCore.seedDB.mySeed.hash, yacyCore.seedDB.mySeed.hash,
crawlingStartURL.getHost(), reasonString, new kelondroBitfield());
ee.store();
switchboard.errorURL.stackPushEntry(ee);
}
} catch (PatternSyntaxException e) {
prop.put("error", 8); //crawlfilter does not match url
prop.put("error_newcrawlingfilter", newcrawlingfilter);
prop.put("error_error", e.getMessage());
} catch (Exception e) {
// mist
prop.put("error", 6);//Error with url
prop.put("error_crawlingStart", crawlingStart);
prop.put("error_error", e.getMessage());
e.printStackTrace();
}
} else if (crawlingMode.equals("file")) {
if (post.containsKey("crawlingFile")) {
// getting the name of the uploaded file
String fileName = (String) post.get("crawlingFile");
try {
// check if the crawl filter works correctly
Pattern.compile(newcrawlingfilter);
// loading the file content
File file = new File(fileName);
// getting the content of the bookmark file
byte[] fileContent = (byte[]) post.get("crawlingFile$file");
// TODO: determine the real charset here ....
String fileString = new String(fileContent,"UTF-8");
// parsing the bookmark file and fetching the headline and contained links
htmlFilterContentScraper scraper = new htmlFilterContentScraper(new URL(file));
//OutputStream os = new htmlFilterOutputStream(null, scraper, null, false);
Writer writer = new htmlFilterWriter(null,null,scraper,null,false);
serverFileUtils.write(fileString,writer);
writer.close();
//String headline = scraper.getHeadline();
HashMap hyperlinks = (HashMap) scraper.getAnchors();
// creating a crawler profile
plasmaCrawlProfile.entry profile = switchboard.profiles.newEntry(fileName, file.toURL().toString(), newcrawlingfilter, newcrawlingfilter, newcrawlingdepth, newcrawlingdepth, crawlingIfOlder, crawlingDomFilterDepth, crawlingDomMaxPages, crawlingQ, storeHTCache, true, localIndexing, crawlOrder, xsstopw, xdstopw, xpstopw);
// loop through the contained links
Iterator interator = hyperlinks.entrySet().iterator();
int c = 0;
while (interator.hasNext()) {
Map.Entry e = (Map.Entry) interator.next();
String nexturlstring = (String) e.getKey();
if (nexturlstring == null) continue;
nexturlstring = nexturlstring.trim();
// normalizing URL
nexturlstring = new URL(nexturlstring).toNormalform();
// generating an url object
URL nexturlURL = null;
try {
nexturlURL = new URL(nexturlstring);
} catch (MalformedURLException ex) {
nexturlURL = null;
c++;
continue;
}
// enqueuing the url for crawling
String rejectReason = switchboard.sbStackCrawlThread.stackCrawl(nexturlstring, null, yacyCore.seedDB.mySeed.hash, (String)e.getValue(), new Date(), 1, profile);
// if something failed add the url into the errorURL list
if (rejectReason == null) {
c++;
} else {
plasmaCrawlEURL.Entry ee = switchboard.errorURL.newEntry(nexturlURL, null, yacyCore.seedDB.mySeed.hash, yacyCore.seedDB.mySeed.hash,
(String) e.getValue(), rejectReason, new kelondroBitfield());
ee.store();
switchboard.errorURL.stackPushEntry(ee);
}
}
} catch (PatternSyntaxException e) {
// print error message
prop.put("error", 8); //crawlfilter does not match url
prop.put("error_newcrawlingfilter", newcrawlingfilter);
prop.put("error_error", e.getMessage());
} catch (Exception e) {
// mist
prop.put("error", 7);//Error with file
prop.put("error_crawlingStart", fileName);
prop.put("error_error", e.getMessage());
e.printStackTrace();
}
}
}
}
}
if (post.containsKey("distributedcrawling")) {
long newBusySleep = Integer.parseInt(env.getConfig("62_remotetriggeredcrawl_busysleep", "100"));
if (post.get("dcr", "").equals("acceptCrawlMax")) {
@ -328,18 +92,14 @@ public class IndexCreate_p {
if (post.containsKey("pausecrawlqueue")) {
switchboard.pauseCrawlJob(plasmaSwitchboard.CRAWLJOB_LOCAL_CRAWL);
prop.put("info", 4);//crawling paused
prop.put("info", 1);//crawling paused
}
if (post.containsKey("continuecrawlqueue")) {
switchboard.continueCrawlJob(plasmaSwitchboard.CRAWLJOB_LOCAL_CRAWL);
prop.put("info", 5);//crawling continued
prop.put("info", 2);//crawling continued
}
if (post.containsKey("deleteprofile")) {
String handle = (String) post.get("handle");
if (handle != null) switchboard.profiles.removeEntry(handle);
}
}
// define visible variables
@ -420,43 +180,7 @@ public class IndexCreate_p {
}
// create prefetch table
boolean dark;
// sed crawl profiles
int count = 0;
int domlistlength = (post == null) ? 160 : post.getInt("domlistlength", 160);
//try{
Iterator it = switchboard.profiles.profiles(true);
plasmaCrawlProfile.entry profile;
dark = true;
while (it.hasNext()) {
profile = (plasmaCrawlProfile.entry) it.next();
//table += profile.map().toString() + "<br>";
prop.put("crawlProfiles_"+count+"_dark", ((dark) ? 1 : 0));
prop.put("crawlProfiles_"+count+"_name", wikiCode.replaceHTML(profile.name()));
prop.put("crawlProfiles_"+count+"_startURL", wikiCode.replaceHTML(profile.startURL()));
prop.put("crawlProfiles_"+count+"_handle", wikiCode.replaceHTML(profile.handle()));
prop.put("crawlProfiles_"+count+"_depth", profile.generalDepth());
prop.put("crawlProfiles_"+count+"_filter", profile.generalFilter());
prop.put("crawlProfiles_"+count+"_crawlingIfOlder", (profile.recrawlIfOlder() == Long.MAX_VALUE) ? "no re-crawl" : ""+profile.recrawlIfOlder());
prop.put("crawlProfiles_"+count+"_crawlingDomFilterDepth", (profile.domFilterDepth() == Integer.MAX_VALUE) ? "inactive" : ""+profile.domFilterDepth());
prop.put("crawlProfiles_"+count+"_crawlingDomFilterContent", profile.domNames(true, domlistlength));
prop.put("crawlProfiles_"+count+"_crawlingDomMaxPages", (profile.domMaxPages() == Integer.MAX_VALUE) ? "unlimited" : ""+profile.domMaxPages());
prop.put("crawlProfiles_"+count+"_withQuery", ((profile.crawlingQ()) ? 1 : 0));
prop.put("crawlProfiles_"+count+"_storeCache", ((profile.storeHTCache()) ? 1 : 0));
prop.put("crawlProfiles_"+count+"_localIndexing", ((profile.localIndexing()) ? 1 : 0));
prop.put("crawlProfiles_"+count+"_remoteIndexing", ((profile.remoteIndexing()) ? 1 : 0));
prop.put("crawlProfiles_"+count+"_deleteButton", (((profile.name().equals("remote")) ||
(profile.name().equals("proxy")) ||
(profile.name().equals("snippet"))) ? 0 : 1));
prop.put("crawlProfiles_"+count+"_deleteButton_handle", profile.handle());
dark = !dark;
count++;
}
//}catch(IOException e){};
prop.put("crawlProfiles", count);
boolean dark = true;
// create other peer crawl table using YaCyNews
int availableNews = yacyCore.newsPool.size(yacyNewsPool.INCOMING_DB);
@ -513,9 +237,8 @@ public class IndexCreate_p {
// remote crawl peers
if (yacyCore.seedDB == null) {
//table += "Sorry, cannot show any crawl output now because the system is not completely initialised. Please re-try.";
prop.put("error", 3);
if (yacyCore.seedDB != null) {
prop.put("remoteCrawlPeers", 0);
} else {
Enumeration crawlavail = yacyCore.dhtAgent.getAcceptRemoteCrawlSeeds(plasmaURL.dummyHash, true);
Enumeration crawlpendi = yacyCore.dhtAgent.getAcceptRemoteCrawlSeeds(plasmaURL.dummyHash, false);
@ -546,22 +269,12 @@ public class IndexCreate_p {
}
prop.put("crawler-paused",(switchboard.crawlJobIsPaused(plasmaSwitchboard.CRAWLJOB_LOCAL_CRAWL))?0:1);
// return rewrite properties
return prop;
}
private static int recrawlIfOlderC(boolean recrawlIfOlderCheck, int recrawlIfOlderNumber, String crawlingIfOlderUnit) {
if (!recrawlIfOlderCheck) return -1;
if (crawlingIfOlderUnit.equals("year")) return recrawlIfOlderNumber * 60 * 24 * 356;
if (crawlingIfOlderUnit.equals("month")) return recrawlIfOlderNumber * 60 * 24 * 30;
if (crawlingIfOlderUnit.equals("day")) return recrawlIfOlderNumber * 60 * 24;
if (crawlingIfOlderUnit.equals("hour")) return recrawlIfOlderNumber * 60;
if (crawlingIfOlderUnit.equals("minute")) return recrawlIfOlderNumber;
return -1;
}
}

View File

@ -49,7 +49,7 @@
<td width="10">&nbsp;</td>
<td valign="top" ><table border="0" cellpadding="2" cellspacing="1">
<td valign="top"><table border="0" cellpadding="2" cellspacing="1">
<tbody>
<tr class="TableHeader">
<th>Indicator</th>
@ -66,11 +66,84 @@
<td align="left"><span id="wordcacheSpan">&nbsp;</span></td>
</tr>
</tbody>
</table></td>
</table><br />
<p>
#(info)#<!-- 0 -->
::<!-- 1 -->
Error with profile management. Please stop YaCy, delete the file DATA/PLASMADB/crawlProfiles0.db and restart.
::<!-- 2 -->
Error: #[errmsg]#
::<!-- 3 -->
Application not yet initialized. Sorry. Please wait some seconds and repeat the request.
::<!-- 4 -->
<strong>ERROR: Crawl filter "#[newcrawlingfilter]#" does not match with crawl root "#[crawlingStart]#".</strong> Please try again with different filter.
::<!-- 5 -->
Crawling of "#[crawlingURL]#" failed. Reason: #[reasonString]#<br>
::<!-- 6 -->
Error with URL input "#[crawlingStart]#": #[error]#
::<!-- 7 -->
Error with file input "#[crawlingStart]#": #[error]#
::<!-- 8 -->
Crawling of "#[crawlingURL]#" started.
<strong>Please wait some seconds, it may take some seconds until the first result appears there.</strong>
If you crawl any un-wanted pages, you can delete them <a href="IndexCreateWWWLocalQueue_p.html">here</a>.<br />
#(/info)#
</p>
</td>
</tr></table></p>
<p><table border="0" cellpadding="2" cellspacing="1" id="queueTable">
<!-- crawl profile list -->
<p id="crawlingProfiles"><strong>Crawl Profiles:</strong><br />
<table border="0" cellpadding="2" cellspacing="1">
<colgroup>
<col width="120" />
<col />
<col width="16" />
<col width="60" />
<col width="10" span="2" />
<col />
<col width="10" span="5" />
</colgroup>
<tr class="TableHeader">
<td><strong>Crawl Thread</strong></td>
<td><strong>Start URL</strong></td>
<td><strong>Depth</strong></td>
<td><strong>Filter</strong></td>
<td><strong>MaxAge</strong></td>
<td><strong>Auto Filter Depth</strong></td>
<td><strong>Auto Filter Content</strong></td>
<td><strong>Max Page Per Domain</strong></td>
<td><strong>Accept '?' URLs</strong></td>
<td><strong>Fill Proxy Cache</strong></td>
<td><strong>Local Indexing</strong></td>
<td><strong>Remote Indexing</strong></td>
<td></td>
</tr>
#{crawlProfiles}#
<tr class="TableCell#(dark)#Light::Dark#(/dark)#">
<td>#[name]#</td>
<td><a href="#[startURL]#">#[startURL]#</a></td>
<td>#[depth]#</td>
<td>#[filter]#</td>
<td>#[crawlingIfOlder]#</td>
<td>#[crawlingDomFilterDepth]#</td>
<td>#[crawlingDomFilterContent]#</td>
<td>#[crawlingDomMaxPages]#</td>
<td>#(withQuery)#no::yes#(/withQuery)#</td>
<td>#(storeCache)#no::yes#(/storeCache)#</td>
<td>#(localIndexing)#no::yes#(/localIndexing)#</td>
<td>#(remoteIndexing)#no::yes#(/remoteIndexing)#</td>
<td>#(deleteButton)#::<form action="WatchCrawler_p.html" method="get" enctype="multipart/form-data"><input type="hidden" name="handle" value="#[handle]#" /><input type="submit" name="deleteprofile" value="Delete" /></form>#(/deleteButton)#</td>
</tr>
#{/crawlProfiles}#
</table></p>
<!-- crawl queues -->
<p><p id="crawlingQueues"><strong>Crawl Queue:</strong><br />
<table border="0" cellpadding="2" cellspacing="1" id="queueTable">
<tbody>
<tr class="TableHeader">
<th>Queue</th>