Commit Graph

15 Commits

Author SHA1 Message Date
orbiter
94f3d90af2 added a hint about regular expressions in crawl start
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6021 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-04 20:03:26 +00:00
apfelmaennchen
9ab009b16b fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1890#p13476
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5755 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-30 07:33:43 +00:00
auron_x
03a16f6c20 - more XHTML-validation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5580 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-02-06 14:45:56 +00:00
lotus
7e011de34e hint for recrawls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5537 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-01-30 08:24:27 +00:00
orbiter
10f5ec1040 reverted last commit (more testing needed)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5356 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-22 00:12:50 +00:00
orbiter
dba7ef5144 extended crawling constraints:
- removed never-used secondary crawl depth
- added a must-not-match filter that can be used to exclude urls from a crawl
- added stub for crawl tags which will be used to identify search results that had been produced from specific crawls
please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'.
Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never)

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-14 09:58:56 +00:00
lotus
4745e89451 auto-choose crawl type
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5331 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-11-12 14:44:23 +00:00
orbiter
47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5289 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-21 11:27:50 +00:00
orbiter
7b35d54c6c fixed some problems with network switching (was not completely 'clean')
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5200 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-23 12:11:19 +00:00
daburna
992635c074 translation update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5107 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-03 13:44:58 +00:00
apfelmaennchen
8d1bedfc3a - added bookmarkTitle to CrawlStart_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5068 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-21 21:07:21 +00:00
apfelmaennchen
e1574fe02e - added autoReCrawl folders to bookmarks (DATA/SETTINGS/autoReCrawl.conf)
- the serverBusyThread checks folders every 60 min. (==> autoReCrawl_idlesleep in yacy.conf)
- added option to create bookmarks from CrawlStart URL

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-08-04 20:43:36 +00:00
orbiter
8e179f6588 removed option to do a re-crawl with a period of minutes. Such a short time does not make sense and it may cause endless indexing loops. The removing of the option will ensure that a misuse is prevented.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4964 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-07-01 23:47:33 +00:00
orbiter
40d7f485f3 - fixed several NPE bugs
- fixed loosing of own seed hash (hopefully)
- fixed a bug with crawl start s beginning with (bookmark) files
- added better IP recognition during hello process


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4882 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-04 22:24:00 +00:00
orbiter
2f381b8d7a - fixed at least two causes for a NPE after a use case switch.
A large refactoring was neccessary
- added another crawl start option: automatic restriction to sub-path
- removed crawlStartSimple and renamed crawl start expert
   to crawl start (without expert)
- some changes to texts in crawl start
- added some more deletions when an web index is deleted:
   delete also queues and robots cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-04 21:34:57 +00:00