Commit Graph

17 Commits

Author SHA1 Message Date
orbiter
11bebe356b fixed crawl start: with SVN 7225 the name of the crawl start url was not given in input field and therefore all crawl starts had contained the empty string as crawl start url
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7229 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-08 22:02:24 +00:00
mikeworks
70576e88d2 de.lng: Added some more untranslated strings I found and uncommented old ones that were removed
terminal_p.html: Put back the old ID which was really easy to find
IndexCreate.js: Because XHTML 1.0 Strict does not allow name tags for some elements rewrote most element access functions to use getElementById
Table_API_p.html and all other html pages: Some XHTMl 1.0 Strict fixes, changed checkAll javascript, marked the first row with checkboxes as unsortable where applicable
Table_API_p.java and all other java pages: URLencoded lines with possible ampersands & -> & for validation XHTML 1.0 Strict sourcecode
--> All Index Create pages should validate now. Hope I did not break anything else (too much :-)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-06 00:00:23 +00:00
orbiter
f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
- nobody understand the auto-dom filter without a lenghtly introduction about the function of a crawler
- nobody ever used the auto-dom filter other than with a crawl depth of 1
- the auto-dom filter was buggy since the filter did not survive a restart and then a search index contained waste
- the function of the auto-dom filter was in fact to just load a link list from the given start url and then start separate crawls for all these urls restricted by their domain
- the new Site Link-List option shows the target urls in real-time during input of the start url (like the robots check) and gives a transparent feed-back what it does before it can be used
- the new option also fits into the easy site-crawl start menu

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7213 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-30 12:50:34 +00:00
mikeworks
b019426811 de.lng: Added German translations for new Index Creation pages RSS Feeds and adapted text in Tables_p.html and CrawlStartExpert_p.html to match some typos, also changed one name tag to id to conform with XHTML 1.0 Strict
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-26 01:39:51 +00:00
orbiter
58b7417a59 - added a new 'easy' crawl start menu which can be used for the special case of loading a complete domain
- the previous crawl start servet was renamed to CrawlStartExpert_p
- easy crawl start is now default

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7160 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-09-16 12:02:43 +00:00
orbiter
2f381b8d7a - fixed at least two causes for a NPE after a use case switch.
A large refactoring was neccessary
- added another crawl start option: automatic restriction to sub-path
- removed crawlStartSimple and renamed crawl start expert
   to crawl start (without expert)
- some changes to texts in crawl start
- added some more deletions when an web index is deleted:
   delete also queues and robots cache


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-06-04 21:34:57 +00:00
lulabad
fc54d4519e some more XHTML strict errors
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4471 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-10 09:06:17 +00:00
daburna
3636526bd6 replaced re-crawl/min-age as suggested here: http://forum.yacy-websuche.de/viewtopic.php?f=9&t=198
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-09 15:15:58 +00:00
daburna
a047e7f830 replaced irritating "re-crawl"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4463 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-02-08 22:47:18 +00:00
orbiter
b183bf6f42 - fixed opensearch bugs
- added 'full domain' button to expert crawl start
- removed not-workin 'only one domain' button, the regex allowed crawling of other domains

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-10-02 21:43:05 +00:00
low012
51800539b2 *) changed regex that is created for crawling filter (see http://forum.yacy-websuche.de/viewtopic.php?t=83)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3945 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-07-01 17:09:57 +00:00
orbiter
5009695537 fix for double-entries of crawl tasks.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3920 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-25 14:16:32 +00:00
orbiter
c7a614830a several bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3899 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-15 17:45:49 +00:00
allo
b2a9080a14 fix for when the user hits cancel
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3820 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 19:56:59 +00:00
allo
b68fb8a0ba one \ more
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3819 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 19:48:15 +00:00
allo
e24b54301e RegEx, not Blacklist-style RegEx ;/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3818 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-07 19:46:24 +00:00
orbiter
3f49cd516b splittet the index create page into two pages:
- one with less option but with information about other remote crawls
- one with complete information but without any other information
on both pages the steering options had beed removed. They are now at the monitoring page.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3813 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-06-06 22:27:03 +00:00