Commit Graph

8042 Commits

Author SHA1 Message Date
apfelmaennchen
4f95f72124 YMarks:
- working direct importer for YaCy Crawl Starts
- working direct import for old bookmarks.db

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8052 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 23:10:53 +00:00
orbiter
a635e43f40 fix for global search attribute when selecting extended search options
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 22:57:15 +00:00
sixcooler
29c2289b5c Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-16 17:15:18 +01:00
sixcooler
605bc4c10e Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-16 16:56:09 +01:00
orbiter
aa322bc6d0 fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 15:36:30 +00:00
orbiter
97d1347adb added also a default accept field to robots.txt downloads
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8049 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 15:33:55 +00:00
orbiter
f183d3822c added a default accept header in http requests since some http fraud detection functions check that this header field exist
see also: http://bad-behavior.ioerror.us/ in source file browser.inc.php

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8048 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 15:27:43 +00:00
orbiter
06352b8d6b more logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 14:09:50 +00:00
orbiter
a99934226e more logging for debugging of robots.txt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 13:56:31 +00:00
orbiter
7a5841e061 fix for robot parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 13:12:46 +00:00
orbiter
458c20ff72 fix for robot parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 13:06:46 +00:00
sixcooler
e7dedc56f2 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-16 11:13:03 +01:00
sixcooler
787f6ef039 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-16 02:05:11 +01:00
orbiter
017a01714d - enhanced logging in robots.txt parser for remote debugging
- robots.txt is now more robust against database operations

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 01:03:49 +00:00
sixcooler
7545822db5 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-16 01:59:48 +01:00
orbiter
5a7cec59f3 moved ynetSearch to get all files out of htroot/api/util/
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-16 00:21:56 +00:00
apfelmaennchen
a410cfd7f3 - flexigrid images didn't load last time
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 21:55:00 +00:00
apfelmaennchen
a8dfe787ed - updated to jquery flexigrid 1.1
- YMarks.html automatically  recognizes if a bookmark is a crawl start


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 21:45:17 +00:00
sixcooler
710ea9fcb9 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-15 18:32:09 +01:00
cominch
cef8ebc41d getpageinfo: Checks if there is a OAI repository behind the URL.
This check is only performed if oai parameter is set when calling e.g. getpageinfo_p.xml?actions=oai

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 12:22:19 +00:00
sixcooler
0aa5e134ea Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-15 02:31:27 +01:00
orbiter
eb1c7c041d write info about robots.txt evaluation into getpageinfo_p.xml
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8038 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 00:33:54 +00:00
orbiter
f8b8c82421 - refactoring of getpageinfo_p.xml (moved out of util)
- added more logging in getpageinfo_p.xml

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-15 00:22:40 +00:00
apfelmaennchen
abba31f02e - bugfix for correctly sorting ymarks
- some tuning for the autotagger (still not perfect)
- /api/ymarks/get_metadata.xml now provides info for crawlstarts
- removed unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 22:00:44 +00:00
orbiter
ff32469272 added a link to /api/util/getpageinfo_p.xml as API to crawl start info and to ViewFile.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 20:19:41 +00:00
sixcooler
3b70ff7046 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-14 19:25:30 +01:00
orbiter
3a15e58e28 - increased stability when opening the robots table
- increased stability when deleting tables

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 15:33:35 +00:00
orbiter
775b44017e refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 15:11:57 +00:00
sixcooler
c99a4c0920 Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-14 14:07:58 +01:00
orbiter
e914a30099 fix for npe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-14 12:32:15 +00:00
sixcooler
b92c6bf897 Trying ImageIO instead of awt-Toolkit for parsing 2011-11-14 12:37:11 +01:00
sixcooler
db5ef90b0f Merge branch 'master' of https://git.gitorious.org/yacy/rc1.git 2011-11-14 12:22:57 +01:00
sixcooler
69dcde5cc6 not checking for the pid-file 2011-11-14 12:21:36 +01:00
sixcooler
9f8240b350 script for clean copy of URL-tables 2011-11-14 12:20:59 +01:00
sixcooler
5c58eda45a custom start-script 2011-11-14 12:20:33 +01:00
sixcooler
f40fef8243 custom logging settings 2011-11-14 12:19:58 +01:00
sixcooler
7cf8fac83f some filtering 2011-11-14 12:19:27 +01:00
sixcooler
3ef9f301ba some customize on Memory-Performance-Graph 2011-11-14 12:16:07 +01:00
sixcooler
8f25070460 weekly rewrite of blobs 2011-11-14 12:14:07 +01:00
sixcooler
d6c1ab4e0f some more unreserved characters 2011-11-14 12:11:22 +01:00
sixcooler
f522f61af0 clean offline copy of URL Tables 2011-11-14 12:09:34 +01:00
sixcooler
ee2f8673a2 memory in Perfomance Graph - just like it was in the past 2011-11-14 12:08:01 +01:00
sixcooler
2a6712e4be sixcooler.de in seed-list bootstrap locations 2011-11-14 12:06:05 +01:00
sixcooler
54193457bc cutom keep alive strategy 2011-11-14 11:54:48 +01:00
sixcooler
249a78ff2a G1 Memory Strategy - not used now 2011-11-14 11:54:03 +01:00
sixcooler
ccf1583188 cutom keep alive strategy 2011-11-14 11:52:29 +01:00
sixcooler
f280e339a8 no force on Memory Request for these parser 2011-11-14 11:46:30 +01:00
apfelmaennchen
5f7dbe1c42 - some refactoring (ymarks)
- improvement for autotagger (is now able to create/detect  multi word tags e.g. 'open source')



git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-13 23:19:47 +00:00
sixcooler
6567244f2a git testing: 2011-11-12 12:26:42 +01:00
apfelmaennchen
2f03186252 - small bug fix
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-12 09:25:08 +00:00