Commit Graph

1377 Commits

Author SHA1 Message Date
orbiter
f763923e0a added missing files for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1057 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-11 08:02:46 +00:00
theli
9649d08171 *) More tolerant robots parser
- converting tabs to spaces
   - cutting of '*' in the disallow section

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1056 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-11 07:49:54 +00:00
orbiter
79818a320f introduced citation-rank transmission protocol and activate transport for anonymisation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-10 23:48:20 +00:00
borg-0300
9a441e8e77 new Ranking Images
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1054 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-09 13:24:41 +00:00
theli
6f8d7d3bcd *) Adding first version of YaCy bookmarklet
- this can be used to easily crawl a webpage which is currently opened in the browser
   - to get the bookmarklet javascript simply call http://localhost:8000/QuickCrawlLink_p.html
     and drag and drop the link shown to your Browsers Toolbar/Link-Bar.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1053 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 12:14:51 +00:00
theli
7e0647f692 *) Bugfix for userDB usage during authentication
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1052 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 10:17:12 +00:00
hydrox
86c74d209d *) fix for Settings_p.html (wrong variablename in link)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1051 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 09:02:26 +00:00
hydrox
886955e38c *) fix for last commit (wrong filename)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1050 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 08:58:18 +00:00
hydrox
88669ce008 *) cleaned up Settings_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1049 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 08:40:18 +00:00
theli
93cadb47b9 *) More tolerant robots parser for robots-files which missing empty lines between rule blocks
See: http://www.yacy-forum.de/viewtopic.php?p=12471

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1048 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-08 07:41:25 +00:00
orbiter
02f8013013 auto-delete of corrupted word files during word-migration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1047 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 14:57:37 +00:00
orbiter
d2731418bf added creation of global ranking files and changed url normal form usage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1046 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 12:33:02 +00:00
theli
6f9f8ed8f8 *) Automatic Reset of Stack Crawler DB on startup errors
See: http://www.yacy-forum.de/viewtopic.php?t=1432

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1045 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 12:19:05 +00:00
theli
fb766413d1 *) Changes on httpc dns caching
- Bugfix: old dns cache did not handle case insensitive hostnames correctly. 
   - adding a possibility to set domain name patterns defining hostnames that should not be cached by the httpc dns cache
     e.g. borg-300.dyndns.org
     This can be done by setting the new httpc.nameCacheNoCachingPatterns property
   - using httpc.dnsResolve wherever possible within the sourcecode
     [httpd.java,plasmaCrawlStacker.java]

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1044 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 10:57:54 +00:00
allo
89a4cca4df max. num of Entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1043 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 06:53:44 +00:00
allo
8d8d866494 Bugfix for catch up late Peer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1042 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 06:52:30 +00:00
orbiter
bc420c62f6 fixed htcache path generation (never change a running system)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1041 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-07 01:31:11 +00:00
borg-0300
795f488222 new urlNormalform version
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1040 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 22:35:56 +00:00
orbiter
c86d801b0f removed dyndns domains from dns caching
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1039 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 22:12:08 +00:00
orbiter
6dc42a2392 detecting of loops in kelondroTree during last/first-Node search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1038 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 21:06:55 +00:00
borg-0300
17d2830394 see: http://www.yacy-forum.de/viewtopic.php?t=1416
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1037 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 17:24:57 +00:00
theli
dd24f0252f *) Searchword highlighting for info page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1036 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 06:27:17 +00:00
theli
f9fb284fb7 *) Better handling of robots.txt files with incorrect keywords
See: http://www.yacy-forum.de/viewtopic.php?p=12292#12292

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-06 06:01:08 +00:00
borg-0300
a1406f4617 urlNormalform: no logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1034 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-05 16:21:04 +00:00
borg-0300
72cde1d894 getCachePath: no logging
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1033 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 22:47:13 +00:00
borg-0300
1fbd72f9e0 rename "index.html" to "ndx"
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1032 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 22:39:33 +00:00
orbiter
26c3f4aa5b link update as requested by domain owner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1031 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 19:59:25 +00:00
borg-0300
cd1107d85e added support for URLs with '?&'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1030 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 17:25:15 +00:00
borg-0300
5fb2b017cb small change
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1029 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 16:37:56 +00:00
borg-0300
60e869f236 bugfix for last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1028 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 15:52:12 +00:00
borg-0300
544e4ea90e small change
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1027 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 14:11:46 +00:00
borg-0300
00ab4d8723 cleaned, small change, Properties
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1026 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-04 13:41:51 +00:00
borg-0300
440e6ed747 see http://www.yacy-forum.de/viewtopic.php?t=1416
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1025 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 23:49:50 +00:00
theli
b8ceb1ffde *) Adding better https support for crawler
- solving problems with unkown certificates by implementing a dummy trust Manager
   - adding https support to robots-parser 
   - Seed File can now be downloaded from https resources
   - adapting plasmaHTCache.java to support https URLs properly

*) URL Normalization
   - sub URLs are now normalized properly during indexing
   - pointing urlNormalForm function of plasmaParser to htmlFilterContentScraper function
   - normalizing URLs which were received by a crawlOrder request

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1024 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 15:28:37 +00:00
borg-0300
d2507c6081 rename setJunior()... to orJunior()...,
added javadoc, 
added getPeerType(), setIP(), setPort(String port)


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1023 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 14:16:16 +00:00
borg-0300
e3179a6394 added getOwnSeedFile()
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1022 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 14:07:58 +00:00
borg-0300
a803a509ae bugfix: port handling in HTCache
grogram flow, cleared up


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1021 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 12:39:24 +00:00
hydrox
2c5999ae00 *)fixed UNRESOLVED PATTERN in ViewLog_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1020 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 10:44:53 +00:00
theli
f871408729 *) sharedBlacklist_p.java
- Setting Pragma: no-cache
   - increasing timeout to 12 sec.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1019 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 08:32:43 +00:00
theli
3d0dfd4df4 *) Using StringBuffer instead of String concatenation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1018 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-03 08:08:37 +00:00
low012
452db479cd *) bugfix: "21" was displayed as "21" in yacyWiki
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1017 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 23:28:54 +00:00
hydrox
cb69047b91 *)cleanup access static methods and fields
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1016 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 17:56:26 +00:00
hydrox
56b9f34411 *)removed unused imports
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1015 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 16:30:45 +00:00
hydrox
62b6c2b9e7 *)added news count to News.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1014 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 10:31:52 +00:00
orbiter
5f68b6886b introduced new url-hashes for better ranking computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1013 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-02 00:54:55 +00:00
orbiter
aadace1285 fixed network image in search performance monitor
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1012 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-01 00:49:13 +00:00
orbiter
bb369c98de fixed search result ordering by date
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1011 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 17:17:48 +00:00
hydrox
295aff52a3 *)added offline-browsing-support (onlineMode=0)
*)online-mode now can be changed in Status.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1010 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 12:25:40 +00:00
orbiter
4d1e56e4d9 fixed intermission-bug (removed 'break for intermission' of httpd-thread)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1009 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 10:46:13 +00:00
orbiter
b058ecf0bc refactoring of image-generation; added experimental PNG encoder (not active now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1008 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-10-31 02:43:55 +00:00