karlchenofhell
6fbe31425a
- some code-cleanup (no more syntax-warnings here)
...
- added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-02-26 12:56:50 +00:00
orbiter
f3f99b19c6
extended search statistics
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3249 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-19 01:45:29 +00:00
karlchenofhell
b3a2c79fec
- fix for last fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3202 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-14 17:47:25 +00:00
karlchenofhell
4b23e79f51
- quick-fix for http://www.yacy-forum.de/viewtopic.php?t=3358
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3201 6c8d7289-2bf4-0310-a012-ef5d649a1542
2007-01-14 17:31:08 +00:00
karlchenofhell
be941c4475
- "javascript:"-URLs are recognized as well (as intended formerly I assume)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3097 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 20:19:02 +00:00
karlchenofhell
a619ba3f49
- fix for String index out of range during URL parsing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3096 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-12-18 19:32:01 +00:00
orbiter
d34f10c63d
some tests with reverse dns lookup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2954 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-11-12 00:28:10 +00:00
orbiter
bd4f43cd66
- fixed a null pointer exception bug
...
- switched off more write caches
- re-enabled index-abstracts search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2885 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-31 02:45:41 +00:00
theli
d38ef0493d
*) be more tolerant against missing ports in url
...
"http://yacy.net:/ " is now interpreted as "http://yacy.net/ "
See: http://www.yacy-forum.de/viewtopic.php?p=27102
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2852 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 05:22:54 +00:00
theli
cfe54fedc7
*) Bugfix for resolveBackpath problem with tailing /..
...
*) Junit testclass for resolveBackpath testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2850 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-24 05:07:34 +00:00
karlchenofhell
d13b381f83
- added mint-green skin
...
- removed test-urls because of problems with text-encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2832 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-21 10:24:32 +00:00
karlchenofhell
ce237aefad
- assortment-sizes table from PerformanceQueues_p.html is not shown if not used
...
- escape query- and fragment-part of an url as well
- new resolveBackpath for urls: http://www.yacy-forum.de/viewtopic.php?t=2679#24867
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2815 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-19 15:27:24 +00:00
karlchenofhell
b14a500b88
- removed debug output from PerformanceMemory_p
...
- added URL escaping (tested, nevertheless watch out for possibly broken URLs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2797 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-18 14:51:37 +00:00
orbiter
bcf2b800b4
applied UTF-8 encoding parameter to yacy-internal protocol communication
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2694 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 13:35:38 +00:00
orbiter
5a40ea7866
refactoring of wget string list generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2692 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-10-02 09:59:20 +00:00
orbiter
df1629b05a
- code cleanup
...
- version 0.471
- moved surftipps to own web page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-29 22:27:20 +00:00
borg-0300
f18304ddd3
unused/not needed imports removes;
...
properties added;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2628 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-18 22:21:18 +00:00
orbiter
1e7fd48afd
added size method to ftpc
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2508 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-07 12:21:41 +00:00
orbiter
d4c5e2af01
html-dirlist can now also be generated from existing connections
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2493 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 10:11:07 +00:00
theli
7930839594
*) URL.java: userinfo was not taken over when generating a new url from a base url and a rel. path
...
*) CrawlWorker.java: using new dirhtml function of ftpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2492 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 05:17:57 +00:00
orbiter
17ba468165
added html dirlisting generation in ftpc.java:
...
ftpc.dirhtml() generates a StringBuffer with a complete web page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2491 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-05 00:11:59 +00:00
theli
ab5a9bee66
*) adding some copyright headers
...
*) next step of restructuring for new crawlers
- adding first testversion of ftp crawler class
-- does not create a htCache entry yet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2483 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 14:38:29 +00:00
theli
5847492537
*) next step of restructuring for new crawlers
...
- IndexCreate_p.java: correcting problems with ftp urls
- URL.java does not cutout the userinfo anymore
(needed to transport authentication info in ftp urls, e.g. ftp://username:pwd@ftp.irgendwas.de)
- plasmaCrawlLoader.java:
-- hack to re enable https urls
-- adding function getSupportedProtocols
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2482 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:17:11 +00:00
orbiter
6cce47e217
test of ftp-urls in URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2481 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-09-04 13:10:40 +00:00
orbiter
f933f00f09
another patch to URL protocol handling for 'news', 'nntp' etc:
...
reject it! (the java.net.URL class rejects them too)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2432 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 01:04:04 +00:00
orbiter
4c6e00d80a
more bugfixes for URL class, see:
...
http://www.yacy-forum.de/viewtopic.php?p=24844#24844
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2431 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-21 00:23:39 +00:00
orbiter
b7dc251948
fixed bugs in url class:
...
- correct backpath ('..') handling
- correct absolute path handling
- included https
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2428 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-19 22:27:01 +00:00
orbiter
276225d79e
fix for URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2423 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-18 21:33:00 +00:00
orbiter
f43c90fa98
fixed handling of null referer in crawlOrder
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 21:46:34 +00:00
orbiter
abf22f6e60
removed url normalform computation from htmlFilterContentScraper.
...
This method was implemented in de.anomic.net.URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-08-11 15:09:22 +00:00
theli
0db237467f
*) bugfix for URL generation from file
...
see: http://www.yacy-forum.de/viewtopic.php?p=24116
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2326 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-25 16:18:45 +00:00
orbiter
e20ff77c10
another bugfix in new url class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2318 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:37:22 +00:00
orbiter
685430a1b5
bugfix in new URL class, better loggin for domain extraction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2317 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 11:33:01 +00:00
orbiter
79af283f6c
better debugging in new URL class for wrong port numbers
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2315 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-22 10:21:24 +00:00
orbiter
4bd626572b
added hashCode and compareTo to new URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2301 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 12:00:54 +00:00
theli
a70cbd959b
*) further improvements for the anomic.net.url class
...
- relpath starting with javascript: are ignored now
- bugfix for concatenation of relpath starting with # or ?
in this case no slash should be added to the baseURL, otherwise
we get URLs of the form http://test.de/index.html/?param=value
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2298 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 05:12:08 +00:00
theli
8a1f1d96b3
*) Bugfix for url concatenation. Relative urls with / or http:// at the beginning
...
were not handled correctly on url concatenation via new URL(URL,relPath).
See: http://www.yacy-forum.de/viewtopic.php?t=2623
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2297 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-18 04:48:18 +00:00
orbiter
3879a0ecd0
replaced java.net.URL usage by use of new class de.anomic.net.URL
...
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-07-13 01:21:53 +00:00
hermens
d4645062bc
Correct usage of vhost in wget/wput requests:
...
- yacyClient: don't use own .yacyh domain in requests, instead use .yacyh domain of target peer for everything but ranking distribution
- natLib: use full hostname instead of just SLD.TLD
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-06-21 14:25:27 +00:00
orbiter
fd7c17e624
added virtual host support:
...
all yacy-to-yacy communication now send the <peer-hexhash>.yacyh
virtual domain inside the http 'Host' property field.
This shall enable running a yacy peer on a virtual host.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-05-09 13:11:00 +00:00
theli
a10072b5c0
*) Adding timeout to ftpc.java
...
See: http://www.yacy-forum.de/viewtopic.php?t=2268
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2035 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-04-23 10:35:46 +00:00
orbiter
59fc55ea1e
added checks to protect peers from wrong seeds
...
see also: http://www.yacy-forum.de/viewtopic.php?p=19249#19249
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1939 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-20 20:31:51 +00:00
allo
b3df5ce2b8
another local iprange
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1875 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-03-10 18:34:53 +00:00
allo
32d4c3e4c6
static IP is proper
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1779 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-02-28 08:57:20 +00:00
theli
026dcdfcc0
*) Bugfix from mbirth for ftpc bug
...
See: http://www.yacy-forum.de/viewtopic.php?p=15496#15496
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1381 6c8d7289-2bf4-0310-a012-ef5d649a1542
2006-01-20 13:37:42 +00:00
orbiter
37f88b4017
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 23:51:29 +00:00
orbiter
8f1f2daa5e
implemented interactive link deletion of search results.
...
next steps: attach voting and restrict to administrator
to see the deletion button, move the mouse pointer to the left of a search result
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1172 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-06 16:15:21 +00:00
orbiter
3d8a5ae652
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1166 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 14:24:13 +00:00
orbiter
1d6a6d1f85
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1159 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-12-05 00:17:12 +00:00
orbiter
79818a320f
introduced citation-rank transmission protocol and activate transport for anonymisation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
2005-11-10 23:48:20 +00:00