Michael Peter Christen
a33e2742cb
- removed unnecessary synchronized and deadlock in crawler
...
- removed problem with monitoring object on Balancer.wait
- added missing user agent settings
2012-10-28 19:56:02 +01:00
Michael Peter Christen
1533bfd63b
refactoring
2012-09-25 21:20:03 +02:00
Michael Peter Christen
8219a445f3
refactoring
2012-09-21 16:46:57 +02:00
Michael Peter Christen
00c1c777fa
refactoring
2012-09-21 15:48:16 +02:00
Michael Peter Christen
24d9db1613
snippet retrieval loading processes may use a smaller minimum load time
...
value than crawling processes. This speeds up the search result
preparation dramatically.
2012-07-30 10:38:23 +02:00
orbiter
0cbda0b2b8
- replaced all length() == 0 and size() == 0 with isEmpty()
...
- replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be
done automatically
- implemented some isEmpty() methods
2012-07-10 22:59:03 +02:00
Michael Peter Christen
b0c408788b
made class methods static where possible
2012-07-05 12:38:41 +02:00
Michael Peter Christen
1825f165b8
better integration of blacklist according to use case
2012-07-02 13:57:29 +02:00
Michael Peter Christen
77f795756c
fixing redirects and status codes: storing of status code in
...
ResponseHeader to make it available for late evaluations, like storage
in solr.
2012-06-25 18:17:31 +02:00
Michael Peter Christen
b9d42fd9c8
using com.google.common.io.Files instead of homebrew methods
2012-06-22 11:39:17 +02:00
Michael Peter Christen
0f82fb3628
using double instead float for a better release ordering
2012-05-30 15:28:20 +02:00
Michael Peter Christen
71c3163f3d
- fixes to node identification
...
- added link to node in network list
- added marking of portal search node peers
2012-05-29 01:38:54 +02:00
Michael Peter Christen
046f3a7e8d
check if httpc has decompressed the release file and rename the file
...
from .tar.gz to .tar if that happened
2012-04-16 09:50:55 +02:00
Michael Peter Christen
7e4e3fe5b6
free some memory after parsing html
2012-02-02 09:55:27 +01:00
Michael Peter Christen
ef5192f8c9
using the generic document parser for crawl starts instead of the html
...
parser. This makes it possible that every type of document can be a
crawl start point, not only text documents or html documents. Testet
this with a pdf document.
2012-01-23 17:27:29 +01:00
orbiter
402e9d71ef
changed ording on release files: main criteria is not the svn any more; releases are now ordered by
...
- release number
- date
- svn number
additionally there is a new option to remove the svn number completely
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8135 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-12-04 07:22:13 +00:00
orbiter
a7df70221e
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-04 09:06:24 +00:00
orbiter
d2ea250d99
refactoring:
...
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00