Commit Graph

9687 Commits

Author SHA1 Message Date
orbiter
a9c8046c87 do a light optimization at the end of a crawl postprocessing 2013-07-13 19:09:46 +02:00
orbiter
1b43e02b86 Merge branch 'master' of git://gitorious.org/~quix0r/yacy/quix0rs-yacy-rc1 2013-07-13 18:54:18 +02:00
orbiter
a548354c71 replaced type of solr schema object sku of text_en_splitting_tight by
string
2013-07-13 18:54:09 +02:00
Roland Haeder
59b4fdd5ad Merge remote-tracking branch 'upstream/master' 2013-07-13 15:12:51 +02:00
orbiter
5493389576 stealth mode shall only be available for authorized users, because
unauthorized users can otherwise be monitored by authorized users
2013-07-13 14:49:36 +02:00
Roland Haeder
ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet 2013-07-13 13:12:36 +02:00
orbiter
2f1ec8d4a2 npe fix 2013-07-13 11:10:05 +02:00
Michael Peter Christen
bcc623a843 refactoring of load_delay: this is a matter of client identification 2013-07-12 16:24:56 +02:00
orbiter
0d0b3a30f5 activate api actions after postprocessing of crawls 2013-07-12 16:05:48 +02:00
orbiter
3978c5ca5d fix for http://bugs.yacy.net/view.php?id=255 2013-07-12 14:38:30 +02:00
orbiter
2be456e7fb added a postprocessing field into api/status_p.xml to show if the
postprocessing task is running at that time (status: busy) or not
(status:idle)
2013-07-12 14:29:22 +02:00
orbiter
575f913154 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-07-12 14:17:13 +02:00
orbiter
c4efb612e2 added list of crawls to status_p.xml 2013-07-12 14:16:51 +02:00
Lotus
5f666220b7 added files to uninstall 2013-07-11 22:04:01 +02:00
Lotus
bb6caa346c Do not allow automatic update in case YaCy is installed to the Program
Files folder on Windows. There are no permissions to write that folder
and update would fail.
2013-07-11 21:50:06 +02:00
Lotus
338cf0d133 Revert "Revert "Windows installer: update logo""
This reverts commit c66631d407.
2013-07-11 21:48:49 +02:00
Lotus
c66631d407 Revert "Windows installer: update logo"
This reverts commit 41cc9be62b.
2013-07-11 21:48:13 +02:00
Lotus
41cc9be62b Windows installer: update logo 2013-07-11 21:46:46 +02:00
Lotus
4fdcfb8230 adapt windows start script parameters to linux start script parameter 2013-07-11 21:46:17 +02:00
orbiter
dac88561ae minimum access time has a tight connection to ClientIdentification,
therefore it is defined there.
2013-07-11 17:04:24 +02:00
Michael Peter Christen
9a29ab469e another patch to prevent CLOSE_WAIT status on solr connections 2013-07-11 12:53:39 +02:00
Michael Peter Christen
5091d627bc fixed parsing of peer flags 2013-07-11 12:53:16 +02:00
Michael Peter Christen
87e9052081 added Connection:close to all http requests in our http client to
prevent CLOSE_WAIT states (as seen in lsof)
2013-07-11 11:54:11 +02:00
orbiter
2a19a60074 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-07-11 10:22:55 +02:00
sixcooler
bff8c753c6 re-insert this file - was deleted by mistake
+ correct an other case-typo
2013-07-10 18:32:12 +02:00
orbiter
e609ec388a metager whitelist update 2013-07-10 15:13:04 +02:00
Michael Peter Christen
5c6946dd5f replaced usage of log4j by ConcurrentLog where possible 2013-07-09 14:42:39 +02:00
Michael Peter Christen
5878c1d599 - refactoring of log to ConcurrentLog:
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
2013-07-09 14:28:25 +02:00
Michael Peter Christen
6d5533c9cd Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-07-09 11:52:20 +02:00
orbiter
c79f687110 enhanced the network scanner: find more hosts automatically by removal
of common subdomains before application of protocol-specific prefix
2013-07-09 11:42:13 +02:00
orbiter
f4f6551c66 better handling of time-out at solrj in case that a commit is done in a
fail-over case during add
2013-07-09 11:01:37 +02:00
orbiter
b4677d1cad fix for bug #252
the naming of the servlet was wrong, the bug may not be present on
systems where upper/lowercase matching is lazy (windows)
2013-07-09 10:50:47 +02:00
Michael Peter Christen
2716dfc46c increase crawler speed by reduction if the busysleep time 2013-07-08 23:40:31 +02:00
Michael Peter Christen
07261fe274 Merge remote-tracking branch 'nutomics/blacklist_structure' 2013-07-08 23:32:15 +02:00
Michael Peter Christen
dea71851d2 - better concurrency for network scanner
- network scanner can now start from the list of all hosts in the search
index
2013-07-08 16:29:30 +02:00
Michael Peter Christen
a34e137e27 fix for citation index generation in case that entry.referrerhash() is
null. This is especially the case if ftp sites are crawled
2013-07-08 16:26:11 +02:00
Michael Peter Christen
a2c8116a8f accept (but ignore) a '+' sign in front of search words 2013-07-08 16:20:40 +02:00
orbiter
9f0cc9b401 enhanced network scanner
- textarea input field can now be used to paste in a large list of hosts
- /31er subnet is possible (only one host)
- auto-detect subdomains for ftp and www subdomains
2013-07-08 13:17:09 +02:00
orbiter
d8354a389c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-07-07 21:31:28 +02:00
Lotus
6e120e90fe do not cut text on submit buttons 2013-07-07 19:17:29 +02:00
orbiter
f8c28efd66 fix for rssTerminal coloring 2013-07-04 21:46:46 +02:00
sixcooler
308d73f855 do not use remote proxy if not switched on - regardless of the proto 2013-07-04 19:16:13 +02:00
sixcooler
69906b1d2e Revert "do not use remote proxy if not switched on - regardless of the proto"
This reverts commit 20f452d228.
2013-07-04 19:13:51 +02:00
sixcooler
20f452d228 do not use remote proxy if not switched on - regardless of the proto 2013-07-04 19:12:50 +02:00
sixcooler
9551720d5c re-enable saved setting for proxy-crawl-profile 2013-07-04 19:10:57 +02:00
sixcooler
d5d8936f9d For indexes that are changing rapidly in NRT situations, fcs (stands for
Field Cache per Segment) may be a better choice than the default fc.
(saves memory)
see: http://wiki.apache.org/solr/SimpleFacetParameters#facet.method
2013-07-04 19:08:53 +02:00
Felix Ableitner
44f8fcf62e Changed class structure of Blacklist. 2013-07-04 18:37:57 +02:00
Michael Peter Christen
3054a6d4b9 added a patch from Sebastian M.B., submitted by email for coloring of
rss terminal
2013-07-04 17:12:19 +02:00
Michael Peter Christen
78af998f8f Merge commit 'fd90fcc4e08f80acbfd1c9a7ec62ce04cd309594' 2013-07-04 16:56:54 +02:00
Michael Peter Christen
57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
default.
2013-07-03 14:50:06 +02:00