Commit Graph

2904 Commits

Author SHA1 Message Date
Michael Peter Christen
67cd4c37bd activated the new apk parser which was already ready but not included in
the parser initialization. To make the apk parser usable, the handling
of application type links had to be modified. Now all documents which
have not a parser attached are placed to the noload-queue while all
other documents are parsed using the associated parser class. This may
have side-Effects on other parsers and the display of different file
classes (images, apps, videos).
2014-09-24 13:32:58 +02:00
orbiter
a922b122a3 added a hack to forward solr search results from an external attached
solr to the YaCy built-in solr search servlet. Its not complete and not
fully correct (there is still a utf8 encoding problem) but it is a way
to get easily requests forwarded through YaCy to an external Solr.
2014-09-22 15:28:54 +02:00
Michael Peter Christen
025516f682 fix for crawl limit for number of pages fail 2014-09-20 13:06:46 +02:00
Michael Peter Christen
2645dc816a added warning for not well-formed postprocessing queries 2014-09-18 14:36:57 +02:00
Michael Peter Christen
437ce3b8a0 added internal api for partial updates to Solr 2014-09-18 14:26:45 +02:00
orbiter
3ac31614a3 added option to reverse-sort YaCy tables (internal API change only) 2014-09-18 11:11:09 +02:00
Michael Peter Christen
6d3d4c4ea6 changed the concurrent enumeration of query results in such a way that
it is now possible to get the results in two steps:
- first retrieve all IDs as given for a query
- then retieve each document individually

This was necessary for very large result sets where a query may run for
hours and is possibly terminated by a solr-internal timeout. This occurs
regulary during postprocessing and therefore this commit may fix
unwanted postprocessing terminations.
2014-09-17 13:58:55 +02:00
Michael Peter Christen
ad35d9294f added a 'stats' table which records some peer statistics twice every
hour. The table can be shown with
http://localhost:8090/Tables_p.html?table=stats

The entries have the following meaning: 
aM: activeLastMonth
aW: activeLastWeek
aD: activeLastDay
aH: activeLastHour
cC: countConnected (Active Senior)
cD: countDisconnected (Passive Senior)
cP: countPotential (Junior)
cR: count of the RWI entries
cI: size of the index (number of documents)

The entry keys are abbreviated to reduce the space in the table as the
name is written again for every row.

This is the beginning of a 'yacystats' micro-alternative als built-in
function in YaCy. Graphics may follow after some time if enough test
data is available.
2014-09-17 12:54:50 +02:00
reger
8284ea751a catch TimeoutException during ping and do not delete yacy.conf during prereadconfigfile
found a situation after crash (reboot) with existing running semaphore but YaCy not running.
Ping generated exception which finally deleted the conf file (during pre-read procedure)
- change to ping (catch exception solved it)
- additionally removed delete yacy.conf file (if needed we need to make a backup)
2014-09-16 23:14:13 +02:00
reger
ffa7c7116f better fix for NPE in image search
replace 8931e14514
2014-09-16 16:43:17 +02:00
Michael Peter Christen
759e7d9538 fix for http://forum.yacy-websuche.de/viewtopic.php?p=30720#p30720 2014-09-16 14:53:30 +02:00
Michael Peter Christen
bf18a39d0e replaced warning with info 2014-09-16 14:41:04 +02:00
Michael Peter Christen
f1032fb8fe more enhancements to image search in case that a restriction to a single
domain is done
2014-09-16 13:41:01 +02:00
Michael Peter Christen
475125f9d7 hack to get more results when doing a remote site search 2014-09-16 00:13:26 +02:00
Michael Peter Christen
81f9b34da7 increaesed ability ot search for all images on a single server within
the p2p remote search
2014-09-15 20:33:22 +02:00
Michael Peter Christen
2c26013c50 better contentdom abstraction 2014-09-15 14:00:41 +02:00
Michael Peter Christen
6a8fb8190b changed default value for maximum number of connections to 50 2014-09-15 13:50:40 +02:00
Michael Peter Christen
ca8b2bf099 removed www and welcome servlet, these had been demo servlets and are
not needed any more
2014-09-15 12:48:58 +02:00
reger
03a7a29db3 limit OAI import urn resolver try for Deutsche National Library
The resolver service of National Library uses name space nbn, limit use of nbn-resolving.de accordingly to urn:nbn:
- add resolver for rfc's
2014-09-14 01:38:27 +02:00
Michael Peter Christen
0838326a76 changed error message, see http://mantis.tokeek.de/view.php?id=439 2014-09-13 17:02:26 +02:00
reger
b5e0f70197 - remove repositoryPath post from ConfigBasic (obsolete)
- remove static snippetComputationTime from ResultEntry (not used)
2014-09-13 03:21:52 +02:00
reger
8931e14514 fix NPE in image search 2014-09-13 00:27:39 +02:00
Michael Peter Christen
1735dbc9d9 enhanced image search: bugfixes and performance enhancements 2014-09-12 16:37:01 +02:00
Michael Peter Christen
ebd0be2cea fixes and speed updates for search process 2014-09-10 14:24:03 +02:00
Michael Peter Christen
7611bf79bd Merge branch 'master' of gitorious.org:yacy/icewindxs-rc1
Conflicts:
	locales/ru.lng
2014-09-10 13:24:49 +02:00
Michael Peter Christen
524bedc00a fixed text in startup tray icon and added shutdown icon during shutdown 2014-09-10 13:19:08 +02:00
Michael Peter Christen
4709d8417c npe fix for non-tray users 2014-09-08 10:26:28 +02:00
orbiter
5b5635e187 replaced font for boot tray icon with image and added some more images
for further tray icon displays
2014-09-08 00:21:29 +02:00
orbiter
aa6cdc4ab5 speed-up of start process if remote DNS waits for timeout 2014-09-07 12:28:19 +02:00
orbiter
40b3977c21 added an animation of the tray icon during the boot phase of YaCy.
Additionally, there is a tooltip and a new headline at the tray menu
which states the current booting status.
2014-09-07 12:04:35 +02:00
Michael Peter Christen
ec6082c872 very bad language detection hack fix hack 2014-09-05 23:29:09 +02:00
Michael Peter Christen
39615de3f9 adding the buffer size is not wrong but may cause confusing information
when the buffer is cleaned after a buffer flush which is not then
available in Solr since that is waiting for a commit. In such cases the
counter would run backwards which is prevented by ignoring the buffer
size.
2014-09-05 14:57:40 +02:00
Michael Peter Christen
395edec6f1 changed strategy to count the number of documents: get the max of
solr+buffer and the hit cache. This shall help during first crawls to
see a running document counter even if there was no commit meanwhile to
solr. To support that strategy, the hit cache must be written earlier.
2014-09-05 14:50:22 +02:00
Michael Peter Christen
e87dc08c0d set the correct fail time in error docs 2014-09-05 14:46:11 +02:00
Michael Peter Christen
cfb20bc0ce removing the [] for ipv6 addresses may be a bad idea.. 2014-09-04 18:17:38 +02:00
orbiter
b6d57f06eb enhanced the apk parser (up to beeing production-ready).
The parser is not yet activated and will be after the next release step.
2014-09-04 09:41:42 +02:00
Michael Peter Christen
a7dd89c4de changed method to write the citation index: do not catch up references
during document parsing; instead use the same references that would also
be written into the webgraph. That should cause that the webgraph and
the citation index express the exact same semantic.
2014-09-02 13:22:12 +02:00
Michael Peter Christen
57ce7eeff3 fixed localhost authorization and replaced the adminRealm with an info
string which is visible in the browser. That makes it possible that the
browser instructs the user how to change a forgotten admin password
(during runtime).
2014-09-02 13:15:19 +02:00
orbiter
f318d7c285 enhanced date-ordered ranking 2014-09-01 13:01:30 +02:00
reger
a6891ff7f8 fix Querygoal.parse exception on +/-null-term
covers http://mantis.tokeek.de/view.php?id=452
2014-09-01 00:16:26 +02:00
reger
c7335318eb remove unused legacy procedure from httpserver
(deleted  generateSocketAddress(port) )
2014-08-31 00:33:05 +02:00
Michael Peter Christen
eab0d3e1a9 bugfix for wrong lock display, see
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5321&p=30484#p30484
2014-08-28 12:50:45 +02:00
orbiter
49d4f95faf bugfix to latest commit 2014-08-27 00:16:50 +02:00
orbiter
68211f8244 enable Crawler_p servlet if a rss feed or a wiki dump import was
submitted.
2014-08-27 00:15:31 +02:00
orbiter
a65df4ce7e do not push noindex errors into log if in intranet mode. noindex
attributes are attached to artificial constructed index.html files which
list directories. Such files are naturally rejected by the crawler and
should not appear in the error log because these files are part of the
construction of file crawlers and confuse users if they see them in the
error log.
2014-08-27 00:10:51 +02:00
orbiter
688c6d8954 Merge branch 'master' of git@gitorious.org:yacy/rc1.git 2014-08-27 00:04:36 +02:00
orbiter
4ae7aead28 addon to latest fix 2014-08-27 00:03:49 +02:00
Marc Nause
2af56fa37d Improved UPnP. (still not perfect)
*) set HTTPS port if enabled
*) improved data structures (may not be final)
*) moved UPnP to own package
2014-08-26 22:47:13 +02:00
orbiter
b3ebd38079 removed the HTDOCS repository concept because the concept to host files
on the YaCy http server is obsolete; YaCy can index file:// and smb://
paths
2014-08-26 19:02:53 +02:00
reger
1fdcc2d67b change seedfile upload ip check to allow intranet ip in intranet mode
- this allows to setup a principal peer in intranet environment
2014-08-25 01:25:22 +02:00