Commit Graph

9850 Commits

Author SHA1 Message Date
Michael Peter Christen
43ca359e24 Merge branch 'master' of ssh://gitorious.org/yacy/rc1 2013-04-23 21:01:08 +02:00
Michael Peter Christen
2d60dfb3e1 Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1 2013-04-23 21:00:49 +02:00
orbiter
f7571386a3 added a 'collection' property attribute in yacysearch.html which can be
used to select between different collections as defined during a crawl
start with the 'collection' attribute. This actually implements the
ability to prepare search tenants which restrict their search results to
a specific collection. The main use for this is to provide tenants to
the yaml4 interface (at this time).
2013-04-23 20:42:54 +02:00
Saransh Sharma
04b61e08c8 More Translation 2013-04-23 19:31:17 +05:30
orbiter
3e79bd4b1f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-04-23 12:15:46 +02:00
orbiter
d571e739b6 increased row limitation for authorized users from 10000 to 100000000 in
solr interface
2013-04-23 12:15:33 +02:00
Michael Peter Christen
d937c55204 extended limitation of dom export size from 100000 to 100000000 2013-04-22 22:33:13 +02:00
Michael Peter Christen
fc2095ac67 some extensions to raster plotter to transform a RGB picture to an
indexed color scheme. This is needed for gif animations
2013-04-22 14:33:04 +02:00
Michael Peter Christen
c1a2175fbc added transparency to gif image animation and the integration to the
YaCy httpd for on-the-fly generated gifs (including animated gifs)
2013-04-21 12:29:05 +02:00
Michael Peter Christen
a1fffe8e86 fixed default ranking values 2013-04-21 12:27:27 +02:00
orbiter
5d442dad82 avoid NPE in regex checker 2013-04-20 10:53:49 +02:00
Michael Peter Christen
24bcf54100 Merge branch 'master' of git://gitorious.org/~saranshupscale/yacy/yacy-india-rc1 2013-04-19 09:55:33 +02:00
Saransh Sharma
b31793f5d6 Hello world 2013-04-19 13:12:23 +05:30
Michael Peter Christen
50421171c3 added new schema fields:
hreflang_url_sxt and hreflang_cc_sxt
for
http://support.google.com/webmasters/bin/answer.py?hl=de&answer=189077

navigation_url_sxt and navigation_type_sxt
for
http://googlewebmastercentral.blogspot.de/2011/09/pagination-with-relnext-and-relprev.html

publisher_url_s
for http://support.google.com/plus/answer/1713826?hl=de

all fields are disabled by default and not written to the index.
2013-04-18 17:21:17 +02:00
Michael Peter Christen
566d6c980c checking of document signature for a double-document check now refers
only to documents within the same domain
2013-04-17 16:15:27 +02:00
Michael Peter Christen
1d30082446 added hindi translation configuration 2013-04-17 12:57:27 +02:00
Saransh Sharma
ee9d50e4b8 Hindi Some parts only 2013-04-17 14:41:55 +05:30
Michael Peter Christen
d05dc07cff setting of new default values for ranking 2013-04-16 15:02:00 +02:00
Michael Peter Christen
97775fbebc fixed ranking for add-function queries: this did not work. The option
was removed. All function queries are now boosts (multiplies the score
according to a function). This is also the recommended way to boost
rankings based on functions as explained in
http://nolanlawson.com/2012/06/02/comparing-boost-methods-in-solr/
2013-04-16 14:45:14 +02:00
Michael Peter Christen
ac5fa9fe48 fix for result counter logging 2013-04-16 13:32:13 +02:00
Michael Peter Christen
298bf2deb5 fix to ranking configuration servlet 2013-04-16 12:38:16 +02:00
Michael Peter Christen
2db058b551 added in RankingSolr_p.html a select box to switch between different
ranking situations. By default, four situations can be configured.
2013-04-16 11:38:51 +02:00
Michael Peter Christen
6fbca35215 fixed api table navigation 2013-04-16 01:39:30 +02:00
Michael Peter Christen
7ab5093321 added new solr title_exact_signature_l and
description_exact_signature_l to be able to identify unique title and
unique description fields.
2013-04-16 01:35:15 +02:00
Michael Peter Christen
f24ac518e6 redesign of exists()-query (can now be called with query) and the
CachedSolrConnector which based its cache on the key value. This will be
used to correct the title_unique_b and description_unique_b field.
2013-04-15 14:08:30 +02:00
Michael Peter Christen
27d6222880 added new field host_extent_i which, after a crawl and postprocessing,
holds the number of documents for the host where the document is hosted.
This is necessary for ranking and the norming of references per local
host in the ranking computation.
2013-04-14 20:52:40 +02:00
Michael Peter Christen
579eb01a49 showing now the details of references count in host browser:
external (ext), internal (int) and external hosts (hosts) for each
indexed document.
2013-04-14 11:30:57 +02:00
reger
0f4237d8e5 add admin option to delete load errors from index 2013-04-14 05:33:01 +02:00
reger
518b20147c skip postprocessing during document.store if no citation index connected (prevent null pointer exception) 2013-04-14 02:01:27 +02:00
Marc Nause
ac478384d3 *) did some long overdue refactoring 2013-04-13 23:04:44 +02:00
Marc Nause
e99c8789ff *) fixed encoding of query in link to map (in case geolocalization is
enabled, "Show search results for "köln" on map")
*) applied suggestions of Checkstyle plugin
2013-04-13 21:50:48 +02:00
Michael Peter Christen
ada3f27de7 added three new field for a better ranking: references_internal_i,
references_external_i and references_exthosts_i. These can be used to
count and evaluate the number of external links to every web page. An
experimental ranking function can be i.e.:
div(add(references_internal_i,product(references_external_i,references_exthosts_i)),add(clickdepth_i,1))
2013-04-12 16:17:14 +02:00
Michael Peter Christen
082e3274d6 - setting the same default ranking in the solr interface as for YaCy
search interfaces if no other ranking attributes are given
- using the YaCy ranking in the GSA interface only if there was not
given a GSA-style sort attribute
- to avoid confusion about correct ranking attributes, only the default
'0'-ranking profile is used and not scenario-adopted (site, date)
because that should be configurable in the web interface before it is
used actually for ranking.
2013-04-12 10:48:41 +02:00
Michael Peter Christen
a20941c067 resume paused crawls on startup; user expects that restarts 'heal'
everything
2013-04-11 15:07:08 +02:00
Michael Peter Christen
edc0b33f6d - showing references count and clickdepth in host browser
- fixed generation and presentation of both values
2013-04-11 14:46:13 +02:00
orbiter
2c3b024196 if the crawl was paused (automatically), show the reason for pausing in
the Crawler_p servlet.
2013-04-09 18:55:26 +02:00
reger
566a3b0294 fix: Index Administration > Reverse Word Index (IndexControlRWIs_p) corrected use of word search to word-hash search
- removed duplicate QueryParams.hashes2Handles , redundant  with .hashes2Set
2013-04-08 21:25:21 +02:00
reger
989575b447 Merge branch 'master' of git://gitorious.org/yacy/rc1.git 2013-04-07 20:20:09 +02:00
Michael Peter Christen
27907c9739 added missing library after solr upgrade 2013-04-07 10:36:05 +02:00
reger
f37b4c984c adjust Netbeans IDE project.xml classpath for Solr 4.2.1 jars 2013-04-06 23:00:48 +02:00
Michael Peter Christen
c6c01a3ca2 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2013-04-06 16:11:33 +02:00
Michael Peter Christen
cf0acd2cb4 upgrade to solr 4.2.1 2013-04-06 16:11:24 +02:00
reger
40b3f2c5fe comment out dead menue link 2013-04-06 02:34:56 +02:00
reger
bf1e1ddca1 fix typo in prev commit 2013-04-06 02:29:49 +02:00
reger
d4d93be779 uncomment "used time" calculation for remote search log 2013-04-06 02:08:01 +02:00
reger
36202f27b0 improve remote search log, set "Returned Results" to transmitcount (instead of no value) 2013-04-05 03:33:33 +02:00
reger
e89491271f - fix opensearch discover err msg - webgraph not enabled - if no opensearchdescription link found in index
- remove search2.net from sample config (is down)
2013-04-04 00:40:59 +02:00
reger
6a9d0b60a3 make sure configured port is reported on recreated mySeed.txt 2013-04-01 03:51:57 +02:00
reger
254074b11d Merge branch 'master' of git://gitorious.org/yacy/rc1.git 2013-03-22 03:46:26 +01:00
Michael Peter Christen
870aedf3c6 fixes for better search interface integration in yaml templates 2013-03-20 16:19:49 +01:00