Commit Graph

13784 Commits

Author SHA1 Message Date
luccioman
f511e16d50 Prevent duplication of Solr query highlight fields parameters
That was caused by concurrent modifications (with addHighlightField()
function) to the same SolrQuery instance when requesting Solr on remote
peers in p2p search.
2018-05-14 15:26:44 +02:00
luccioman
4f0ab318ef Fixed snippets statistics displayed "provided by Solr" count 2018-05-14 15:21:21 +02:00
luccioman
e357ade47d Reduced memory footprint of text snippet extraction
By not parsing and storing at first all sentences of a document, but
only on the fly the ones necessary to compute the snippet.
2018-05-13 10:29:52 +02:00
luccioman
e115e57cc7 Reduced text snippet extraction processing time.
By not generating MD5 hashes on all words of indexed texts, processing
time is reduced by 30 to 50% on indexed documents with more than 1Mbytes
of plain text.
2018-05-11 15:42:53 +02:00
reger
7525594315 upd to jwat-warc-1.1.1 2018-05-06 00:49:30 +02:00
luccioman
1e2f094b9e Removed unnecessary html end ligne tag with invalid syntax 2018-05-03 09:00:09 +02:00
luccioman
ce289ebaf7 Upgraded ConfigNetwork_p html doctype and added language attribute 2018-05-03 08:53:07 +02:00
luccioman
16254fac1e Removed unpaired select closing tag 2018-05-03 08:37:38 +02:00
luccioman
f1061e0897 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2018-05-02 08:40:19 +02:00
luccioman
692c1cfdde Added a UI section to configure encryption of peers communications 2018-05-02 08:38:58 +02:00
sgaebel
4b79851e12 corrected icons_sizes_sxt to SolrType.string 2018-05-01 14:04:15 +02:00
luccioman
3b89c232db Easier tracking of longest text snippets initializations
When text snippets statistics are enabled and FINE log level is enabled
on the TextSnippetStatistics class.
2018-05-01 09:58:05 +02:00
luccioman
3c4344cb12 Fixed text snippet max init time statistic rendering 2018-05-01 09:39:41 +02:00
reger
a8234b7ea7 Make sure for image resource url enabled index image pixel size fields are filled
if at least one of the image size fields is enabled in index (images_height_val,
images_width_val, images_pixel_val). 
Previously all fields were required to be enabled (hint: default setting 
is height + width enabled)
2018-04-30 04:59:34 +02:00
luccioman
e67df103b5 Removed more remaining uses of deprecated Seed.getIP() function. 2018-04-29 08:26:53 +02:00
reger
b81debca2e upd to jsoup-1.11.3 2018-04-28 23:24:24 +02:00
luccioman
addd18c993 Removed some remaining uses of deprecated Seed.getIP() 2018-04-26 09:39:30 +02:00
luccioman
c35d0568b6 Support for preferred https in peers communication on more operations 2018-04-24 08:08:24 +02:00
luccioman
0a058ba6af Keep https in result message URL when push_p API is requested over https 2018-04-24 08:05:17 +02:00
luccioman
e914d17aca Updated call to function deprecated since commons-codec version 1.11 2018-04-23 08:07:56 +02:00
luccioman
a9e054ac06 Removed Docker Cloud deploy button as service will soon be shut down
See Docker notification at
http://success.docker.com/article/cloud-migration
2018-04-18 09:18:49 +02:00
luccioman
8bc36506f2 Enforced access controls on basic administration settings pages.
Ensuring http post method is used for operations with server-side
effects (in respect of http semantics), and a valid transaction token is
provided by the user-agent.
2018-04-18 08:10:51 +02:00
luccioman
02673379df Added a start script option to run as a foreground process without JMX
Contrary to the -d/--debug option which opens the 9999 port thus
allowing remote monitoring with JVM tools such as JConsole.
2018-04-17 08:16:37 +02:00
luccioman
a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
Thus producing some basic stats on processing times for snippets
generation and counts on snippets per source type.
2018-04-15 09:55:08 +02:00
reger
508050f79c upd to icu4j-61.1 2018-04-14 16:16:35 +02:00
luccioman
1889d484de Added Solr HTML writer support for responses from remote instances 2018-04-12 09:23:00 +02:00
luccioman
72808655a5 Added controls on mode switch when attached to remote Solr instance(s)
- to prevent unwanted exposure of index entries about private
local/intranet documents when switching from "Intranet Indexing" mode
while attached to remote Solr instance(s)
 - to warn user about remote Solr instance(s) still attached when
switching from modes other than "Intranet Indexing"
2018-04-11 07:56:41 +02:00
luccioman
2af3bf79c7 Improve rendering of remote Solr admin URLs
- properly handle IPv6 loopback address replacement
 - replace loopback address or host only when accessing peer remotely
 - replace loopback part with the peer hostname as requested rather than
with its seed public IP as this works better for Intranet mode and when
peer is behind a reverse proxy.
2018-04-10 11:15:31 +02:00
luccioman
d5f9dc0848 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2018-04-10 07:20:40 +02:00
luccioman
bb74de7d59 Removed unnecessary "/admin" suffix from remote Solr instance admin URL
For quite quite a long time now, the Solr /admin URL suffix indeed
redirects to the Solr base context (see
https://issues.apache.org/jira/browse/SOLR-3337)
2018-04-09 00:01:45 +02:00
reger
e7971fb888 upd to pdfbox-2.0.9 2018-04-08 20:13:53 +02:00
reger
e2b2c89feb upd to jetty-9.4.9.v20180320 2018-04-07 23:39:03 +02:00
luccioman
0d34034f17 Ensure an embedded Solr is available for Solr dump/restore operations
Otherwise, these operations triggered NullPointerException when only an
external Solr index is attached.
2018-04-07 13:42:06 +02:00
luccioman
d92b191942 Ensure no remote Solr is attached before "Shut Down and Re-Start Solr"
Otherwise once this operation is applied, the remote Solr(s) instances
are deconnected and the embedded Solr is connected even if disabled by
setting "core.service.fulltext".

Also use constants for related default setting values.
2018-04-06 20:34:54 +02:00
luccioman
26d8ad591c Adjusted Solr select servlet output when using an external Solr only
- Use the EnhancedXMLResponseWriter only when requested output is "exml"
- Use the Standard Solr writers when possible, for example for json, xml
or javabin output formats
- Return an error when the requested format can not been rendered with
an external Solr server only

Important : this modification is necessary for peers using exclusively
an external Solr server to be reachable as robinson targets in p2p
search, as the binary format ("javabin") is the default Solr exchange
format for peers.
Before this, when a peer requested a remote one attached only to an
external Solr (no embedded one), it ended with "Invalid type" error, as
the remote peer answered with xml although binary format was requested.
2018-04-06 15:16:54 +02:00
luccioman
c867a52d96 Upgraded Solr dependencies from 6.6.2 to 6.6.3 2018-04-05 18:15:45 +02:00
luccioman
69690c13a0 Optionally allow external Solr server with self-signed certificate
This is necessary when you want to attach to a dedicated external Solr
server protected with basic http authentication and requested over https
but having only a self-signed certificate.
2018-04-04 18:16:26 +02:00
Marc Nause
1e4ceaac3f Removed seed URLs pointing to server low.audioattack.de since it will not be updated anymore. 2018-04-03 23:19:05 +02:00
luccioman
b882f85900 Fixed NPE case in Solr select servlet on external Solr only setup
Regression introduced with commit
0d7625ecfb
2018-04-03 15:36:17 +02:00
luccioman
6784c9be68 Updated external Solr setup basic instructions 2018-04-03 15:34:44 +02:00
luccioman
211f3d04ab Added hint message inciting to check accounts settings on fresh install
When unrestricted access from localhost is set and the accounts config
page has not been visited at all.
2018-04-02 19:48:11 +02:00
luccioman
2fd4d05e2f Added a shared Java constant for setting key server.servlets.called 2018-04-02 15:16:10 +02:00
luccioman
033f7c4c00 Adjusted localhost/qualified account admin access informational texts.
Following remarks from @etam on issue #170
2018-04-02 15:04:56 +02:00
luccioman
05702c2ced Adjusted api table query matching strategies
When inlined (for example in the CrawlProfileEditor_p.html page) :
search only on the comment, as the url is not visible

On regular display : search on comment OR url, instead of comment AND
url. Otherwise searching on comments terms is almost useless as these
terms are not necessarily present in the url.
2018-03-30 11:12:48 +02:00
luccioman
65451a3d62 Fixed start record on the last api table results page
When the last results page size was lower than maximumRecords, results
from the previous page where displayed again.
2018-03-30 10:53:06 +02:00
luccioman
86c902b853 Enable api table page navigation with search query
Applied the same default results page size as when a type filter is
defined for proper and consistend page navigation when combining type
filter and search query.
2018-03-30 10:21:42 +02:00
luccioman
9c7faa04d8 Display the total number of matching items when filtering on table API
Notably for a proper page navigation of the crawl scheduler table
(CrawlProfileEditor_p.html page).
2018-03-29 14:24:25 +02:00
luccioman
311e91ff77 Added hint to clarify results rendered dates and 'Sort by date' switch 2018-03-27 18:05:20 +02:00
luccioman
90dc580158 Fixed initial ViewFile mode and suggestions links from previous commit 2018-03-27 08:25:40 +02:00
luccioman
0b6aed4de6 Keep the selected view mode when typing a new URL in the ViewFile page
Otherwise, when interested in viewing `Link List` for example, each time
you typed a new URL, `Parsed Sentences` view mode was selected as
default and you had to selected again the view mode you are insterested
in.
2018-03-27 07:42:26 +02:00