Commit Graph

5895 Commits

Author SHA1 Message Date
luccioman
0efc6c89ef Fixed rendering of crawl queues page for URLs with raw IPV6 addresses 2018-08-13 14:36:22 +02:00
luccioman
0e976e9030 Added a link to MediaWiki dumps summary in import page for convenience 2018-08-08 08:11:02 +02:00
luccioman
ecd4535eb6 Prevent entering empty OpenSearch URLs in ConfigHeuristics_p.html
In order to early prevent adding invalid configuration entries to the
heuristicopensearch.conf file, as revealed the issue #209.
2018-08-06 12:07:47 +02:00
luccioman
1ca9cb6bd9 Fixed a NullPointerException case, reported in issue #209 2018-08-06 12:04:44 +02:00
luccioman
8a29551c54 Upgraded the OpenGeoDB dump URL
The status of the library in the DictionaryLoader_p.html page now also
advertises the user that an upgrade can be applied when an older dump is
already loaded.

Upgrade applied as suggested by Niklas Andrus @fapth_gitlab on Gitter
chat.
2018-08-03 18:39:41 +02:00
luccioman
bf4f320b16 Optionally render the response header when using the Solr html writer
With params rendered as html input fields for conveniently modifying
params values and refreshing results.
2018-07-23 18:36:57 +02:00
luccioman
88e6ce23c9 Consistently render empty facets and facets having only entries at zero 2018-07-17 07:36:39 +02:00
luccioman
534f09e92b Added and updated hint messages about remote crawler status
To help identify why remote crawl results may not be received.
2018-07-06 11:30:30 +02:00
luccioman
c726154a59 Fixed removal of URLs from the delegatedURL remote crawl stack
URLs were removed from the stack using their hash as a bytes array,
whereas the hash is stored in the stack as String instance.
2018-07-05 09:36:36 +02:00
luccioman
2bdd71de60 Added server side columns sorting on the Process Scheduler table
For easier usage of large tables in the Table_API_p.html page.
2018-07-04 10:28:32 +02:00
luccioman
f895745e1c Removed more unsafe concurrent accesses to SimpleDateFormat instances.
SimpleDateFormat must not be used by concurrent threads without
synchronization for parsing or formating dates as it is not thread-safe
(internally holds a calendar instance that is not synchronized).

Prefer now DateTimeFormatter when possible as it is thread-safe without
concurrent access performance bottleneck (does not internally use
synchronization locks).
2018-06-29 15:49:55 +02:00
luccioman
5c6c61809a Fixed JavaScript sorting of tables with cells containing an input field 2018-06-29 13:01:05 +02:00
luccioman
3885fd64a0 Fixed Table_API_p.html current table page loss on row editing.
Reset only to the first table page when the search query is modified
2018-06-28 15:34:53 +02:00
luccioman
e97580dfc7 Fixed unsafe conccurent access to generic SimpleDateFormat instances
SimpleDateFormat must not be used by concurrent threads without
synchronization for parsing or formating dates as it is not thread-safe
(internally holds a calendar instance that is not synchronized).

Prefer now DateTimeFormatter when possible as it is thread-safe without
concurrent access performance bottleneck (does not internally use
synchronization locks).
2018-06-28 14:59:23 +02:00
luccioman
38a3a5e5ad Fixed a NullPointerException case in the suggest api 2018-06-22 10:49:01 +02:00
luccioman
b159564c72 Properly render json string attributes in the crawl profile html editor 2018-06-19 12:46:50 +02:00
luccioman
cced94298a Added a new crawler document filter type using Solr syntax
This makes possbile to set up much more advanced document crawl filters,
by filtering on one or more document indexed fields before inserting in
the index.
2018-06-19 10:12:20 +02:00
Michael Christen
e0dc632020 removed transformer
it was not used any more
2018-06-19 00:42:23 +02:00
luccioman
eb94986f95 Added Italian in available web interface languages list 2018-06-12 14:19:22 +02:00
luccioman
9bc7b6c39d Allow edtion of scheduled next execution dates for finer control
Can be useful more especially when scheduling many API calls over a long
period of time to precisely adjust each scheduled date/time.
2018-06-11 11:38:58 +02:00
luccioman
b5dc1f376f Made outgoing pools max total connections user configurable
For a finer control over the maximum simultaneously active outgoing
connections.
2018-06-06 09:36:50 +02:00
luccioman
387d646c0e Added gzip compression of responses returned to user-agents accepting it
Enabled as default, but can be disabled using the "Server Access
Settings" admin page.
2018-06-05 13:35:39 +02:00
luccioman
a1990202ab Fixed unresolve-pattern case on old html title 2018-06-02 14:54:05 +02:00
luccioman
35826a3091 Added a search page customization setting to display or not favicons
If not interested in displaying this on your search results and notably
on a peer with limited resources this can help saving some CPU and
outgoing network connections.
2018-05-25 11:13:43 +02:00
luccioman
79bd9f623a Updated YaCy home page embedded links from http to https scheme 2018-05-22 17:46:12 +02:00
luccioman
1dfd3e9dde Limit the rate of calls to the suggest API when typing in search field 2018-05-22 07:55:09 +02:00
luccioman
4f0ab318ef Fixed snippets statistics displayed "provided by Solr" count 2018-05-14 15:21:21 +02:00
luccioman
e115e57cc7 Reduced text snippet extraction processing time.
By not generating MD5 hashes on all words of indexed texts, processing
time is reduced by 30 to 50% on indexed documents with more than 1Mbytes
of plain text.
2018-05-11 15:42:53 +02:00
luccioman
ce289ebaf7 Upgraded ConfigNetwork_p html doctype and added language attribute 2018-05-03 08:53:07 +02:00
luccioman
16254fac1e Removed unpaired select closing tag 2018-05-03 08:37:38 +02:00
luccioman
692c1cfdde Added a UI section to configure encryption of peers communications 2018-05-02 08:38:58 +02:00
luccioman
e67df103b5 Removed more remaining uses of deprecated Seed.getIP() function. 2018-04-29 08:26:53 +02:00
luccioman
addd18c993 Removed some remaining uses of deprecated Seed.getIP() 2018-04-26 09:39:30 +02:00
luccioman
c35d0568b6 Support for preferred https in peers communication on more operations 2018-04-24 08:08:24 +02:00
luccioman
0a058ba6af Keep https in result message URL when push_p API is requested over https 2018-04-24 08:05:17 +02:00
luccioman
8bc36506f2 Enforced access controls on basic administration settings pages.
Ensuring http post method is used for operations with server-side
effects (in respect of http semantics), and a valid transaction token is
provided by the user-agent.
2018-04-18 08:10:51 +02:00
luccioman
a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
Thus producing some basic stats on processing times for snippets
generation and counts on snippets per source type.
2018-04-15 09:55:08 +02:00
luccioman
72808655a5 Added controls on mode switch when attached to remote Solr instance(s)
- to prevent unwanted exposure of index entries about private
local/intranet documents when switching from "Intranet Indexing" mode
while attached to remote Solr instance(s)
 - to warn user about remote Solr instance(s) still attached when
switching from modes other than "Intranet Indexing"
2018-04-11 07:56:41 +02:00
luccioman
2af3bf79c7 Improve rendering of remote Solr admin URLs
- properly handle IPv6 loopback address replacement
 - replace loopback address or host only when accessing peer remotely
 - replace loopback part with the peer hostname as requested rather than
with its seed public IP as this works better for Intranet mode and when
peer is behind a reverse proxy.
2018-04-10 11:15:31 +02:00
luccioman
0d34034f17 Ensure an embedded Solr is available for Solr dump/restore operations
Otherwise, these operations triggered NullPointerException when only an
external Solr index is attached.
2018-04-07 13:42:06 +02:00
luccioman
d92b191942 Ensure no remote Solr is attached before "Shut Down and Re-Start Solr"
Otherwise once this operation is applied, the remote Solr(s) instances
are deconnected and the embedded Solr is connected even if disabled by
setting "core.service.fulltext".

Also use constants for related default setting values.
2018-04-06 20:34:54 +02:00
luccioman
69690c13a0 Optionally allow external Solr server with self-signed certificate
This is necessary when you want to attach to a dedicated external Solr
server protected with basic http authentication and requested over https
but having only a self-signed certificate.
2018-04-04 18:16:26 +02:00
luccioman
211f3d04ab Added hint message inciting to check accounts settings on fresh install
When unrestricted access from localhost is set and the accounts config
page has not been visited at all.
2018-04-02 19:48:11 +02:00
luccioman
2fd4d05e2f Added a shared Java constant for setting key server.servlets.called 2018-04-02 15:16:10 +02:00
luccioman
033f7c4c00 Adjusted localhost/qualified account admin access informational texts.
Following remarks from @etam on issue #170
2018-04-02 15:04:56 +02:00
luccioman
05702c2ced Adjusted api table query matching strategies
When inlined (for example in the CrawlProfileEditor_p.html page) :
search only on the comment, as the url is not visible

On regular display : search on comment OR url, instead of comment AND
url. Otherwise searching on comments terms is almost useless as these
terms are not necessarily present in the url.
2018-03-30 11:12:48 +02:00
luccioman
65451a3d62 Fixed start record on the last api table results page
When the last results page size was lower than maximumRecords, results
from the previous page where displayed again.
2018-03-30 10:53:06 +02:00
luccioman
86c902b853 Enable api table page navigation with search query
Applied the same default results page size as when a type filter is
defined for proper and consistend page navigation when combining type
filter and search query.
2018-03-30 10:21:42 +02:00
luccioman
9c7faa04d8 Display the total number of matching items when filtering on table API
Notably for a proper page navigation of the crawl scheduler table
(CrawlProfileEditor_p.html page).
2018-03-29 14:24:25 +02:00
luccioman
311e91ff77 Added hint to clarify results rendered dates and 'Sort by date' switch 2018-03-27 18:05:20 +02:00