Commit Graph

5892 Commits

Author SHA1 Message Date
luccioman
7496df93c3 Fixed error 414 (URI Too Long) when manually selecting to many RSS items
Switched form method to HTTP POST to prevent this.
2018-03-23 10:49:39 +01:00
luccioman
fb3032c530 Added a crawl filtering possibility on documents Media Type (MIME) 2018-03-23 10:28:19 +01:00
luccioman
90d4802082 Updated link URL to IANA Media Types with https 2018-03-23 10:23:54 +01:00
luccioman
e45afedee4 Added support for enclosures (media links) to the RSS loader 2018-03-21 08:22:29 +01:00
luccioman
aaefd5219c Reduce log verbosity of RSS loader on feed items with no link 2018-03-20 10:09:17 +01:00
Michael Peter Christen
187075b878 added nav filter 2018-03-10 15:46:53 +01:00
luccioman
07e8628853 Added HTML5 embedded audio for results playing on supporting browsers
Restricted to authenticated or localhost users only to prevent
redistribution license issues.
2018-02-23 11:41:50 +01:00
luccioman
46c9da6428 Allow creation of vocabularies from remote CSV file URLs. 2018-02-21 08:41:13 +01:00
luccioman
348d07a999 Enforced controls on vocabulary editing operations. 2018-02-20 12:22:54 +01:00
luccioman
2532db2ce6 Vocabulary editor : use accessible labels and CSS for elements position 2018-02-20 11:22:34 +01:00
luccioman
ac14437316 Vocabulary_p.html : richer semantics for HTML tables
Also replaced deprecated attributes
2018-02-19 15:15:02 +01:00
luccioman
b67742336e Provide user interface messages on vocabulary creation read/write errors 2018-02-19 11:48:40 +01:00
luccioman
ea57763294 Mark vocabulary name field as required using html instead of JavaScript 2018-02-19 09:35:44 +01:00
luccioman
39ec8cba37 Fixed Vocabulary_p.html HTML validation errors.
Validated with Validated with Nu Html Checker 17.11.1.
2018-02-19 08:54:42 +01:00
luccioman
7c644090ff Fixed CrawlStartExpert.html HTML validation errors
Validated with Nu Html Checker 17.11.1
2018-02-16 11:35:15 +01:00
luccioman
519fc9a600 Issue #156 : new option to clean up (or not) search cache on crawl start
Prevent also unnecessary search event cache clean-up on each access to
the crawl monitor page (Crawler_p.html).
2018-02-16 10:19:41 +01:00
luccioman
3e8dd90211 Use https rather than http in links and queries to openstreetmap.org 2018-02-15 19:14:07 +01:00
luccioman
8d7099a081 Handle escaped line breaks and separators in vocabulary import from CSV 2018-02-15 07:29:17 +01:00
luccioman
09f93fed0e Added a line start field for vocabulary import from CSV file
As a convenience to ignore eventual CSV header lines
2018-02-14 10:31:09 +01:00
luccioman
d28d612069 Added option to choose field delimiter in vocabulary import from CSV 2018-02-14 09:29:04 +01:00
luccioman
95f1954c78 Adjusted last blacklist entry example for a more accurate description
As discussed in issue #160 , blacklist entries can indeed currently not
be "complete" regular expressions, but must be structured as a domain
part, a separator character ('/'), and a path part.
2018-02-14 07:51:07 +01:00
luccioman
dbf4c1cd76 Improved blacklist entries editing operations :
- Fixes issue #160 : handle properly syntax exceptions with a user
friendly message
- Fixes loss of information on multiple blacklist entries editions
- Fixes loss of entries when moving entries from one list to another
2018-02-13 18:24:26 +01:00
reger
5df72c1c65 Remove now obsolete html for language-nav and ISO639 jar reference 2018-02-12 01:16:14 +01:00
reger
87077b8fb6 Adjust and move Language Navigator to be member of the navigatior plugin
list.
2018-02-12 00:16:34 +01:00
luccioman
eb20589e29 Fixed issue #158 : completed div CSS class ignore in crawl 2018-02-10 11:56:28 +01:00
luccioman
fa65fb1a03 Fixed loss of search modifiers on bookmark, recommand or delete result 2018-02-08 14:31:26 +01:00
luccioman
0cdee4e26a Fixed loss of "meanCount" search param when using facets or page buttons
Then on new search queries, no suggestions at all could be displayed.
2018-02-08 08:07:30 +01:00
luccioman
117a859879 Do not clear all search modifiers when unselecting one modifier.
Previously, when clicking a selected facet in the search results page to
unselect it, all other eventually selected modifiers/facets were also
removed.
2018-02-07 15:54:46 +01:00
luccioman
a9dc0874c0 Remove old query terms from search results suggestions links.
Especially when old terms were misspelled, suggestions links then
provided most of the time empty results.
2018-02-06 15:14:14 +01:00
luccioman
c71b545235 Enable results suggestions (Did you Mean) even when RWI is not enabled.
RWI is no more necessary for suggestions processing since commit
c40ba51ca6.
Revealed by a question about spell check from ouahpiti on YaCy forum
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6084 ).
2018-02-06 12:33:44 +01:00
luccioman
9412881230 Added basic support for autotagging microdata annotated item types.
With the appropriate vocabulary settings in Vocabulary_p.html page, this
can produce Vocabulary search facets displaying item types referenced in
html documents by microdata annotation.
Tested notably, but not limited to, vocabulary classes/types defined by
Schema.org and Dublin Core.
2018-02-06 10:25:38 +01:00
luccioman
539925a275 Added an utility to generate/update XLIFF master file from lng files. 2018-01-29 18:34:47 +01:00
luccioman
41a6b052d9 Updated master and French translation for the IndexReIndexMonitor_p page 2018-01-29 16:51:00 +01:00
luccioman
929e0d6eae Replaced improper ByteBuffer.equals() implementation by Arrays.equals()
Renamed also ByteBuffer.equals() to startsWith() as this is the
appropriate function implementation semantics.
2018-01-29 13:38:25 +01:00
luccioman
8b572b7337 Commit Solr index before simulating or starting recrawl job.
This ensures up-to-date simulation query results, and recrawl
processing.
2018-01-26 10:31:13 +01:00
luccioman
5e2812c060 Automatically refresh running recrawl report when JavaScript is enabled.
For users who would prefer to keep JavaScript disabled, a manual Refresh
button is still available.
2018-01-19 11:58:52 +01:00
luccioman
0fce264ba4 Set reindex page to html5 and removed presentational only html tables. 2018-01-15 18:32:34 +01:00
luccioman
83df922afc Removed unused duplicated HTML id on header hidden field 2018-01-15 17:16:54 +01:00
luccioman
4e03335625 Added more details to the recrawl job report 2018-01-12 11:47:13 +01:00
luccioman
d95d393a0d Add a query link to local Solr to browse selected recrawl candidates 2018-01-12 10:48:54 +01:00
luccioman
59f7763af6 Display recrawl job report also when job is actively running 2018-01-11 09:53:27 +01:00
luccioman
0c9e0b3566 Record recrawl calls to make them schedulable 2018-01-10 17:05:53 +01:00
luccioman
433e241e4f Added a report info box about eventual last terminated recrawl job
For easier monitoring of recrawls.
2018-01-09 22:33:15 +01:00
luccioman
b2af25b14f Added a stop condition to the Recrawl busy thread 2018-01-09 10:22:26 +01:00
luccioman
421728d25a Made possible to customize selection query before launching a recrawl 2018-01-08 21:20:46 +01:00
luccioman
fab6e54fec Enforced controls (HTTP method, token) on ReIndex and ReCrawl operations 2018-01-07 15:25:16 +01:00
luccioman
8a4ea1c11e Added UI switch to control content domain constraint per search request 2018-01-02 08:13:14 +01:00
luccioman
36a45b3905 Added UI setting for strictness of content-type checking on media search 2017-12-29 11:32:42 +01:00
luccioman
e6907fdab3 Added optional search parameter/setting to control content domain filter
Thus allowing to choose at configuration or per search request, whether
extending or not results beyond strict content domain filter (image,
video, audio or application).

Related graphical controls to be added to user interface.
2017-12-23 18:56:17 +01:00
luccioman
d42c1773c8 Added UI setting for optional encryption with https on p2p searches 2017-12-22 11:01:02 +01:00
luccioman
09c4ee56a7 Added optional https support for remote crawl and profile operations 2017-12-21 18:41:32 +01:00
luccioman
5db1c9155a Do locale independant case conversion on hosts, schemes, and file exts.
Required for proper operation when the default system locale is Turkish,
as dottless and dotted i characters have specific case conversion rules
in this language.
2017-12-19 13:52:05 +01:00
luccioman
1c4803e40a Enable optional https support for /yacy/transferURL API calls.
Also updated some Javadoc and consistently use Switchboard instance as a
constructor parameter where relevant.
2017-12-19 12:30:49 +01:00
luccioman
79a2ba306a Updated links to Java Regular Expressions documentation to version 8 2017-12-19 11:14:20 +01:00
luccioman
17e004599d Started implementing optional https preference for protocol operations
Introduced through the new configurable setting
network.unit.protocol.https.preferred, defaulting to false for now.

Let choose to prefer using https when available on remote peers to
perform YaCy protocol operations including notably hello or transferRWI.

Not yet implemented for every YaCy protocol operations.
2017-12-15 11:28:46 +01:00
ScRe13
bb3d3fe074 fixed default loading default settings; load was populated with wrong value 2017-12-12 23:25:56 +01:00
reger
20bba135fe Show hide or show public surftip button depending on current config status,
to show the button to switch the status (hiding button of current status)
2017-12-10 01:25:20 +01:00
Michael Peter Christen
b907819cb4 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2017-12-09 22:29:54 +01:00
Michael Peter Christen
25573bd5ab added a crawl filter based on <div> tag class names
When a crawl is started, a new field to exclude content from scraping is
available. The field can be identified with the class name of div tags.
All text contained in such a div tag where the configured class name(s)
match are not indexed, while the remaining page is indexed.
2017-12-09 22:29:35 +01:00
luccioman
640fed2a9c Removed Java 1.8 no more necessary version checking (fixes issue #147)
Java 1.8 is by the way now a prerequisite to run from latest sources.
2017-12-08 15:26:46 +01:00
luccioman
d95b288f19 Removed use of deprecated Jetty IPAccessHandler for client filtering.
Upgraded to InetAccessHandler.
Added InetPathAccessHandler extension to InetAccessHandler to maintain
path patterns capability previously available in IPAccessHandler but
lost in InetAccessHandler.

Filtering on IPv6 addresses is now supported.

Support for deprecated pattern formats such as "192.168." and
"192.168.1.1/path" has been removed, but startup automated migration
should convert such patterns eventually present in serverClient.
2017-12-08 15:12:08 +01:00
Michael Peter Christen
607b39b427 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
Conflicts:
	htroot/yacysearchitem.java
2017-12-07 15:25:41 +01:00
Michael Peter Christen
4355de0f3c (more!) evaluation of XRealIP from nginx reverse proxy 2017-12-07 15:16:11 +01:00
luccioman
f9cba827c0 Made "tld:" modifier case insensitive and IDN complient.
Thus allowing typing internationalized top-level domains with non ASCII
characters as tld: modifier.
2017-12-04 19:13:16 +01:00
luccioman
c5c3cc1274 Use HTTP Post operation for resetting memory monitoring state.
Fixes issue #145

Also added textual hint on the button, and display it only when it makes
sense, that is to say when the memory state is 'exhausted'.
2017-12-04 08:48:37 +01:00
luccioman
cb10daba92 Renamed Chinese & Greek lng files using ISO639-1 codes.
Previously named with their ISO 3166-1 country code : this way, when
setting language to "Browser" in ConfigBasic.html, it didn't work
properly when browser preferred language was Chinese or Greek as their
respective language codes are "zh" and "el" (not "cn" and "gr" which are
their country codes)
2017-11-04 11:06:05 +01:00
luccioman
4b61edff32 Added a help link to ISO 639-1 language codes list ref 2017-11-03 10:34:36 +01:00
luccioman
a994d439af Added description of spatial restrictions in search options 2017-11-02 08:57:00 +01:00
luccioman
8a48f80909 Added language HTML attribute to the search home page. 2017-10-31 08:19:04 +01:00
luccioman
5ff76fdcb9 Fixed spelling 2017-10-31 07:52:30 +01:00
luccioman
2c3f0ff9e8 Updated search page keyboard shortcuts descriptions. 2017-10-31 07:44:37 +01:00
luccioman
af825e9ffc Use accessible labels for search home page radio buttons. 2017-10-30 08:07:59 +01:00
luccioman
8e732d437c Enable HTTP Digest authentication for non admin users.
Also ensure authentication is not lost by Digest timeout when navigating
between index.html and search results page.

This way, running searches with extended features on a remote peer or a
password protected peer works with a regular user (with "Extended
search" rights). 
When authenticating on the search page with a user without "Extended
search" rights, it appears as authenticated, but has just its usual
access to the public search features.
2017-10-26 07:51:18 +02:00
luccioman
5161451a35 Stay authenticated when going to the search start page.
Otherwise, when authenticated as admin and navigating from search
results or admin pages to the search start page (/index.html), if
nothing is done on that page within HTTP Digest Auth timeout (about
2mn), then search is performed without authentication and so without
extended search features.
2017-10-24 09:54:54 +02:00
luccioman
d0bed78d02 Use the same top nav bar on index.html and search results.
Thus eventually including the same optional login link/status in the
search start page than in the results page, for the same convenient
login without the need to use the Administration section.
2017-10-24 09:34:03 +02:00
luccioman
f678394ce5 Fixed loss of index page form values on 'more options' link click.
Restores the behavior introduced eleven years ago (see commit
479861a3cf) and lost by mistake 3 years
ago (see commit 617dd9c97b), when the
click handler started referencing a missing HTML id.
2017-10-23 18:28:11 +02:00
luccioman
af198b990b Added an optional login link/status to the search public top nav bar.
Thus allowing a more convenient way (wihout the need to go to the admin
section) to login when searching on your remote or password protected
peer and benefit from extended search features such as Heuristics,
Bookmarking or JavasScript resorting.

Can be disabled using the ConfigSearchPage_p.html.
2017-10-21 10:57:36 +02:00
luccioman
1de86cf1bf Fixed JPEG snapshot resizing when running on OpenJDK.
Resizing JPEG snapshot images through /api/snapshot.jpg failed when
running on OpenJDK, but rendered successfully with a Oracle JDK.
Details in mantis 772 ( http://mantis.tokeek.de/view.php?id=772 ).

Removing any alpha component (useless in snapshot images) from the
rendered resized image solves the issue.
2017-10-19 09:27:52 +02:00
luccioman
a17a418e78 Fixed NullPointerException cases on snapshot images parsing. 2017-10-18 08:31:18 +02:00
luccioman
285f0d6a39 Consistently encode snapshot image with format requested on the API.
Previously, calling /api/snapshot.png rendered JPEG encoded images.
2017-10-18 07:53:07 +02:00
luccioman
4da15db998 Fixed search result Snapshots link.
Previously rendered as a broken URL containing the absolute file path of
a snapshot on the search server.

Now rendered as a valid URL linking to the /api/snapshot API to provide
available snapshot content. Snapshot format is selected among the
available ones in the following order of preference  : JPG/PNG, PDF, and
XML.
2017-10-17 09:41:58 +02:00
luccioman
fe75f326d8 Fixed ProfilingGraph calculation integer overflows and added test class.
Complementary to fix proposed in PR #128 by @otteresk.
2017-10-16 09:18:12 +02:00
luccioman
8303e15419 Reduced number of search navigators refresh requests in JS resort mode
The SearchEvent listen to changes on each of its navigators, and the
information about their overall state is sent with each fetched search
item (as a "data-nav-generation" attribute). Then the browser can
regularly fetch a fresh version of yacysearchtrailer.html only if
necessary (when that nav-generation value change).
2017-10-12 07:16:19 +02:00
luccioman
2ac78e2cca Addedd missing parameters to yacysearchtrailer call on JS resort mode 2017-10-11 07:13:28 +02:00
luccioman
dbff7b14fc Add a configurable limit to tags initially displayed in search results
When the limit is reached, a button allow expanding/collapsing remaining
tags.

When this feature is activated without a limit to the number of
displayed tags, when encountering search results with a very large
number of keywords, the results page can become almost unusable (very
long vertical scrollbar)
2017-10-09 14:13:46 +02:00
reger
f8c7d0265e Adjust tags css style in ConfigSearchPage to equal search page 2017-10-07 06:13:22 +02:00
luccioman
fcea6def72 Added textual hints to language radio buttons labels
As an help and accessible alternative to visual styling marking  whether
a language is available in browser preferred lang mode.
2017-10-02 10:05:57 +02:00
luccioman
27ab733685 Ensure private search features are not lost on Digest auth timeout
This is a fix for mantis 766 ( http://mantis.tokeek.de/view.php?id=766 )

Since the upgrade to Digest authentication, access to protected search
features was indeed disabled once the Digest nonce timed out.

After Digest auth timeout the browser no more sent authentication
information and as the search results page is not private, protected
features were simply be hidden without asking browser again for
authentication.

Adding a supplementary parameter when accessing the search results as
authenticated fixes this.
2017-09-29 19:18:12 +02:00
reger
dd82f85953 Add links to the optional keyword tags of search result
If swichted on link (click) to the tag adds the keyword to the search query.
If a keyword navigator is active the selected keyword adds or replaces 
a query keyword: modifier (currently replace was choosen as multiple 
keywords are not fully supported yet)
2017-09-28 00:46:49 +02:00
luccioman
fc28c58731 Added missing accessible labels to ConfigSearchPage_p.html 2017-09-26 14:58:30 +02:00
luccioman
8294374c10 Fixed ConfigSearchPage_p HTML validation errors.
Validated with Nu Html Checker 17.9.0
2017-09-26 07:59:44 +02:00
luccioman
57a33aefb0 Removed unnecessary max counts init on empty search navigators. 2017-09-25 15:21:17 +02:00
luccioman
b1e7bd0dd6 Restrict Search Result Layout modification to HTTP POST only. 2017-09-25 14:54:35 +02:00
luccioman
ef8aea7f8d Made the dates navigator max elements number user configurable.
Also used object properties on QueryParams instances, rather than using
mutable class (static) properties.
2017-09-25 09:19:08 +02:00
luccioman
0b0980b364 Improved accessibility of histograms widgets.
Added keyboard navigation support and missing WAI-ARIA attributes.

Tested with NVDA 2017.3 screenreader on recent major browsers.
2017-09-22 11:00:46 +02:00
luccioman
62c7cd9a77 Upgraded JavaScript lib raphael.js from 2.1.3 to 2.2.7 2017-09-20 07:59:20 +02:00
luccioman
cbbc7b43d3 Refresh paginations buttons instead of fully rendering each time.
This prevent the already displayed pagination buttons to be unresponsive
when clicking on them while the rendering JS function is running.
2017-09-18 17:36:07 +02:00
luccioman
18412dca21 Handle JS refreshing of belatedly added search navigators 2017-09-16 10:13:09 +02:00
luccioman
9049a926a5 Restrict JS results resorting to authenticated users.
Until a more efficient DOM refresh model needing less XHR requests per
search is implemented.
2017-09-16 09:26:08 +02:00
luccioman
4ab961fa46 Added HTML ids to search navigators for a more reliable JS refreshing. 2017-09-15 14:23:49 +02:00