luccioman
0c9e0b3566
Record recrawl calls to make them schedulable
2018-01-10 17:05:53 +01:00
luccioman
433e241e4f
Added a report info box about eventual last terminated recrawl job
...
For easier monitoring of recrawls.
2018-01-09 22:33:15 +01:00
luccioman
b2af25b14f
Added a stop condition to the Recrawl busy thread
2018-01-09 10:22:26 +01:00
luccioman
421728d25a
Made possible to customize selection query before launching a recrawl
2018-01-08 21:20:46 +01:00
luccioman
fab6e54fec
Enforced controls (HTTP method, token) on ReIndex and ReCrawl operations
2018-01-07 15:25:16 +01:00
luccioman
36e9b1c5b3
Fixed SegmentTest test case time dependant occasional failures
...
As highlighted by latest automated Travis builds.
2018-01-02 10:21:07 +01:00
luccioman
8a4ea1c11e
Added UI switch to control content domain constraint per search request
2018-01-02 08:13:14 +01:00
luccioman
36a45b3905
Added UI setting for strictness of content-type checking on media search
2017-12-29 11:32:42 +01:00
reger
cedb53be4e
upd to commons-io-2.6
2017-12-28 03:13:42 +01:00
reger
f8071ac8ae
Make TokenizedStringNavigator (used for keyword search facet) active
...
check case insensitive.
As keywords are compared lower case, make sure user input keyword:Key
or keyword:key will be shown as active in facet entry key.
2017-12-28 02:51:52 +01:00
reger
270b77074e
upd to httpclient-4.5.4 and httpmime-4.5.4
2017-12-24 01:34:23 +01:00
reger
6db7f5525b
upd to icu4j-60.2
2017-12-24 01:02:18 +01:00
luccioman
e6907fdab3
Added optional search parameter/setting to control content domain filter
...
Thus allowing to choose at configuration or per search request, whether
extending or not results beyond strict content domain filter (image,
video, audio or application).
Related graphical controls to be added to user interface.
2017-12-23 18:56:17 +01:00
luccioman
f52217c939
Enable full size images preview for users with extended search rights
2017-12-22 11:39:30 +01:00
luccioman
d42c1773c8
Added UI setting for optional encryption with https on p2p searches
2017-12-22 11:01:02 +01:00
luccioman
09c4ee56a7
Added optional https support for remote crawl and profile operations
2017-12-21 18:41:32 +01:00
luccioman
5db1c9155a
Do locale independant case conversion on hosts, schemes, and file exts.
...
Required for proper operation when the default system locale is Turkish,
as dottless and dotted i characters have specific case conversion rules
in this language.
2017-12-19 13:52:05 +01:00
luccioman
1c4803e40a
Enable optional https support for /yacy/transferURL API calls.
...
Also updated some Javadoc and consistently use Switchboard instance as a
constructor parameter where relevant.
2017-12-19 12:30:49 +01:00
luccioman
79a2ba306a
Updated links to Java Regular Expressions documentation to version 8
2017-12-19 11:14:20 +01:00
reger
c94bc82f6a
upd to commons-compress-1.15
2017-12-16 00:49:48 +01:00
luccioman
c6e1befbca
Restored peer URL host name stripping removed from previous commit.
...
Still useful for peers with IPv6 addresses.
2017-12-15 17:03:35 +01:00
luccioman
17e004599d
Started implementing optional https preference for protocol operations
...
Introduced through the new configurable setting
network.unit.protocol.https.preferred, defaulting to false for now.
Let choose to prefer using https when available on remote peers to
perform YaCy protocol operations including notably hello or transferRWI.
Not yet implemented for every YaCy protocol operations.
2017-12-15 11:28:46 +01:00
luccioman
2bc61f5657
Merge pull request #149 from Scre13/bugfix_default_settings
...
Fixed loading default thread load setting in Performance Settings of Queues and Processes.
2017-12-13 07:38:04 +01:00
ScRe13
bb3d3fe074
fixed default loading default settings; load was populated with wrong value
2017-12-12 23:25:56 +01:00
reger
20bba135fe
Show hide or show public surftip button depending on current config status,
...
to show the button to switch the status (hiding button of current status)
2017-12-10 01:25:20 +01:00
Michael Peter Christen
b907819cb4
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
2017-12-09 22:29:54 +01:00
Michael Peter Christen
25573bd5ab
added a crawl filter based on <div> tag class names
...
When a crawl is started, a new field to exclude content from scraping is
available. The field can be identified with the class name of div tags.
All text contained in such a div tag where the configured class name(s)
match are not indexed, while the remaining page is indexed.
2017-12-09 22:29:35 +01:00
luccioman
640fed2a9c
Removed Java 1.8 no more necessary version checking (fixes issue #147 )
...
Java 1.8 is by the way now a prerequisite to run from latest sources.
2017-12-08 15:26:46 +01:00
luccioman
d95b288f19
Removed use of deprecated Jetty IPAccessHandler for client filtering.
...
Upgraded to InetAccessHandler.
Added InetPathAccessHandler extension to InetAccessHandler to maintain
path patterns capability previously available in IPAccessHandler but
lost in InetAccessHandler.
Filtering on IPv6 addresses is now supported.
Support for deprecated pattern formats such as "192.168." and
"192.168.1.1/path" has been removed, but startup automated migration
should convert such patterns eventually present in serverClient.
2017-12-08 15:12:08 +01:00
reger
cc7a93e6b6
remove deprecated jetty continuation class from urlproxyservlet
...
(was a long time carry over, while not supporting async requests)
2017-12-08 01:01:07 +01:00
Michael Peter Christen
607b39b427
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
...
Conflicts:
htroot/yacysearchitem.java
2017-12-07 15:25:41 +01:00
Michael Peter Christen
4355de0f3c
(more!) evaluation of XRealIP from nginx reverse proxy
2017-12-07 15:16:11 +01:00
reger
e5b4799838
upd to Jetty-9.4.8.v20171121
2017-12-07 00:24:33 +01:00
luccioman
f9cba827c0
Made "tld:" modifier case insensitive and IDN complient.
...
Thus allowing typing internationalized top-level domains with non ASCII
characters as tld: modifier.
2017-12-04 19:13:16 +01:00
luccioman
a4494d6e01
Improved support for internationalized domain names on "site:" modifier
...
Allow typing directly internationalized domain names including non ASCII
characters in the search field.
Search is done using the ASCII Compatible Encoding (ACE) representation.
2017-12-04 18:23:26 +01:00
luccioman
d07006bac4
Do locale independant case conversion on "filetype:" query modifier.
2017-12-04 14:11:29 +01:00
luccioman
8fbf25d1ed
Made "site:" query modifier case insensitive.
2017-12-04 14:08:34 +01:00
luccioman
867388e05b
Refactored 'site:' query modifier parsing into a dedicated function.
2017-12-04 13:58:15 +01:00
luccioman
c5c3cc1274
Use HTTP Post operation for resetting memory monitoring state.
...
Fixes issue #145
Also added textual hint on the button, and display it only when it makes
sense, that is to say when the memory state is 'exhausted'.
2017-12-04 08:48:37 +01:00
reger
0704b1d644
upd to httpcore-4.4.8
2017-12-04 01:12:50 +01:00
luccioman
bfe753acea
Merge pull request #144 from him2him2/_fic_HTTPS
...
Update HTTP -> HTTPS in README.md
2017-12-02 08:45:42 +01:00
luccioman
c9d80b5b77
Prefer fine URL match over approximate URL mask regex on final filtering
...
Also prevent adding a redundant and CPU costly Solr url mask filter
query when possible
2017-12-01 11:52:52 +01:00
luccioman
0a120787e3
Improved accuracy of URLs search filters : protocol, tld, host, file ext
2017-12-01 11:19:31 +01:00
luccioman
d1c7dfd852
Fixed URL parsing with fragment and empty path
2017-12-01 09:48:42 +01:00
luccioman
e07ef1b610
Apply tld query modifier on Solr host_s mandatory field.
...
The filter has thus much more chances to be effective than when applied
on the optional field host_dnc_s.
2017-12-01 08:46:46 +01:00
luccioman
478e92deff
Fixed url mask filter generated when protocol modifier is not null
2017-11-30 20:21:45 +01:00
luccioman
29de4a65d7
Refactored url mask filter build from query modifiers
...
For better readability and easier unit testing.
2017-11-30 09:20:32 +01:00
reger
a1879115dc
upd to Jsoup-1.11.2
2017-11-26 22:01:42 +01:00
reger
d5a75537e4
remove redundant setting of timeout for remoteinstance
...
and replace depreciated updatesolrclient instantiation with recommended builder
2017-11-26 02:53:51 +01:00
luccioman
f01aac31fd
Made possible to use https for remote search on peers with SSL enabled.
...
Default is still http to prevent any regressions, but a new setting is
available to choose https as the preferred protocol to perform remote
searches.
New configuration setting 'remotesearch.https.preferred' is manually
editable in yacy.conf file or in Advanced Properties page
(/ConfigProperties_p.html).
Should be enabled as default in the future for improved privacy.
Https could also eventually be used for other peers communications.
2017-11-24 14:10:41 +01:00