Commit Graph

5895 Commits

Author SHA1 Message Date
luccioman
18412dca21 Handle JS refreshing of belatedly added search navigators 2017-09-16 10:13:09 +02:00
luccioman
9049a926a5 Restrict JS results resorting to authenticated users.
Until a more efficient DOM refresh model needing less XHR requests per
search is implemented.
2017-09-16 09:26:08 +02:00
luccioman
4ab961fa46 Added HTML ids to search navigators for a more reliable JS refreshing. 2017-09-15 14:23:49 +02:00
luccioman
ad61a3afed Results JS resort : properly handle results with same ranking value. 2017-09-15 12:16:24 +02:00
luccioman
57a1007772 Added new graphical setting for browser JS/On demand results resorting. 2017-09-15 11:12:23 +02:00
luccioman
d00a35576c Apply JS resort only when currently relevant : p2p text search 2017-09-15 09:51:34 +02:00
luccioman
4e3c928d31 Do not animate unnecessarily when changing page on JS sorted results. 2017-09-14 09:36:55 +02:00
luccioman
fb6743e8f8 Prevent unnecessary DOM finds in JS resorting functions.
Also removed now unused functions earlierPage() and laterPage().
2017-09-13 19:03:01 +02:00
luccioman
b1b9ffbbc8 Stop updating results with JS resorting on server feeds termination 2017-09-13 09:03:24 +02:00
luccioman
6f5e55c9f0 Updated the JavaScript license information page 2017-09-13 08:35:15 +02:00
luccioman
c7149acb48 Disabled as default verbose browser console logs in yacysort.js 2017-09-13 08:23:19 +02:00
luccioman
b50700c35f Added missing copyright header to the yacysort.js file 2017-09-13 08:16:29 +02:00
luccioman
86d41f0242 Moved the JS resort specific styling to the usual YaCy CSS location 2017-09-13 08:08:38 +02:00
luccioman
9e86d183b8 Disable manual search results resorting when resorting is done with JS
Also added a constant for the js resorting setting key.
2017-09-13 07:58:05 +02:00
luccioman
4ccd38357f Trigger js resorting animations using only CSS classes.
Also added some more descriptive comments.
2017-09-13 07:41:03 +02:00
luccioman
e40a225bc1 Merge branch 'javascript-resort' of https://github.com/Scarfmonster/yacy_search_server into jsResort 2017-09-13 07:29:58 +02:00
Ryszard Goń
2af011243f Javascript re-sorting: Remove potentially breaking display property and reset max-height when animation is finished. 2017-09-11 20:02:19 +02:00
Ryszard Goń
634f52fefc Javascript re-sorting: replace jQuery show() with css animations 2017-09-10 17:20:12 +02:00
luccioman
5d3ceb31b7 Improved search navigators counters accuracy and consistency.
- added some missing increments from RWI results
- decrement relevant navigator counts when solr or RWI results are
evicted because duplicates detection or constraints checked belatedly
- do not compute facets when unnecessary to avoid unwanted CPU load
- do not increment from facets when already done
- do not rely on facets on remote solr peers requests, as most of the
time only a limited part of their total results if fetched (thus also
preventing unnecessary load on remote peers)
- use a concurrency friendly score map for the dates navigators to
prevent unwanted ConcurrentModificationExceptions

This improves the situation for the most obvious inconsistencies in
search navigators counts, but more has to be done for a true accuracy
(notably when query modifiers constraints are applied belatedly - after
the solr or RWI retrieval request - such as the content domain
constraint)
2017-09-06 16:58:40 +02:00
JeremyRand
ab0e50b941
Javascript re-sorting: optimize the jQuery selectors a little bit. 2017-09-03 18:09:52 +00:00
JeremyRand
86b5094970
Fix numbered page navigation from getting corrupted when statistics() runs. 2017-09-03 18:09:51 +00:00
JeremyRand
a888254769
Add UI for numbered page navigation when Javascript re-sorting is enabled. 2017-09-03 18:09:44 +00:00
JeremyRand
74333c931e
Fix the sidebar item "Wiki Name Space" with Javascript re-sorting. 2017-09-03 17:50:17 +00:00
JeremyRand
4a9e64caea
(WIP) Add numbered page navigation when Javascript re-sorting is enabled.
TODO: Add UI for selecting the number.
2017-09-03 17:50:17 +00:00
JeremyRand
6ec256dc34
(WIP) Fix the sidebar when Javascript resorting is in use.
TODO: Add some markup so that DOM traversal in the animations is less painful.
2017-09-03 17:50:16 +00:00
JeremyRand
d37df75afa
(WIP) Optionally sort HTML search items via Javascript.
TODO: Expose a GUI setting for this.
2017-09-03 17:50:08 +00:00
JeremyRand
61be709a97
Add data-ranking attribute to each HTML search item. 2017-09-03 17:44:19 +00:00
luccioman
a28428047a Fixed count of filtered results from local solr.
Was inadequately modified in my previous related commits (making next
pages buttons unavailable in Search portal mode), as
SearchEvent.local_solr_available did not count the total filtered
results but only the ones within the currently fetched result page(s).
2017-08-31 11:24:59 +02:00
luccioman
30c2f50e0b Use final results counts in progress bar detailed statistics.
Using unfiltered detailed counts (local and remote entries found before
doubles detection and before applying query modifiers) was confusing and
inconsistent with the total count. It could let think more results are
to come in the next pages, without understanding why they are not
displayed.
2017-08-31 07:37:24 +02:00
luccioman
8b25b485eb Make result action links visible when focusing them with keyboard. 2017-08-29 08:16:12 +02:00
luccioman
3e933979df Removed duplicate HTML class attribute. 2017-08-29 07:39:12 +02:00
luccioman
ce22076920 Fixed Unresolved_Pattern occurence on results favicon HTML id. 2017-08-29 07:32:33 +02:00
luccioman
a1a0515312 Added a button to manually refresh sorting of p2p search results.
As a server-side oriented alternative to the JavaScript realtime
resorting feature proposed in PR #104.
The goal is the same as in this PR : having the possibility compensate
the network latency of various peers results fetching and obtain once
possible a consistently ranked result set.
2017-08-28 19:03:51 +02:00
luccioman
4eba88f2ff Removed some unnecessary uses of java.lang.reflect api.
This improves code browsing and readability, making search by references
or call hierarchy IDE features more accurate.
2017-08-24 18:47:18 +02:00
reger
51a4e03c93 Allow to stop currently running warc import (stop button) 2017-08-20 22:17:27 +02:00
luccioman
3f0446f14b Ensure proper synchronous robots entry retrieval on first check.
Previously, when checking for the first time the robots.txt policy on a
unknown host (not cached in the robots table), result was always empty
in the /getpageinfo_p.xml api and in the /CrawlCheck_p.html page. Next
calls returned however the correct information.
2017-08-16 09:30:33 +02:00
luccioman
b23a563065 Prevent search result failure on incomplete images information.
Complements the recent modification related to images in commit 7f395ef.

Unfortunately many documents metadata fetched from the freeworld p2p
network have only partial information about embedded images. Without
proper error handling, this made many searches in p2p mode to fail
completely.
2017-08-15 10:11:05 +02:00
Michael Peter Christen
7f395ef937 added image link in search results
This should be a help to make a preview of search results.
The image is computed from the list of embedded images, it is
always the first image in that list.
In rss-type results the image is presented like
<media:content medium="image" url="https://abc.xyz/logo.png"/>
as defined in
http://www.rssboard.org/media-rss#media-content
2017-08-14 20:12:09 +02:00
reger
4979439e87 Skip public post of jre version.
Added to determine switch to java8  596b5dfa59
2017-08-06 23:41:53 +02:00
reger
588c6e96fb upd version for typeahead.jquery.js in jslicense.html 2017-07-16 23:35:56 +02:00
luccioman
8100c033a2 URL Viewer : apply crawler size limits when adding to local index.
This allow large files parsing and preview, while preventing unwanted
OutOfMemory errors which are likely to occur when adding to the Solr
Index resources larger than configured crawler limits.
2017-07-16 14:37:06 +02:00
reger
e5cff062b5 Clean up redundant but obsolete jquery.rdfquery-core-1.0.js script lib 2017-07-14 23:41:39 +02:00
reger
23bda133d2 Fix css conflict of YMarks.html to make it viewable.
yacy-ymarks.css sidebar conflicts with bootstraps sidebar (different
overlay settings). Simply renamed it to ymark-sidebar.
2017-07-09 23:08:54 +02:00
reger
a21789d4e7 Fix unresolved pattern in api/share.html by init some display var's 2017-07-08 22:46:15 +02:00
luccioman
bf55f1d6e5 Started support of partial parsing on large streamed resources.
Thus enable getpageinfo_p API to return something in a reasonable amount
of time on resources over MegaBytes size range.
Support added first with the generic XML parser, for other formats
regular crawler limits apply as usual.
2017-07-08 09:04:03 +02:00
luccioman
1b3c169a9c URL Viewer : decode raw text using the eventual response charset.
When provided, or decode as UTF-8 as previously done.
2017-07-03 13:51:14 +02:00
reger
e6e20dab52 upd to Jetty 9.4.6.v20170531
Modify loginservice to the changes in Jetty, partially based on pull 
request #101 https://github.com/yacy/yacy_search_server/pull/101 bu @automenta
2017-07-01 23:58:28 +02:00
luccioman
e4c730b99f Updated PerformanceQueues_p.xml API with last related servlet changes 2017-06-30 11:41:48 +02:00
luccioman
dcc56318bb Made remote search max system load limits configurable from UI.
As reported by davide on YaCy forums (
http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6004 ) when the
system is on high load, unless reading carefully YaCy configuration
file, it could be difficult to understand why remote search results are
not fetched.
2017-06-30 11:30:54 +02:00
luccioman
4b72b29ea2 Added an informative title on the crawl start robots.txt status icon 2017-06-29 11:36:47 +02:00
luccioman
d08f31c3a8 Crawl start Ajax request : properly handle eventual XML parsing errors
Otherwise on a malformed getpageinfo_p XML response (from the browser
point of view), JavaScript errors where thrown and the ajax status
steering wheel remained displayed indefinitely.
2017-06-29 11:25:27 +02:00
luccioman
8da3174867 Ensure lower case conversion consistency with any default locale.
Especially for Turkish speaking users using "tr" as their system default
locale : strings for technical stuff (URLs, tag names, constants...)
must not be lower cased with the default locale, as 'I' doesn't becomes
'i' like in other locales such as "en", but becomes 'ı'.
2017-06-27 06:42:33 +02:00
luccioman
c41b31dcb3 Cleaned up memory usage page HTML
- fixed validation errors
- removed deprecated attributes
- improved accessibility with richer table semantics (headers and
caption elements) and language declaration
2017-06-20 09:21:55 +02:00
luccioman
0487336ec3 Prevent integer overflow in table statistics and use strong typing 2017-06-19 17:02:11 +02:00
luccioman
0f80c978d6 Limit the number of initially previewed links in crawl start pages.
This prevent rendering a big and inconvenient scrollbar on resources
containing many links.
If really needed, preview of all links is still available with a "Show
all links" button.

Doesn't affect the number of links used once the crawl is effectively
started, as the list is then loaded again server-side.
2017-06-17 09:33:14 +02:00
luccioman
32288a8999 Merge branch 'master' of https://github.com/yacy/yacy_search_server 2017-06-17 08:16:55 +02:00
luccioman
e9b4b29f90 Limit scope of some local JavaScript variables. 2017-06-16 08:50:57 +02:00
Michael Peter Christen
369b8e0e0b added json(p) endpoint for crawl start 2017-06-16 08:44:40 +02:00
luccioman
9dd790087d Added HT Cache basic statistics (hit rate) 2017-06-15 09:50:02 +02:00
luccioman
28b451a0b3 Made Cache compression level and lock timeout user configurable 2017-06-14 19:02:08 +02:00
Michael Peter Christen
6fe735945d migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8
Also: now Version 1.921
2017-06-09 12:25:23 +02:00
luccioman
8399275142 Properly close file output streams even on exceptions scenarios. 2017-06-08 07:19:16 +02:00
reger
632354e2ff Tokenize result entry keywords and add some styling for display 2017-06-04 01:50:40 +02:00
reger
a814f3d885 Introduce keyword query parameter
This enables keyword navigator to filter on keywords. Added search page
output and layout config for keywords, allowing e.g. in Intranet use
to display the keywords. No styling or links applied to the keyword
text (but is desirable possibly in combination with bootstrap-tagsinput
for future/intranet).
2017-06-02 01:00:21 +02:00
luccioman
cbccf97361 Added JavaDoc to the getpageinfo_p API servlet. 2017-05-30 17:38:16 +02:00
luccioman
bd88fd303e Deprecated duplicated and internally unused getpageinfo servlet.
Redirections set for the transition of any eventual external uses:
 - /api/getpageinfo.xml to /api/getpageinfo_p.xml
 - /api/getpageinfo.json to /api/getpageinfo_p.json
2017-05-30 09:29:28 +02:00
luccioman
1be4d32f99 Restored search page default behavior for Tab, Page Up and Down keys
Replaced by shortcuts defined by the HTML "accesskey" attribute which
has the advantage to be advertised by screen readers when focusing the
corresponding buttons, contrary to custom JavasScript key handlers.
Now With Firefox :
 - "Alt + Shift + n" for next page
 - "Alt + Shift + p" for previous page

Following ARIA recommendation : "keyboard shortcuts enhance, not
replace, standard keyboard access." ( see
https://www.w3.org/TR/wai-aria-practices/#kbd_shortcuts_behavior_design)

Fix for mantis 711 (http://mantis.tokeek.de/view.php?id=711)
2017-05-23 07:25:40 +02:00
luccioman
45346c1be8 Added missing accessibility attributes on search results progress bar. 2017-05-16 09:44:13 +02:00
luccioman
91a06bc669 Annotated search result information separators for screen readers. 2017-05-15 13:31:24 +02:00
luccioman
31ad043bb9 Added user interface feedback on results feeding termination status.
Added as an additional icon with title in the search progress bar, to
inform about background search feeder threads terminated or still
running. While giving a bit more information to users about the p2p
search process, this can help choosing whether or not wait a little bit
more time before going to the next page, in order to get results from
various sources sorted as best as possible (see #91 for a discussion
about sorting accuracy and network latency).

Other related modifications included :
 - regular updates to statistics in the progress bar until the
background feeders are completely terminated.
 - removed some uses of unsecure and discouraged JavaScript elements
2017-05-15 13:15:16 +02:00
luccioman
d90b001e1b Improved previous merge "Show ranking in HTML UI".
- added the new setting as configurable in the "Debug/Analysis" settings
page. Debug/analysis is its main purpose for now as there is currently
no nice and "understansable" ranking score info servlet (see forum
discussion http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5884 ) 
- render in the "Search Page Layout" page preview when enabled
- added constants
2017-05-11 18:02:33 +02:00
luccioman
efe1232d90 Merge branch 'html-show-ranking' of
https://github.com/JeremyRand/yacy_search_server

Conflicts:
	defaults/yacy.init
2017-05-11 14:53:57 +02:00
luccioman
4564541b3b Fixed blacklist Regex containing '+' characters rendering.
As reported on YaCy forum by shni
(http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5970) when a
blacklist entry contained both '?' and '+' characters, the '+' chars
were wrongly decoded and rendered as spaces.
2017-05-04 11:12:58 +02:00
luccioman
0612a8f4f2 Fixed the previously added link to scheduled dump operations. 2017-05-04 08:45:30 +02:00
luccioman
a87281b498 Added MediaWiki dump import scheduling feature.
Checking the last modified date by default to prevent unnecessary long
running operations.
2017-05-03 18:53:01 +02:00
luccioman
10c03c6c64 Improved MediaWiki dump import monitoring.
When import thread is terminated :
 - now stop refreshing and stay on the monitoring page to give user a
feedback after a long running import
 - added link to the next monitoring step : results from surrogates
reader
 - added link to new import
 
On the new import page, added a link on the eventual last import report.
2017-05-02 09:38:45 +02:00
luccioman
8d288f5dba Crawl results page : apply table lines number limit.
Take into account the already existing default limit value (especially
useful after a long crawl or surrogates import), or a custom one from
parameter "count".
Added a "Show all" link for convenience.
2017-04-27 18:24:54 +02:00
reger
c77e43a391 Take out mailto collect in internal parsed document
As earlier plans to make use of mailto as separate webgraph entity didn't
materialize (see  http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5726&p=32493&hilit=mailto#p32493)
free the unused handling and resources.
2017-04-20 00:18:18 +02:00
reger
bec34d3546 Add url input field as source for WarcImporter
allowing to import warc from url without prior download.
2017-04-16 04:25:29 +02:00
reger
d3df8a46c4 fix unresolved_pattern on missing post parameter api/message.html 2017-04-14 21:14:26 +02:00
luccioman
f66438442e Extended Mediawiki dump import to remote URLs.
When using a public HTTP URL in /IndexImportMediawiki_p.html, the remote
file now is directly streamed and processed, allowing import of several
GB dumps even with a low memory remote peer, and without need to
manually download the dump file first.
2017-04-14 14:32:44 +02:00
luccioman
7edddd7b0d Improved error reports on various wiki dump prerequisites failure cases.
Also added some JavaDoc.
2017-04-11 08:21:34 +02:00
luccioman
dfe8d4139b Used a text input for wiki dump import file selection.
Using an HTML "file" input was confusing (as reported by promocore on
YaCy forum : http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5965) ,
and it only worked with MS IE/Edge on a local YaCy peer :
 - for security reasons some current major browsers such as Firefox or
Chrome do not allow to send full file path information when using a file
form input
 - the local file system selection popup doesn't make sense when you
want to import a dump on a remote YaCy server
2017-04-11 07:34:17 +02:00
reger
3a71430030 Adjust ConfigSearchPage_p to activated hosts navigator as plugin 2017-04-10 22:58:20 +02:00
reger
7b80189bda Activate hosts navigator plugin. This includes rwi results in the navigator
count.
This might be tangential related to http://mantis.tokeek.de/view.php?id=736
as the example includes a local index search, while rwi results are not
counted.
2017-04-10 22:42:06 +02:00
reger
05a1b14b4a add missing text from ConfigRobotsTxt_p to master.lng
and link to Translation Editor to Translation News page.
2017-04-09 21:42:05 +02:00
reger
a39c00a93f add servlet to list user in UserDB and made user editor available in
separate servlet for a quick and easy overview of configured user and
selection for edit.
2017-04-09 02:09:32 +02:00
reger
a4498e17c0 fix edit current user form to required post mehtod
introduced with cde237b687
2017-04-08 22:54:57 +02:00
luccioman
665d087d76 Enforced access controls on a few more administration pages.
- ensure use of HTTP POST method when performing server side effect
operations
 - transaction token required to ensure the request has effectively been
requested by user interaction
2017-04-03 12:20:16 +02:00
luccioman
0feded21dd Escaped HTML eventually active content from recorded API call comments. 2017-04-03 11:40:37 +02:00
luccioman
09e72eb0a4 Set Config Portal as a private administration page.
Consistently with its required action from submission credentials, and
because external unauthenticated users do not need to access these
settings.
2017-04-03 11:34:49 +02:00
reger
9339a6a4c5 use css error class for error msg in IndexImportOAIPMH_p.html,
adjust to xhtml <p> usage rule
2017-04-02 20:36:22 +02:00
reger
ba339a2a45 Add servlet to import warc file from filesystem IndexImportWarc_p.html.
Apply Importer interface to WarcImporter
2017-04-02 03:32:21 +02:00
Michael Peter Christen
1d81b8f102 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git 2017-04-01 01:04:27 +02:00
Michael Peter Christen
69081bce00 added export to elasticsearch. The export dump can easily be imported to
elasticsearch using the command
curl -XPOST localhost:9200/collection1/yacy/_bulk --data-binary
@yacy_dump_XXX.flatjson
2017-04-01 01:04:17 +02:00
luccioman
5b5b9d5d96 URL Viewer : only display the link to metadata when metadata exists 2017-03-30 16:14:22 +02:00
luccioman
39ffa42a3c Modified RWI settings page radio click event to use HTTP POST 2017-03-30 10:23:47 +02:00
luccioman
af28a07780 Updated API calls recording/replay with recent changes.
- enabled HTTP POST calls with Digest HTTP authentication
 - made API calls compatible with API newly restricted to HTTP POST only
with transaction token validation
 - ensured backward compatibility with older entries recorded as HTTP
GET
2017-03-30 09:22:28 +02:00
luccioman
cde237b687 Enforced access controls on some administrative actions.
- ensure use of HTTP POST method : HTTP GET should only be used for
information retrieval and not to perform server side effect operations
(see HTTP standard https://tools.ietf.org/html/rfc7231#section-4.2.1)
 - a transaction token is now required for these administrative form
submissions to ensure the request can not be included in an external
site and performed silently/by mistake by the user browser
2017-03-26 11:48:00 +02:00
reger
cbf58d5f0a Add hint text to default ServerAcess Port Settings page 2017-03-19 21:45:33 +01:00
reger
f05976c017 Display the local search word statistic in alphabetic order 2017-03-19 07:12:35 +01:00
reger
3dd23c178b Introduce the option to configure a shutdown port.
A port value of -1 will disable this option.

If set to a value greater 0, YaCy listens on this of on the local loopback 
address (127.0.0.1) for a shutdown or restart signal.
E.g. connect to http://localhost:8005/shutdown will stop the YaCy server.
http://localhost:8005/restart will restart it.
This option allows to stop YaCy locally independant from the web web 
frontend (which might be configured for password protected remote access).
2017-03-19 02:30:08 +01:00
reger
a2afb4bae0 add switchboardconstants for server ports config keys 2017-03-18 20:02:26 +01:00
reger
038b9cd98e update translation for ConfigNetwork_p.html 2017-03-15 22:36:53 +01:00
luccioman
8e77fe3860 Fixed unresolved pattern case in search results progress bar.
This is a fix for mantis 715 (http://mantis.tokeek.de/view.php?id=715).

A possible path scenario that could leading to this case :
 - YaCy is running low in memory
 - a search is requested
 - before the end of search results rendering, the cleanup job runs and
deletes the running search event from the cache because of short memory
 - then yacysearchitem renders with "-UNRESOLVED_PATTERN-" parameter
values passed to the statistics() JavaScript function
2017-03-08 10:27:18 +01:00
luccioman
79df5bb20a Fixed settingsAck_p.html back link for case where referrer is stripped. 2017-03-07 12:27:27 +01:00
luccioman
5b03feb776 Fixed unresolved pattern case on /yacysearchlatestinfo.json api 2017-03-03 13:46:44 +01:00
luccioman
0173b0bc32 Added an advanced settings page for referrer policy settings.
Feedback will be welcome, notably on the descriptive content of this
page.
2017-03-03 12:05:30 +01:00
luccioman
cdcd923375 Privacy enhancement : added settings to control referrer policy.
HTTP "Referer" header sent by the browser when using YaCy can now be
controlled either with the referrer meta tag as a global policy, or only
for search result links by adding the attribute rel="noreferrer".

To improve privacy with the less possible regressions, the default is
set as meta tag with value "origin-when-cross-origin" : internal YaCy
links behavior is not affected, but when visiting external websites
referrer url is not empty but stripped from query parameters and path.

Older browsers, Safari, MS IE and Edge do not support the referrer meta
tag, so the standard but less flexible noreferrer link type can also be
enabled as an alternative.

User-friendly settings page to be implemented.
2017-02-28 18:11:54 +01:00
reger
0aa0dd0b5b fix delta time calculation in PerformanceSearch_p for the 1. entry
(INITIALIZATION displayed absolute date, set delta to 0 for 1. entry)
2017-02-27 01:04:31 +01:00
luccioman
9e626f6b00 Added a hint title for required fields in the Solr Schema editor 2017-02-24 11:09:42 +01:00
reger
7c188ad092 Add extract of queries.log in form of top search word cloud (last 7 days)
to AccessTracker_p.html (Network Access -> Local Search Log page).
It displays top 20 words of search queries.
2017-02-20 23:27:33 +01:00
luccioman
3475d8c1a9 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2017-02-20 10:48:44 +01:00
luccioman
c68a8be2d9 Refactored and enforced Solr mandatory fields for proper operation
- Added a new method to check activation of mandatory fields on
Collection Configuration commit, consistently with checks previously
performed in Switchboard startup and with mandatory fields in the
default schema.
- Reorganized default schema and CollectionConfiguration enumeration :
moved no more mandatory fields in a specific section, and moved fields
enabled at startup to the mandatory section. 
- Marked mandatory fields as required and with stronger font in the
IndexSchema_p.html page
2017-02-20 10:48:07 +01:00
reger
334c70c37a correct fromDate init value on missing param in api/timeline_p servlet
revert test modification from last commit in AccessTracker.main
2017-02-20 00:14:14 +01:00
luccioman
6e89d125f2 Added robots.txt support for heuristics federated search.
As noticed by @reger24, abusive use of OpenSearch systems should be
prevented, especially if allowing to parse and reuse HTML results.
robots.txt file is now checked before requesting an external OpenSearch
system to respect the host exclusions and eventual crawl-delay value.
The check is also performed when trying to add a new OpenSearch URL
template through the /ConfigHeuristics_p.html admin page.
2017-02-15 15:04:40 +01:00
reger
a011a97de9 make ConfigParser a protected page, for consistent behavior of locked
menu items.
2017-02-14 02:04:42 +01:00
luccioman
54405577aa Replaced absolute redirection locations by relative ones when possible.
This makes integration of YaCy behind a reverse proxy subfolder easier.
2017-02-09 16:42:21 +01:00
luccioman
1857651988 Added a new Debug/Analysis advanced settings subsection.
As discussed in PR #93 with @JeremyRand and @reger24 this new advanced
settings page includes:
 - a new setting to control remote Solr responses encoding
 - some existing debug settings which could not be set through the admin
user interface
2017-02-09 11:05:06 +01:00
luccioman
94af489f14 Removed deprecated "localMissCount" prop from yacysearchlatestinfo.json.
This property has been deprecated four years ago by commit
d74472f562. For any active search event
id, it was then always filled with "-UNRESOLVED_PATTERN-".
2017-02-03 10:32:31 +01:00
luccioman
f6ad927a14 Refactored the DHT-Trigger section in Performance_p.html page.
This is to be more easily understandable and to reflect more accurately
the current memory strategies implementations that eventually set the
"proper" state not only because DHT reception.
2017-02-01 18:44:42 +01:00
luccioman
b51fd9467c Fixed unresolved pattern on directory entries in HostBrowser.xml api.
As described in mantis 725 (http://mantis.tokeek.de/view.php?id=725) the
HostBrowser.xml api directory entries had incorrect count attribute
value. 
This was because the HostBrowser html page and backing template servlet
evolved, but modifications were not reported on the xml api.
2017-01-31 09:20:19 +01:00
reger
f6b08443f0 adjust column layout in Settings_Proxy.inc 2017-01-30 22:44:28 +01:00
luccioman
95b63f5126 Added a CSS class for infobox block.
This will prevent mistakenly hiding a div element not designed to be an
infobox but having a ".info" parent (After having previously added the
possibility for a div - and not only a span element - to be an infobox).
2017-01-28 10:19:39 +01:00
luccioman
68afe900d0 Added user-friendly controls over disk usage configuration settings.
As mentioned in issue #103, control settings over YaCy disk usage
already existed but lacked a user-friendly way to set them.

I added it to the Performance_p.html administration page with a little
refactoring on the "Resource Observer" fieldset for improved
accessibility and HTML standards respect.
Also added the possibility to enable/disable the autoregulation fonction
from this page.
2017-01-27 15:47:15 +01:00
luccioman
d0182e4797 Improved Index Browser accessibility with semantically richer html tags.
Made use of ol, li, thead, th, tbody, h1 and h2 html tags.
Added aria-label attributes to provide alternative textual information
previously only conveyed by color cue.

Tested behavior with NVDA 2016.4 screen reader.
2017-01-26 01:13:32 +01:00
luccioman
254060bda1 Index Browser : fixed display of "Count colors" for authorized users. 2017-01-24 11:49:15 +01:00
luccioman
c82c8351dd Fixed Index Browser page HTML validation errors and switched to HTML5.
Also removed deprecated HTML attributes uses.

Validation performed with Nu Html Checker 17.1.0.

Cross browser tested with :
 - Debian Jessie : Firefox ESR 45.6.0
 - MS Windows 10 : Firefox 50.1.0, Chrome 55.0.2883.87, MS Edge
2017-01-24 09:40:43 +01:00
luccioman
826e5bbadd Documented /HostBrowser.html related configuration settings 2017-01-23 16:05:51 +01:00
luccioman
9adba36754 Fixed "-UNRESOLVED_PATTERN-" admin parameter in "load & index" links. 2017-01-23 14:54:37 +01:00
luccioman
4e2bc644cb Display Index Browser links requiring auth only when authenticated.
In the /HostBrowser.html page "only hosts with urls pending in the
crawler", "only with load errors" and "Administration Options" all
require administration credentials. But they were displayed even to
unauthenticated users, and clicking them did nothing and returned the
/HostBrowser.html page empty.
2017-01-23 14:49:02 +01:00
reger
e61ee180a7 Group all proxy settings on System Administration by adding settings of
UrlProxyAccss page (moved from deleted AugmentedBrowsing_p), adjust
submenu (remove Augmented Browsing) and translation files.
2017-01-22 23:58:46 +01:00
luccioman
39e081ef38 Fixed display of crawler pending URLs counts in HostBrowser.html page.
As described in mantis 722 (http://mantis.tokeek.de/view.php?id=722)

Also updated some Javadoc.
2017-01-22 12:31:14 +01:00
luccioman
870a5eae26 Removed temporary test main method commited by mistake. 2017-01-22 12:19:43 +01:00
reger
c4017f2e87 upd to commons-compress-1.13.jar
hide external icon on forge logo (was also out of position in IE)
2017-01-20 02:15:11 +01:00
luccioman
e048e74072 Added an optional parameter to webstructure.xml api.
This new "documentStructure" parameter can be set to false to only get
hosts accumulated references on a resource and thus prevent scraping the
specified URL and getting citations references.

Also set WebStructureGraph constants as final and updated the Javadoc
with example api call URLs.
2017-01-19 12:30:44 +01:00
luccioman
17b7c92009 Made sure webstructure.xml API produces valid XML.
Host names should not contain XML special characters such as quotation
mark, but at this stage the WebGraph may have mistakenly recorded a host
name with such characters. What's more the DigestURL constructor does
not prevent this.
By the way using serverObjects.putXML to encode host names we ensure
here the rendered XML is well formed and can be parsed by external tools
even if an structure entry is incorrect.
2017-01-17 15:59:55 +01:00
luccioman
d9766ca981 Fixed WatchWebStructure_p.html render to include https URLs.
As described in mantis 721 (http://mantis.tokeek.de/view.php?id=721)
WatchWebStructure_p.html failed to include in its structure view https
and other protocols and ports than default http.
2017-01-16 18:41:58 +01:00
luccioman
ed3dd5e31a Fixed webstructure.xml API used with a domain name 'about' parameter.
As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720),
when requesting this API with a domain name instead of a complete URL
only HTTP references on default port were listed.
2017-01-16 16:41:06 +01:00
luccioman
0da1e6ba16 Factored code re-implementing DigestURL.hosthash() method.
This ensure consistent implementation of the url host hash generation
and easier usage finding in source code.

Also added a unit test for this function.
2017-01-16 10:18:42 +01:00
luccioman
f793d97e56 Factored common code with DigestURL.hosthash() 2017-01-13 16:05:46 +01:00
luccioman
9cea7cbb10 Detailed some Javadoc related to /api/webstructure.xml usage. 2017-01-12 17:52:47 +01:00
reger
007e2afa6e Start to rename "Augmented Browsing" to "Web Proxy ..." / "View via Proxy"
The augmented Browsing option was reduced to the web proxy functionallity.
Augmented browsing is not available and no known plan exist to reimplement
alteration of result pages with additional information.
2017-01-12 01:36:30 +01:00
luccioman
339f005ced Blacklist import and update performance improvements.
Measurement sample : import from blacklist local file containing about
15000 entries
 - before refactoring : several minutes
 - after refactoring : a few seconds!
2017-01-06 12:24:31 +01:00
luccioman
e3892b0957 Added some JavaDoc. 2017-01-06 11:23:40 +01:00
luccioman
52d05d14c6 Display result favicons only for http or https resources.
Favicon display only makes sense for http(s) websites, being public or
intranet. So I modified the favicon conditional display to verify the
result URL protocol rather than if we are in intranet mode.

Also prevented rendering an img HTML tag with empty src on other results
protocols such as ftp or file.

Fixing this thanks to priest2 report
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5923).
2017-01-06 09:00:28 +01:00
luccioman
b154d3eb87 Added descriptive titles to Crawler_p.html speed settings.
As reported by bubul
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5924) , LF and MH
acronyms meaning were not detailed.
Also added label tags for improved accessibility on these input fields.
2017-01-05 14:54:59 +01:00
reger
68d4dc5cc5 Complete harmonization RequestHeader getCookie with std ServletRequest
to use javax.servlet.http.Cookie parameters.
Depreciate now obsolete getHeaderCookies.
Adjust setting of MaxAge to spec if >= 0 otherwise keep default.
2017-01-02 03:04:21 +01:00
reger
396ed3c769 On negative result vote also delete document from fulltext index
(not only from dht)
2017-01-01 23:58:38 +01:00
reger
f153cc4b5d add/allow to create a bookmark of result viewed via urlproxy.
For this on the header of the viewed result a "add bookmark" button is
available (for authenticated users).
Currently the bookmark is added to a (virtual) bookmark folder "/proxy"
w/o any additional tags etc.
2016-12-23 19:03:44 +01:00
reger
7bf2bcf504 fix and prevent exception on missing required cookie name
skip cookie creation if name is empty.
2016-12-22 19:52:38 +01:00
luccioman
128c8ef8d4 Fixed title rendering having non ASCII chars in QuickCrawlLink_p.html. 2016-12-21 08:19:09 +01:00
luccioman
ee6933c004 Added a title on the previous and next page pagination buttons.
This is to clarify the meaning of these buttons for users who could
think they link respectively to the first and last results page.
2016-12-21 07:22:41 +01:00
reger
8eb6fba59c activate filetype navigator plugin and restrict config (append) of navs
to not already actives.
Dht results are now included in count this might over shoot on redundant
dht and solr, while the previous solr facet based was always low.
2016-12-21 02:04:13 +01:00
luccioman
c25e48e969 Enabled displaying results after 14th page for local search queries.
Fixes issue #90 for local queries only: Stealth mode, Portal mode or
Intranet mode. 
For P2p mode, the issue would probably be difficult to solve with
reasonable performance. This is still to dig.

Also switched some InterreputedException catch log messages to warn
level as this is normal behavior when shutting down a peer.

Fixed yacysearch buttons navbar behavior to deal correctly with total
results count or offset over 1000. Also improved the buttons navbar to
be able to navigate over 10th page for local queries.
2016-12-20 14:52:33 +01:00
reger
6be9d62ab4 show earthsearch.png in ConfigSearchPage layout on activated location
navigator (for more realistic impression)
2016-12-20 02:06:43 +01:00
reger
c50e23c495 reduce creation of empty legacy RequestHeader() in situation where null
is acceptable (less for garbage collection).
2016-12-18 02:38:43 +01:00
reger
193b2ab1fc reduce redundant declaration for simple date formatter
using predefined GenericFormatter.SIMPLE_FORMATTER
2016-12-17 23:29:57 +01:00
reger
38d676c7e4 use GenericFormatter SimpleDate for sortable column in table_API
to allow correct chronological sorting (of the date string)
fix for http://mantis.tokeek.de/view.php?id=585
2016-12-17 21:44:09 +01:00
reger
c702eb6786 del dead menu link to /repository
(directory not created in current distribution -> old)
2016-12-17 02:38:52 +01:00
luccioman
467650c042 Hardened system update checks.
When a downloaded archive release is corrupted, empty, or can not be
opened for any reason, the update script must not be launched because it
erases the existing lib/*.jar libraries.
2016-12-16 11:03:09 +01:00
luccioman
00e81fcc15 Check HTTP status when downloading a release, and report eventual error. 2016-12-15 15:30:36 +01:00
reger
8e2cef5f07 allow protocol navigator to be unselected if only one button is shown
after activating navi/facetfilter
2016-12-15 00:45:08 +01:00
luccioman
437e535e5c Fixed admin navbar rendering at various screen sizes.
Fix mantis 443 (http://mantis.tokeek.de/view.php?id=443).

Tested on :
 - Debian jessie : Firefox ESR 45.5.1
 - MS Windows 10 : Firefox 50.1.0, Chrome 55 and Edge
 - Emulated devices/adaptative views embedded in the previously
mentionned browsers
2016-12-14 12:49:41 +01:00
luccioman
b90730f956 Fixed locations search navbar overlapping issues.
This is similar to the main yacysearch navbar issues described in mantis
708 (http://mantis.tokeek.de/view.php?id=708)
2016-12-13 16:50:24 +01:00
luccioman
0714b06038 Fixed resource switch button overlapping at various screen sizes.
Fixes second part of mantis 708
(http://mantis.tokeek.de/view.php?id=708)

The bootstrap-switch component has some sizing issues with long labels,
which are not likely to be solved soon due to a lack of resources on
that project (see issue
https://github.com/nostalgiaz/bootstrap-switch/issues/419 )

This fix works by applying the following ideas :
 - labels are long, so font-size and padding are reduced on small screen
sizes using a media query
 - use relative percent width values on the component wrappers to
prevent overlapping on the neighbour content
 - disable animation because it relies on absolute pixels width values
2016-12-13 15:33:18 +01:00
luccioman
848bfc240c Fixed YaCy logo (no external mark) for the refactored navbar search.
Thanks to reger24 feedback.
2016-12-12 12:55:20 +01:00
reger
8acdc5443b prepare ConfigSearchPage servlet to append and remove navigator plugins,
keeping order of added nav's.
The search page preview template displays active navs. Therefore a select
and add button has been added below the preview (to keep it close to actual).
This should in future likely be done by drag&drop (html5 feature).
2016-12-12 02:29:15 +01:00
reger
b32bcdf344 list entries in outgoing cookie monitor one per line
for easier readability.
For this adjust outgoingCookies entry to use Cookie[] instead of String[]
2016-12-10 22:08:09 +01:00
luccioman
f37a86e1c6 Fixed yacy search navabar header overlapping at various screen sizes.
- using a icon-only admin button at small and medium screen size
- using a icon-only "Search Interfaces" button at small screen size
- hiding the YaCy brand at extra-small screen size

Fixes the header part of mantis 708
(http://mantis.tokeek.de/view.php?id=708).

Navigator button overlapping is still to fix.
2016-12-09 11:25:09 +01:00
reger
77e65016c0 use more availabe SwitchboardConstans in ProxyIndexingMonitor_p
(to easily find usage)
2016-12-07 00:39:53 +01:00
luccioman
8146b97e9b Added a unit after the vocabulary size value for easier understanding. 2016-12-05 10:58:23 +01:00
reger
de33c7e765 replace one more arbitrary CONNECTION_PROP_CLIENTIP header with std.
getRemoteAddr()
2016-12-05 00:11:03 +01:00
reger
14e73f5b9b use bootstrap button style in MessageSend_p.html
and align buttons with form
2016-12-04 22:26:02 +01:00
reger
65871d28b2 skip comparing "xxxxx" on missing authorization header in Blog servlet 2016-12-04 22:11:22 +01:00
reger
82512613f5 fix unresolved pattern in ConfiLanguage drop down list 2016-12-03 01:13:47 +01:00
luccioman
0b4e7795df Fixed JavaScript error "hs.htmlExpand is not a function".
This error occurs on /ConfigSearchPage_p.html and on search results page
when Metadata links are enabled.

The fix was to remove unnecessary use of hs.htmlExpand() which is now
part of highslide-full.js library file, currently not distributed with
YaCy (only includes highslide.js). The Metadata links work correctly and
the initial dynamic expansion offered by htmlExpand() did not bring much
usability.
2016-11-29 02:56:43 +01:00
luccioman
1f4f0eacc2 Fixed a JS undefined error case, occurring when search field is empty. 2016-11-29 02:11:44 +01:00
luccioman
ceb7588880 Converted "clone" URL links in Table_API_p.html to purely relative ones.
Again for easier YaCy integration when running behind a reverse proxy
subfolder.
2016-11-29 01:34:33 +01:00
luccioman
cca3417b87 Fixed image and favicon viewing for unauthenticated local requests.
As reported by @reger24, image and favicon viewing was broken with
unauthenticated requests on peers configured to require authentication
even from localhost.

So I unified viewing rights check in a single new function on
ImageViewer class.
2016-11-28 22:10:05 +01:00
reger
02092de3d8 remove login cookie generation for static admin ind User servlet
cookieAuth is never successful for static admin, leaving the creation and
handling for login cookies for static admin obsolete.
2016-11-26 23:28:30 +01:00
reger
49f19aff75 exclude external link icon in Collage servlet
(icons display not close to image in IE)
2016-11-26 19:53:00 +01:00
reger
a0705c049d include check to prevent adding username identical with static admin
in ConfigAccounts_p
2016-11-26 18:26:14 +01:00
luccioman
89017e17e4 Converted ajax URL to relative and added a check on the response status.
This makes YaCy easier to configure when running behind a reverse Proxy.

The check on status avoids trying to update the page with error text
content when the server returned a 404 or 500 error message for example.
2016-11-25 11:13:16 +01:00
reger
8e3e3ed191 update the older ResponseHeader patch to handle cookies,
to work directly with javax.servlet.http.Cookie (rename headerProps to
cookieStore as is only used for this).
(Re)implement set-cookie in DefaultServlet to make cookieAuthentication
work as designed.
2016-11-25 02:00:20 +01:00
luccioman
aa9ddf3c23 Added control over Robots.txt active threads maximum number.
When starting a crawl from a file containing thousands of links,
configuration setting "crawler.MaxActiveThreads" is effective to prevent
saturating the system with too many outgoing HTTP connections threads
launched by the crawler.
But robots.txt was not affected by this setting and was indefinitely
increasing the number of concurrently loading threads until most ot the
connections timed out.

To improve performance control, added a pool of threads for Robots.txt,
consistently used in its ensureExist() and massCrawlCheck() methods.
The Robots.txt threads pool max size can now be configured in the
/PerformanceQueus_p.html page, or with the new
"robots.txt.MaxActiveThreads" setting, initialized with the same default
value as the crawler.
2016-11-23 18:13:05 +01:00
reger
baf6d21cfe ConfigSearchPage, move protocol navi up to better simulate actual design.
Because here btn-group-justified screws up table column width (Explorer
and Firefox) bootstrap btn-group is used.
2016-11-23 01:33:01 +01:00
luccioman
176f7c2aab RWI Ranking Configuration page : fixed missing required dependencies.
Fix for mantis 707 (http://mantis.tokeek.de/view.php?id=707)
2016-11-22 09:40:22 +01:00
reger
08a0acc35d make a YearNavigator availabel, useable as SearchEvent.naviator plugin.
It can take any Date field of the index and displays a list of year strings
in reverse order by the year (not the score/count).
To allow to define the index field to use, the fieldname (and title can be 
appended to the navi's name "year" e.g. year:load_date_dt:LoadDate
It works also with dates_in_content_dts field (from the graphical date
navigator). Here the query parameter from: to: are used on selection as
Query modifier (for other dates currently no query parameter available, so
selection won't work to filter search results).
Not included in the UI Searchpage layout config so far (for experiment with
it manual change to conf needed).
2016-11-21 16:52:53 +01:00
reger
3630fcc458 adjust messages_p servlet to standard header.getPathInfo, too
(replacing non standard http header)
2016-11-20 04:02:28 +01:00
reger
bad8f87998 remove old/obsolete clear text "adminAccount" credential entry from init
and setConfig (.,empty) from servlets/code
2016-11-20 00:20:47 +01:00
reger
20c9b0138e let User servlet detect static admin with (newer) md5 encoded pwd
(complete a old todo)
2016-11-19 01:02:05 +01:00
reger
b449b0b660 remove login request directly after logout,
and add logout from servlet container
make logout button red
2016-11-18 02:39:53 +01:00
luccioman
fd22d8c08b Upgraded Bootstrap to 3.3.7 and upgraded its related js dependencies.
Upgraded the following JavaScript libraries dependencies :
 - bootstrap-switch to 3.3.2
 - html5shiv to 3.7.3 and switched to minified version
 - typeahead to 0.10.5
 - jQuery to 1.12.4

Removed unused bootstratp-rtl.css and bootstrap-rtl.min.css.

Tested non regressions on the following systems :
 - Debian Jessie : 
  - Firefox 45.4.0

 - MS Windows 10 :
  - Chrome 54.0.2840.99 
  - Firefox 50.0
  - Edge
  - Emulated IE 11, 10 and 9
2016-11-17 14:17:08 +01:00
luccioman
0806de8fdc Ensure file input stream are closed in both normal and error cases. 2016-11-16 15:13:58 +01:00
luccioman
f72c99474e Network Access servlet : render pure relative HTML links.
For better YaCy integration behind reverse proxy with subfolder.

Tested on Debian Jessie with an apache2 reverse proxy.

See related mantis issues http://mantis.tokeek.de/view.php?id=106 and
http://mantis.tokeek.de/view.php?id=701
2016-11-14 20:17:35 +01:00
reger
d631fbc019 make more use of the new ServletRequest interface methodes
getScheme, getServerPort (in QuickCrawlLink_p & YaCyDefaultServlet)
2016-11-14 03:01:15 +01:00
reger
395f2e8946 Make ServletRequest implement the standardized HttpServletRequest interface,
to make all readily available information from the original ServletRequest
available to YaCy servlets (without converting data to internal structures).
The implementation of the common interface allows easier integration of
YaCy servlets with the servlet standard (e.g. shared login service with
the servlet container etc.)
2016-11-14 01:37:16 +01:00
luccioman
62f75417ef Updated Pattern JavaDoc links to current minimum (1.7) JDK version. 2016-11-14 00:18:40 +01:00
luccioman
ca4c38a5ba Updated links to external Java and Solr docs to currently used versions. 2016-11-13 23:34:27 +01:00