Commit Graph

5705 Commits

Author SHA1 Message Date
luccioman
8100c033a2 URL Viewer : apply crawler size limits when adding to local index.
This allow large files parsing and preview, while preventing unwanted
OutOfMemory errors which are likely to occur when adding to the Solr
Index resources larger than configured crawler limits.
2017-07-16 14:37:06 +02:00
reger
e5cff062b5 Clean up redundant but obsolete jquery.rdfquery-core-1.0.js script lib 2017-07-14 23:41:39 +02:00
reger
23bda133d2 Fix css conflict of YMarks.html to make it viewable.
yacy-ymarks.css sidebar conflicts with bootstraps sidebar (different
overlay settings). Simply renamed it to ymark-sidebar.
2017-07-09 23:08:54 +02:00
reger
a21789d4e7 Fix unresolved pattern in api/share.html by init some display var's 2017-07-08 22:46:15 +02:00
luccioman
bf55f1d6e5 Started support of partial parsing on large streamed resources.
Thus enable getpageinfo_p API to return something in a reasonable amount
of time on resources over MegaBytes size range.
Support added first with the generic XML parser, for other formats
regular crawler limits apply as usual.
2017-07-08 09:04:03 +02:00
luccioman
1b3c169a9c URL Viewer : decode raw text using the eventual response charset.
When provided, or decode as UTF-8 as previously done.
2017-07-03 13:51:14 +02:00
reger
e6e20dab52 upd to Jetty 9.4.6.v20170531
Modify loginservice to the changes in Jetty, partially based on pull 
request #101 https://github.com/yacy/yacy_search_server/pull/101 bu @automenta
2017-07-01 23:58:28 +02:00
luccioman
e4c730b99f Updated PerformanceQueues_p.xml API with last related servlet changes 2017-06-30 11:41:48 +02:00
luccioman
dcc56318bb Made remote search max system load limits configurable from UI.
As reported by davide on YaCy forums (
http://forum.yacy-websuche.de/viewtopic.php?f=23&t=6004 ) when the
system is on high load, unless reading carefully YaCy configuration
file, it could be difficult to understand why remote search results are
not fetched.
2017-06-30 11:30:54 +02:00
luccioman
4b72b29ea2 Added an informative title on the crawl start robots.txt status icon 2017-06-29 11:36:47 +02:00
luccioman
d08f31c3a8 Crawl start Ajax request : properly handle eventual XML parsing errors
Otherwise on a malformed getpageinfo_p XML response (from the browser
point of view), JavaScript errors where thrown and the ajax status
steering wheel remained displayed indefinitely.
2017-06-29 11:25:27 +02:00
luccioman
8da3174867 Ensure lower case conversion consistency with any default locale.
Especially for Turkish speaking users using "tr" as their system default
locale : strings for technical stuff (URLs, tag names, constants...)
must not be lower cased with the default locale, as 'I' doesn't becomes
'i' like in other locales such as "en", but becomes 'ı'.
2017-06-27 06:42:33 +02:00
luccioman
c41b31dcb3 Cleaned up memory usage page HTML
- fixed validation errors
- removed deprecated attributes
- improved accessibility with richer table semantics (headers and
caption elements) and language declaration
2017-06-20 09:21:55 +02:00
luccioman
0487336ec3 Prevent integer overflow in table statistics and use strong typing 2017-06-19 17:02:11 +02:00
luccioman
0f80c978d6 Limit the number of initially previewed links in crawl start pages.
This prevent rendering a big and inconvenient scrollbar on resources
containing many links.
If really needed, preview of all links is still available with a "Show
all links" button.

Doesn't affect the number of links used once the crawl is effectively
started, as the list is then loaded again server-side.
2017-06-17 09:33:14 +02:00
luccioman
32288a8999 Merge branch 'master' of https://github.com/yacy/yacy_search_server 2017-06-17 08:16:55 +02:00
luccioman
e9b4b29f90 Limit scope of some local JavaScript variables. 2017-06-16 08:50:57 +02:00
Michael Peter Christen
369b8e0e0b added json(p) endpoint for crawl start 2017-06-16 08:44:40 +02:00
luccioman
9dd790087d Added HT Cache basic statistics (hit rate) 2017-06-15 09:50:02 +02:00
luccioman
28b451a0b3 Made Cache compression level and lock timeout user configurable 2017-06-14 19:02:08 +02:00
Michael Peter Christen
6fe735945d migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8
Also: now Version 1.921
2017-06-09 12:25:23 +02:00
luccioman
8399275142 Properly close file output streams even on exceptions scenarios. 2017-06-08 07:19:16 +02:00
reger
632354e2ff Tokenize result entry keywords and add some styling for display 2017-06-04 01:50:40 +02:00
reger
a814f3d885 Introduce keyword query parameter
This enables keyword navigator to filter on keywords. Added search page
output and layout config for keywords, allowing e.g. in Intranet use
to display the keywords. No styling or links applied to the keyword
text (but is desirable possibly in combination with bootstrap-tagsinput
for future/intranet).
2017-06-02 01:00:21 +02:00
luccioman
cbccf97361 Added JavaDoc to the getpageinfo_p API servlet. 2017-05-30 17:38:16 +02:00
luccioman
bd88fd303e Deprecated duplicated and internally unused getpageinfo servlet.
Redirections set for the transition of any eventual external uses:
 - /api/getpageinfo.xml to /api/getpageinfo_p.xml
 - /api/getpageinfo.json to /api/getpageinfo_p.json
2017-05-30 09:29:28 +02:00
luccioman
1be4d32f99 Restored search page default behavior for Tab, Page Up and Down keys
Replaced by shortcuts defined by the HTML "accesskey" attribute which
has the advantage to be advertised by screen readers when focusing the
corresponding buttons, contrary to custom JavasScript key handlers.
Now With Firefox :
 - "Alt + Shift + n" for next page
 - "Alt + Shift + p" for previous page

Following ARIA recommendation : "keyboard shortcuts enhance, not
replace, standard keyboard access." ( see
https://www.w3.org/TR/wai-aria-practices/#kbd_shortcuts_behavior_design)

Fix for mantis 711 (http://mantis.tokeek.de/view.php?id=711)
2017-05-23 07:25:40 +02:00
luccioman
45346c1be8 Added missing accessibility attributes on search results progress bar. 2017-05-16 09:44:13 +02:00
luccioman
91a06bc669 Annotated search result information separators for screen readers. 2017-05-15 13:31:24 +02:00
luccioman
31ad043bb9 Added user interface feedback on results feeding termination status.
Added as an additional icon with title in the search progress bar, to
inform about background search feeder threads terminated or still
running. While giving a bit more information to users about the p2p
search process, this can help choosing whether or not wait a little bit
more time before going to the next page, in order to get results from
various sources sorted as best as possible (see #91 for a discussion
about sorting accuracy and network latency).

Other related modifications included :
 - regular updates to statistics in the progress bar until the
background feeders are completely terminated.
 - removed some uses of unsecure and discouraged JavaScript elements
2017-05-15 13:15:16 +02:00
luccioman
d90b001e1b Improved previous merge "Show ranking in HTML UI".
- added the new setting as configurable in the "Debug/Analysis" settings
page. Debug/analysis is its main purpose for now as there is currently
no nice and "understansable" ranking score info servlet (see forum
discussion http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5884 ) 
- render in the "Search Page Layout" page preview when enabled
- added constants
2017-05-11 18:02:33 +02:00
luccioman
efe1232d90 Merge branch 'html-show-ranking' of
https://github.com/JeremyRand/yacy_search_server

Conflicts:
	defaults/yacy.init
2017-05-11 14:53:57 +02:00
luccioman
4564541b3b Fixed blacklist Regex containing '+' characters rendering.
As reported on YaCy forum by shni
(http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5970) when a
blacklist entry contained both '?' and '+' characters, the '+' chars
were wrongly decoded and rendered as spaces.
2017-05-04 11:12:58 +02:00
luccioman
0612a8f4f2 Fixed the previously added link to scheduled dump operations. 2017-05-04 08:45:30 +02:00
luccioman
a87281b498 Added MediaWiki dump import scheduling feature.
Checking the last modified date by default to prevent unnecessary long
running operations.
2017-05-03 18:53:01 +02:00
luccioman
10c03c6c64 Improved MediaWiki dump import monitoring.
When import thread is terminated :
 - now stop refreshing and stay on the monitoring page to give user a
feedback after a long running import
 - added link to the next monitoring step : results from surrogates
reader
 - added link to new import
 
On the new import page, added a link on the eventual last import report.
2017-05-02 09:38:45 +02:00
luccioman
8d288f5dba Crawl results page : apply table lines number limit.
Take into account the already existing default limit value (especially
useful after a long crawl or surrogates import), or a custom one from
parameter "count".
Added a "Show all" link for convenience.
2017-04-27 18:24:54 +02:00
reger
c77e43a391 Take out mailto collect in internal parsed document
As earlier plans to make use of mailto as separate webgraph entity didn't
materialize (see  http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5726&p=32493&hilit=mailto#p32493)
free the unused handling and resources.
2017-04-20 00:18:18 +02:00
reger
bec34d3546 Add url input field as source for WarcImporter
allowing to import warc from url without prior download.
2017-04-16 04:25:29 +02:00
reger
d3df8a46c4 fix unresolved_pattern on missing post parameter api/message.html 2017-04-14 21:14:26 +02:00
luccioman
f66438442e Extended Mediawiki dump import to remote URLs.
When using a public HTTP URL in /IndexImportMediawiki_p.html, the remote
file now is directly streamed and processed, allowing import of several
GB dumps even with a low memory remote peer, and without need to
manually download the dump file first.
2017-04-14 14:32:44 +02:00
luccioman
7edddd7b0d Improved error reports on various wiki dump prerequisites failure cases.
Also added some JavaDoc.
2017-04-11 08:21:34 +02:00
luccioman
dfe8d4139b Used a text input for wiki dump import file selection.
Using an HTML "file" input was confusing (as reported by promocore on
YaCy forum : http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5965) ,
and it only worked with MS IE/Edge on a local YaCy peer :
 - for security reasons some current major browsers such as Firefox or
Chrome do not allow to send full file path information when using a file
form input
 - the local file system selection popup doesn't make sense when you
want to import a dump on a remote YaCy server
2017-04-11 07:34:17 +02:00
reger
3a71430030 Adjust ConfigSearchPage_p to activated hosts navigator as plugin 2017-04-10 22:58:20 +02:00
reger
7b80189bda Activate hosts navigator plugin. This includes rwi results in the navigator
count.
This might be tangential related to http://mantis.tokeek.de/view.php?id=736
as the example includes a local index search, while rwi results are not
counted.
2017-04-10 22:42:06 +02:00
reger
05a1b14b4a add missing text from ConfigRobotsTxt_p to master.lng
and link to Translation Editor to Translation News page.
2017-04-09 21:42:05 +02:00
reger
a39c00a93f add servlet to list user in UserDB and made user editor available in
separate servlet for a quick and easy overview of configured user and
selection for edit.
2017-04-09 02:09:32 +02:00
reger
a4498e17c0 fix edit current user form to required post mehtod
introduced with cde237b687
2017-04-08 22:54:57 +02:00
luccioman
665d087d76 Enforced access controls on a few more administration pages.
- ensure use of HTTP POST method when performing server side effect
operations
 - transaction token required to ensure the request has effectively been
requested by user interaction
2017-04-03 12:20:16 +02:00
luccioman
0feded21dd Escaped HTML eventually active content from recorded API call comments. 2017-04-03 11:40:37 +02:00