Commit Graph

12746 Commits

Author SHA1 Message Date
luccioman
c7402a2f89 Removed invalid empty form action.
A form action URL must not be empty (see
https://www.w3.org/TR/html/sec-forms.html#element-attrdef-form-action ).
No action attribute has the same effect (relaunching the same GET
action) but is valid HTML.
2016-10-07 10:57:31 +02:00
luccioman
37df2e19fd Removed xmlns attribute which no more makes sense in HTML5 pages. 2016-10-07 10:46:20 +02:00
luccioman
94924e288f Added some accessibility improvements to the main interface.
Tested with NVDA screen reader.
2016-10-07 10:44:45 +02:00
luccioman
dd86f7c44e Fixed HTML validation errors and grouped radios options in fieldsets 2016-10-07 10:43:06 +02:00
luccioman
fc0c72c84b Switched to the short HTML Doctype
This pages were already no more XHTML 1.0 because made use of the HTML5
syntax and elements.
Applied current (2016) HTML standard recommended Doctype declaration
(see https://www.w3.org/TR/html/syntax.html#the-doctype ).
2016-10-07 10:42:23 +02:00
luccioman
b5eb7a9217 Removed unnecessary crawlingDomFilterDepth hidden field.
It had incorrect "-UNRESOLVED_PATTERN-" value (see  second part of
mantis 691 http://mantis.tokeek.de/view.php?id=691 )

Note : crawlingDomFilterDepth is apparently unused in current (2016)
YaCy code-base. It was also unnecessary because crawlingDomFilterCheck
hidden field is set to "off".
2016-10-05 13:48:22 +02:00
luccioman
f6d7c6ee1f Fixed Recorded action URLs beginning displayed in /Table_API_p.html
Removed scheme, host and port from URL to avoid dealing with http/https,
external host and port retrieving issues.

What's more, this is consistent with how URL are displayed in
/Tables_p.html?table=api&count=100&reverse=on&search= or
Tables_p.xml?table=api&count=100&search=

This fixes mantis 691 first part
(http://mantis.tokeek.de/view.php?id=691)
2016-10-05 12:20:37 +02:00
reger
474f0476c6 adjust Tokenizer sentence count on trailing text after last recognized sentence
+ upd test case for rwi multi-word-query  (leaving results known to fail untested)
2016-10-05 05:52:37 +02:00
luccioman
34658ddb9b Merge pull request #76 from luccioman/crawler
Crawl monitoring : refresh running crawls table
2016-10-04 05:06:18 +02:00
luccioman
0065c9b9ea Crawl monitoring : refresh running crawls table
Fix mantis 690 ( http://mantis.tokeek.de/view.php?id=690 ). 
Tested on :
- MS Windows 10 : Edge, Firefox 49, Chrome 53
- Debian Jessie : Firefox ESR 45
2016-10-04 03:56:03 +02:00
luccioman
e1e632ad84 Switched to the short HTML Doctype
This page was already no more XHTML 1.0 as it makes use of the HTML5
<progress> element.
Applied current HTML standard recommended Doctype declaration (see
https://www.w3.org/TR/html/syntax.html#the-doctype ).
2016-10-04 03:56:02 +02:00
luccioman
4d8611e5e7 Tables accessibility : added missing <thead> sections. 2016-10-04 03:56:02 +02:00
luccioman
9fb3142317 Restricted variables scope to function handleStatus() in Crawler.js
Missing 'var' in declaration was unnecessarily giving global scope to
these variables.
2016-10-04 03:56:02 +02:00
reger
3861ac9293 upd maven dependency-check plugin to reflect changes of https://nvd.nist.gov
+ upd unknown ant script with current lib/jsch version
2016-10-04 03:05:26 +02:00
reger
681a61dafb adjust rwi index result word position handling used for rwi ranking
- correct WordReferenceVars.toRowEntry posintext parameter
to set expected min posintext (the difference is on multi-word queries,
while positions are ordered by search word order).
- modified posofphrase/posinphrase join operation
 - to set min posofphrase
 - and keep posinphrase if not same posofphrase (was set to 0, no differentiation during ranking)
+ fix compiler msg (missing type declaration)
2016-10-04 01:42:18 +02:00
reger
14f7577231 add support for older Word versions (Word6/Word95) to docParser 2016-10-03 01:52:51 +02:00
reger
8794e06721 upd to poi-3.15.jar 2016-10-03 01:48:35 +02:00
reger
e25f2ee88b mention date search parameter in search option help (index.html) 2016-10-02 06:36:34 +02:00
reger
1a79c64495 generalize DateDetection with holiday date rules readily available in icu
to make sure current dates are recognized (was fixed to 2014 - 2016)
+ adjust holiday date parser from pattern.match to pattern.find to deal with leading and trailing text
+ moved relative date recognition (morgen, tomorrow) to parseline (used by query parser only), as not working and problematic for indexing
+ add test case for parseline (used by query parser)
2016-10-02 03:19:12 +02:00
reger
6f68f08354 correct DateDetection Silvester date
add Thanksgiving
2016-10-01 03:16:27 +02:00
reger
32a2e3a22a have RSSFeed.getChannel return empty message on missing channel element,
a) required b) prevent NPE in rss servlets
+ add test
2016-09-30 21:46:57 +02:00
reger
fedb9f8151 del double entry in master.lng 2016-09-30 21:42:42 +02:00
luccioman
8d57b5b970 Added some javadocs. 2016-09-30 17:12:55 +02:00
luccioman
4585a60d7e Made use of the constant corresponding to the hard-coded value. 2016-09-30 17:12:29 +02:00
luccioman
60df09fff9 Fixed some HTML validation errors : Illegal character in query
Now encode space characters in URLs query part.
2016-09-30 10:54:53 +02:00
luccioman
a76a46a2e9 Removed invalid rel="[count]" from links in tagcloud.
These are no valid link relationships, and do not appear to be used in
scripting or styling. 
If necessary, a valid alternative could be to add an attribute such as
data-count="[count]"
2016-09-30 09:43:51 +02:00
reger
862f28eaa6 display number of documents/rss-items for label "docs" in load_rss_p servlet
(as replacement for the rarely used "docs" rss-tag for a url to the rss-specification)
2016-09-29 23:59:10 +02:00
luccioman
5027912f30 Fixed <p> spacers : blocks elements such as <div> are not allowed inside 2016-09-29 14:24:15 +02:00
luccioman
abe489a0b5 Removed unnecessary ARIA "form" role on native HTML form elements.
This fixes warnings reported by W3C Nu Html Checker
(https://validator.w3.org/nu/).
2016-09-29 13:42:07 +02:00
luccioman
cca4186044 Fixed HTML validation error : "Stray end tag div" 2016-09-29 11:42:59 +02:00
luccioman
dcdea2d02f Fixed shutdown for crawler.MaxActiveThreads value greater than 200
Shutdown was hanging in CrawlQueues.close() at
this.workerQueue.put(POISON_REQUEST) when config value
crawler.MaxActiveThreads was greater than 200.

Revealed by "Collision" Threads dumps in mantis 689
(http://mantis.tokeek.de/view.php?id=689#c1312)

Fixed consistency between this.worker.length and this.workerQueue
capacity, and made the process more reliable using non-blocking offer()
function.
2016-09-29 10:33:11 +02:00
reger
ada473ced2 fix ConfigBasic servlet parameter name for Japanese _jp->_ja 2016-09-28 16:08:36 +02:00
luccioman
d286ba2c3e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2016-09-28 14:53:08 +02:00
luccioman
b8f6458152 Prevent yacy main thread from hanging on browser opening process.
First fix for mantis 689 (http://mantis.tokeek.de/view.php?id=689).

On Debian Linux, with a headless jre and no open browser,
browser.openBrowserClassic() was called and waited forever the browser
process end (p.waitFor()). YaCy shutdown was therefore not working until
the browser was closed.

Also modified browser opening command for Unix platform to open the
default the browser (with xdg-open util) instead of Firefox.

xdg-open also has the advantage to be asynchronous (not blocking).
2016-09-28 14:52:30 +02:00
reger
cf3a4bdf52 upd to pdfbox-2.0.3 2016-09-27 23:12:10 +02:00
reger
70e1eb30a5 prevent StringIndexOutOfBounds in getLocalFile()
+ tighten patching of DOS path w/o protocol to drive "LETTER":
2016-09-27 22:40:36 +02:00
luccioman
1bb0b135ac Avoid duplication of various MS Windows file URLs flavors
Fix for mantis 692 (http://mantis.tokeek.de/view.php?id=692)
2016-09-27 07:53:08 +02:00
luccioman
b9a8476f02 Removed unused import 2016-09-27 07:41:45 +02:00
reger
e73c1eea8c remove unused rootpattern, leftover from commit
9a5ab4e2c1 (diff-d2b184283abed53ae260fc9eabdaef40)
2016-09-26 02:54:58 +02:00
reger
6f8c3ccea4 improve url hash computation for file path with mixed java & windows
file.separator to compute equal hashes (by normalizing path for computation)
+ expand test case for to check mixed java / windows file url notation
like e.g. file:///c:/test/file.html vs. file:///c:\test/file.html
- relates partially to http://mantis.tokeek.de/view.php?id=692
2016-09-25 22:08:12 +02:00
reger
bac302bfe4 fix NPE in QuickCrawlLink_p if param doesn't contain crawl url 2016-09-24 23:33:21 +02:00
reger
e9b9a7f68f add missing text for Supporter.html to master.lng 2016-09-24 05:02:37 +02:00
reger
efcb6a1e74 fix supported mime XML -> xml for rssParser (mime normalized to lower case for comparison)
+ add mime text/xml as in use for rss in the wild
2016-09-23 23:37:12 +02:00
luccioman
b5ba8f9f68 Added alternative text and title to HostBrowser.html image links
For better accessibility
2016-09-23 13:27:46 +02:00
luccioman
4aba491156 Fixed HTML validation errors : duplicate ids. 2016-09-22 16:25:47 +02:00
luccioman
1c139d70d4 Fixed W3C validation error : percent encode '[' and ']' chars in hrefs. 2016-09-22 16:20:13 +02:00
luccioman
b3b75b0498 Accessibility : add a customizable alternative text to YaCy log
Applied W3C recommendations :
https://www.w3.org/TR/html51/semantics-embedded-content.html#a-link-or-button-containing-nothing-but-an-image
and
https://www.w3.org/TR/html51/semantics-embedded-content.html#logos-insignia-flags-or-emblems
2016-09-22 16:08:33 +02:00
luccioman
f2bc1b268d Updated URL fragment validation rules according to current standards
See RFC 3986 (https://tools.ietf.org/html/rfc3986) or URL living
standard (https://url.spec.whatwg.org/)
2016-09-22 11:28:33 +02:00
luccioman
b1b8e69da8 Fixed NullPointerException cases 2016-09-22 11:25:33 +02:00
luccioman
3ee4f56c39 Improved ErrorCache behavior when switching networks
Even after network switch, ErroCache was still holding a reference to
the previous Solr cores, thus becoming useless until next YaCy restart.

Initial error cache filling with recent errors from the index was also
missing after the swtich.
2016-09-22 09:07:07 +02:00