Commit Graph

13029 Commits

Author SHA1 Message Date
luccioman
68afe900d0 Added user-friendly controls over disk usage configuration settings.
As mentioned in issue #103, control settings over YaCy disk usage
already existed but lacked a user-friendly way to set them.

I added it to the Performance_p.html administration page with a little
refactoring on the "Resource Observer" fieldset for improved
accessibility and HTML standards respect.
Also added the possibility to enable/disable the autoregulation fonction
from this page.
2017-01-27 15:47:15 +01:00
reger
95d2a28599 adjust the Field-Reindex Thread to verify and update the document id
in case hash (ID) doesn't match document url (sku field).
2017-01-26 23:49:15 +01:00
Michael Christen
e6e4ccaa00 Merge pull request #98 from Velociraptor85/patch-2
LSB Tag
2017-01-26 06:37:29 +01:00
Michael Christen
a7fd47b3aa Merge pull request #105 from ivar/patch-1
Update README.md - removes deprecated URL
2017-01-26 06:29:42 +01:00
Ivar Vasara
cfd21aaa10 Update README.md - removes deprecated URL 2017-01-25 20:36:48 -08:00
luccioman
d0182e4797 Improved Index Browser accessibility with semantically richer html tags.
Made use of ol, li, thead, th, tbody, h1 and h2 html tags.
Added aria-label attributes to provide alternative textual information
previously only conveyed by color cue.

Tested behavior with NVDA 2016.4 screen reader.
2017-01-26 01:13:32 +01:00
luccioman
fc01b69eca Fixed local image search pagination regression.
As reported by @tglman on issue #90, when searching images on the local
index only, pages next to the first were always empty. This was a
regression from commit c25e48e969.
2017-01-25 09:54:39 +01:00
luccioman
54ffd925dc Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2017-01-24 17:14:49 +01:00
luccioman
4c65321aae Updated master xliff file with missing entries for HostBrowser.html.
Also translated lang="en" html attribute to lang="[targetLang]" on
locale files having translated entries for HostBrowser.html
2017-01-24 17:14:14 +01:00
Michael Peter Christen
02d0b3172c Merge branch 'master' of https://github.com/yacy/yacy_search_server.git 2017-01-24 15:56:37 +01:00
Michael Peter Christen
d4f45cf05e added dc.date.modified and dc.date.created to date parser 2017-01-24 15:56:29 +01:00
luccioman
254060bda1 Index Browser : fixed display of "Count colors" for authorized users. 2017-01-24 11:49:15 +01:00
luccioman
96b7ddcef3 Updated French translation of HostBrowser.html 2017-01-24 11:38:56 +01:00
luccioman
c82c8351dd Fixed Index Browser page HTML validation errors and switched to HTML5.
Also removed deprecated HTML attributes uses.

Validation performed with Nu Html Checker 17.1.0.

Cross browser tested with :
 - Debian Jessie : Firefox ESR 45.6.0
 - MS Windows 10 : Firefox 50.1.0, Chrome 55.0.2883.87, MS Edge
2017-01-24 09:40:43 +01:00
reger
f9180fabc4 assure that RWI Index.Segment IODispatcher is not blocking on shudown
waiting on a semaphore permit.
see desc. http://mantis.tokeek.de/view.php?id=723
2017-01-24 01:51:28 +01:00
luccioman
826e5bbadd Documented /HostBrowser.html related configuration settings 2017-01-23 16:05:51 +01:00
luccioman
9adba36754 Fixed "-UNRESOLVED_PATTERN-" admin parameter in "load & index" links. 2017-01-23 14:54:37 +01:00
luccioman
4e2bc644cb Display Index Browser links requiring auth only when authenticated.
In the /HostBrowser.html page "only hosts with urls pending in the
crawler", "only with load errors" and "Administration Options" all
require administration credentials. But they were displayed even to
unauthenticated users, and clicking them did nothing and returned the
/HostBrowser.html page empty.
2017-01-23 14:49:02 +01:00
reger
e61ee180a7 Group all proxy settings on System Administration by adding settings of
UrlProxyAccss page (moved from deleted AugmentedBrowsing_p), adjust
submenu (remove Augmented Browsing) and translation files.
2017-01-22 23:58:46 +01:00
luccioman
39e081ef38 Fixed display of crawler pending URLs counts in HostBrowser.html page.
As described in mantis 722 (http://mantis.tokeek.de/view.php?id=722)

Also updated some Javadoc.
2017-01-22 12:31:14 +01:00
luccioman
870a5eae26 Removed temporary test main method commited by mistake. 2017-01-22 12:19:43 +01:00
reger
df80c57842 add ukr and pol to DCEntry.getLanguage ISO639-2 3-char language code
conversion to deliver uk, pl 2-char code
and use if else to return on match
2017-01-22 00:01:18 +01:00
reger
8d790ab783 delete outdated and unmaintained Netbeans project
Netbeans has good build-in maven support which is a supported and 
maintained build env, making special and additional NB setting obsolete.
2017-01-21 01:53:43 +01:00
reger
85cd19962f fix the missing solr-5.5.2.jar delete from prev. commit 2017-01-21 00:35:05 +01:00
reger
890c6dcdc6 upd to solr-5.5.3
minor bugfix version
2017-01-21 00:26:04 +01:00
reger
c4017f2e87 upd to commons-compress-1.13.jar
hide external icon on forge logo (was also out of position in IE)
2017-01-20 02:15:11 +01:00
luccioman
e048e74072 Added an optional parameter to webstructure.xml api.
This new "documentStructure" parameter can be set to false to only get
hosts accumulated references on a resource and thus prevent scraping the
specified URL and getting citations references.

Also set WebStructureGraph constants as final and updated the Javadoc
with example api call URLs.
2017-01-19 12:30:44 +01:00
reger
581b00cc20 remove obsolete lastmodified calculation in WebgraphConfig 2017-01-17 23:45:56 +01:00
luccioman
5c8958bcea Updated Javadoc and Junit tests for the WebStructureGraph class. 2017-01-17 17:01:56 +01:00
luccioman
17b7c92009 Made sure webstructure.xml API produces valid XML.
Host names should not contain XML special characters such as quotation
mark, but at this stage the WebGraph may have mistakenly recorded a host
name with such characters. What's more the DigestURL constructor does
not prevent this.
By the way using serverObjects.putXML to encode host names we ensure
here the rendered XML is well formed and can be parsed by external tools
even if an structure entry is incorrect.
2017-01-17 15:59:55 +01:00
luccioman
d9766ca981 Fixed WatchWebStructure_p.html render to include https URLs.
As described in mantis 721 (http://mantis.tokeek.de/view.php?id=721)
WatchWebStructure_p.html failed to include in its structure view https
and other protocols and ports than default http.
2017-01-16 18:41:58 +01:00
luccioman
ed3dd5e31a Fixed webstructure.xml API used with a domain name 'about' parameter.
As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720),
when requesting this API with a domain name instead of a complete URL
only HTTP references on default port were listed.
2017-01-16 16:41:06 +01:00
luccioman
0da1e6ba16 Factored code re-implementing DigestURL.hosthash() method.
This ensure consistent implementation of the url host hash generation
and easier usage finding in source code.

Also added a unit test for this function.
2017-01-16 10:18:42 +01:00
luccioman
86adfef30f Added automated unit tests and perfs test for WebStructureGraph class.
Fixed references count when multiple links target the same domain name
in one document.
2017-01-13 16:10:59 +01:00
luccioman
f793d97e56 Factored common code with DigestURL.hosthash() 2017-01-13 16:05:46 +01:00
luccioman
9cea7cbb10 Detailed some Javadoc related to /api/webstructure.xml usage. 2017-01-12 17:52:47 +01:00
reger
007e2afa6e Start to rename "Augmented Browsing" to "Web Proxy ..." / "View via Proxy"
The augmented Browsing option was reduced to the web proxy functionallity.
Augmented browsing is not available and no known plan exist to reimplement
alteration of result pages with additional information.
2017-01-12 01:36:30 +01:00
luccioman
c9889991b9 Fixed 2 failing JUNit tests. 2017-01-09 17:59:01 +01:00
luccioman
bdaef80a55 Ignore generated Javadoc with git SCM. 2017-01-09 16:45:31 +01:00
luccioman
6a4d51d8f9 Cleaned up some Javadoc warnings. 2017-01-09 16:44:47 +01:00
luccioman
86dc198698 Fixed some JavaDocs broken links. 2017-01-09 09:57:53 +01:00
luccioman
c78e2f3b4b Fixed maven assembly base directory to match last main YaCy binaries. 2017-01-09 09:54:14 +01:00
reger
16beb551ea fix DC.Elements namespace in DublinCore vocabulary class
delete redundant (unused) DCElements.
2017-01-07 18:24:29 +01:00
luccioman
339f005ced Blacklist import and update performance improvements.
Measurement sample : import from blacklist local file containing about
15000 entries
 - before refactoring : several minutes
 - after refactoring : a few seconds!
2017-01-06 12:24:31 +01:00
luccioman
e3892b0957 Added some JavaDoc. 2017-01-06 11:23:40 +01:00
luccioman
52d05d14c6 Display result favicons only for http or https resources.
Favicon display only makes sense for http(s) websites, being public or
intranet. So I modified the favicon conditional display to verify the
result URL protocol rather than if we are in intranet mode.

Also prevented rendering an img HTML tag with empty src on other results
protocols such as ftp or file.

Fixing this thanks to priest2 report
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5923).
2017-01-06 09:00:28 +01:00
reger
4c9be29a55 fix concurrency issue with htmlParser using not current scraper data
resulting in incorrect data for some html index metadata.
Details see http://mantis.tokeek.de/view.php?id=717
2017-01-06 03:01:52 +01:00
luccioman
b154d3eb87 Added descriptive titles to Crawler_p.html speed settings.
As reported by bubul
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5924) , LF and MH
acronyms meaning were not detailed.
Also added label tags for improved accessibility on these input fields.
2017-01-05 14:54:59 +01:00
reger
eedee6eabb fix exception on URIMetadataNote instantiation with corrected id hash on
host_id_s. Use Solr setField instead of addField to prevent
java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String
	at net.yacy.kelondro.data.meta.URIMetadataNode.hosthash(URIMetadataNode.java:247)
	at net.yacy.search.query.SearchEvent.addNodes(SearchEvent.java:966)
	at net.yacy.peers.Protocol.solrQuery(Protocol.java:1242)
	at net.yacy.peers.RemoteSearch$2.run(RemoteSearch.java:349)
2017-01-05 00:24:37 +01:00
luccioman
b55cf16dad Upgraded jgit build library to version 4.5.0
This is the latest Java 7 compatible jgit release.

Properly support GitHub tags marked as "Pre-release". 
With the previous venerable jgit version 1.1.0, a YaCy repository clone
having such a tag made GitRevTask and GitRevMavenTask crash.
2017-01-04 17:09:37 +01:00