Commit Graph

13784 Commits

Author SHA1 Message Date
luccioman
1b866c6076 Added possibility to hide or show image results with rendering errors
When searching images, thumbnails that could not be rendered (because of
a load error such as HTTP 404, networking issue or an internal error on
the rendering servlet) are now hidden as default. But can be revealed
with a button if desired.

Fix for issue #217
2018-08-24 09:13:12 +02:00
Philipp Hofmann
04c9584326 Docker: merge RUN instructions for less layer (-3) 2018-08-23 13:33:31 +02:00
luccioman
d03c098b54 Removed deprecated warning comments about imports and Debian installer
Deprecated by commit be5d3a1066 , as
classpath is now defined in yacycore.jar Manifest file.
2018-08-22 22:35:00 +02:00
luccioman
5b60b4225f Fixed encoding of '+' character on search pages links
As revealed by issue #216
2018-08-20 18:44:04 +02:00
Philipp Hofmann
3121825fba Remove executable (x) permission of 2 files
* docker/Dockerfile.alpine
* docker/Readme.md
2018-08-20 16:46:03 +02:00
Philipp Hofmann
42734175c8 Dockerfile: Improve package cache update
* Alpine-Image: If --no-cache is used, apk update is not necessary
* Debian-Image: Remove /var/lib/apt/lists to reduce image size
2018-08-20 16:45:38 +02:00
Philipp Hofmann
3f2a2f7577 Dockerfile: Remove bad whitespaces 2018-08-20 14:43:05 +02:00
luccioman
b726b2b532 Removed unnecessary '+' character URL decoding from search query
Manually replacing '+' character or "%20" by a space character in the
search query parameter was necessary in YaCy a long time ago to properly
decode application/x-www-form-urlencoded format (commit
9842fab6e4 in 2010).
Since the introduction of Jetty as the embedded HTTP server (commit
4b77733e59 in 2013), this is no more
necessary as Jetty internals already do this for us in
org.eclipse.jetty.util.UrlEncoded.decodeUtf8To().

So we can remove now this duplicated decoding as it prevents a proper
use of the '+' character in search requests, as reported in issue #216.
2018-08-20 08:10:39 +02:00
luccioman
baa7154486 Upgraded Apache PDFBox dependency from 2.0.9 to 2.0.11
Release notes at
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310760&version=12343466
and https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310760&version=12342889
2018-08-18 12:39:58 +02:00
luccioman
54fbe166ba Updated pdf cache clear steps consistently with current pdfbox version
- Removed calls to no more existing clearResources functions (on PDFont
class and its children) since upgrade to pdfbox 2.n.n
- Removed hacky usage of protected internal ClassLoader function. This
removes the warnings displayed when running with JDK9 or JDK10 :

     [java] WARNING: Illegal reflective access by
net.yacy.document.parser.pdfParser$ResourceCleaner (file:<path>) to
method java.lang.ClassLoader.findLoadedClass(java.lang.String)
     [java] WARNING: Please consider reporting this to the maintainers
of net.yacy.document.parser.pdfParser$ResourceCleaner
     [java] WARNING: Use --illegal-access=warn to enable warnings of
further illegal reflective access operations
     [java] WARNING: All illegal access operations will be denied in a
future release

Crawling thousands of pdf documents from various sources after
modifications applied, revealed no new memory leak related to pdfbox
(measurements done with JVisualVM).
2018-08-16 18:23:42 +02:00
Erik Dominikus
f04580ecfd Add contributor guidelines; closes #214 2018-08-15 23:23:30 +07:00
luccioman
685122363d Added a parser for XZ compressed archives.
As suggested by LA_FORGE on mantis 781
(http://mantis.tokeek.de/view.php?id=781)
2018-08-15 10:07:39 +02:00
luccioman
8ce9c066bf Updated the JRE URL from 8u171 to 8u181 for the MS Windows installer 2018-08-14 08:41:23 +02:00
luccioman
0efc6c89ef Fixed rendering of crawl queues page for URLs with raw IPV6 addresses 2018-08-13 14:36:22 +02:00
luccioman
4ee14ff3c5 Fixed NullPointerException case on malformed crawl queue folder name 2018-08-13 14:35:26 +02:00
luccioman
21ad9435ec Fixed crawl queue folder naming for IPv6 hosts on MS Windows filesystems
As reported by @vikulin in issue #187, crawling websites using a raw
IPv6 address as host name in their URL failed when running on Microsoft
Windows platforms (FAT32 or NTFS filesystems) when YaCy crawler created
the crawl queue folder, as the ':' character which is part of an IPV6
address is forbidden on these filesystems.
2018-08-11 10:02:26 +02:00
luccioman
0e976e9030 Added a link to MediaWiki dumps summary in import page for convenience 2018-08-08 08:11:02 +02:00
luccioman
1a91e87b05 Upgraded commons-compress dependency from version 1.16.1 to 1.17 2018-08-08 08:06:24 +02:00
luccioman
f2c479fe88 Cleaned up unused old jar files not removed on previous Solr upgrade 2018-08-08 08:04:33 +02:00
luccioman
ecd4535eb6 Prevent entering empty OpenSearch URLs in ConfigHeuristics_p.html
In order to early prevent adding invalid configuration entries to the
heuristicopensearch.conf file, as revealed the issue #209.
2018-08-06 12:07:47 +02:00
luccioman
1ca9cb6bd9 Fixed a NullPointerException case, reported in issue #209 2018-08-06 12:04:44 +02:00
luccioman
8a29551c54 Upgraded the OpenGeoDB dump URL
The status of the library in the DictionaryLoader_p.html page now also
advertises the user that an upgrade can be applied when an older dump is
already loaded.

Upgrade applied as suggested by Niklas Andrus @fapth_gitlab on Gitter
chat.
2018-08-03 18:39:41 +02:00
luccioman
373edf9eac Adjusted yjson Solr writer to support responses from an external Solr
Worked previously only with responses from YaCy embedded Solr, now able
to render the response when YaCy is configured to use an external Solr
index.
2018-07-31 16:22:21 +02:00
luccioman
87bd17b1cf Simplified a little bit the RSS OpenSearch Solr writer 2018-07-31 16:02:50 +02:00
luccioman
dc49ca9c27 Fixed a NPE case on the Solr OpenSearch response writer
Occurred when omitHeader parameter is set to true
2018-07-29 16:30:37 +02:00
luccioman
f4267ed247 Made Solr OpenSearch RSS writer compatible with external Solr index
Worked previously only with responses from YaCy embedded Solr, now able
to render the response when YaCy is configured to use an external Solr
index.
2018-07-28 11:03:31 +02:00
luccioman
2bbf070f57 Upgraded icu4j dependency from 61.1 to 62.1 2018-07-26 09:39:54 +02:00
luccioman
ede8ae6697 Fixed few technical mistakes in updated Chinese translation from PR #188 2018-07-26 08:27:21 +02:00
luccioman
1e09a0284b
Merge pull request #188 from tangdou1/patch-3
small update in zh.lng
2018-07-26 08:22:13 +02:00
luccioman
b1410f593a Fixed stylesheet relative URLs rendering in Solr html writer
Relative URLs to CSS stylesheets were not properly rendered when using
the Solr html response writer and the "/solr/collection1/select" entry
point instead of "/solr/select".
2018-07-25 08:03:25 +02:00
luccioman
89c59814da Improved rendering of the Solr api relative url in the html writer
In order to have a consistent relative url when using either
/solr/select or /solr/collection1/select entry point.
2018-07-24 10:13:55 +02:00
luccioman
bf4f320b16 Optionally render the response header when using the Solr html writer
With params rendered as html input fields for conveniently modifying
params values and refreshing results.
2018-07-23 18:36:57 +02:00
luccioman
313204ae2c Override qf and df Solr params with defaults only when they are not set 2018-07-23 13:50:24 +02:00
tangdou1
0ff2ca8f01
small update in zh.lng 2018-07-23 17:04:54 +08:00
luccioman
88e6ce23c9 Consistently render empty facets and facets having only entries at zero 2018-07-17 07:36:39 +02:00
luccioman
6831bdffb5 Fixed minor technical issues on Chinese updated translations 2018-07-17 07:08:30 +02:00
luccioman
e03c464f16
Merge pull request #186 from tangdou1/patch-1
Update zh.lng
2018-07-17 07:03:07 +02:00
tangdou1
0ebb27e5da
Update zh.lng 2018-07-16 20:18:02 +08:00
tangdou1
9c6a99f7ca
Update zh.lng 2018-07-16 19:53:20 +08:00
tangdou1
68b5b48335
Update zh.lng 2018-07-16 18:04:37 +08:00
luccioman
bdafb14336 Removed redundant synchronization lock on network switch function
Was useless as done in an already synchronized block, and the lock
object was assigned a new value in that same block, and nowhere else a
lock is requested on that same object.
2018-07-16 09:20:23 +02:00
luccioman
d5f44ea216 Removed unnecessary synchronization lock from serverSwitch constructor
Lock was useless here as it was set on an object instance attribute
while the object itself is not yet constructed and no other threads can
access it.
2018-07-16 09:13:50 +02:00
tangdou1
f19570a797
Update zh.lng 2018-07-14 21:36:03 +08:00
tangdou1
edb431cf8a
Update zh.lng 2018-07-14 17:18:43 +08:00
luccioman
dcee2ee6a6 Use standard Java annotation syntax instead of custom Javadoc tag
For better support by building tools.
As reported by @KnustJohn_twitter , the custom
[@phase](https://maven.apache.org/plugin-tools/maven-plugin-tools-java/index.html)
Javadoc tag made NetBeans fail on Javadoc generation for the
GitRevmavenTask class.
Using instead standard Java 5
[annotations](https://maven.apache.org/plugin-tools/maven-plugin-plugin/examples/using-annotations.html#POM_configuration)
this is no more an issue.
2018-07-13 07:25:58 +02:00
luccioman
e1bc035496 Ignore Eclipse projects config files derived from maven pom.xml 2018-07-13 07:12:57 +02:00
luccioman
dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list
When using the 'From Link-List of URL' as a crawl start, with lists in
the order of one or more thousands of links, the failreason_s Solr field
maximum size (32kb) was exceeded by the string representation of the URL
must-match filter when a crawl URL was rejected because not matching.
2018-07-11 08:13:29 +02:00
luccioman
f467601561 Properly lock solrInstances for reboot and restoration of embedded Solr
Putting a synchronization lock directly on the solrInstances property
was ineffective as it is assigned a new (unlocked) instance in these
operations.
2018-07-08 08:57:59 +02:00
luccioman
9630f81306 Fixed small unnecessary lines of code 2018-07-08 08:15:26 +02:00
luccioman
26aa5f7a0f Suppress compilation warning on unit testing intentional failure 2018-07-08 08:14:07 +02:00