Commit Graph

7937 Commits

Author SHA1 Message Date
luc
7aa1a29e33 Return more accurate HTTP status 400 with detail message when some error
occurs on ViewImage :
 - missing required parameters
 - url licence invalid
2016-01-08 23:18:13 +01:00
luc
bd9dc2f32b Corrected NullPointerException cases occuring in YJsonResponseWriter
when no description is available.
2016-01-08 20:46:02 +01:00
luc
0076f9f97d Updated documented sample url 2016-01-08 20:43:49 +01:00
luc
cfdbc2b487 Improved URLLicence reliability for use by conccurrent non authaurized
users.
Removed URLLicence generation when unnecessary (authorized users)
2016-01-08 20:42:57 +01:00
reger
c91e712178 further refactor using standard java / (one) utf-8 charset variable
extending initiative of commit 9a25751850
2016-01-07 16:17:37 +01:00
luc
571bc55937 Refactoring : use StandardCharsets constants instead of hard-coded
charset names.
2016-01-05 23:37:05 +01:00
reger
1af0e9ef74 remove workaround for Solr bug regarding multivalued date fields
fixed in 5.4.0
http://issues.apache.org/jira/browse/SOLR-8050
2016-01-03 01:11:27 +01:00
sixcooler
5a35f9383a bump to solr/lucene 5.4.0 2016-01-02 21:07:50 +01:00
reger
a58d34a4e8 check error URL cache before adding errorDoc to index
- del obsolete related switchboardconstant
2016-01-02 05:03:57 +01:00
reger
e9539b1086 reintroduce special handling of file upload multipart/form-data from HTTPDemon.parseMultipart
- add filename to parameter fieldname
- add filecontent to special parameter fieldname$file
(some servlets use this $file parameter)

fix for http://mantis.tokeek.de/view.php?id=542
2015-12-31 03:04:13 +01:00
reger
cd26717ba2 fix low memory status hint (dht-in disabled)
http://mantis.tokeek.de/view.php?id=619
2015-12-29 20:38:45 +01:00
reger
a5faf73afa remove obsolete yacy.init entries interaction.*
(related to removed triplestore)
2015-12-29 15:41:19 +01:00
sixcooler
dce1cb65c4 Merge remote-tracking branch 'choose_remote_name/master' 2015-12-28 23:20:42 +01:00
reger
46ac0867ff fix poison mediawikiimporter output queue also after ExecutionException
in worker thread.
Writer of importer keeps needs a poison to close the file. On exception (e.g. OOM)
add a poison marker in outer most try/catch to assure output queue will terminate
in this condition too (and closes+renames the surrogate/in/xxx.prt file)
2015-12-28 02:32:00 +01:00
reger
a7591d3ed0 fix mediawikiimporter number format exception on coordinate parsing
handle uncomplete metadata like "NS=43/50//N". 
For other {expr ... } type entries a try catch added
2015-12-27 01:59:15 +01:00
reger
9da1712a31 increase http header EXPIRES for css and images in DefaultServlet
to increase browser cache hits for not changing content
2015-12-26 17:35:46 +01:00
reger
6d54eb3d36 skip loading document on crawl start for YMark bookmarks
by adding a constructor giving the already loaded document as parameter.
2015-12-26 01:15:07 +01:00
reger
80e2c82249 fix NPE on empty blog importfile parameter 2015-12-24 02:00:45 +01:00
reger
e84d94f8ca fix mime table for ms office / open office documents
(causing wrong parser detect in intranet mode)
2015-12-22 17:48:24 +01:00
reger
45b9bd8403 adjust MultiProtocolURL.protocol detection to handle mailto with "://" in parameters,
and feeding hyperlinks to webgraph processing.
2015-12-21 04:42:26 +01:00
reger
d5fd031449 fix reading of ippattern config array in URLProxy 2015-12-20 15:51:54 +01:00
reger
b7e8358645 make use of header.getContentType where possible (mime is normalized afterwards)
otherwise use header.mime() differentiated in prev. commit.
2015-12-20 15:49:24 +01:00
reger
7a8c077838 fix HeaderFramework.mime() to strip charset parameter.
Differentiate mime() and getContentType() which gives the raw header field.
This improves parser detection if charsets are included in http content-type field.
2015-12-20 06:44:16 +01:00
reger
b4b6910d60 fix (todo): correct doc.id of remote search result if no match with newly
calculated doc hash if different.
Testing showed that in some cases delivered url doesn't match the local
calculated hash. In this case replace doc.id (and host_id_s) with calculation
from url.
2015-12-20 02:10:49 +01:00
reger
dec3e6ad96 fix: adjust urlstub for mailto links
(skip protocol)
2015-12-19 20:13:33 +01:00
reger
cb83e65f89 drop returning document language "en" if unknown (fix todo)
which also harmonizes handling of query.modifier for rwi and solr results
(to result must match a given language filter)
2015-12-19 01:42:35 +01:00
reger
0c5548a7ff fix (todo) remove redundant holding of email link nameproperty in parser document 2015-12-18 02:35:44 +01:00
reger
71c416f383 show mailto links in ViewFile.html linklist 2015-12-18 01:11:55 +01:00
reger
6b7c10cef8 fix dc:date in mediawikiimporter/document.writexml to use lastmodified 2015-12-17 02:53:10 +01:00
reger
14803d58cd let html scraper accept html5 <link rel="icon"> for favicon links 2015-12-17 00:36:08 +01:00
luc
b4cdacee76 Merge branch 'master' of https://github.com/yacy/yacy_search_server 2015-12-16 03:26:06 +01:00
luc
ba0a293f5c Corrected another case of
org.apache.lucene.store.AlreadyClosedException" occuring when
SearchEvent.cleanup() was called while committing local solr index.
2015-12-16 03:25:07 +01:00
reger
4d2b934487 prevent mailto links getting into parser result document's in/outbound link collection
by checking mailto scheme early.
- fix upper case mailto protocol assignment
- add test case for getProtocol
2015-12-16 03:01:17 +01:00
luc
8c4ab9c76b Added an option to eventually limit size of remote solr documents put to
local index. See mantis #626.
2015-12-16 02:20:03 +01:00
luc
a2c08402af Merge branch 'master' of https://github.com/yacy/yacy_search_server 2015-12-15 23:30:30 +01:00
luc
70595d05d0 Modified MemoryControl.main() test to properly end for better results
displaying.
2015-12-14 23:49:28 +01:00
sixcooler
1be67d9ab6 CachedSolrConnector was replaced by ConcurrentUpdateSolrConnector years
ago - time to let it go
Commented out unused table of cache-objects
2015-12-14 21:33:27 +01:00
reger
28b8bc290a fix use of NETWORK_SEARCHVERIFY for rwi verification
was not used to set the searchevent parameter (done in SearchEventCache.getEvent)
- remove unused corresponding QueryParams.filterfailurls param.
2015-12-13 20:01:49 +01:00
reger
020630efd8 remove unused network scanner parameter from queryparameter
Search event is not using networkscanner 
(removed filterscannerfail param always init to false)
2015-12-13 02:50:08 +01:00
luc
ad5586f8f6 Merge branch 'master' of https://github.com/yacy/yacy_search_server 2015-12-08 03:35:36 +01:00
luc
8ebefa4233 Fixed MediaWiki import : DCEntry conversion to SolrInputDocument was
failing. Looks like it was broken since Commit
b43811d38c
2015-12-08 03:34:03 +01:00
luc
7736ee5a42 Updated MediaWimporter main() : display usage in console and stop
properly without calling System.exit
2015-12-08 03:30:51 +01:00
reger
cdb8f3b10d make current ranking score value avail. to search interface / api
Update the result score result field with the result queue ranking value to reflect
the actual calculated/used score,
for rwi & solr stack results.
(calc. etc. is unchanged, it's just that result entry carries the latest val
as api retrieves the number from it)
2015-12-08 03:17:32 +01:00
luc
27d11f8671 Fixed isSolrDump function : PushBackInputStream was not unread when
returning false (for example with a WikiMedia dump).
2015-12-07 21:58:36 +01:00
Michael Peter Christen
135a123a77 less logging in new language detection 2015-12-03 00:39:15 +01:00
Michael Peter Christen
ef8cd80593 fix for npe 2015-12-03 00:33:13 +01:00
reger
0347bfa71f Apply collection query constraint/modifiert to rwi result stack.
Collection is not available in pure rwi entries (but in local solr metadata)
But if user wishes to filter by query constraint also rwi shall adhere to this
(even if only rwi entries with parsed or solr received metadata may fit)
2015-12-02 22:57:59 +01:00
luc
2a67d2ba6f Corrected error management for unsupported image formats, parsing
errors, and unavailable resources : avoid logging to much Exceptions as
these errors easily occur when searching images.
2015-12-01 01:06:01 +01:00
Michael Peter Christen
d6e9834040 Merge branch 'master' of
https://github.com/Scarfmonster/yacy_search_server

# Conflicts:
#	.classpath
#	build.xml
2015-11-30 16:54:54 +01:00
Michael Peter Christen
d82d311995 Merge branch 'master' of https://github.com/luccioman/yacy_search_server
# Conflicts:
#	.classpath
2015-11-30 13:34:10 +01:00