sixcooler
b61f91f0d4
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-30 18:47:42 +01:00
reger
81f53fc83a
upd readme.mediawiki min java version 1.7
2015-10-26 22:19:20 +01:00
reger
d223cf0ae4
adjust MediaWiki importer geo coordinate calculation
...
- allow lat/long 0.xxx
- south / west assignment
include test class
2015-10-26 21:19:35 +01:00
sixcooler
7e2723a894
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-26 19:23:25 +01:00
reger
2b775d5be6
fix typo in WikiCode coordinate calculation
2015-10-25 19:38:42 +01:00
reger
a2dcf64039
fix IndexImportMediawiki_p servlet's refresh header
...
add url parameter to make sure no parameter are included in refresh url
which could cause unwanted restart of import job
see http://mantis.tokeek.de/view.php?id=591 comments
2015-10-25 05:41:25 +01:00
reger
bbe9df2bb3
fix MediawikiImporter for bz2 dump
...
skip reading bz2 file magicbyte to identify bz2 format as inputstream reset would be required. Common compress reads and checks the magicbytes internally and throws ioexception if wrong, making preread obsolete.
2015-10-25 03:06:15 +01:00
reger
c6687dd560
fix a system.out to log.fine
...
in bmpParser
2015-10-25 00:26:45 +02:00
reger
c720b4c249
remove override of dynamicField coordinate_p in solr schema
...
(coordinate_p is not a mandatory field as such doesn't need to be declared as schema.field)
2015-10-24 22:44:28 +02:00
reger
e53c6bbd51
fix init of peer flags
...
(remove hiding of ssl flag)
2015-10-24 19:36:33 +02:00
sixcooler
301ba6131a
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-24 13:01:47 +02:00
Michael Peter Christen
ac034db8bc
Merge branch 'master' of https://github.com/luccioman/yacy_search_server
...
# Conflicts:
# htroot/js/highslide/highslide.js
# source/net/yacy/document/ImageParser.java
2015-10-24 11:22:35 +08:00
luc
8da20718aa
Created a class to test ViewImage rendering against multiple image
...
files.
2015-10-23 15:49:07 +02:00
luc
ec04d27473
Corrected APNG test suite link name.
2015-10-23 14:12:00 +02:00
luc
cbb84ba073
Detailed javadoc.
2015-10-23 13:57:24 +02:00
luc
70111876d2
Filled ViewImageTest.html with all remaining IANA image file formats.
...
Added some links to test suites and specifications.
2015-10-23 12:27:52 +02:00
sixcooler
bfccb8db1c
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-22 20:39:57 +02:00
reger
826f14f37f
fix unnececary set null of peer flags, causing reread
...
remove obsolete version flags
2015-10-22 02:35:58 +02:00
luc
a156fd65d0
Patch to manage render or load errors is still needed after highlight.js
...
version upgrade.
Updated patch for better behavior consistency between browsers.
2015-10-22 00:36:34 +02:00
sixcooler
cdbafe340e
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-21 08:42:36 +02:00
luc
37e28e0dd3
- Keep aspect ratio of images rendered directly by browser such as gif
...
and svg.
- Corrected quadratic rendering of landscape images with height smaller
than maxHeight
2015-10-21 02:49:51 +02:00
reger
571609c208
upd javascript img viewerto highslide 4.1.13
2015-10-21 02:14:04 +02:00
luc
e2d00585e2
Display full size preview using ViewImage Servlet.
2015-10-20 01:17:37 +02:00
luc
74b0283d57
Added image preview error management.
2015-10-20 01:15:02 +02:00
luc
5902ce032e
Corrected NullPointerException case when ImageIO reader is not found for
...
image format.
2015-10-19 14:11:26 +02:00
reger
f0b5bc93a3
remove obsolete yacy.init entry "secureHttps"
...
not used anywhere
2015-10-19 03:47:28 +02:00
reger
c4fa6d7bf5
upd to icu4j-56_1
2015-10-19 01:06:51 +02:00
reger
5445f38070
upd to jetty 9.2.13.v20150730
2015-10-19 00:53:10 +02:00
reger
6ca02ad577
upd httpclient-4.5.1, httpmime-4.5.1, httpcore-4.4.3, commons-compress-1.10
2015-10-18 19:53:39 +02:00
reger
c6495a5b62
add a log entry on parsing ajax crawling scheme snapshot
...
(prev. commit 9252e36aeb
)
2015-10-18 06:19:12 +02:00
reger
9252e36aeb
implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
...
see freshly deprecated https://developers.google.com/webmasters/ajax-crawling/
Implementation improves parsing of the homepage (ajax page) which uses metatag "fragment" in header and parses supplied html snapshot instead of mostly empty ajax/scripted page.
Implementation supports also hash-bang urls (url with anchor starting with ! like ...path#!hashfragment) but our crawler filters it
(use of hash-bang is controversly discussed and proposal is deprecated, makes no sense to adjust the crawler, but as long as it is used by some sites the minor change/improvement in htmlparser is good for some time).
Quick - how does it work
- if metatag fragment with content "!" is found
- htmlparser tries to get content of htmls snapshot (using a different url)
- htmlparser returns 2 documents (original url and snapshot content - but using same original url)
- after parsing result documents are joined (and stored to index containing content also from snapshot page... as the original ajax page contains typically no parseable html content)
2015-10-18 05:51:01 +02:00
Michael Peter Christen
d1ae999ef9
replaced HashMap with LinkedHashMap to preserve the object order
2015-10-16 23:30:51 +02:00
Michael Peter Christen
7d075a1d76
added log lines
2015-10-16 23:30:04 +02:00
Michael Peter Christen
092dac086e
Merge branch 'master' of https://github.com/luccioman/yacy_search_server
2015-10-16 23:22:30 +02:00
Michael Peter Christen
a44cc774d0
Merge branch 'master' of github.com:yacy/yacy_search_server
2015-10-16 23:21:58 +02:00
sixcooler
41c9215174
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-16 21:45:23 +02:00
reger
7a64bebb86
init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
...
and allow injection of recrawl urls before queue is empty
During recrawl the balancer hangs on the very last urls often on hosts with huge delay time,
by allowing injection earlier progress is more balanced. Max number of injected crawl urls by recrawl job is 2 * max loader.
2015-10-16 03:05:39 +02:00
sixcooler
e7dab60ebd
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-15 19:54:42 +02:00
luc
d6522fa4a2
Integrated haraldk/TwelveMonkeys library to first add TIF image format
...
support.
2015-10-15 10:06:51 +02:00
luc
e093fb228d
Created a generic ViewImage performance render test.
2015-10-15 09:18:24 +02:00
Michael Peter Christen
9244694e64
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-10-14 15:17:23 +02:00
Michael Peter Christen
151ccd50a9
fix for image size field values (must be multi-valued)
2015-10-14 15:16:16 +02:00
luc
3ad564e2e4
Created a ViewImage rendering performance measurement test.
2015-10-14 10:17:09 +02:00
luc
62e07a26a0
Refactoring : split into sub-functions to make it understanding and
...
performance measurement easier.
2015-10-14 10:15:00 +02:00
luc
b3f044072e
Updated table headers and SVG file url for case sensitive OS.
2015-10-14 10:13:37 +02:00
luc
ff963cbe23
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-13 08:55:18 +02:00
reger
c9937973e3
unescape MultiProtocolURL getAttributes() return values.
...
use getAttributes() to get query parameters as clear text (w/o url encoding)
use getSearchpartMap() to get in internal format (url encoded)
fix for http://mantis.tokeek.de/view.php?id=606
2015-10-13 02:43:18 +02:00
sixcooler
6695e5cdd3
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-10-12 21:37:04 +02:00
reger
10b0eb106f
fix link target on iframe list in CrawlProfileEditor
2015-10-11 06:06:40 +02:00
reger
78e8c6f3e5
refactor special handling (static override) of SUPPORTED_EXTENSIONS/MIME_TYPES
...
not used for genericImageParser
2015-10-11 01:23:52 +02:00