reger
9cfa847c94
upd maven pom (add langdetect)
2015-11-30 18:57:16 +01:00
Michael Peter Christen
5309b99e98
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-11-30 17:01:39 +01:00
Michael Peter Christen
d6e9834040
Merge branch 'master' of
...
https://github.com/Scarfmonster/yacy_search_server
# Conflicts:
# .classpath
# build.xml
2015-11-30 16:54:54 +01:00
Michael Peter Christen
c030b9bc5c
Merge pull request #27 from Stepanov-Sergey/master
...
added Russian synonyms
2015-11-30 13:45:25 +01:00
Michael Peter Christen
cff152f26e
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-11-30 13:44:36 +01:00
Michael Peter Christen
7e785dac8e
urlproxyheader must be in the default package because all classes in the
...
htroot path must be in the default package
2015-11-30 13:35:41 +01:00
Michael Peter Christen
d82d311995
Merge branch 'master' of https://github.com/luccioman/yacy_search_server
...
# Conflicts:
# .classpath
2015-11-30 13:34:10 +01:00
Michael Peter Christen
d3ab43e743
fixed classpath
2015-11-30 13:19:49 +01:00
Michael Peter Christen
06994b5853
Merge pull request #23 from linkerlin/patch-1
...
Create .travis.yml
2015-11-30 13:18:03 +01:00
Sergey Stepanov
de0f3c6ff1
added Russian synonyms
2015-11-30 11:37:47 +03:00
reger
b5371ea8c1
read/init crawl queue in a thread
...
to speed-up YaCy start on large existing crawler queues
2015-11-29 05:19:39 +01:00
reger
f05b34fc35
upd to slf4j-1.7.13
2015-11-29 01:24:46 +01:00
reger
1160b13172
remove unused md5 from ViewFile servlet params
2015-11-28 23:09:15 +01:00
reger
e163ea88f6
fix vsdParser (Visio) parser return statement
...
(final block un-necessary throw)
2015-11-28 02:43:38 +01:00
reger
b2c8bc0ae6
remove md5_s from default index fields
...
it is not assigned a value / not used
Due to above also excluded from transfer protocol.
2015-11-27 02:41:02 +01:00
luc
e40ae0943b
- No max dimensions specified : render raw image data when source and
...
target image format are the same.
- Corrected scaling condition.
2015-11-26 09:30:43 +01:00
luc
4c36b7bd14
Merge branch 'master' of https://github.com/yacy/yacy_search_server
...
Conflicts:
.classpath
2015-11-26 09:28:34 +01:00
reger
90686a75a2
fix flux factor (additional crawl delay by access count) calculation
2015-11-25 01:34:41 +01:00
reger
d79fa7fbeb
upd to Jetty v9.2.14.v20151106
2015-11-24 21:35:58 +01:00
luc
4af27289e5
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-11-23 09:01:25 +01:00
reger
297fdb60d3
throw exception if crawler hostqueue can't create hostpath directory.
...
In rare cases hostname may not be a valid filesystem directory name,
which can't be created (e.g. containing '*' char). To prevent crawl queue
looping on this invalid entry by throwing a malformedurlexception.
2015-11-22 21:26:18 +01:00
luc
755efac17d
Use same max file size when loading all resource bytes or opening stream
...
content
2015-11-20 19:35:39 +01:00
luc
5eafce5577
Rendering performance improvement : use EncodedImage constructor with
...
BufferedImage parameter to avoid re-rerendering BufferedImage.
2015-11-20 15:02:58 +01:00
luc
bc6c79fc12
Corrected scaling function for non RGB images.
2015-11-20 14:35:36 +01:00
luc
042b0e9658
Corrected IcedTea version. See http://mantis.tokeek.de/view.php?id=615
2015-11-20 10:15:54 +01:00
luc
1565559df8
Refactoring : extracted write InputStream method.
2015-11-20 09:42:24 +01:00
luc
f0478bb14d
BMP and ICO image formats support : integrated /haraldk/TwelveMonkeys
...
imageio-bmp-3.2 library.
- better BMP format flavours support
- handle PNG encoded icons
- handle transparency
Added some javadoc url references to .classpath
2015-11-20 09:38:16 +01:00
luc
b6ba941d33
Configuration projet eclipse : ajout nature et validation javascript
2015-11-20 09:32:30 +01:00
luc
7f27683831
Correction erreur de compilation.
2015-11-20 09:29:02 +01:00
luc
07437986e7
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-11-20 08:15:24 +01:00
reger
97cc03ef6a
start using a template for urlproxy header
...
It is included as iframe /proxmsg/urlproxyheader.html
to allow full servlet functionallity and flexibility to display some
index/meta data in future.
2015-11-20 01:49:56 +01:00
reger
d08e421809
fix link to logo (yacysearch.xsl)
2015-11-19 21:08:00 +01:00
luc
f01d49c37a
Process large or local file images dealing directly with content
...
InputStream.
2015-11-18 10:15:38 +01:00
luc
3c4c77099d
If available, check content length before downloading. Check also
...
content length is not over Integer.MAX_VALUE.
2015-11-18 10:11:38 +01:00
luc
5bbb2e1730
Ensure resource is closed when reading a full file InputStream
2015-11-18 10:08:06 +01:00
luc
6291a57300
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-11-18 08:49:31 +01:00
reger
0d3c5b223e
have psParser cleanup temp file
2015-11-17 23:45:29 +01:00
reger
7d0d19cb8e
avoid File.deleteOnExit() on temp files
...
JVM registers each file in a list regardless of already deleted and never
cleans up the list during runtime.
This accumulates to a considerable amount of mem during large crawls and/or
long uptime.
To tackle this, all temp files are now created in a subdir of java.io.tmpdir
and the jvm tmpdir property is set to this subdir, which is deleted by
code on shutdown.
Additionally let pdfParser use this tmp subdir too.
2015-11-17 22:27:07 +01:00
luc
bfe51001e3
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-11-17 08:30:32 +01:00
reger
02e4489a23
set tmpfile.deleteOnExit by default,
...
to make sure files are removed on shutdown.
2015-11-16 21:37:45 +01:00
reger
2985baaa01
Exclude repetitive protocol part in tokenized url
...
used as description if none is avail. from parser.
2015-11-16 01:06:20 +01:00
reger
ca3d26a401
harmonize wordsintitle & CollectionSchema.title_words_val calculation,
...
remove obsolete partial init of wordreference from urimetadata
2015-11-15 06:06:37 +01:00
reger
7bf03856d1
add link to quick select blacklist
...
from title list
2015-11-15 00:39:38 +01:00
reger
440ce6d198
add German translation to re-crawl job
2015-11-15 00:34:22 +01:00
reger
5362a80f1c
upd to httpcore 4.4.4
2015-11-14 21:16:31 +01:00
reger
e90593450c
upd to TwelveMonkeys ImageIO 3.2
2015-11-14 01:46:25 +01:00
reger
b4dbff6a6a
fix yacysearch.json "totalResults"
...
element "totalResults" is included twice (at begin & end),
only the element after performing the search holds number > 0
see http://mantis.tokeek.de/view.php?id=608
2015-11-13 20:10:47 +01:00
reger
52a9040ae6
Sort out double keywords (dc_subject) early in parsed documents
...
- by direct using Set vs. List
- remove not neede String[] getter
2015-11-13 01:48:28 +01:00
luc
49331dc523
Merge branch 'master' of https://github.com/yacy/yacy_search_server
2015-11-12 08:21:56 +01:00
luc
0de6988604
Added links to more image test suites.
2015-11-12 08:21:37 +01:00