sixcooler
ef6a64b2a4
Fix for upload-connection staying in blocked state.
...
This was caused by reading via GZIP from close-wait connection an caused
high cpu- and system-loads.
Solved by implementing handling of the RedListener.
2015-06-09 21:26:10 +02:00
reger
c973f94936
add log entry on release file delete by ResourceObserver
2015-06-08 03:17:12 +02:00
reger
121972752c
implement deleteOldDownloads in RexourceObserver on low diskspace
...
- direct assign sb.observer (skip redundant InitThread)
2015-06-08 02:52:13 +02:00
Michael Peter Christen
0d5ac6e527
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-06-07 22:25:26 +02:00
Michael Peter Christen
9c12555be5
added link to Snapshots in search results if the snapshot exists and
...
option is set in ConfigSearchPage_p
(this is a stub: we also need a visualization of pdf files!)
2015-06-07 20:37:37 +02:00
sixcooler
480e4a6a5c
Update to Jetty-9.2.11 - a bugfix-release that did not solve my
...
Problems, but does not harm anything
2015-06-07 20:09:27 +02:00
reger
72f6a0b0b2
enhance recrawl job
...
- allow to modify the query to select documents to process (after job has started)
- allow to include failed urls (httpstatus <> 200)
2015-06-06 18:45:39 +02:00
Michael Peter Christen
e0a23c56c7
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-06-05 08:32:55 +02:00
Michael Peter Christen
fb9e1dd3f5
servlet for latest commit
2015-06-05 07:22:35 +02:00
reger
5183ad718d
upd to poi-3.12.jar
2015-06-05 03:36:57 +02:00
reger
7478338a40
remove augmented parsing activation from frontend
...
experimental implementation not used and based on error prone experimental rdfaparser
2015-06-05 00:51:00 +02:00
reger
11aa2edfe1
remove RDFa parser activation from frontend
...
reason: experimental implementatin of RDFa parser not executed (limited to special urls) but may cause error on normal html parsing due to a inputstream.reset
2015-06-05 00:15:16 +02:00
Michael Peter Christen
ff11ac89f7
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-06-04 23:04:04 +02:00
Michael Peter Christen
5e2d23b7a0
removed the new index export method from the IndexControlURLs_p.html
...
servlet and moved it to a new /IndexExport_p.html servlet. This servlet
is now more prominent linked in the main menu under Production -> Index
Export/Import
2015-06-04 23:03:46 +02:00
reger
64a7b0b140
Merge origin/master
2015-06-04 22:44:46 +02:00
reger
49b79987c9
remove obsolete searchfl work table
...
was used to register urls with not complete words in snippet but is never accessed
2015-06-04 22:44:01 +02:00
sixcooler
4533f392b0
correct the dark themes to show also a dark navbar on searchresults
2015-06-04 22:15:38 +02:00
Michael Peter Christen
d0aff91f23
fix for index import
2015-06-01 01:56:09 +02:00
Michael Peter Christen
34de1e8cbc
gzip compression will perform more efficient and with better compression
...
level
2015-06-01 01:24:33 +02:00
Michael Peter Christen
98be59ce9c
full solr xml exports will now be automatically compressed during
...
export. That makes it possible to export a solr xml dump even if disc
space is low.
2015-05-30 19:02:54 +02:00
Michael Peter Christen
a1a8edfc0a
wrap HeaReader close() in a catch Throwable block to prevent that an
...
excpetion during close blocks the whole shotdown process
2015-05-30 17:54:02 +02:00
Michael Peter Christen
b43811d38c
added surrogate import process for exported solr dumps.
...
Just throw your solr dump file into DATA/SURROGATES/in/ and it will be
imported!
2015-05-30 13:19:59 +02:00
Michael Peter Christen
b77537294d
prevent disc usage when showing tray animation
2015-05-30 06:57:15 +02:00
Michael Peter Christen
eec78e1b0c
added intensity option to graphics
2015-05-30 06:31:08 +02:00
Michael Peter Christen
a5007f345e
re-licensing some of my old visualization classes under LGPL 2.1
2015-05-30 06:12:08 +02:00
Michael Peter Christen
c99a665593
adding a 3-pixel font generator made some time ago..
2015-05-30 06:01:52 +02:00
Michael Peter Christen
c7576d6028
added a full solr export to the IndexControlURLs_p.html servlet. The
...
export function is also now the default export option. The export file
format for a full solr export is very similar to a solr search result
xml, only the <lst name="responseHeader"> tag is missing.
The exported xml has a special line termination feature: all documents
will be exported into a single line without any CR in between. That
means that every document is completely inside a single line. While this
is not readable at all for humans, it is very useful for linux line
processing scripts, like grep. Using grep it will be easy to select
single documents which match for a given pattern.
Such dumps shall be importable with the DATA/SURROGATE/in import
function, but that import is not yet adopted to the new file format.
2015-05-29 15:05:52 +02:00
Michael Peter Christen
47682bf467
fix for unresolved pattern
2015-05-28 17:43:52 +02:00
Michael Peter Christen
197f7449e5
All entities of crawl profiles are now editable in the crawl profile
...
editor.
2015-05-28 16:07:40 +02:00
reger
1d8e1e4bac
- Image search expand box, adjust javascript hs padtominsize parameter, to make sure expand box doesn't shrink on small images
...
- asure ImageResult.imagetext has value for the link text (use filename if no alt text given)
2015-05-27 02:31:13 +02:00
reger
8b35656007
remove hard throw exception in makeResultEntry
...
remove not used "share." peername.yacy url rewrite
2015-05-26 23:57:06 +02:00
reger
af57fbefad
use available mime (instead null) on imageresult from metadatanode
2015-05-26 23:54:04 +02:00
reger
dd7782bac0
revert deletion of BinSearch
...
(accident)
2015-05-26 04:26:26 +02:00
reger
000dde9511
Eleminate duplication of values for search ResultEntry
...
by instatiation from URIMetadataNode, by eleminating differentiation of ResultEntry/URIMetadataNode.
- moved remaining ResultEntry functionallity to URIMetadataNode
- for 1:1 functionallity added a function makeResultEntry()
- removed ResultEntry
- refactored related code
Main difference is after makeResultEntry the text_t content is removed and alternative title/url strings for display are calculated.
Main difference left is, that
2015-05-26 04:15:00 +02:00
reger
29c4aa3991
fix compiler notification of missing serialID
...
from last commit
2015-05-25 21:51:32 +02:00
reger
3d53da8236
refactor ResultEntry to be based on MetadataNode/SolrDocument
...
to share/reuse common access routines
2015-05-25 21:28:48 +02:00
reger
d882991bc5
Implement sharing of ioDispatcher for term & citation index
...
as proposed in ioDispatcher description
2015-05-25 19:46:26 +02:00
reger
17e820cfd7
use doctype() in ViewFile to choose display routines
...
in preference of getfileExtension()
2015-05-25 00:08:38 +02:00
reger
370ba9da71
On imageSearch prefere mime to sort out none-image documents
...
Generalize the hack to prevent urls with just a img extension beeing returned
improving http://mantis.tokeek.de/view.php?id=528
2015-05-24 21:48:58 +02:00
reger
cd31633369
improve MultiprotocolURL.getFileExtension()
...
prevent string OOB while querypart contains a dot (return just "")
see log snippet in http://mantis.tokeek.de/view.php?id=533
2015-05-24 19:38:04 +02:00
reger
c60ccdfbcf
Increase IODspatcher dumpQueue size to 2 to reduce risk of concurrent emergency dump,
...
skip concurrent emergency merge
dealing with/see http://mantis.tokeek.de/view.php?id=566
2015-05-24 18:03:27 +02:00
reger
8a9622c31c
fix string OoB on getImagelinks with long alttext
...
in description calculation
2015-05-24 01:59:40 +02:00
reger
aa83931765
Convert content charset for display via CacheResource_p
...
Cached resource charset encoding might not fit to internal handling (using utf-8),
convert resource to utf-8
see http://mantis.tokeek.de/view.php?id=576
2015-05-23 20:31:37 +02:00
reger
3e742d1e34
Init remote crawler on demand
...
If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread.
Deploy of the remoteCrawlJob deferred on activation of the option.
2015-05-23 02:06:39 +02:00
Michael Peter Christen
dbf9e3503d
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
2015-05-22 11:39:00 +02:00
Michael Peter Christen
8b1a30be50
removed a -UNRESOLVED_PATTERN-
2015-05-22 11:22:36 +02:00
Michael Peter Christen
9938c81378
fix for division by zero
2015-05-22 11:15:53 +02:00
reger
13f013f64a
Limit extra sleep of BusyThread on LowMemCycle
2015-05-17 06:21:12 +02:00
reger
cd7c0e0aae
detail optimization of RecrawlThread
2015-05-17 00:13:00 +02:00
reger
ace71a8877
Initial (experimental) implementation of index update/re-crawl job
...
added to IndexReIndexMonitor_p.html
Selects existing documents from index and feeds it to the crawler.
currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY]
Documents are added in small chunks (200) to the crawler, only if no other crawl is running.
2015-05-16 01:23:08 +02:00