yacy_search_server

mirror of https://github.com/yacy/yacy_search_server.git synced 2024-09-21 00:00:13 +02:00

Author	SHA1	Message	Date
sixcooler	ef6a64b2a4	Fix for upload-connection staying in blocked state. This was caused by reading via GZIP from close-wait connection an caused high cpu- and system-loads. Solved by implementing handling of the RedListener.	2015-06-09 21:26:10 +02:00
reger	c973f94936	add log entry on release file delete by ResourceObserver	2015-06-08 03:17:12 +02:00
reger	121972752c	implement deleteOldDownloads in RexourceObserver on low diskspace - direct assign sb.observer (skip redundant InitThread)	2015-06-08 02:52:13 +02:00
Michael Peter Christen	0d5ac6e527	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	2015-06-07 22:25:26 +02:00
Michael Peter Christen	9c12555be5	added link to Snapshots in search results if the snapshot exists and option is set in ConfigSearchPage_p (this is a stub: we also need a visualization of pdf files!)	2015-06-07 20:37:37 +02:00
sixcooler	480e4a6a5c	Update to Jetty-9.2.11 - a bugfix-release that did not solve my Problems, but does not harm anything	2015-06-07 20:09:27 +02:00
reger	72f6a0b0b2	enhance recrawl job - allow to modify the query to select documents to process (after job has started) - allow to include failed urls (httpstatus <> 200)	2015-06-06 18:45:39 +02:00
Michael Peter Christen	e0a23c56c7	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	2015-06-05 08:32:55 +02:00
Michael Peter Christen	fb9e1dd3f5	servlet for latest commit	2015-06-05 07:22:35 +02:00
reger	5183ad718d	upd to poi-3.12.jar	2015-06-05 03:36:57 +02:00
reger	7478338a40	remove augmented parsing activation from frontend experimental implementation not used and based on error prone experimental rdfaparser	2015-06-05 00:51:00 +02:00
reger	11aa2edfe1	remove RDFa parser activation from frontend reason: experimental implementatin of RDFa parser not executed (limited to special urls) but may cause error on normal html parsing due to a inputstream.reset	2015-06-05 00:15:16 +02:00
Michael Peter Christen	ff11ac89f7	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	2015-06-04 23:04:04 +02:00
Michael Peter Christen	5e2d23b7a0	removed the new index export method from the IndexControlURLs_p.html servlet and moved it to a new /IndexExport_p.html servlet. This servlet is now more prominent linked in the main menu under Production -> Index Export/Import	2015-06-04 23:03:46 +02:00
reger	64a7b0b140	Merge origin/master	2015-06-04 22:44:46 +02:00
reger	49b79987c9	remove obsolete searchfl work table was used to register urls with not complete words in snippet but is never accessed	2015-06-04 22:44:01 +02:00
sixcooler	4533f392b0	correct the dark themes to show also a dark navbar on searchresults	2015-06-04 22:15:38 +02:00
Michael Peter Christen	d0aff91f23	fix for index import	2015-06-01 01:56:09 +02:00
Michael Peter Christen	34de1e8cbc	gzip compression will perform more efficient and with better compression level	2015-06-01 01:24:33 +02:00
Michael Peter Christen	98be59ce9c	full solr xml exports will now be automatically compressed during export. That makes it possible to export a solr xml dump even if disc space is low.	2015-05-30 19:02:54 +02:00
Michael Peter Christen	a1a8edfc0a	wrap HeaReader close() in a catch Throwable block to prevent that an excpetion during close blocks the whole shotdown process	2015-05-30 17:54:02 +02:00
Michael Peter Christen	b43811d38c	added surrogate import process for exported solr dumps. Just throw your solr dump file into DATA/SURROGATES/in/ and it will be imported!	2015-05-30 13:19:59 +02:00
Michael Peter Christen	b77537294d	prevent disc usage when showing tray animation	2015-05-30 06:57:15 +02:00
Michael Peter Christen	eec78e1b0c	added intensity option to graphics	2015-05-30 06:31:08 +02:00
Michael Peter Christen	a5007f345e	re-licensing some of my old visualization classes under LGPL 2.1	2015-05-30 06:12:08 +02:00
Michael Peter Christen	c99a665593	adding a 3-pixel font generator made some time ago..	2015-05-30 06:01:52 +02:00
Michael Peter Christen	c7576d6028	added a full solr export to the IndexControlURLs_p.html servlet. The export function is also now the default export option. The export file format for a full solr export is very similar to a solr search result xml, only the <lst name="responseHeader"> tag is missing. The exported xml has a special line termination feature: all documents will be exported into a single line without any CR in between. That means that every document is completely inside a single line. While this is not readable at all for humans, it is very useful for linux line processing scripts, like grep. Using grep it will be easy to select single documents which match for a given pattern. Such dumps shall be importable with the DATA/SURROGATE/in import function, but that import is not yet adopted to the new file format.	2015-05-29 15:05:52 +02:00
Michael Peter Christen	47682bf467	fix for unresolved pattern	2015-05-28 17:43:52 +02:00
Michael Peter Christen	197f7449e5	All entities of crawl profiles are now editable in the crawl profile editor.	2015-05-28 16:07:40 +02:00
reger	1d8e1e4bac	- Image search expand box, adjust javascript hs padtominsize parameter, to make sure expand box doesn't shrink on small images - asure ImageResult.imagetext has value for the link text (use filename if no alt text given)	2015-05-27 02:31:13 +02:00
reger	8b35656007	remove hard throw exception in makeResultEntry remove not used "share." peername.yacy url rewrite	2015-05-26 23:57:06 +02:00
reger	af57fbefad	use available mime (instead null) on imageresult from metadatanode	2015-05-26 23:54:04 +02:00
reger	dd7782bac0	revert deletion of BinSearch (accident)	2015-05-26 04:26:26 +02:00
reger	000dde9511	Eleminate duplication of values for search ResultEntry by instatiation from URIMetadataNode, by eleminating differentiation of ResultEntry/URIMetadataNode. - moved remaining ResultEntry functionallity to URIMetadataNode - for 1:1 functionallity added a function makeResultEntry() - removed ResultEntry - refactored related code Main difference is after makeResultEntry the text_t content is removed and alternative title/url strings for display are calculated. Main difference left is, that	2015-05-26 04:15:00 +02:00
reger	29c4aa3991	fix compiler notification of missing serialID from last commit	2015-05-25 21:51:32 +02:00
reger	3d53da8236	refactor ResultEntry to be based on MetadataNode/SolrDocument to share/reuse common access routines	2015-05-25 21:28:48 +02:00
reger	d882991bc5	Implement sharing of ioDispatcher for term & citation index as proposed in ioDispatcher description	2015-05-25 19:46:26 +02:00
reger	17e820cfd7	use doctype() in ViewFile to choose display routines in preference of getfileExtension()	2015-05-25 00:08:38 +02:00
reger	370ba9da71	On imageSearch prefere mime to sort out none-image documents Generalize the hack to prevent urls with just a img extension beeing returned improving http://mantis.tokeek.de/view.php?id=528	2015-05-24 21:48:58 +02:00
reger	cd31633369	improve MultiprotocolURL.getFileExtension() prevent string OOB while querypart contains a dot (return just "") see log snippet in http://mantis.tokeek.de/view.php?id=533	2015-05-24 19:38:04 +02:00
reger	c60ccdfbcf	Increase IODspatcher dumpQueue size to 2 to reduce risk of concurrent emergency dump, skip concurrent emergency merge dealing with/see http://mantis.tokeek.de/view.php?id=566	2015-05-24 18:03:27 +02:00
reger	8a9622c31c	fix string OoB on getImagelinks with long alttext in description calculation	2015-05-24 01:59:40 +02:00
reger	aa83931765	Convert content charset for display via CacheResource_p Cached resource charset encoding might not fit to internal handling (using utf-8), convert resource to utf-8 see http://mantis.tokeek.de/view.php?id=576	2015-05-23 20:31:37 +02:00
reger	3e742d1e34	Init remote crawler on demand If remote crawl option is not activated, skip init of remoteCrawlJob to save the resources of queue and ideling thread. Deploy of the remoteCrawlJob deferred on activation of the option.	2015-05-23 02:06:39 +02:00
Michael Peter Christen	dbf9e3503d	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	2015-05-22 11:39:00 +02:00
Michael Peter Christen	8b1a30be50	removed a -UNRESOLVED_PATTERN-	2015-05-22 11:22:36 +02:00
Michael Peter Christen	9938c81378	fix for division by zero	2015-05-22 11:15:53 +02:00
reger	13f013f64a	Limit extra sleep of BusyThread on LowMemCycle	2015-05-17 06:21:12 +02:00
reger	cd7c0e0aae	detail optimization of RecrawlThread	2015-05-17 00:13:00 +02:00
reger	ace71a8877	Initial (experimental) implementation of index update/re-crawl job added to IndexReIndexMonitor_p.html Selects existing documents from index and feeds it to the crawler. currently only the field fresh_date_dt is used determine documents for recrawl (fresh_date_dt:[* TO NOW-1DAY] Documents are added in small chunks (200) to the crawler, only if no other crawl is running.	2015-05-16 01:23:08 +02:00

1 2 3 4 5 ...

11810 Commits