Commit Graph

4869 Commits

Author SHA1 Message Date
Michael Peter Christen
36e623d8bf enhanced metadata enrichment for media file type search:
- Web servers may now deliver YaCy-specific http header field with a
title and keywords. The new http header fields are:
X-YaCy-Media-Title - to be used for media (image, audio, video) titles
X-YaCy-Media-Keywords - to be used for media (image, audio, video)
keywords
- both fields are written to document fields title and keywords and are
searched also during image search.
- to make the usage of arbitrary http header fields (including this new
fields) possible in the /api/push_p.json servlet, a new POST argument is
also introduced to push http header fields. The new POST attribute is
named "responseHeader-X" (where X is the counter). It is allowed to use
this attribute as multi-attribute several times, each can be filled with
a http header line.
- see /api/push_p.html for examples
2014-06-26 13:02:35 +02:00
reger
a88ea14e09 harmonize use of style for "delete" button
- apply the monstly used btn-danger class
2014-06-22 23:33:59 +02:00
Michael Peter Christen
8fd72b5e8b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-06-20 13:57:06 +02:00
Michael Peter Christen
81d0f01a6f added 'synchronous' and 'commit' flags in push api 2014-06-20 13:56:55 +02:00
reger
5043eff33a move page navigation below results (image search)
force page navigation to be displayed below results in image search for any number of displayed images instead to be displayed to the right of last image.
2014-06-20 01:02:43 +02:00
Marc Nause
f443cfa32d Improvements and bugfixes for recording actions of blacklist API. 2014-06-17 22:54:47 +02:00
Michael Peter Christen
0ba6b98d5b fix for broken json 2014-06-17 11:36:20 +02:00
orbiter
4177c9cf05 fix for crawl start check 2014-06-15 22:50:04 +02:00
orbiter
0bbb5040b8 Merge branch 'master' of git@gitorious.org:yacy/rc1.git 2014-06-15 12:38:52 +02:00
orbiter
9d5d86cd03 Added filter query options to the ranking servlet /RankingSolr_p.html.
Filter queries are not actually related to ranking, but user requests
have pointed out that specific boost queries to move results to the end
of the result list are not sufficient. Such boost filters may be better
executed as actual filter and therefore such a filter can now be
statically applied to every search request. A typical use could be the
expression "http_unique_b:true AND www_unique_b:true" which uses the
recently introduced fields http_unique_b and www_unique_b which are true
only for one of the alternatives with/without http(s) and with/without
prefix 'www.' in host names.
2014-06-15 12:38:30 +02:00
Michael Peter Christen
d2151857f1 Added collection navigation:
The collection field (can be filled i.e. in Crawl Start) can be used to
add categories to YaCy index entries. The usage of that field was
restricted to solr searches and post argument filters as implemented in
commit f7571386a3.
This commit extends collections to a full navigation option in the
standard YaCy search interface. The field is not active by default but
can be activated easily in the /ConfigSearchPage_p.html servlet (just
check the 'Collection' facet field). Collections can now be used for (at
least) two purposes:
- to provide search tenants (through post argument collection)
- to provide self-made category navigation
Search requests may now have (independently from switched on or off
collection facet) a "collection:<collection-name>" modifier attached;
firthermore collection names may use disjunctions using the '|' pipe
symbol. For example, this is a valid search request:
www collection:user|proxy
2014-06-15 12:11:23 +02:00
Michael Peter Christen
74c249288a added a push api to make it possible to upload files directly without
crawling to the YaCy indexer. Files are uploaded using POST multipart
requests; multiple file uploads are possible as well. Each file has
attached the file date and mime type which is used to get the right
parser for the submitted data. Also an url is submitted which is
assigned to the document.
The CrawlSwitchboard has a new option for default Crawl Profiles which
are assigned dynamically from the new push interface.
2014-06-12 18:10:07 +02:00
reger
c798a9d1bb fix unresolved pattern in yacysearch.rss title
and rss xml error due to html & encoding in url entries
2014-06-07 03:01:26 +02:00
Michael Peter Christen
e64be5dcad in case that the network is switched to any other than freeworld, RWIs
are disabled. This is a temporary fix. There must be a better way to
determine if RWIs are to be switched on or of.
2014-06-04 13:59:37 +02:00
Michael Peter Christen
87f171675b doing index deletions using a get string which makes it easier to
copy-paste deletion examples (see: #EuGH :( )
2014-06-04 12:09:49 +02:00
Michael Peter Christen
a2f800cd8f fix for bad String conversion 2014-06-04 12:07:07 +02:00
Michael Peter Christen
b3b174e2b8 fixed webgraph postprocessing and status display in Crawler_p servlet 2014-06-02 15:06:38 +02:00
reger
7a52a6ba3f add links to port config in status panel
- pom upd to match javadoc location
2014-06-02 02:11:54 +02:00
reger
c3e40c82fe make https port setting changeable via front end somewhere
(chosen Http Networking page /Settings_p.html?page=http )
2014-06-01 03:15:38 +02:00
Michael Peter Christen
698f053658 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-06-01 01:02:12 +02:00
Michael Peter Christen
f23c4142e0 added option to configure a custom user agent within allip networks 2014-06-01 01:02:03 +02:00
reger
8e233e2eb4 - fix typo in Message_p (defaultpath)
- use more existing switchboardconstants for getproperties
- replace depriciated call defaultservlet
2014-06-01 00:20:25 +02:00
Michael Peter Christen
8ad41a882c fixed several problems with postprocessing:
- unique-postprocessing was destroying results from other
postprocessings; removed cross-updates as they had been not necessary
- unique-postprocessing did not restrict on same protocol
- inefficient concurrent update cache was redesigned completely
- increased limits for concurrent blocking queues to prevent early
time-out
2014-05-29 13:24:24 +02:00
Michael Peter Christen
640b684bb6 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 2014-05-28 19:19:17 +02:00
Michael Peter Christen
2f5477ea59 a try to fix the mixed up terms 'Active' -> 'Senior' and 'Passive' ->
'Junior'
2014-05-28 18:48:54 +02:00
reger
ca5437dd50 fix crawl of file:// , also http://mantis.tokeek.de/view.php?id=149
local files can be crawled (intranet mode) url parsing fixed according to  RFC 1738 (for unix and windows)
for win like file:///c:/tmp   or file://localhost/c:/tmp
for linux like file:///tmp  or file://localhost/tmp
Host is ignored and path must be absolute
2014-05-28 03:01:34 +02:00
reger
66f6797f52 make config search page layout closer to actual page appearance 2014-05-25 01:06:39 +02:00
sixcooler
5b1c4ef191 Monitoring and limit connection-count for Jetty 2014-05-22 22:16:39 +02:00
orbiter
ce1dbfeb0f fix appearance of image search thumbnails. 2014-05-22 15:01:58 +02:00
orbiter
6daae59479 switch on core.service.rwi when switching back from portal mode to p2p
mode
2014-05-22 12:55:22 +02:00
Michael Peter Christen
f0db501630 better handling of ranking parameters and new default values for date
navigation which is done using ranking in solr.
2014-05-22 03:01:07 +02:00
Michael Peter Christen
2520590b45 migrated from pdfbox 1.8.4 to 1.8.5. They have a very long bugfix list
for that update:
http://www.apache.org/dist/pdfbox/1.8.5/RELEASE-NOTES.txt
2014-05-21 22:48:41 +02:00
Michael Peter Christen
6634b5b737 debug code for index distribution testing 2014-05-21 18:20:16 +02:00
Michael Peter Christen
89e13fa34e fixed bug in test function 2014-05-21 15:31:47 +02:00
Marc Nause
4723329e29 Improved blacklist XML/JSON API. 2014-05-19 20:51:43 +02:00
reger
f91b2f51ae fix: load_Rss remove feed to many parameter for get
use form post methode
2014-05-18 22:41:09 +02:00
orbiter
c028ae9b09 Merge branch 'master' of git@gitorious.org:yacy/rc1.git 2014-05-18 21:21:17 +02:00
reger
e31493e139 "Use remote proxy for yacy" has no function, remove option and related config item
see/fix bug http://mantis.tokeek.de/view.php?id=23
http://mantis.tokeek.de/view.php?id=189
2014-05-17 23:36:59 +02:00
reger
89e2c5e884 fix: allow enable of CrawlStartExpert.html #file 2014-05-17 22:56:15 +02:00
reger
1b37b12998 fix: CrawlStartExpert.html # From File with missing filename
- crawlName must not be empty
- crawlingFile must not be empty
2014-05-17 21:34:23 +02:00
orbiter
0d8072aa99 removed warnings 2014-05-13 22:29:05 +02:00
orbiter
be7c99dbe8 switched menu position of ConfigPortal.html and ConfigSearchBox.html 2014-05-13 08:14:56 +02:00
Michael Peter Christen
a1ac4c3b76 automatically clear graphics cache 2014-05-12 15:45:25 +02:00
reger
f87ac716f3 improve IndexDeletion by query
adding transparently text_t as pseudo default search field if no fieldname (no  : ) is included.
adressing bug report  http://mantis.tokeek.de/view.php?id=274
2014-05-12 00:12:05 +02:00
reger
e9060d31bd update to Jetty 9
besides adjustments in code it makes the servlet settings in web.xml significant.
This applies to solr, gsa and proxy servlet. There is no longer a default setup in code during init (as jetty 9 checks for double definition).
2014-05-11 01:53:11 +02:00
orbiter
b9c1a61814 added a peername=<peername> property in the seedlist API 2014-05-08 07:41:40 +02:00
orbiter
c637955e67 fix for navigation steering / p2p mode
see also:
http://forum.yacy-websuche.de/viewtopic.php?f=5&t=5198&p=29958#p29958
2014-05-06 05:58:51 +02:00
Marc Nause
f98ccf952f Improved Blacklist API:
*) added JSON support
*) fixed Exception in case of missing parameters
*) renamed parameter for items in "add entry" and "delete entry" from
"entry" to "item" to match term in XML
2014-05-05 23:16:01 +02:00
reger
91bd384cf6 fix input-group layout on index.html
see bug http://mantis.tokeek.de/view.php?id=391
2014-05-03 21:55:10 +02:00
Marc Nause
0d88f292dc Key for parameter "blacklist name" is "list" in all servlets now. 2014-05-02 14:18:52 +02:00