Commit Graph

221 Commits

Author SHA1 Message Date
reger
e9e0d63897 Add config option to show HostBrowser link in search result
- ConfigPortal: added checkbox Host Browser
- yacy.init: added search.result.show.hostbrowser as default = on (true)
- fix HostBrowser: broken link to protected WebStructurePicture for public user
2012-12-27 10:01:10 +01:00
Michael Peter Christen
98819ec3d9 use solr boost configuration to select search fields. At this time it is
possible to enter a negative boost value to switch that value off. This
might be different in the future with a better input interface.
2012-12-27 03:17:45 +01:00
Michael Peter Christen
72f165d58b added a Boost class which stores solr query boost values. The class can
be configured using the yacy.init file. The boost information is taken
from the configuration each time when a query to solr is done.
2012-12-02 16:54:29 +01:00
sixcooler
2d972f289a rise commitWithinMs to default-value from SwitchBoard
(result in lower hd-io)

no dots in memory-graph (there are to much of them)
2012-10-26 02:12:45 +02:00
Michael Peter Christen
42e525ca9a enhanced the host browser 2012-10-08 14:00:14 +02:00
sof
5cb244b79b Merge remote branch 'origin/master' 2012-10-05 18:54:39 +02:00
apfelmaennchen
88b062210c Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based
on the jaudiotagger library. The parser is disabled by default as it
needs to store temporary files for non file:// protocols, which might be
disliked. For your local MP3-collection it loads nicely Artist,
Title, Album etc. from the audio files meta data.
2012-10-05 18:54:26 +02:00
Michael Peter Christen
3d33a5bdf6 turned the synonyms_t Text field into a multi-valued String field
synonyms_sxt
2012-10-02 11:13:06 +02:00
orbiter
a55e77a115 added twitter search heuristic 2012-09-13 23:53:53 +02:00
Michael Peter Christen
b2b516cc3e added a collection attribute to crawls and searches:
- a solr field collection_sxt can be used to store a set of crawl tags
- when this field is activated, a crawl tag can be assigned when crawls
are started
- the content of the collection field can be comma-separated, all of
them are assigned to the documents when they are indexed as result of
such a crawl start
- a search result can be drilled down to a specific collection; this is
currently only available in the solr interface and also in the gsa
interface using the 'site' option
- this adds a mandatory field for gsa queries (the google api demands
that field all the time)
2012-09-03 15:26:08 +02:00
cominch
dc468dad01 add content control features for custom filter lists 2012-08-29 09:04:28 +02:00
Michael Peter Christen
af764c106c re-activated audio and video search because they obviously work (!) 2012-08-22 01:56:13 +02:00
Michael Peter Christen
23226676c6 FOR THE BRAVE.. this is a forced migration to solr which is now ready
for production as a replacement of the metadata-db.
This intermediate release 1.041 will switch on the previously optional
solr index and the old metadata-db will still work as it did before.
Solr+metadata are accessed in mixed mode, no migration is done yet.
If this causes not a catastrophe until the end of the weekend, we will
do a YaCy 1.1 main release containing this as default.
2012-08-16 18:17:47 +02:00
cominch
e2119f4e76 augmented browsing: replace htmlparser by jsoup, which is more stable
and reliable
2012-08-14 10:06:12 +02:00
Michael Peter Christen
826967513b changed options in IndexFederated_p to switch on/off parts of the index
individually. The settings are experimental and the values of the
settings will be overwritten when an index migration from urldb to solr
starts.
2012-07-23 16:28:39 +02:00
Michael Peter Christen
0301aba1e9 removed unused method parameters 2012-07-05 10:23:07 +02:00
reger
067728bccc add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages) 2012-07-01 00:12:20 +02:00
Michael Peter Christen
9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
0-values and no empty strings are written). This may save a lot of
memory (in ram and on disc) if excessive 0-values or empty strings
appear)
- do not allow default boolean values for checkboxes because that does
not make sense: browsers may omit the checkbox attribute name if the box
is not checked. A default value 'true' would not comply with the
semantic of the browsers response.
- add a checkbox in IndexFederated_p for the lazy initialization of solr
fields.
2012-06-27 12:17:58 +02:00
Michael Peter Christen
c03d306afa shorter autocommit time (now: 1 second) to prevent that user cannot see
results in solr the first time they try it out. The value can now be
easily set to a higher number using the IndexFederated_p interface.
2012-06-26 14:53:45 +02:00
Michael Peter Christen
3fd4a01286 added option to record urls that are forwarded to the solr index 2012-06-26 13:54:48 +02:00
Michael Peter Christen
8dd469b9dd added option to configure the autocommit delay time of solr on-the-fly 2012-06-25 14:59:46 +02:00
Michael Peter Christen
b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
- added code stub for an embedded Solr but generation of Solr store is
still commented out (it works but is not yet ready for usage)
2012-06-25 11:34:38 +02:00
Michael Peter Christen
8738336408 set Xms lower than Xmx 2012-06-19 08:45:49 +02:00
Michael Peter Christen
96f6a5869f more robust OAI-PMH client (large time-out, three re-tries). OAI-PMH
server appeart to be very slow sometimes
2012-06-16 22:30:31 +02:00
Michael Peter Christen
6d17686258 made triplestore persistent by default
added a size display in triplestore servlet
2012-06-15 19:13:07 +02:00
cominch
3c255c025b Show tags in search results (if activated in ConfigPortal_p.html) 2012-06-15 10:43:05 +02:00
Michael Peter Christen
a5cdfb91de - fixed Cache link (below snippet)
- added 'Augmented Proxy' link below snippet
- added configuration options for augmented proxy
2012-06-14 19:55:34 +02:00
Roland 'Quix0r' Haeder
af5a597e47 Scroogle is not comming back, remove dead code
Conflicts:
	source/net/yacy/search/Switchboard.java
2012-06-10 23:38:41 +02:00
cominch
90512640bf Added config switches for custom parser
Conflicts:
	source/net/yacy/document/TextParser.java
2012-06-10 12:49:36 +02:00
cominch
5d20cd324a Add Triplestore and RDF query interface
Conflicts:
	build.xml
	defaults/yacy.init
	source/net/yacy/interaction/AugmentHtmlStream.java
2012-06-10 10:35:59 +02:00
Michael Peter Christen
41c02cb10e - less restrictions for usage of Table RAM copy
- new limit to use the table copy (instead of flag): 400MB available. If
less is available, then a copy is never used. If more is available, then
it can be used if there is a remaining space of at least 200MB
- flush caches more often: flush the Digest cache
2012-06-08 12:48:25 +02:00
Michael Peter Christen
8002fd2578 use less cache space since a large cache would cause more memory usage
in index files.
2012-06-06 14:17:42 +02:00
Michael Peter Christen
5aee19daa4 added show from cache in search results (not yet finished) 2012-06-04 23:44:26 +02:00
Michael Peter Christen
0d32a766ed relax verify attribute for search widget to make it faster:
set to "cacheonly"
2012-05-20 00:50:54 +02:00
Michael Peter Christen
db9d81cb7a ups 2012-05-16 01:04:08 +02:00
Michael Peter Christen
e7e381d110 added configuration to switch off redirection following in crawler 2012-05-15 12:25:46 +02:00
Michael Peter Christen
99c74699de removed scroogle (scroogle is dead) 2012-02-25 12:57:59 +01:00
Michael Peter Christen
4c5edab1ec added option to have exception search result windows 2012-01-26 15:32:30 +01:00
Michael Peter Christen
696ee5fc16 removed pdf from default parser deny list 2012-01-23 17:27:58 +01:00
Lotus
c73af39e54 refactoring of tray icon class,
now uses Java 6 methods natively
2012-01-18 20:47:09 +01:00
Michael Peter Christen
0bcef2d156 added feature as requested in
http://forum.yacy-websuche.de/viewtopic.php?f=18&t=3461
The search can now be configured with a non-display host list.
the search will always exlude the given list of host unless they are
requested directly using the host navigation
2011-12-13 00:16:05 +01:00
Michael Christen
17f962fceb translator updates:
- config string for chinese
- do not copy the language file to DATA/LOCALE any more (and do not use
them there, this is really confusing for new translators)
2011-12-08 10:25:26 +01:00
Michael Christen
c715d19c09 fixes for dependency on svn 2011-12-06 22:05:22 +01:00
Michael Christen
f62e6fb438 less frequent DHT distribution to reduce the load a bit on every peer 2011-12-05 15:45:33 +01:00
Michael Christen
9dbc93613e now that the whole world knows that we actually do p2p and not
metasearch we can support a default look-up to scroogle to gain more
attention to people who say that your search results are incomplete
2011-12-05 11:52:24 +01:00
orbiter
f9216e388c - faster ping to clean up old peers faster
- clean up more news

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8125 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-30 21:21:16 +00:00
orbiter
ac5bda205f - removed lower page navigation (it never looks nice)
- added visibility of metadata and parser in search results since that shows what YaCy can do in a nice way

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 13:30:42 +00:00
orbiter
c659310e89 - removed option to search for audio, video and applications. These things are still experimental and should not be shown to new users since this would cause them to argue that YaCy does not work. The functions are stil available, because:
- added a configuration option in ConfigPortal to swtich the search media types on or off

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 13:07:03 +00:00
orbiter
6cd27473f5 - better default values for caching and cache usage
- set new caching and verification behavior according to use case automatically

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 10:22:02 +00:00
orbiter
5866c73a09 fix for compare search: use scroogle instead of bing and get a default search if configured search engine is not available
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8074 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-23 15:17:46 +00:00