Commit Graph

62 Commits

Author SHA1 Message Date
luc
befb2415f8 Corrected frames preview displaying eventually incorrectly in local
administration mode.
2015-12-16 02:23:58 +01:00
luc
8c4ab9c76b Added an option to eventually limit size of remote solr documents put to
local index. See mantis #626.
2015-12-16 02:20:03 +01:00
reger
7cda48a9d6 add hint to "default max results per page" limit on ConfigPortal
(limit is applied in yacysearch & max. total results by sum result-stack size)
- remove obsolete search.navigation prop (has moved to ConfigSearchPage_p)
2015-12-09 00:49:38 +01:00
reger
a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt() 2015-10-31 23:09:03 +01:00
reger
4eb89d7f15 revert clickservlet
(default was indeed a mistakenly)
2015-01-05 09:10:20 +01:00
reger
d44d8996d0 Added a “don't store remote search results” option
This is intended for peers who want to participate in the P2P network but don't wish to load/fill-up their index with metadata of every received search result. 
The DHT transfer is not effected by this option (and will work as usual, so that a peer disabling the new store to index switch still receives and holds the metadata according to DHT rules).
Downside for the local peer is that search speed will not improve if search terms are only avail. remote or by quick hits in local index.

To be able to improve the local index a Click-Servlet option was added additionally.
If switched on, all search result links point to this servlet, which forwards the users browser (by html header) to the desired page and feeds the page to the fulltext-index.
The servlet accepts a parameter defining the action to perform (see defaults/web.xml, index, crawl, crawllinks)

The option check-boxes are placed in ConfigPortal.html
2015-01-04 11:10:45 +01:00
reger
e177d69387 remove obsolete config footer option (ConfigPortal user.login)
no footer or footer-option in use

remove unused yacy.init item allowUnlimitedReceiveIndexFrom
2014-12-29 03:50:00 +01:00
Michael Peter Christen
bef689d0a2 NPE fix 2014-12-23 00:30:34 +01:00
Michael Peter Christen
0bfc69b29b more ipv6 bugfixes 2014-10-08 12:38:56 +02:00
Marc Nause
1e6e69bc40 Finished implementation of UPNP:
*) will try other ports if YaCy standard ports are not available
*) distinguish between internal and external port (not sure if this
works 100%)

Still to add: propery in config to enter own external port (in case of
manually configured NAT)
2014-10-07 13:10:06 +02:00
Michael Peter Christen
d2151857f1 Added collection navigation:
The collection field (can be filled i.e. in Crawl Start) can be used to
add categories to YaCy index entries. The usage of that field was
restricted to solr searches and post argument filters as implemented in
commit f7571386a3.
This commit extends collections to a full navigation option in the
standard YaCy search interface. The field is not active by default but
can be activated easily in the /ConfigSearchPage_p.html servlet (just
check the 'Collection' facet field). Collections can now be used for (at
least) two purposes:
- to provide search tenants (through post argument collection)
- to provide self-made category navigation
Search requests may now have (independently from switched on or off
collection facet) a "collection:<collection-name>" modifier attached;
firthermore collection names may use disjunctions using the '|' pipe
symbol. For example, this is a valid search request:
www collection:user|proxy
2014-06-15 12:11:23 +02:00
Michael Peter Christen
da86f150ab - added a new Crawler Balancer: HostBalancer and HostQueues:
This organizes all urls to be loaded in separate queues for each host.
Each host separates the crawl depth into it's own queue. The primary
rule for urls taken from any queue is, that the crawl depth is minimal.
This produces a crawl depth which is identical to the clickdepth.
Furthermorem the crawl is able to create a much better balancing over
all hosts which is fair to all hosts that are in the queue.
This process will create a very large number of files for wide crawls in
the QUEUES folder: for each host a directory, for each crawl depth a
file inside the directory. A crawl with maxdepth = 4 will be able to
create 10.000s of files. To be able to use that many file readers, it
was necessary to implement a new index data structure which opens the
file only if an access is wanted (OnDemandOpenFileIndex). The usage of
such on-demand file reader shall prevent that the number of file
pointers is over the system limit, which is usually about 10.000 open
files. Some parts of YaCy had to be adopted to handle the crawl depth
number correctly. The logging and the IndexCreateQueues servlet had to
be adopted to show the crawl queues differently, because the host name
is attached to the port on the host to differentiate between http,
https, and ftp services.
2014-04-16 21:34:28 +02:00
orbiter
3c8d6e1eee added adminAccount switch to ConfigAccounts_p servlet to switch on
protection of all pages; some refactoring as well
2014-03-20 22:11:49 +01:00
reger
365f77ea8c make internal page links relative to ease any future development for context aware servlets
note also http://bugs.yacy.net/view.php?id=106
2014-02-10 21:40:42 +01:00
Michael Peter Christen
5e31bad711 - the webgraph shall store all links which appear on a web page and not
all unique links! This made it necessary, that a large portion of the
parser and link processing classes must be adopted to carry a different
type of link collection which carry a property attribute which are
attached to web anchors.
- introduction of a new URL class, AnchorURL
- the other url classes, DigestURI and MultiProtocolURI had been renamed
and refactored to fit into a new document package schema, document.id
- cleanup of net.yacy.cora.document package and refactoring
2013-09-15 00:30:23 +02:00
Michael Peter Christen
76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE,
because it's normal that a boolean value is missing in the post argument
if a checkbox is not selected.
Added also some style enhancements to IndexFederated, removed the Solr
attachment manual and replaced it with a link to the wiki which explains
this in more detail.
2013-07-31 10:49:26 +02:00
Michael Peter Christen
4c242f9af9 always use a default value for boolean options to have transparency for
the outcome if the attribute is missing in servlets
2013-07-25 12:17:29 +02:00
Michael Peter Christen
5878c1d599 - refactoring of log to ConcurrentLog:
jdk-based logger tend to block
at java.util.logging.Logger.log(Logger.java:476) in concurrent
environments. This makes logging a main performance issue. To overcome
this problem, this is a add-on to jdk logging to put log entries on a
concurrent message queue and log the messages one by one using a
separate process.
- FTPClient uses the concurrent logging instead of the log4j logger
2013-07-09 14:28:25 +02:00
Michael Peter Christen
0c5bed7e2c added configuration option for greedy learning function to ConfigPortal
servlet
2013-06-28 15:31:36 +02:00
Michael Peter Christen
8ea6ddf636 removed attributes from ConfigPortal.html which are redundant to
ConfigSearchPage_p.html
2013-06-28 14:17:14 +02:00
Michael Peter Christen
fd1776a3b0 added a new 'Citations' function: each search result item can now be
explored for citations within other documents. A click on the
'Citations' link shows an analysis with all text lines in the document
each with a complete list of documents which contain the same line. A
second section shows the linking documents in ascending order of number
of citations from the original document. Because documents from
different hosts are most interesting here, they are listed at the top of
the page as possible 'copypasta' source.
2013-06-12 15:02:49 +02:00
reger
1fb452174a read defaults from yacy.init for "Set to Defaults" button 2013-01-05 20:47:18 +01:00
reger
e9e0d63897 Add config option to show HostBrowser link in search result
- ConfigPortal: added checkbox Host Browser
- yacy.init: added search.result.show.hostbrowser as default = on (true)
- fix HostBrowser: broken link to protected WebStructurePicture for public user
2012-12-27 10:01:10 +01:00
Michael Peter Christen
00c1c777fa refactoring 2012-09-21 15:48:16 +02:00
Michael Peter Christen
9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
0-values and no empty strings are written). This may save a lot of
memory (in ram and on disc) if excessive 0-values or empty strings
appear)
- do not allow default boolean values for checkboxes because that does
not make sense: browsers may omit the checkbox attribute name if the box
is not checked. A default value 'true' would not comply with the
semantic of the browsers response.
- add a checkbox in IndexFederated_p for the lazy initialization of solr
fields.
2012-06-27 12:17:58 +02:00
cominch
c63c3a4495 Show additional interaction elements in footer section on each page, if
activated in ConfigPortal.html.
This footer is also visible in augmented browsing proxy mode.
2012-06-20 18:04:23 +02:00
cominch
84a11ec48c Corrected loading of default page settings on ConfigPortal.html 2012-06-20 07:55:28 +02:00
cominch
3c255c025b Show tags in search results (if activated in ConfigPortal_p.html) 2012-06-15 10:43:05 +02:00
Michael Peter Christen
a5cdfb91de - fixed Cache link (below snippet)
- added 'Augmented Proxy' link below snippet
- added configuration options for augmented proxy
2012-06-14 19:55:34 +02:00
Michael Peter Christen
5aee19daa4 added show from cache in search results (not yet finished) 2012-06-04 23:44:26 +02:00
Michael Peter Christen
8b974905ee changed log-in text for all servlets with authentication:
- added hint how to set the password using a shell script
- added a shell script to change the password
2012-05-24 13:24:31 +02:00
Michael Peter Christen
1473e2258e fix for http://bugs.yacy.net/view.php?id=154 2012-05-18 23:56:40 +02:00
Michael Peter Christen
8aba045ba1 if a new pop-up page is set in config portal, then this page applies
also to the default page configuration for the httpd if no path is
given.
2012-02-26 20:53:32 +01:00
Michael Peter Christen
4c5edab1ec added option to have exception search result windows 2012-01-26 15:32:30 +01:00
Michael Peter Christen
0bcef2d156 added feature as requested in
http://forum.yacy-websuche.de/viewtopic.php?f=18&t=3461
The search can now be configured with a non-display host list.
the search will always exlude the given list of host unless they are
requested directly using the host navigation
2011-12-13 00:16:05 +01:00
Michael Christen
d6e6f7715b added "about" box configuration 2011-12-10 02:04:36 +01:00
orbiter
5a55397f99 some last-minute performance hacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-25 11:23:52 +00:00
orbiter
ac5bda205f - removed lower page navigation (it never looks nice)
- added visibility of metadata and parser in search results since that shows what YaCy can do in a nice way

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8091 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 13:30:42 +00:00
orbiter
c659310e89 - removed option to search for audio, video and applications. These things are still experimental and should not be shown to new users since this would cause them to argue that YaCy does not work. The functions are stil available, because:
- added a configuration option in ConfigPortal to swtich the search media types on or off

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8090 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 13:07:03 +00:00
orbiter
6cd27473f5 - better default values for caching and cache usage
- set new caching and verification behavior according to use case automatically

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8087 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-24 10:22:02 +00:00
orbiter
804e48888b smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter
should also be a little bit faster

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8057 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-11-18 13:09:07 +00:00
orbiter
ba41a869a7 set default number of search results in ConfigPortal.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8008 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-29 09:22:03 +00:00
orbiter
d260b25457 fix for npe
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8006 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-29 07:28:24 +00:00
orbiter
d2ea250d99 refactoring:
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00
orbiter
ba03ca8620 added more configuration options for search:
- removed configuration button for 'search only for admin' from index.html and added this to ConfigPortal
- added configuration of link verification options (iffresh, cacheonly, nocache, ifexist) to ConfigPortal
- added configuration of navigation options to ConfigPortal
- added an option to switch off automatic index cleaning in case that a link verification method fails


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7613 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-21 07:50:34 +00:00
low012
2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 01:03:35 +00:00
orbiter
70ca7cec8c fix for http://forum.yacy-websuche.de/viewtopic.php?p=21763#p21763
and another fix for non-working global search when search options are switched off

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7467 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-03 10:43:09 +00:00
orbiter
fe93caac5a added flags and administration options to show advanced search and to show search result attributes (for each search result)
Administration can be done at ConfigPortal.html

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7466 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-02-02 15:54:13 +00:00
orbiter
88773e4daa changed the default port from 8080 to 8090
see also: http://forum.yacy-websuche.de/viewtopic.php?p=21683#p21683

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7454 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-28 10:54:13 +00:00
orbiter
4565b2f2c0 removed the display option from index.html, yacysearch.html and yacyinteractive.html
instead, a setting at ConfigPortal.html can be made to define if the topmenu shall be shown at these pages or if there is no naviagtion at all. 

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7366 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-12-08 10:50:23 +00:00