Commit Graph

99 Commits

Author SHA1 Message Date
luccioman
a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
Thus producing some basic stats on processing times for snippets
generation and counts on snippets per source type.
2018-04-15 09:55:08 +02:00
luccioman
d90b001e1b Improved previous merge "Show ranking in HTML UI".
- added the new setting as configurable in the "Debug/Analysis" settings
page. Debug/analysis is its main purpose for now as there is currently
no nice and "understansable" ranking score info servlet (see forum
discussion http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5884 ) 
- render in the "Search Page Layout" page preview when enabled
- added constants
2017-05-11 18:02:33 +02:00
reger
3dd23c178b Introduce the option to configure a shutdown port.
A port value of -1 will disable this option.

If set to a value greater 0, YaCy listens on this of on the local loopback 
address (127.0.0.1) for a shutdown or restart signal.
E.g. connect to http://localhost:8005/shutdown will stop the YaCy server.
http://localhost:8005/restart will restart it.
This option allows to stop YaCy locally independant from the web web 
frontend (which might be configured for password protected remote access).
2017-03-19 02:30:08 +01:00
luccioman
0173b0bc32 Added an advanced settings page for referrer policy settings.
Feedback will be welcome, notably on the descriptive content of this
page.
2017-03-03 12:05:30 +01:00
luccioman
1857651988 Added a new Debug/Analysis advanced settings subsection.
As discussed in PR #93 with @JeremyRand and @reger24 this new advanced
settings page includes:
 - a new setting to control remote Solr responses encoding
 - some existing debug settings which could not be set through the admin
user interface
2017-02-09 11:05:06 +01:00
reger
e61ee180a7 Group all proxy settings on System Administration by adding settings of
UrlProxyAccss page (moved from deleted AugmentedBrowsing_p), adjust
submenu (remove Augmented Browsing) and translation files.
2017-01-22 23:58:46 +01:00
reger
de33c7e765 replace one more arbitrary CONNECTION_PROP_CLIENTIP header with std.
getRemoteAddr()
2016-12-05 00:11:03 +01:00
reger
d9adc2c255 load handler for Transparent Proxy on startup only if feature is activated
to save the resources and keep handler chain small if the feature is not used.
+add a warning message on settingsack_p page to restart on first activation
2016-03-25 05:26:48 +01:00
reger
a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt() 2015-10-31 23:09:03 +01:00
Michael Peter Christen
84763126e0 added option to make the YaCy proxy act as the cache is never stale. If
set to 'Always Fresh' the cache is always used if the entry in the cache
exist. This is a good way to archive web content and access it without
going online again in case the documents exist.
To do so, open /Settings_p.html?page=ProxyAccess and check the "Always
Fresh" checkbox.
This is set do false which behave as set before.
If you set this to true, then you have your web archive in DATA/HTCACHE.
Copy this to carry around your private copy of the internet!
2014-11-24 20:28:52 +01:00
sixcooler
33b0234454 added a input-field for setting 'fileHost'
Set this to avoid error-messages like 'proxy use not allowed / granted'
on accessing your Peer by its hostname.
2014-11-12 21:32:34 +01:00
Marc Nause
1e6e69bc40 Finished implementation of UPNP:
*) will try other ports if YaCy standard ports are not available
*) distinguish between internal and external port (not sure if this
works 100%)

Still to add: propery in config to enter own external port (in case of
manually configured NAT)
2014-10-07 13:10:06 +02:00
Michael Peter Christen
247e626083 IPv6 host parsing bugfixes 2014-10-01 10:21:03 +02:00
reger
e033e79826 remove old description for proxy port settings (Settings_p.html?page=ProxyAccess)
- The options were not current (only port number accepted, which is part of ConfigBasic.html)
- Deleted options and the port number input field from the proxyaccess page.
- joined both transparent proxy setup pages (Settings_Http.inc & Settings_ProxyAccess.inc) in one page
- adjustments to the related/linked pages
2014-08-20 22:45:36 +02:00
reger
c3e40c82fe make https port setting changeable via front end somewhere
(chosen Http Networking page /Settings_p.html?page=http )
2014-06-01 03:15:38 +02:00
reger
e31493e139 "Use remote proxy for yacy" has no function, remove option and related config item
see/fix bug http://mantis.tokeek.de/view.php?id=23
http://mantis.tokeek.de/view.php?id=189
2014-05-17 23:36:59 +02:00
reger
2dabe2009d - remove unused manual http KeepAlive config
(reducing references to obsolete httpdemon)
- add port info to settings_http
2014-04-18 19:57:35 +02:00
reger
61e51d47a5 fix: unused / incorrect default username parameter
(removed setting)
2014-03-01 23:50:35 +01:00
reger
f681ce15ae remove obsolete HTTPServer input field 2013-12-24 05:11:31 +01:00
reger
f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
- defaults to Jetty (in this branch)
- add server version info & config option -> Admin Console -> Advanced Settings -> Http Networking
2013-10-17 03:34:22 +02:00
reger
fbf84e9ff3 fix SeedUpload setting propery name for include template file 2012-12-24 04:13:38 +01:00
Michael Peter Christen
00c1c777fa refactoring 2012-09-21 15:48:16 +02:00
orbiter
a7df70221e refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-10-04 09:06:24 +00:00
orbiter
d2ea250d99 refactoring:
- moved many classes from de.anomic to net.yacy
- made more sub-packages for search classes

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-09-25 16:59:06 +00:00
orbiter
1d8b0f74f4 one more fix for SVN 7713
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7716 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 15:31:24 +00:00
orbiter
0960261769 fix for svn 7713
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7715 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 15:20:57 +00:00
orbiter
5b579e21a3 code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7713 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-05-13 06:21:40 +00:00
low012
2861d0888a *) simplified code\n*) fixed potential NumberFormatExceptions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-03-15 01:03:35 +00:00
orbiter
88773e4daa changed the default port from 8080 to 8090
see also: http://forum.yacy-websuche.de/viewtopic.php?p=21683#p21683

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7454 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-28 10:54:13 +00:00
orbiter
6c35b68f17 - removed 'peerName' property from the yacy settings file because this information is stored in the yacy seed file
- the own seed file gets the lead for storage of the peer name
- exchanged default peer name generation method with one that does not use the local ip
- default peer names are now strings starting with '_anon'
- added another switch to suppress forwarding to ConfigBasic if the name was already changed
- replaced all usages of the yacy.conf peerName with access to the local seed
- changes to the peer name are now applied directly and not after the next peer ping


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7453 6c8d7289-2bf4-0310-a012-ef5d649a1542
2011-01-28 10:12:17 +00:00
orbiter
155d556568 - better memory protection
- more logging
- little bit of refactoring

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7278 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-10-27 13:21:18 +00:00
orbiter
3197ca42ed preparations to move the HTCache into cora:
- move the header framework classes to cora
- move the ARC caching classes to cora
- refactoring of code to call these classes from cora

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7068 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-08-23 12:32:02 +00:00
orbiter
11639aef35 - added new protocol loader for 'file'-type URLs
- it is now possible to crawl the local file system with an intranet peer
- redesign of URL handling
- refactoring: created LGPLed package cora: 'content retrieval api' which may be used externally by other applications without yacy core elements because it has no dependencies to other parts of yacy

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6902 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-05-25 12:54:57 +00:00
orbiter
9623d9e6d2 added a smb loader component for the YaCy crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6737 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-10 08:55:29 +00:00
orbiter
72f00dee59 removed never-used server access account function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6731 6c8d7289-2bf4-0310-a012-ef5d649a1542
2010-03-08 22:30:45 +00:00
orbiter
5841ee83d3 refactoring
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6400 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-10-11 21:29:18 +00:00
orbiter
1d8d51075c refactoring:
- removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here:
http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html
We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages.
- cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http.
- because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-19 20:37:44 +00:00
orbiter
5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
- The indexing queue was a historic data structure that was introduced at the very beginning at the project as a part of the switchboard organisation object structure. Without the indexing queue the switchboard queue becomes also superfluous. It has been removed as well.
- Removing the switchboard queue requires that all servlets are called without a opaque generic ('<?>'). That caused that all serlets had to be modified.
- Many servlets displayed the indexing queue or the size of that queue. In the past months the indexer was so fast that mostly the indexing queue appeared empty, so there was no use of it any more. Because the queue has been removed, the display in the servlets had also to be removed.
- The surrogate work task had been a part of the indexing queue control structure. Without the indexing queue the surrogates needed its own task management. That has been integrated here.
- Because the indexing queue had a special queue entry object and properties attached to this object, the propterties had to be moved to the queue entry object which is part of the new indexing queue withing the blocking queue, the Response Object. That object has now also the new properties of the removed indexing queue entry object.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6225 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-17 13:59:21 +00:00
orbiter
7d493cf8cc moved parser configuration in separate servelet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6207 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-14 06:57:13 +00:00
f1ori
f814e0fa81 enable warnings and fix most of it
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6196 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-11 21:01:27 +00:00
orbiter
4b74ad0a46 fixed setting of parser configuration servlets
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6191 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-10 15:02:34 +00:00
orbiter
57a88d435b redesign of parser mime type detection and parser steering
There is now a mime-blacklist instead of a mime-whitelist

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6190 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-10 14:22:17 +00:00
orbiter
21b8704fb4 refactoring of the ParserDispatcher and ParserConfig: resulted into Idiom, Parser and Classification classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6188 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-09 22:25:31 +00:00
orbiter
8ca1f5d400 - some work to integrate the html parser the same way as the other parsers are integrated (not finished)
- added migration of code of settings pages (hmm.. does not work correctly yet, sorry)
- more refactoring
- removed more unused code

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6187 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-09 20:56:30 +00:00
orbiter
dafffd0153 refactoring of parsers and document processing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-07-08 21:48:08 +00:00
orbiter
154bbc3364 code cleanup: call of static methods directly to the class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-06-30 13:01:35 +00:00
orbiter
99bf0b8e41 refactoring of plasmaWordIndex:
divided that class into three parts:
- the peers object is now hosted by the plasmaSwitchboard
- the crawler elements are now in a new class, crawler.CrawlerSwitchboard
- the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment

The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer.

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-05-28 14:26:05 +00:00
orbiter
14a1c33823 refactoring of wordIndex class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542
2009-03-13 10:34:51 +00:00
orbiter
0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
The old process used a not really efficient way to detect html encoding strings in texts.
All calling methods had been adoped to call the new class in an enhanced way with less parameters.

Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-10-22 18:59:04 +00:00
lotus
029e16b653 replaced some put(String, String) by putHTML(String, String) on serverObjects respond
in htroot/ root
didn't touch htroot/xml/
this should solve potential xss issues

git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5184 6c8d7289-2bf4-0310-a012-ef5d649a1542
2008-09-19 11:45:11 +00:00