Commit Graph

12518 Commits

Author SHA1 Message Date
luccioman
744c9a2615 Opensearch desc : handle https protocol url with default port (443)
This completes modifications made for mantis 669
(http://mantis.tokeek.de/view.php?id=669)
2016-08-12 12:18:26 +02:00
Michael Peter Christen
103a8348b3 fix for NPE and small performance enhancement 2016-08-10 06:48:08 +02:00
reger
2910fe35c1 add missing scheduler calc of next exec_date (call of calculateAPIScheduler)
- after last_exec_date is altered, next_exec_date should be recalculated
- makes the recalculation of next_exec in advance (without api call surely made) in Switchbard.schedulerJob() obsolete
Slightly modify next_exec calc. on missed event to now+schedule_time (from fix 10min)
2016-08-09 03:03:04 +02:00
reger
70d47ae38a keep scheduler selection by repeat entry from 07311020d4
to allow exec schedule on actual exec event.
Iterate on exec date (of advantage after interruption/shutdown) to schedule
older or missed events first.
2016-08-08 02:19:48 +02:00
reger
7c3f932e5d revert due to conflict with double count recording by schedulter / servlet by the commit under normal operation (no shutdown) 2016-08-08 01:57:31 +02:00
reger
07311020d4 postpone apicall exec date init until actual call
fix for http://mantis.tokeek.de/view.php?id=677
The difference is on scheduling a large number of rss feeds and loading 
is not finished before shutdown of YaCy. The change makes sure not already
loaded RSS will be loaded by the scheduler on next startup.
2016-08-07 05:08:55 +02:00
reger
5e335b32da fix Blacklist.contains() matching path pattern to string
similar to 5e9e871192
+ add proof testcase
2016-08-04 01:12:49 +02:00
reger
5e9e871192 fix Blacklist.remove by using pattern.toString to find pattern to remove,
parameter String path did never equal Pattern.
+ delete unused removeAll, as it does not persist changes after restart
2016-08-03 02:13:26 +02:00
reger
1843ea7e69 on Blacklist.add pattern to source file also update internal entry maps
as in Blacklist.add(blacklistType) to make entry effective w/o restart
fix for http://mantis.tokeek.de/view.php?id=676
2016-08-02 02:41:03 +02:00
reger
bf6ce33da3 Correct use of _htDocsPath config in YaCyDefaultServlet to use servlet config variable
+ add some javadoc and remove a not useful static declaration
2016-07-31 23:16:24 +02:00
reger
fcad2d0744 add uses of config constant INDEX_RECEIVE_ALLOW 2016-07-27 02:16:20 +02:00
reger
07fbb737df upd to commons-compress-1.12 2016-07-23 20:34:47 +02:00
reger
226f81cfcf declare poison pill url MultiProtocolURL() as protected to make sure not
used from outside.
After double checking use of poison url revert path init from commit
f8632ad292
2016-07-23 20:03:13 +02:00
reger
f8632ad292 prevent string index out of bounds MultiProtocolURL.getPaths
as path maybe a empty string
+ init path to "" also in init for poison url (to guarantee success for 
all existing uses of path w/o check for null)
2016-07-23 19:18:23 +02:00
reger
35a7d57260 update lucenematchversion to current (5.2.0 -> 5.5.0)
there should be no need for reindex by the update
2016-07-23 18:36:43 +02:00
Marc Nause
1f7013a1e3 removed unused properties in default config (CGI capabilities of YaCy's
HTTPd have been removed many moons ago)
2016-07-21 21:36:00 +02:00
reger
9b07bbf955 deprecate newurl(), not used and already replaced
instead of making it handle all supported the protocols
2016-07-21 02:14:35 +02:00
reger
774b3906a9 fix GenericFormatter.parse ("time","timeoffset")
change: UTC offset internally expected in minutes
2016-07-19 02:57:41 +02:00
reger
27163af0e1 improve detection of referenced links by taking http and https link protocol
into account
+ correct query start detection of commit f89d4eb51d
2016-07-17 23:42:25 +02:00
reger
aed44e31ca fix fr.lng typo in submenuRanking.template 2016-07-17 22:16:08 +02:00
reger
f89d4eb51d fix MultiProtocolURL init (assign of host) for urls with '/' in query part
+ add to test case
2016-07-17 04:17:01 +02:00
reger
163d0cc3cf correct renamed ConfigAdvanced_p.html -> ConfigProperties_p.html in fr and sk.lng 2016-07-16 20:34:42 +02:00
Burkhard
929147b16f Merge pull request #62 from luccioman/french_translation
Updated french translation for some pages
2016-07-16 02:17:58 +02:00
reger
87fcfc6d78 Adjusted hash computation and toNormalform for file:// protocol to deliver
same hash same file on Windows filesystem path with forward- and backslash in path.
Background see http://mantis.tokeek.de/view.php?id=671
+Test case
2016-07-16 01:59:09 +02:00
luccioman
31941f1a5e Updated french translation for some pages
Used Translator_p.html editor and updated french translation for the 8
first files displayed in the combobox.
2016-07-15 14:44:58 +02:00
reger
f0f38a4a94 put Autocrawl_p.html to master.lng.xlf 2016-07-15 00:27:41 +02:00
reger
41d845285d add missing text for ConfigBasic.html to master.lng.xlf 2016-07-11 04:08:24 +02:00
reger
a952787712 adjust opensearchdescription to return url with protocol it was call on
fix http://mantis.tokeek.de/view.php?id=669
2016-07-11 02:33:12 +02:00
reger
360b38d9b6 fix CookieTest_p parameter from ResponseHeader to RequestHeader 2016-07-10 05:44:56 +02:00
reger
3811184abd fix GSA servlet clientIP retrival 2016-07-09 23:39:43 +02:00
reger
7ab41d4ff1 use directories original lastmodified date in file- & smbloader in response 2016-07-09 19:55:47 +02:00
reger
708bcbb042 one more replacement to use cached hosthash vs. calculated 2016-07-07 02:50:57 +02:00
reger
7b226afc33 fix HostQueueTest - changed open parameter 2016-07-06 23:52:02 +02:00
reger
22db449f2a to prevent crawler to concurrently access and alter same crawl queue
after restart, put hosthash in queue's filename (which is used as primary 
key for crawl queue. Hint: initial hosthash from url and recalculated hosthash 
from just hostname:port are not the same. 
fixes http://mantis.tokeek.de/view.php?id=668 (partially)
2016-07-05 23:22:35 +02:00
reger
2cc4e56010 upd to Solr 5.5.2 2016-07-04 22:23:54 +02:00
Orbiter
50c5ddf1a1 Merge pull request #56 from luccioman/LibreJS
LibreJS compliance : YaCy JavaScript license information
2016-07-04 21:07:11 +02:00
Orbiter
c9ec0d0311 Merge pull request #55 from luccioman/docker
Improve Docker image security, size and reliability
2016-07-04 21:06:54 +02:00
Orbiter
82f40aefb5 Merge pull request #47 from reelsense/patch-2
Distilled sentence - ConfigNetwork_p
2016-07-04 21:05:59 +02:00
Michael Peter Christen
9175e92cd4 new development cycle 2016-07-04 20:59:37 +02:00
Michael Peter Christen
bb8d03bdac release 1.90 2016-07-04 12:01:41 +02:00
Michael Peter Christen
634e48309b another peer list update 2016-07-04 11:02:36 +02:00
Michael Peter Christen
7466d390b2 small refactoring + do not accept too old peers during bootstrap 2016-07-04 11:02:15 +02:00
reger
fcc29c36f0 test case for HostBalancer issue in intranet mode
with file:// protocol, 2 hostqueues accessing same cache file concurrently
http://mantis.tokeek.de/view.php?id=668
Reason seems to be diff. hosthash key of hostqueues on reopen. 
Internal queue key and external representation (directoryname currently hostname.port) must be adjusted to fix it (not done yet).
2016-07-04 02:44:58 +02:00
Michael Peter Christen
16420e5507 added another principal peer 2016-07-03 22:50:50 +02:00
luccioman
fc958230c4 Added instructions for log control and upgrade 2016-07-03 17:28:47 +02:00
luccioman
adc657004d Merge remote-tracking branch 'origin/master' into docker 2016-07-03 16:53:05 +02:00
reger
8d58a48029 remove wrong log line in CrawlSwitchboard
+ don't allow CrawlSwitchboard to exit application
making network param unused
2016-07-02 20:33:23 +02:00
reger
7bac756720 prevent dealing with -UNRESOLVED_PATTERN- eventID parameter in html includes
on first landing on search page
2016-07-01 00:02:10 +02:00
reger
900ec17d1a add de hint translation for CrawlStartScanner_p
rem missing translation line in other lng
2016-06-29 23:27:59 +02:00
reger
5aaa057c65 ignore empty input lines in FileUtils.getListArray() to poka joke blacklist read.
equalizes behavior with getListString()
improves: case were blacklist file contained a undesired empty line, not 
fixed by blacklist-cleaner.
2016-06-28 23:44:28 +02:00