Commit Graph

54 Commits

Author SHA1 Message Date
josejuanmartinez
c83a31b907 Adds information about how to run with 1 Elastic Search and multiple machines. 2019-11-08 22:16:38 +01:00
josejuanmartinez
12865d9f75 Prints settings when launched 2019-11-08 22:07:29 +01:00
josejuanmartinez
2ff897622c ElasticSearch host and port are taken for ENV_VAR, so that we can run ElasticSearch in a remote machine 2019-11-08 21:53:26 +01:00
Jose J. Martinez
ea91a6a0e0
Merge pull request #1 from josejuanmartinez/add-license-1
Create LICENSE
2019-09-28 11:34:53 +02:00
Jose J. Martinez
011d6270c6
Create LICENSE 2019-09-28 11:34:42 +02:00
josejuanmartinez
a8f2fc2ed6 Fixes typo in README.md 2019-09-28 11:33:56 +02:00
josejuanmartinez
fd272f3cd8 Disambiguates 'cadastro' and 'catastro'. Adds cron information to README.md 2019-09-28 11:33:14 +02:00
josejuanmartinez
6a36266886 Adds final documentation of most of functions and methods. 2019-09-28 11:20:40 +02:00
josejuanmartinez
6c6da34adf Adds documentation of most of functions and methods. 2019-09-26 16:52:53 +02:00
J
5ea0da9449 Adds picture to README.md 2019-09-25 18:32:39 +02:00
J
0900a4c53d Updates title in README.md 2019-09-25 18:25:04 +02:00
J
645a294074 Updates README.md 2019-09-25 18:23:37 +02:00
J
1807c052df Removes innecessary logger.debug info 2019-09-23 13:53:56 +02:00
J
69f905e996 Removes prints 2019-09-23 13:51:22 +02:00
J
062cce6e03 Fixes problem that was sending prov_num and city_num instead of prov_name and city_name 2019-09-23 13:51:00 +02:00
J
af53ec5b4e Adds gitignore 2019-09-23 13:18:21 +02:00
J
8c0b16b9e8 Removes debug print 2019-09-23 13:17:37 +02:00
J
55f9e400d3 Fixes check if address is present in ES (provice-city error) 2019-09-23 13:14:21 +02:00
J
89e4f57373 Fixes check if address is present in ES 2019-09-23 13:12:43 +02:00
J
4c252e991d Temporary print for debug purposes 2019-09-23 13:03:29 +02:00
J
d38f0905ee Checks if address already present in ElasticSearch and skips it. Adds ENV var to docker-compose 2019-09-23 13:01:05 +02:00
J
c8ec760ef2 Adds message to console when connection reject (sleeping...) 2019-09-22 16:08:26 +02:00
J
2606fc95f0 Refactoring of tests. Added health check and some minor changes. 2019-09-22 14:48:41 +02:00
J
4cb916b67b Fixes elasticsearch index 2019-09-22 00:58:57 +02:00
J
9dee33288b Fixes elasticsearch index 2019-09-22 00:58:10 +02:00
J
4ce6442836 Adds nested properties (constructions) for correct elastic search visualization 2019-09-22 00:53:42 +02:00
J
e6a23c3b59 Makes Elastic Search data persistent 2019-09-21 23:09:32 +02:00
J
e5c71a2676 Changes to a more standard origin volume route for logstash 2019-09-21 22:45:42 +02:00
J
48b2895fd4 Adds logstash to dockercompose and adds new tests 2019-09-21 22:41:57 +02:00
J
137ce65ee0 Fixes error after refactoring in html scrapping by provinces 2019-09-21 15:27:03 +02:00
J
7cf208a4c2 Refactors and disgregates scrapping and parsing into different classes for maintenability 2019-09-21 15:11:32 +02:00
J
fef84a9f95 Fixes multiparcelas in addresses not being taken into account 2019-09-20 22:32:49 +02:00
J
40f35f13cc Adds UTF-8 headers to all files 2019-09-20 20:00:36 +02:00
J
f186186477 Adds list provinces and list cities to main 2019-09-20 19:52:24 +02:00
J
d5b280f6eb Adds XML multiparcela. Fixes several bugs. 2019-09-20 19:15:32 +02:00
J
ee90545bb6 Pictures added. Tests fixed. README.md added. 2019-09-20 00:53:33 +02:00
J
4a28d67a4e Adds sleep where multiparcela or multiple entrie may come to avoid DoS 2019-09-19 14:53:23 +02:00
J
b81988333c Refactors and adds parametrs to argv to allow scrapping HTML/XML for both provinces or coords (polygon). 2019-09-18 22:09:00 +02:00
J
7438e0f5a7 Adds sleep as an argument 2019-09-18 19:14:03 +02:00
J
50d4ad6e93 Manually closes ElasticSearch socket. Fixes / Updates tests. 2019-09-18 19:11:04 +02:00
J
9f7d5fda51 Fixes XML scrapping for processing optional arguments. Removes bounding boxes to be eventually changed to polygons. Adds parameters to process by province. 2019-09-18 18:24:53 +02:00
J
0478146b27 Fixes parsing of bad-designed XML part of the code from Catastro (sometimes returns a list, sometimes only 1 entry but not within a list) 2019-09-18 01:37:04 +02:00
J
06bb139e63 Updates requirements.txt with xmltodict 2019-09-18 01:07:26 +02:00
J
b583814fb8 Adds XML Webservices Scrapping by all addresses in Spain, instead of by coordinates 2019-09-18 01:04:32 +02:00
J
8ffdf7faed Adds smaller coordinates regions to avoid lots of sea 2019-09-17 13:48:50 +02:00
J
4071b66a65 Undoes the sleep(2) since Cadastro closes connection 2019-09-17 11:00:39 +02:00
J
daba3becaf Changes sleep to 2sec 2019-09-17 10:51:05 +02:00
J
832f0e6239 Added option to execute the script with a specific json coordinate file. 2019-09-17 10:48:16 +02:00
J
89b3cb5994 CHhnges coordinates system, now uses Kibana Geo Point json format. Different regions provided to avoid a big suboptimized square with lots of sea points 2019-09-16 21:22:59 +02:00
J
c29de7faf2 Adds initialize_elasticsearch script to configurate the ES index 2019-09-16 17:45:24 +02:00