Adds information about how to run with 1 Elastic Search and multiple machines.
This commit is contained in:
parent
12865d9f75
commit
c83a31b907
21
README.md
21
README.md
|
@ -48,7 +48,8 @@ to send to sleep at 23:00 and restart processing at 09:00 everyday
|
||||||
|
|
||||||
Having Docker and Docker-compose installed, run first:
|
Having Docker and Docker-compose installed, run first:
|
||||||
```
|
```
|
||||||
docker-compose up -d
|
docker-compose up -d
|
||||||
|
pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
Then configure ElasticSearch index:
|
Then configure ElasticSearch index:
|
||||||
|
@ -95,4 +96,20 @@ Taking into account that there are restrictions that prevents a crapping faster
|
||||||
scrapping can take very long time. so:
|
scrapping can take very long time. so:
|
||||||
1) Go directly to the provinces / cities you need the most. Leave the rest for later.
|
1) Go directly to the provinces / cities you need the most. Leave the rest for later.
|
||||||
2) Use different IP addresses and query parallely.
|
2) Use different IP addresses and query parallely.
|
||||||
3) Write me an email to jjmcarrascosa@gmail.com to get the full DB.
|
3) Write me an email to jjmcarrascosa@gmail.com to get the full DB.
|
||||||
|
|
||||||
|
**Using additional machines (parallel extraction)**
|
||||||
|
You won't need to repeat the previous steps, because we will use one Elastic Search for all the machines.
|
||||||
|
For additional machines, do the following:
|
||||||
|
1) Make sure you have successfully run all the previous steps and ElasticSearch is running in one machine;
|
||||||
|
2) Copy the pubic IP address of that machine
|
||||||
|
3) In a new machine, clone this repository and do the following:
|
||||||
|
```
|
||||||
|
pip install -r requirements.txt
|
||||||
|
export ES_HOST="{IP OR HOST OF THE MACHINE RUNNING ELASTICSEARCH"
|
||||||
|
export ES_PORT="{PORT OF THE MACHINE RUNNING ELASTICSEARCH. USUALLY 9200}"
|
||||||
|
```
|
||||||
|
And finally, run libreCatastro:
|
||||||
|
```
|
||||||
|
python libreCatastro.py [....]
|
||||||
|
```
|
|
@ -57,7 +57,7 @@ if __name__ == "__main__":
|
||||||
|
|
||||||
pp = pprint.PrettyPrinter(indent=4)
|
pp = pprint.PrettyPrinter(indent=4)
|
||||||
pp.pprint(config)
|
pp.pprint(config)
|
||||||
|
|
||||||
''' Scrapping / Parsing core functionality'''
|
''' Scrapping / Parsing core functionality'''
|
||||||
parser = ParserHTML if args.html else ParserXML
|
parser = ParserHTML if args.html else ParserXML
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue