11 KiB
DATA GENERATION TOOLS
This folder contains tools to generate and import test data.
OVERVIEW
Three sets of tools are available, sorted by respective subdirectories.
- eth: Import using sovereign wallets.
- cic_eth: Import using the
cic_eth
custodial engine. - cic_ussd: Import using the
cic_ussd
interface (backed bycic_eth
)
Each of the modules include two main scripts:
- import_users.py: Registers all created accounts in the network
- import_balance.py: Transfer an opening balance using an external keystore wallet
The balance script will sync with the blockchain, processing transactions and triggering actions when it finds. In its current version it does not keep track of any other state, so it will run indefinitly and needs You the Human to decide when it has done what it needs to do.
In addition the following common tools are available:
- create_import_users.py: User creation script
- verify.py: Import verification script
- cic_meta: Metadata imports
REQUIREMENTS
A virtual environment for the python scripts is recommended. We know it works with python 3.8.x
. Let us know if you run it successfully with other minor versions.
python3 -m venv .venv
source .venv/bin/activate
Install all requirements from the requirements.txt
file:
pip install --extra-index-url https://pip.grassrootseconomics.net:8433 -r requirements.txt
If you are importing metadata, also do ye olde:
npm install
HOW TO USE
Step 1 - Data creation
Before running any of the imports, the user data to import has to be generated and saved to disk.
The script does not need any services to run.
Vanilla version:
python create_import_users.py [--dir <datadir>] <number_of_users>
If you want to use a import_balance.py
script to add to the user's balance from an external address, use:
python create_import_users.py --gift-threshold <max_units_to_send> [--dir <datadir>] <number_of_users>
Step 2 - Services
Unless you know what you are doing, start with a clean slate, and execute (in the repository root):
docker-compose down -v
Then go through, in sequence:
Base requirements
If you are importing using eth
and not importing metadata, then the only service you need running in the cluster is:
- eth
In all other cases you will also need:
- postgres
- redis
EVM provisions
This step is needed in all cases.
RUN_MASK=1 docker-compose up contract-migration
After this step is run, you can find top-level ethereum addresses (like the cic registry address, which you will need below) in <repository_root>/service-configs/.env
Custodial provisions
response_data = send_ussd_request(address, self.data_dir)
state = response_data[:3]
out = response_data[4:]
m = '{} {}'.format(state, out[:7])
if m != 'CON Welcome':
raise VerifierError(response_data, 'ussd')
This step is only needed if you are importing using cic_eth
or cic_ussd
RUN_MASK=2 docker-compose up contract-migration
Custodial services
If importing using cic_eth
or cic_ussd
also run:
- cic-eth-tasker
- cic-eth-dispatcher
- cic-eth-tracker
- cic-eth-retrier
If importing using cic_ussd
also run:
- cic-user-tasker
- cic-user-ussd-server
- cic-notify-tasker
If metadata is to be imported, also run:
- cic-meta-server
Step 3 - User imports
If you did not change the docker-compose setup, your eth_provider
the you need for the commands below will be http://localhost:63545
.
Only run one of the alternatives.
The keystore file used for transferring external opening balances tracker is relative to the directory you found this README in. Of course you can use a different wallet, but then you will have to provide it with tokens yourself (hint: ../reset.sh
)
All external balance transactions are saved in raw wire format in <datadir>/txs
, with transaction hash as file name.
If the contract migrations have been executed with the default "giftable" token contract, then the token_symbol
in the import_balance
scripts should be set to GFT
.
Alternative 1 - Sovereign wallet import - eth
First, make a note of the block height before running anything.
To import, run to completion:
python eth/import_users.py -v -c config -p <eth_provider> -r <cic_registry_address> -y ../contract-migration/keystore/UTC--2021-01-08T17-18-44.521011372Z--eb3907ecad74a0013c259d5874ae7f22dcbcc95c <datadir>
After the script completes, keystore files for all generated accouts will be found in <datadir>/keystore
, all with foo
as password (would set it empty, but believe it or not some interfaces out there won't work unless you have one).
Then run:
python eth/import_balance.py -v -c config -r <cic_registry_address> -p <eth_provider> --token-symbol <token_symbol> --offset <block_height_at_start> -y ../keystore/UTC--2021-01-08T17-18-44.521011372Z--eb3907ecad74a0013c259d5874ae7f22dcbcc95c <datadir>
Alternative 2 - Custodial engine import - cic_eth
Run in sequence, in first terminal:
python cic_eth/import_balance.py -v -c config -p <eth_provider> -r <cic_registry_address> --token-symbol <token_symbol> -y ../keystore/UTC--2021-01-08T17-18-44.521011372Z--eb3907ecad74a0013c259d5874ae7f22dcbcc95c --head out
In another terminal:
python cic_eth/import_users.py -v -c config --redis-host-callback <redis_hostname_in_docker> out
The redis_hostname_in_docker
value is the hostname required to reach the redis server from within the docker cluster, and should be redis
if you left the docker-compose unchanged. The import_users
script will receive the address of each newly created custodial account on a redis subscription fed by a callback task in the cic_eth
account creation task chain.
Alternative 3 - USSD import - cic_ussd
If you have previously run the cic_ussd
import incompletely, it could be a good idea to purge the queue. If you have left docker-compose unchanged, redis_url
should be redis://localhost:63379
.
celery -A cic_ussd.import_task purge -Q cic-import-ussd --broker redis://localhost:63379
Then, in sequence, run in first terminal:
python cic_eth/import_balance.py -v -c config -p <eth_provider> -r <cic_registry_address> --token-symbol <token_symbol> -y ../keystore/UTC--2021-01-08T17-18-44.521011372Z--eb3907ecad74a0013c259d5874ae7f22dcbcc95c out
In second terminal:
python cic_ussd/import_users.py -v -c config out
Step 4 - Metadata import (optional)
The metadata import scripts can be run at any time after step 1 has been completed.
Importing user metadata
To import the main user metadata structs, run:
node cic_meta/import_meta.js <datadir> <number_of_users>
Monitors a folder for output from the import_users.py
script, adding the metadata found to the cic-meta
service.
If number of users is omitted the script will run until manually interrupted.
Importing phone pointer
node cic_meta/import_meta_phone.js <datadir> <number_of_users>
If you imported using cic_ussd
, the phone pointer is already added and this script will do nothing.
Importing pins and ussd data (optional)
Once the user imports are complete the next step should be importing the user's pins and auxiliary ussd data. This can be done in 3 steps:
In one terminal run:
python create_import_pins.py -c config -v --userdir <path to the users export dir tree> pinsdir <path to pin export dir tree>
This script will recursively walk through all the directories defining user data in the users export directory and generate a csv file containing phone numbers and password hashes generated using fernet in a manner reflecting the nature of said hashes in the old system. This csv file will be stored in the pins export dir defined as the positional argument.
Once the creation of the pins file is complete, proceed to import the pins and ussd data as follows:
- To import the pins:
python cic_ussd/import_pins.py -c config -v pinsdir <path to pin export dir tree>
- To import ussd data:
python cic_ussd/import_ussd_data.py -c config -v userdir <path to the users export dir tree>
The balance script is a celery task worker, and will not exit by itself in its current version. However, after it's done doing its job, you will find "reached nonce ... exiting" among the last lines of the log.
The connection parameters for the cic-ussd-server
is currently hardcoded in the import_users.py
script file.
Step 5 - Verify
python verify.py -v -c config -r <cic_registry_address> -p <eth_provider> <datadir>
Included checks:
- Private key is in cic-eth keystore
- Address is in accounts index
- Address has gas balance
- Address has triggered the token faucet
- Address has token balance matching the gift threshold
- Personal metadata can be retrieved and has exact match
- Phone pointer metadata can be retrieved and matches address
- USSD menu response is initial state after registration
Checks can be selectively included and excluded. See --help
for details.
Will output one line for each check, with name of check and number of errors found per check.
Should exit with code 0 if all input data is found in the respective services.
KNOWN ISSUES
-
If the faucet disbursement is set to a non-zero amount, the balances will be off. The verify script needs to be improved to check the faucet amount.
-
When the account callback in
cic_eth
fails, thecic_eth/import_users.py
script will exit with a cryptic complaint concerning aNone
value. -
Sovereign import scripts use the same keystore, and running them simultaneously will mess up the transaction nonce sequence. Better would be to use two different keystore wallets so balance and users scripts can be run simultaneously.
-
pycrypto
andpycryptodome
have to be installed in that order. If you get errors concerningCrypto.KDF
then uninstall both and re-install in that order. Make sure you use the versions listed inrequirements.txt
.pycryptodome
is a legacy dependency and will be removed as soon as possible. -
Sovereign import script is very slow because it's scrypt'ing keystore files for the accounts that it creates. An improvement would be optional and/or asynchronous keyfile generation.
-
Running the balance script should be optional in all cases, but is currently required in the case of
cic_ussd
because it is needed to generate the metadata. An improvement would be moving the task toimport_users.py
, for a different queue than the balance tx handler. -
cic_ussd
imports is poorly implemented, and consumes a lot of resources. Therefore it takes a long time to complete. Reducing the amount of polls for the phone pointer would go a long way to improve it. -
A strict constraint is maintained insistin the use of postgresql-12.