Dropout Graph Convolutional Network (dGCN)

This is the official code and data repository for the paper "Active Learning of Ternary Alloy Structures and Energies". Binary (relaxed) and ternary (unrelaxed and relaxed) alloy structures are provided as .cif files under "repo". The inital binary dataset and datasets with added ternary compositions/clusters are provided under "datasets". Plots given in the paper can be reproduced in Jupyter notebooks given under "notebooks".

Setting up

To set up the repository, please follow the steps below:

Clone the repository

git clone https://github.itap.purdue.edu/GreeleyGroup/dgcn.git

Install Anaconda
Create the required conda environment from environment.yml (CPU) or environment_gpu.yml (GPU).

conda env create --file environment.yml

OR

conda env create --file environment_gpu.yml

Activate the environment

conda activate dgcn_env

Model training

To train a dGCN model and make predictions:

Open train_and_predict_dgcn.py.
Enter a method (composition or cluster) and an iteration (1-6). Tinker with the dGCN hyperparameters (if you know what you are doing).
Run the script.

python train_and_predict_dgcn.py

Go to appropriate subdirectory under "active_results" and open df_configs.csv to review the predictions. You can perform further analyses, like calculating metrics, using this csv.

You can also create custom datasets to train the dGCN on. Use the create_dataset.py script to create new datasets composed of different combinations of alloy compositions. Note that the high-fidelity dataset consists of a limited number of alloy compositions (given below) that can be included in the custom dataset.

Pd	Pt	Sn
10	2	4
11	1	4
11	2	3
1	11	4
1	12	3
1	3	12
1	4	11
2	11	3
2	12	2
2	1	13
2	2	12
2	3	11
2	9	5
3	10	3
3	11	2
3	12	1
3	1	12
3	2	11
3	9	4
4	10	2
4	11	1
4	7	5
4	8	4
5	5	6
5	6	5
5	8	3
5	9	2
6	4	6
6	5	5
6	9	1
7	3	6
7	4	5
8	3	5
8	4	4
9	2	5
9	3	4

Making predictions in the ternary space

It is possible to generate ~400,000 ternary structures for all ternary compositions and make predictions on them to calculate free energies (as done in the paper). However, it is recommended that you compile PyTorch with CUDA and use a GPU for this, since otherwise the prediction takes forever on CPUs. Assuming you have done that, do the following:

Run init_ternary_space.py. This generates the structures and dataset.
In train_and_predict_dgcn.py, change predict_ternary_space to True and change cuda to True in dgcn_parameters.
Run train_and_predict_dgcn.py.

Troubleshooting

If you see the following error "ImportError: dynamic module does not define module export function (PyInit_lzma)", check your PYTHONPATH by entering echo $PYTHONPATH. You may need to remove references to other python installations besides the one in dgcn_env. This can usually be done by redefining PYTHONPATH by entering export PYTHONPATH="/path/to/pythonpath/site-packages". To determine this path for the dgcn_env, first activate the dgcn_env and then enter the following on the terminal: python -c "import site;print(site.getsitepackages())"

GreeleyGroup/active_learning

About

Resources

Stars

Watchers

Forks

Releases

Languages