Dropout Graph Convolutional Network (dGCN)
This is the official code and data repository for the paper "Active Learning of Ternary Alloy Structures and Energies". Binary (relaxed) and ternary (unrelaxed and relaxed) alloy structures are provided as .cif files under "repo". The inital binary dataset and datasets with added ternary compositions/clusters are provided under "datasets". Plots given in the paper can be reproduced in Jupyter notebooks given under "notebooks".
Setting up
To set up the repository, please follow the steps below:
- Clone the repository
git clone https://github.itap.purdue.edu/GreeleyGroup/dgcn.git
-
Install Anaconda
-
Create the required conda environment from
environment.yml
conda env create --file environment.yml
- Activate the environment
conda activate dgcn_env
Model training
To train a dGCN model and make predictions:
-
Open
train_and_predict_dgcn.py
. -
Enter a method (composition or cluster) and an iteration (1-6). Tinker with the dGCN hyperparameters (if you know what you are doing).
-
Run the script.
python train_and_predict_dgcn.py
- Go to appropriate subdirectory under "active_results" and open
df_configs.csv
to review the predictions. You can perform further analyses, like calculating metrics, using this csv.
You can also create custom datasets to train the dGCN on. Use the create_dataset.py
script to create new datasets composed of different combinations of alloy compositions. Note that the high-fidelity dataset consists of a limited number of alloy compositions (given below) that can be included in the custom dataset.
Pd | Pt | Sn |
---|---|---|
10 | 2 | 4 |
11 | 1 | 4 |
11 | 2 | 3 |
1 | 11 | 4 |
1 | 12 | 3 |
1 | 3 | 12 |
1 | 4 | 11 |
2 | 11 | 3 |
2 | 12 | 2 |
2 | 1 | 13 |
2 | 2 | 12 |
2 | 3 | 11 |
2 | 9 | 5 |
3 | 10 | 3 |
3 | 11 | 2 |
3 | 12 | 1 |
3 | 1 | 12 |
3 | 2 | 11 |
3 | 9 | 4 |
4 | 10 | 2 |
4 | 11 | 1 |
4 | 7 | 5 |
4 | 8 | 4 |
5 | 5 | 6 |
5 | 6 | 5 |
5 | 8 | 3 |
5 | 9 | 2 |
6 | 4 | 6 |
6 | 5 | 5 |
6 | 9 | 1 |
7 | 3 | 6 |
7 | 4 | 5 |
8 | 3 | 5 |
8 | 4 | 4 |
9 | 2 | 5 |
9 | 3 | 4 |
Making predictions in the ternary space
It is possible to generate ~400,000 ternary structures for all ternary compositions and make predictions on them to calculate free energies (as done in the paper). However, it is recommended that you compile PyTorch with CUDA and use a GPU for this, since otherwise the prediction takes forever on CPUs. Assuming you have done that, do the following:
-
Run
init_ternary_space.py
. This generates the structures and dataset. -
In
train_and_predict_dgcn.py
, changepredict_ternary_space
toTrue
and changecuda
toTrue
indgcn_parameters
. -
Run
train_and_predict_dgcn.py
.