Skip to content

ealnabat/MarkovFit

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
MRF
 
 
 
 
 
 
 
 
 
 
 
 

MarkovFit

MarkovFit is A Protein Structural Modeling Method for Electron Microscopy Maps Using Markov Random Field.

Copyright (C) 2022 Eman Alnabati, Genki Terashi, Juan Esquivel-Rodriguez, Daisuke Kihara, and Purdue University.

License: GPL v3 for academic use. (For commercial use, please contact us for different licensing.)

Contact: Daisuke Kihara (dkihara@purdue.edu)

Cite:

Outline

  • Pre-required software
    • Compile Source Code
  • Project Steps
    • Generate simulated maps of subunits using EMAN2 package
    • Run FFT Search
    • Handle and Cluster FFT Search Results
    • Compute Pairwise Scores of Pairs of Subunits
    • Generate MRF graph and Apply Belief Propagation
    • Extract top10 Final Structures using MaxHeap Tree
  • Run an example

Pre-required software

Compile Source Code:

  • FFT_Search Folder: make EMVEC_FIT_PowerFit

  • Handle Folder: g++ handle_result.cc -o handle

  • Pairwise_Scores Folder: make pairwise_scores_mpi

Project Steps:

Generate simulated maps of subunits using EMAN2 package:

e2pdb2mrc.py --apix=voxel_spacing --res=resolution input_pdb_file output_mrc_file

For experimental map fitting: use voxel spacing and resoultion of that map.

Run FFT Search:

Command:
./FFT_Search/EMVEC_FIT_PowerFit -a main_map -b subunit1_mrc_map -t main_map_contour_level -T subunit_map_contour_level -c no_processes -P true -M 2 -s voxel_space -p map_type > output_file
Input:
-a:          Main map
-b:          Subunit map
-t [float]:  Threshold of density main_map def=0.000
-T [float]:  Threshold of density sub_map def=0.000
-c [int  ]:  Number of cores for threads def=2
-g [float]:  Bandwidth of the gaussian filter
             def=16.0, sigma = 0.5*[float]
-s [float]:  Sampling grid space def=7.0
-M [int]:    Sampling Angle interval Mode 1-3 def=2
             1: 20.83 degree,   648 samples
             2: 10.07 degree, 7,416 samples
             3: 4.71 degree, 70,728 samples
-C:          Cross Correlation Coefficient and Overlap Mode 
             Using normalized density value by Gaussian Filter
-P:          Pearson Correlation Coefficient and Overlap Mode 
             Using normalized density value by Gaussian Filter and average density
-p:          Map type: 1 for experimental, 2 for simulated def=1 
Output:
output_file contains the different transformations applied to subunit map along with goodness-of-fit scores. 

Handle and Cluster FFT Search Results:

Command:
./Handle/handle --dist-threshold max_dist_to_show --min-dist cluster_dist_thrshold --correct-x center_x --correct-y center_y --correct-z center_z --i input_file --o output_file
Input:
--min-dist:         Distance used for clustering search results 8 def=8
--i:                Input file containing FFT search results
--o:                Output file to store results after sorting and clustering
optional:
To print search results and their distance to reference structur after sorting and clustering
--correct-x:        X value of center of reference/native structure
--correct-y:        Y value of center of reference/native structure 
--correct-z:        Z value of center of reference/native structure 
--dist-threshold:   Show only results with distance < threshold to native structure
Output:
output_file containing sorted and clustered results 
output_file_b4_clstering containing sorted results before clustering

Compute Pairwise Scores of Pairs of Subunits

Command:
 mpirun -np no_processes ./Pairwise_Scores/pairwise_scores_mpi --input-pdb subunit1_pdb ... --input-pdb subunitN_pdb --labels A,B,C --transforms-file sub1_processed_search_result_file --transforms-file subN_processed_search_result_file  --calpha main_pdb --transforms-num no_results_per_subunit
Input:
-np:                    No of processes
--input-pdb:            Subunit pdb files
--labels:               List of subunits IDs separated by comma
--transforms-file:      Subunit processed search result file 
--calpha main_pdb:      Native structure pdb file
--output-prefix:        Prefix for output files
--transforms-num:       No results to be considered for each subunit
Output:
One .mrf file for each subunit containing RMSD and goodness-of-fit scores for each transformation
One .mrf file for each pair of subunits containing RMSD and pairwise scores for each of their transformations

Generate MRF graph and Apply Belief Propagation

Command:
R --slave --vanilla --file=./MRF/mrf.R --args "operation='map'" "singletons=c('sub1.mrf',...,'subN.mrf')" "pairwise=c('sub1-sub2.mrf','sub1-sub3.mrf',...,'sub(N-1)-subN.mrf')" "weights=potential.collection.weights(CC=0.5,Overlap=0.9,PhysicsScore=1,no_clashes=0.8)" "output.prefix='mrf'"
Input:
singletons:         List of .mrf files of all subunits
pairwise:           List of .mrf files of all pairs of subunits
weights:            weights to be used for the goodness-of-fit scores and pairwise scores  
output.prefix:      Prefix of the output file
Output:
prefix_top100.txt file containing the transformations of subunits and pairs of subunits sorted bt their final beliefs computed by belief propagation algorithm.

Extract top10 Final Structures using MaxHeap Tree:

Command:
python3 ./MaxHeap/max-heap.py --mrf-file prefix_top100.txt --clash-threshold no_clashes -dir pdb_dir
Input:
--mrf-file:             File containing the results of MRF and belief propagation 
--clash-threshold:      Maximum number of acceptable clashes between pairs of subunits
-dir:                   Directory of the native structure pdb files to generate the transformed structures   
Output:
PDB files of the top 10 structures of each subuint (subID_decoy_MaxHeap_structure#.pdb)
File for each subunit containing subunit transformations of the top 10 structures (subID_clashesThreshold_MaxHeap.txt)

Run an example:

Command:
 ./ run.sh

run.sh file contains all the commands to geenrate top 10 refined structures of the target in the output folder.

Expected_Output folder contains all output files of the example target.

  • subunitID_decoy_MaxHeap_#.pdb: PDB structures.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published