Tutorial: Homology Modeling

  • Tools for detecting homology (from simple to complex)
    • Uses blosum62 matrix
      • pBLAST -  does alignment based on short "word" matches.
      • phmmer - alignment using dynamic programming
    • Uses sequence profile information (PSSM or HMM)
      • HHsearch (aka HHpred) - HMM + secondary structure
      • RaptorX -  PSSM + secondary structure + solvent accessible surface area
      • SPARKS-X - PSSM + secondary structure +  backbone torsion angle + solvent accessible surface area
  • ROBETTA uses HHsearch + RaptorX + SPARKS-X and recombines the results using the hybridize mover in RosettaScripts
  • Hybridize mover does the following:
    1. Takes a list of input templates
    2. Randomly picks one template (you can bias toward certain templates being picked using an assigned weight).
    3. Uses TMalign to superimpose the remaining templates to the picked template. If TMalign score < 0.5 the template is discarded.
    4. Cuts each of the template at the loops creating "chunks"
    5. Stage 1: torsion space fragment insertion (in regions not covered by templates) and Cartesian space chunk recombination
    6. Stage 2: loop closure
    7. Stage 3: relax (full-atom sidechain and backbone refinement)
  1. Create a working directory
    mkdir rattata
    cd rattata
  2. Go to the UniProt website and find the sequence of "Ribosome association toxin RatA" from E. coli.
  3. Select "FORMAT" -> "FASTA" save the file to your working directory as "rata.fasta"
  4. Go to the HHpred website and submit the sequence. Enable the "Realign with MAC" option.
    I've already submitted here: https://toolkit.tuebingen.mpg.de/hhpred/results/RATA_ECOLI
  5. Hit "save" on the HHpred and copy the *.hhr file to your working directory.
  6. Convert the top 5 HHpred hits to grishin format.
    wget -O hhs2partial_thread.pl https://gremlin2.bakerlab.org/rosninja2016/rattata/hhs2partial_thread.pl
    perl hhs2partial_thread.pl -hhs *.hhr -fas rata.fasta -name rata -limit 5
  7. Run the Rosetta partial_thread app to convert the alignments into "partial thread" pdb files. (Note: if you are not using VirtualBox, make sure the path to rosetta inside process_rata file is correct).
    bash process_rata
  8. Generate fragments (since we learned how to do this yesterday, we'll skip this step and just download pre-made ones):
    wget https://gremlin2.bakerlab.org/rosninja2016/rattata/P0AGL5.fas.200.3mers.gz
    wget https://gremlin2.bakerlab.org/rosninja2016/rattata/P0AGL5.fas.200.9mers.gz
  9. Create "flags" file:
    -frag_weight_aligned 0 # change to 0.1 to allow fragment insertion everywhere
    -sog_upper_bound 10
    -beta # this flag enables the latest rosetta score function
    -in:file:fasta rata.fasta
    -parser:protocol xml # rosetta script (see below)
    -relax:jump_move true
    -default_max_cycles 200
    -relax:min_type lbfgs_armijo_nonmonotone
    -hybridize:stage1_probability 1.0
    -hybridize:stage1_4_cycles 400
    -nstruct 1
  10. Create "xml" file:

    <ROSETTASCRIPTS> <SCOREFXNS> <stage1 weights="stage1.wts" symmetric=0> <Reweight scoretype=atom_pair_constraint weight=1.0/> </stage1> <stage2 weights="stage2.wts" symmetric=0> <Reweight scoretype=atom_pair_constraint weight=0.5/> </stage2> <fullatom weights="beta_cart.wts" symmetric=0> <Reweight scoretype=atom_pair_constraint weight=0.5/> </fullatom> </SCOREFXNS> <MOVERS> <Hybridize name=hybridize stage1_scorefxn=stage1 stage2_scorefxn=stage2 fa_scorefxn=fullatom batch=1 stage1_increase_cycles=2.0 stage2_increase_cycles=1.0 linmin_only=0 skip_long_min=1> <Fragments 3mers="P0AGL5.fas.200.3mers.gz" 9mers="P0AGL5.fas.200.9mers.gz"/> <Template pdb="1t17A_0001_099_rata.pdb" weight="1" cst_file="AUTO"/> <Template pdb="3tfzA_0002_099_rata.pdb" weight="1" cst_file="AUTO"/> <Template pdb="2d4rA_0003_099_rata.pdb" weight="1" cst_file="AUTO"/> <Template pdb="3tvqA_0004_099_rata.pdb" weight="1" cst_file="AUTO"/> <Template pdb="4xrwA_0005_099_rata.pdb" weight="1" cst_file="AUTO"/> </Hybridize> </MOVERS> <PROTOCOLS> <Add mover=hybridize/> </PROTOCOLS> <OUTPUT scorefxn=fullatom/> </ROSETTASCRIPTS>

  11. Download stage weight files:
    wget https://gremlin2.bakerlab.org/rosninja2016/rattata/stage1.wts
    wget https://gremlin2.bakerlab.org/rosninja2016/rattata/stage2.wts
    rosetta_scripts.default.linuxgccrelease @flags