Bcell Epitope prediction: Difference between revisions

From 22145
Jump to navigation Jump to search
(Created page with "We want to look for epitopes on the HIV GP120 protein. We will use both sequence-based and structure-based methods. Input data. Go to the pdb homepage and look for the pdb entry 4LSP. It is GP120 complexed with a broadly neutralizing antibody. These antibodies are extremely powerful, since they are effective against many different strains. Q1: Download both the structure (pdb) and the sequence (fasta) of the proteins in this entry. Which proteins are in chain G, L and...")
 
No edit summary
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
We want to look for epitopes on the HIV GP120 protein. We will use both sequence-based and structure-based methods.
'''B-cell epitope prediction — Why do some flu vaccines miss?
Input data.


Go to the pdb homepage and look for the pdb entry 4LSP. It is GP120 complexed with a broadly neutralizing antibody. These antibodies are extremely powerful, since they are effective against many different strains.
Influenza A hemagglutinin (HA) has two very different antibody targets: the variable head and the conserved stem. Head-binding antibodies often lose potency as the virus drifts; stem-binding broadly neutralizing antibodies (bnAbs) can resist drift. Can standard sequence-based and structure-based B-cell epitope predictors tell these stories apart?


Q1: Download both the structure (pdb) and the sequence (fasta) of the proteins in this entry. Which proteins are in chain G, L and H respectively?
'''Part A, Get your bearings in PyMOL (find the ground truth epitope)'''


hint: you can find this information in the structure web-page, in the PDB file itself (you can open it with a text editor), and, if you are more experienced, in the fasta file.
Go to the pdb homepage and look for the pdb entries 4FQI and 7K39. It is hemagglutinin protein from Inlfuenza complexed with two different antibodies. Fetch and inspect structures. You can color the different chains in different colors


Q2: using pymol, can you identify which residues are part of the epitope? Is the epitope linear? Does it have a linear core?
* Open 4FQI in PyMOL (fetch 4fqi) and identify chains for HA (HA1/HA2) and Fab (H/L). For ilustrative purposes we will call HA1 the head and HA2 the stem of hemagglutinin.


Hints: remove the waters, then select the antibody and modify this selection to find residues in a 5Å sphere. You can find a quick video on how to do this in the hands-on prerequisites.
* Open 7K39 (fetch 7k39) and similarly identify HA vs antibody chains.


Q3: using pymol, identify the antibody in the structure, select and remove it (action -> remove atoms). Save the resulting molecule as antigen.pdb
hint: you can find this information in the structure web-page, in the PDB file itself (you can open it with a text editor and search for COMPND)
 
COMPND    MOL_ID: 1;                                                           
COMPND  2 MOLECULE: HEMAGGLUTININ HA1;                                       
COMPND  3 CHAIN: A;                                                           
COMPND  4 FRAGMENT: UNP RESIDUES 17-346;                                     
COMPND  5 SYNONYM: HEMAGGLUTININ RECEPTOR BINDING SUBUNIT HA1;               
COMPND  6 ENGINEERED: YES;                                                   
COMPND  7 MOL_ID: 2;                                                         
COMPND  8 MOLECULE: HEMAGGLUTININ HA2;                                       
COMPND  9 CHAIN: B;                                                           
COMPND  10 FRAGMENT: UNP RESIDUES 347-520;                                     
COMPND  11 SYNONYM: HEMAGGLUTININ MEMBRANE FUSION SUBUNIT HA2;                 
COMPND  12 ENGINEERED: YES;                                                   
COMPND  13 MOL_ID: 3;                                                         
COMPND  14 MOLECULE: ANTIBODY CR9114 HEAVY CHAIN;                             
COMPND  15 CHAIN: H;                                                           
COMPND  16 FRAGMENT: FAB;                                                     
COMPND  17 ENGINEERED: YES;                                                   
COMPND  18 MOL_ID: 4;                                                         
COMPND  19 MOLECULE: ANTIBODY CR9114 LIGHT CHAIN;                             
COMPND  20 CHAIN: L;                                                           
COMPND  21 FRAGMENT: FAB LAMBDA;                                               
COMPND  22 ENGINEERED: YES
 
'''Q1.''' List chain IDs for antigen vs antibody in each complex.
 
Define the real epitope. For each complex, remove waters (remove solvent). Create an epitope selection on HA as all residues within 5 Å of any antibody atom and color it in red.
 
''Hint of PyMOL.'' select epitope, br.(antibody around 5)
''Hint2'' In one of the complexes you have several antibody H and L chains. You might want to work with one antibody only!
 
'''Q2.''' Provide screenshots of the epitopes for the different protein complexes and the sequence panel with chain letters visible. How many epitope residues are there in head vs stem complexes? Are they mostly loops, helices, or strands (qualitatively)?
 
'''Q3.''' Can you identify which residues are part of the epitope? Is the epitope linear? Does it have a linear core? Are the two complexes targeting the same chain?
 
Make antigen-only files
 
Using pymol, identify the antibody in the structure, select and remove it (action -> remove atoms). Save the resulting molecule as antigen.pdb
If the structure contain several antigen chains that are not part of the epitope, you can remove them.


You can save using the GUI (remember to select PDB as option) as well as typing "save antigen.pdb" in the command line
You can save using the GUI (remember to select PDB as option) as well as typing "save antigen.pdb" in the command line


Now we are starting with a sequence-based prediction of the epitopes.
You can also save the fasta sequence of the antigen with the following command:
 
save /path/to/folder/XXXX.fasta, all, format=fasta
(remember to change the path and the XXs, name of the PDB complex)
 
'''Part B. Sequence-based prediction (BepiPred-2.0 & 3.0)'''


Bepipred 2/3


Go to the bepipred website and upload the sequence of GP120. Leave the default parameters and submit the job. Don't close the window because we will need it again!
Go to the BepiPred2 website and upload the sequences of the stem and the head of hemagglutinin. Leave the default parameters and submit the job. Don't close the window because we will need it again!


Q4: does the Bepipred prediction overlap with the actual epitope?
'''Q4.''' does the BepiPred predictions overlap with the actual epitopes?


Be careful: pdb file numbering does not always start from 1! Compare the numbering of the pdb with the one returned by Bepipred.
Be careful: pdb file numbering does not always start from 1! Compare the numbering of the pdb with the one returned by BepiPred.


Repeat the prediction now using BepiPred3.  
Repeat the prediction now using BepiPred3.  


Q5: Does the prediction overlap with our epitope now?
'''Q5.''' Does the prediction overlap with our epitope now?
 
'''Q6.''' What is different in BepiPred 2 compared to Bepipred 3?


Q6: What is different in BepiPred 2 compared to Bepipred 3?
Discotope 2/3


Finally, let's use DTU Discotope server.
'''Part C. Structure-based prediction (DiscoTope-2.0 & 3.0)'''
Load the antigen.pdb file and submit the prediction. Save the output as discotope.pdb
 
 
Load the antigen.pdb file and submit the prediction in Discotope-2. Save the output as discotope2.pdb
 
Once again, the prediction is in the b-factor field (but you can download it in other formats as well)
Once again, the prediction is in the b-factor field (but you can download it in other formats as well)
Load it in pymol and colour according to the b-factor.
Load it in pymol and colour according to the b-factor.


Q7: Does the prediction overlap with our epitope?
'''Q7:''' Does the prediction overlap with our epitope?
 
'''Q8.''' Which prediction best matches our epitope?
 
'''Q9.''' Use Discotope to predict the epitopes in the simian HIV GP120 homolog (pdb code 3FUS). If you want to save time, use this pre-cleaned PDB. Compare the results. Try and explain the differences. According to you is it a meaningful result or not? Please note that some bugs might occur, according to which file you use. If some strange results appear, report it in your answer :)
 
'''Q10.''' Compare the results obtained in Discotope 2 and 3
 
'''Part D. Antigenic drift vs conservation'''
 
Compare another HA
Fetch an HA from a different subtype/strain (e.g., another H5N1, H1N1, or the C05-bound head example). For convenience, you can use 4FQI as reference.
 
Align HA chains (align) and map your epitope residues from the first antigen onto the second by position.


Q8: Which prediction best matches our epitope?
'''Q11.''' Which epitope class (head vs stem) is more conserved by sequence identity at the interface?  


Q9: Use Discotope to predict the epitopes in the simian HIV GP120 homolog (pdb code 3FUS). If you want to save time, use this pre-cleaned PDB. Compare the results. Try and explain the differences. According to you is it a meaningful result or not? Please note that some bugs might occur, according to which file you use. If some strange results appear, report it in your answer :)
'''Part E. Now suppose that we found a good neutralising antibody against Influenza, we isolate the antibody and have a sequence'''


Q10: Compare the results obtained in Discotope 2 and 3
'''Q12.''' Are there any tools to help us predicting where the antibody will bind in the antigen???

Latest revision as of 14:59, 24 October 2025

B-cell epitope prediction — Why do some flu vaccines miss?

Influenza A hemagglutinin (HA) has two very different antibody targets: the variable head and the conserved stem. Head-binding antibodies often lose potency as the virus drifts; stem-binding broadly neutralizing antibodies (bnAbs) can resist drift. Can standard sequence-based and structure-based B-cell epitope predictors tell these stories apart?

Part A, Get your bearings in PyMOL (find the ground truth epitope)

Go to the pdb homepage and look for the pdb entries 4FQI and 7K39. It is hemagglutinin protein from Inlfuenza complexed with two different antibodies. Fetch and inspect structures. You can color the different chains in different colors

  • Open 4FQI in PyMOL (fetch 4fqi) and identify chains for HA (HA1/HA2) and Fab (H/L). For ilustrative purposes we will call HA1 the head and HA2 the stem of hemagglutinin.
  • Open 7K39 (fetch 7k39) and similarly identify HA vs antibody chains.

hint: you can find this information in the structure web-page, in the PDB file itself (you can open it with a text editor and search for COMPND)

COMPND MOL_ID: 1; COMPND 2 MOLECULE: HEMAGGLUTININ HA1; COMPND 3 CHAIN: A; COMPND 4 FRAGMENT: UNP RESIDUES 17-346; COMPND 5 SYNONYM: HEMAGGLUTININ RECEPTOR BINDING SUBUNIT HA1; COMPND 6 ENGINEERED: YES; COMPND 7 MOL_ID: 2; COMPND 8 MOLECULE: HEMAGGLUTININ HA2; COMPND 9 CHAIN: B; COMPND 10 FRAGMENT: UNP RESIDUES 347-520; COMPND 11 SYNONYM: HEMAGGLUTININ MEMBRANE FUSION SUBUNIT HA2; COMPND 12 ENGINEERED: YES; COMPND 13 MOL_ID: 3; COMPND 14 MOLECULE: ANTIBODY CR9114 HEAVY CHAIN; COMPND 15 CHAIN: H; COMPND 16 FRAGMENT: FAB; COMPND 17 ENGINEERED: YES; COMPND 18 MOL_ID: 4; COMPND 19 MOLECULE: ANTIBODY CR9114 LIGHT CHAIN; COMPND 20 CHAIN: L; COMPND 21 FRAGMENT: FAB LAMBDA; COMPND 22 ENGINEERED: YES

Q1. List chain IDs for antigen vs antibody in each complex.

Define the real epitope. For each complex, remove waters (remove solvent). Create an epitope selection on HA as all residues within 5 Å of any antibody atom and color it in red.

Hint of PyMOL. select epitope, br.(antibody around 5) Hint2 In one of the complexes you have several antibody H and L chains. You might want to work with one antibody only!

Q2. Provide screenshots of the epitopes for the different protein complexes and the sequence panel with chain letters visible. How many epitope residues are there in head vs stem complexes? Are they mostly loops, helices, or strands (qualitatively)?


Q3. Can you identify which residues are part of the epitope? Is the epitope linear? Does it have a linear core? Are the two complexes targeting the same chain?

Make antigen-only files

Using pymol, identify the antibody in the structure, select and remove it (action -> remove atoms). Save the resulting molecule as antigen.pdb If the structure contain several antigen chains that are not part of the epitope, you can remove them.

You can save using the GUI (remember to select PDB as option) as well as typing "save antigen.pdb" in the command line

You can also save the fasta sequence of the antigen with the following command:

save /path/to/folder/XXXX.fasta, all, format=fasta (remember to change the path and the XXs, name of the PDB complex)

Part B. Sequence-based prediction (BepiPred-2.0 & 3.0)


Go to the BepiPred2 website and upload the sequences of the stem and the head of hemagglutinin. Leave the default parameters and submit the job. Don't close the window because we will need it again!

Q4. does the BepiPred predictions overlap with the actual epitopes?

Be careful: pdb file numbering does not always start from 1! Compare the numbering of the pdb with the one returned by BepiPred.

Repeat the prediction now using BepiPred3.

Q5. Does the prediction overlap with our epitope now?

Q6. What is different in BepiPred 2 compared to Bepipred 3?


Part C. Structure-based prediction (DiscoTope-2.0 & 3.0)


Load the antigen.pdb file and submit the prediction in Discotope-2. Save the output as discotope2.pdb

Once again, the prediction is in the b-factor field (but you can download it in other formats as well) Load it in pymol and colour according to the b-factor.

Q7: Does the prediction overlap with our epitope?

Q8. Which prediction best matches our epitope?

Q9. Use Discotope to predict the epitopes in the simian HIV GP120 homolog (pdb code 3FUS). If you want to save time, use this pre-cleaned PDB. Compare the results. Try and explain the differences. According to you is it a meaningful result or not? Please note that some bugs might occur, according to which file you use. If some strange results appear, report it in your answer :)

Q10. Compare the results obtained in Discotope 2 and 3

Part D. Antigenic drift vs conservation

Compare another HA Fetch an HA from a different subtype/strain (e.g., another H5N1, H1N1, or the C05-bound head example). For convenience, you can use 4FQI as reference.

Align HA chains (align) and map your epitope residues from the first antigen onto the second by position.

Q11. Which epitope class (head vs stem) is more conserved by sequence identity at the interface?

Part E. Now suppose that we found a good neutralising antibody against Influenza, we isolate the antibody and have a sequence

Q12. Are there any tools to help us predicting where the antibody will bind in the antigen???