Bcell Epitope prediction
B-cell epitope prediction — Why do some flu vaccines miss?
Influenza A hemagglutinin (HA) has two very different antibody targets: the variable head and the conserved stem. Head-binding antibodies often lose potency as the virus drifts; stem-binding broadly neutralizing antibodies (bnAbs) can resist drift. Can standard sequence-based and structure-based B-cell epitope predictors tell these stories apart?
Part A, Get your bearings in PyMOL (ground truth epitope)
Go to the pdb homepage and look for the pdb entries 4FQI and 7K39. It is hemagglutinin protein from Inlfuenza complexed with two different antibodies. Fetch and inspect structures. You can color the different chains in different colors
- Open 4FQI in PyMOL (fetch 4fqi) and identify chains for HA (HA1/HA2) and Fab (H/L). For ilustrative purposes we will call HA1 the head and HA2 the stem of hemagglutinin.
- Open 7K39 (fetch 7k39) and similarly identify HA vs antibody chains.
hint: you can find this information in the structure web-page, in the PDB file itself (you can open it with a text editor and search for COMPND)
COMPND MOL_ID: 1; COMPND 2 MOLECULE: HEMAGGLUTININ HA1; COMPND 3 CHAIN: A; COMPND 4 FRAGMENT: UNP RESIDUES 17-346; COMPND 5 SYNONYM: HEMAGGLUTININ RECEPTOR BINDING SUBUNIT HA1; COMPND 6 ENGINEERED: YES; COMPND 7 MOL_ID: 2; COMPND 8 MOLECULE: HEMAGGLUTININ HA2; COMPND 9 CHAIN: B; COMPND 10 FRAGMENT: UNP RESIDUES 347-520; COMPND 11 SYNONYM: HEMAGGLUTININ MEMBRANE FUSION SUBUNIT HA2; COMPND 12 ENGINEERED: YES; COMPND 13 MOL_ID: 3; COMPND 14 MOLECULE: ANTIBODY CR9114 HEAVY CHAIN; COMPND 15 CHAIN: H; COMPND 16 FRAGMENT: FAB; COMPND 17 ENGINEERED: YES; COMPND 18 MOL_ID: 4; COMPND 19 MOLECULE: ANTIBODY CR9114 LIGHT CHAIN; COMPND 20 CHAIN: L; COMPND 21 FRAGMENT: FAB LAMBDA; COMPND 22 ENGINEERED: YES
Q1. List chain IDs for antigen vs antibody in each complex.
Define the real epitope. For each complex, remove waters (remove solvent). Create an epitope selection on HA as all residues within 5 Å of any antibody atom and color it in red.
Hint of PyMOL. select epitope, br.(antibody around 5) Hint2 In one of the complexes you have several antibody H and L chains. You might want to work with one antibody only!
Q2. Provide screenshots of the epitopes for the different protein complexes and the sequence panel with chain letters visible. How many epitope residues are there in head vs stem complexes? Are they mostly loops, helices, or strands (qualitatively)?
Q3. Can you identify which residues are part of the epitope? Is the epitope linear? Does it have a linear core? Are the two complexes targeting the same chain?
Make antigen-only files
Using pymol, identify the antibody in the structure, select and remove it (action -> remove atoms). Save the resulting molecule as antigen.pdb If the structure contain several antigen chains that are not part of the epitope, you can remove them.
You can save using the GUI (remember to select PDB as option) as well as typing "save antigen.pdb" in the command line
You can also save the fasta sequence of the antigen with the following command:
save /path/to/folder/XXXX.fasta, all, format=fasta (remember to change the path and the XXs, name of the PDB complex)
Part B. Sequence-based prediction (BepiPred-2.0 & 3.0)
Go to the BepiPred2 website and upload the sequences of the stem and the head of hemagglutinin. Leave the default parameters and submit the job. Don't close the window because we will need it again!
Q4. does the BepiPred predictions overlap with the actual epitopes?
Be careful: pdb file numbering does not always start from 1! Compare the numbering of the pdb with the one returned by BepiPred.
Repeat the prediction now using BepiPred3.
Q5. Does the prediction overlap with our epitope now?
Q6. What is different in BepiPred 2 compared to Bepipred 3?
Part C. Structure-based prediction (DiscoTope-2.0 & 3.0)
Load the antigen.pdb file and submit the prediction in Discotope-2. Save the output as discotope2.pdb
Once again, the prediction is in the b-factor field (but you can download it in other formats as well) Load it in pymol and colour according to the b-factor.
Q7: Does the prediction overlap with our epitope?
Q8. Which prediction best matches our epitope?
Q9. Use Discotope to predict the epitopes in the simian HIV GP120 homolog (pdb code 3FUS). If you want to save time, use this pre-cleaned PDB. Compare the results. Try and explain the differences. According to you is it a meaningful result or not? Please note that some bugs might occur, according to which file you use. If some strange results appear, report it in your answer :)
Q10. Compare the results obtained in Discotope 2 and 3
Part D. Antigenic drift vs conservation
Compare another HA Fetch an HA from a different subtype/strain (e.g., another H5N1, H1N1, or the C05-bound head example). For convenience, you can use 4FQI as reference.
Align HA chains (align) and map your epitope residues from the first antigen onto the second by position.
Q11. Which epitope class (head vs stem) is more conserved by sequence identity at the interface?
Part E. Now suppose that we found a good neutralising antibody against Influenza, we isolate the antibody and have a sequence
Q12. Are there any tools to help us predicting where the antibody will bind in the antigen???