Protein Structure: Difference between revisions

From 22111
Jump to navigation Jump to search
 
(31 intermediate revisions by the same user not shown)
Line 2: Line 2:


== Overview ==
== Overview ==
In this exercise you will learn how to
In this exercise you will learn how to
* Search diferent protein databases to obtain protein structures.
* Search diferent protein databases to obtain protein structures.
Line 8: Line 9:
* Highlight features of interest and perform a basic alignment.
* Highlight features of interest and perform a basic alignment.


This exercise is written a bit different. Instead of the step wise exercise we are going to try a different method called inquiry-based learning, where students explore real-world problems and questions, taking ownership of their learning by asking questions, conducting research, and developing their own solutions and understandings.
This exercise is written a bit different. Instead of the step wise exercise we are going to try a different method called inquiry-based learning, where students explore real-world problems and questions, taking ownership of their learning by asking questions, conducting research, and developing their own solutions and understandings.  
 
Therefore today we are going to start with a biological question we want to answer:
 
'''Why only plant profilins are giving an allergic response despite their high conservation across species? Why are we not allergic to mammal profilins, for example?'''


== A bit of background ==
== A bit of background ==
Line 24: Line 29:
Despite their ubiquity and structural conservation, humans typically only become allergic to plant profilins (e.g., from birch pollen, grass pollen, the rubber tree, or certain fruits/vegetables).
Despite their ubiquity and structural conservation, humans typically only become allergic to plant profilins (e.g., from birch pollen, grass pollen, the rubber tree, or certain fruits/vegetables).


Today we would seek to understand:  
Today we would seek to understand: Why only plant profilins are giving an allergenic response despite their high conservation across species?


'''Why only plant profilins are giving an allergenic response despite their high conservation across species?'''
[[Image:Allergen-response.jpg|thumb|center|border|300px|'''Figure 1.''' The allergen response.]]


== Getting started ==
== Getting started ==
Line 32: Line 37:
We know that protein functions are closely correlated to their 3D structure, so in order to understand what is different in Profilin proteins from plants and mammals we are going to gather some experimentally determined protein structures from plant and mammal proteins.
We know that protein functions are closely correlated to their 3D structure, so in order to understand what is different in Profilin proteins from plants and mammals we are going to gather some experimentally determined protein structures from plant and mammal proteins.


First go to https://www.rcsb.org/
First go to the Protein Data Bank (PDB) https://www.rcsb.org/
 
 
[[Image:PDB.png|thumb|center|border|400px|'''Figure 2.''' The Protein Data Bank.]]
 


We are looking for and antibody of the class E (those are related to allergy) in complex with a plant allergen (in this case we will use Hevea brasiliensis, the rubber tree).
We are looking for and antibody of the class E (those are related to allergy) in complex with a plant allergen (in this case we will use Hevea brasiliensis, the rubber tree).


And search for ''IgE'' AND ''Hev b 8''
And search for ''IgE'' AND ''Hev b 8''
 
We are looking for a protein structure complex, so dismiss all the search results that do not contain a protein complex including both the '''IgE''' AND '''Hev b 8'''.
 
 
'''Question 1.''' Look at the method X-RAY diffraction (Å). Which of the complexes is of better quality and why?
 


== Visualization & PyMOL ==


We are looking for a protein structure complex, so dismiss all the search results that do not contain a protein complex including both the '''IgE''' AND '''Hev b 8'''.
You can visualize the structure directly at the PDB website using a browser-based viewer (buttons are found below the structure image), but we will use the viewer PyMOL for our purposes. It is an excellent viewer that can also be used to prepare publication-quality images of protein structures, and it is a very valuable tool when working with protein structures.


⇒ If you have not already, download PyMol from the web site and install it on your computer.


'''<span style="color:#FF0000">Question 1.</span>''' Look at the method X-RAY diffraction (Å). Which of the complexes is of better quality and why?
The program has three panels:


<!--
* The "viewer panel" where the molecule will be displayed
If you scroll down further to the Structure section, you will have an overview of the existing crystal structures in the PDB and you will also find a predicted structure from AlphaFold.
* The "objects panel" on the right, with a list of all your objects and with the pull-down menus to show (S) or hide (H) elements
* The "command-line panel" in the bottom, where you can type commands


Note the positions covered by the X-ray protein structures and the predicted protein structure from AlphaFold.
[[Image:PyMOL_panels.png|thumb|center|border|400px|'''Figure 3.''' The PyMOL Panels.]]


'''<span style="color:#FF0000">Q1B</span>'''
# X-ray protein structures are from residue number ____ to ____.
# AlphaFold protein structures is from residue number ____ to ____.


Now we want more information on the protein structure than the given in Uniprot, so we need to go directly to the Protein Data Bank (PDB):
'''Important Note: In this exercise, you will do some command line coding and get some outputs on the PyMOL. Please remember to add your code into the answer along with the screenshot.'''


Go to the PDB homepage at http://www.rcsb.org.
Now select the best complex and load it into PyMOL using the ''fetch'' command.


(If you cannot solve question 1 use this alternate code to continue the exercise: ''fetch'' 7SBG)


[[File:PDB homepage.png|thumb|center|600px|'''Figure 1.''' The opening page of the Protein Data Bank.]]
You can toggle the object on and off by clicking on its name. Try this. To the right of the object name, there are five buttons: A(ction), S(how), H(ide), L(abel) and C(olor).


⇒ Remove the water molecules and cofactors (''hetatm'') used for making the crystal using the command ''remove'' (these are not part of the of the protein complex).
If you show the sequence, you will notice that each of the chains have a name H and L for the antibody (heavy and light chain) and C for the allergen.


You can search the PDB immediately from the front page using a keyword or a PDB ID in the search field (orange arrow in Fig.1), or you can do a more advanced search using the buttons next to the search field (green arrow in Fig. 1). Other advanced options are found if you click the “Advanced” button next to the search field (blue arrow in Fig. 1). One very useful feature here is the ability to search for structures using a search sequence. Here we will just do a simple keyword search:
⇒ Colour the different chains of the antibody (IgE) in yellow and blue and the allergen (Hev b 8) in green.  


⇒ Type ''"rhamnogalacturonan acetylesterase"'' (remember the quote marks) in the search field and press enter (or the magnifying glass icon to the right). Inspect your results.


<!--
'''Hint: To colour chain A, you can do it on the command line by the command color -> color red, chain A.'''
<blockquote style="background-color: khaki; border: solid thin grey;">
'''May 2020:''' PDB has recently changed the search interface a bit - we'll do a full revision of the exercise manual later, but for now, search with quote marks ("rhamnogalacturonan acetylesterase") and look through the options for tabular views, to find the best option.
</blockquote>
-->


'''<span style="color:#FF0000">Q2</span>'''
Are all hits relevant if you are looking for a representative structure of the sequence shown in the UniProt entry? What would make you skip some of the structures?


You should find more than one structure, which represents RGAE. You only need one, so you will have to decide which one is the best to use. To create a table showing the parameters you wish to compare for selected structures, select “Custom Report” from the drop-down menu labeled “-- Tabular Report --”. You now get a ''very'' long list of possible parameters to include in a report. You should only choose the relevant ones, or your resulting table will be very large. Select the following:
'''Question 2.''' Provide a screenshot of the complex.
* Ligand name
* Resolution
* R-free
Click “Run Report”. Notice that if a PDB entry has more than one ligand, there will be one line for each ligand in the resulting table.


'''<span style="color:#FF0000">Q3</span>'''
Choose the best structure that has sulfate ions bound. Which one did you choose? Why?


Click on the PDB ID of the structure you chose. This will take you to the page showing this entry in the Data Bank (Fig. 2). Have a look around to see which type of information is stored here.
⇒ Let’s try to look at the surface of the interaction now. You can create an object called “antibody”, that contains both antibody chains (chain H+L) by the command select, and show as surface.


⇒ If you click the “Display Files” drop-down menu (top right in Fig. 2) and select “PDB File”, you can see the actual contents of the PDB file. Try this.


A PDB file is a text file and its primary content is the 3-D coordinates (x,y,z) of each atom in the protein structure. However, the first many lines are so-called header lines and contain various pieces of information about the structure. Most of them begin with REMARK ###, with ### being a number describing the precise contents of the line. Below the header section you can find the coordinates of the structure. These coordinates are found in the second half of the PDB file where the lines that start with “ATOM” (or “HETATM” for non-protein atoms).
'''Question 3.''' Provide a screenshot of the interaction surface. Which secondary structures of the allergen Is the antibody interacting with: alpha helices, beta-sheets or loops?


ATOM      5  N  SER A  2      8.646  26.448  43.030  1.00 20.04          N 
ATOM      6  CA  SER A  2      8.423  27.866  43.346  1.00 18.87          C 
ATOM      7  C  SER A  2      8.751  28.799  42.203  1.00 14.45          C 
ATOM      8  O  SER A  2      9.551  28.450  41.307  1.00 16.65          O 
ATOM      9  CB  SER A  2      9.219  28.236  44.584  1.00 27.30          C 
ATOM    10  OG  SER A  2      8.715  27.379  45.647  1.00 29.28          O 


The ATOM records (lines) present the atomic coordinates for standard residues, i.e. the protein part of the PDB file. They also present the occupancy and temperature factor for each atom. Heterogen coordinates use the HETATM record type and are used for everything else: organic compounds, buffer components, water molecules etc. The element symbol is always present to the far right on each ATOM/HETATM record; segment identifier and charge are optional. The coordinate section is always sorted such that the protein part(s) comes first (ATOM), followed by various small molecule ligands (HETATM) and then water molecules (HETATM). You can find a comprehensive description of the [http://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM PDB format] on the PDB homepage.
⇒ Now we want to colour which specific residues of the allergen are in close contact with the antibody (exactly at a distance of 4Å). We will call this object the “epitope”. To define the distance, we can use the command ''around''.  


Knowing the x,y,z coordinates of all the atoms in the structure, the model can be viewed with a structure visualization program. We will use the program PyMOL in a little while to do this. Notice that the 1K7C structure has an extra line for every atom that starts with “ANISOU”. Such lines describe an anisotropic (non-uniform) vibration of the atoms and are only found in high-resolution structures (usually better than ca. 1.5 Å).
In order to select the full residues and not just the atoms of the allergen that are in direct contact we have to do a trick and select by residue using the command ''br.''.


'''Hint:'''  ''select'' ZZ, ''br''.(XX ''around'' 6) With this command we can select all residues in a distance of 6Å to XX, and name this new objext ZZ


[[File:1K7C_page.png|thumb|center|400px|'''Figure 2.''' The PDB entry 1K7C.]]


'''Question 4.''' Now colour the epitope in red and provide a screenshot. How many residues can you count that are in close contact with the IgE antibody?


'''<span style="color:#FF0000">Q4</span>''' What is the residue name (three-letter abbreviation) for the sulfate ions? ________________ (You will need this to answer the following questions! '''Hint:''' Although the information can be found in the PDB file itself in the header as well as in the coordinates below the ATOM records (the residue name is a three-letter abbreviation found in the fourth column in each line), it is much easier to find it on the PDB webpage corresponding to your structure.)
== Structure comparisons ==


<!--
<blockquote style="background-color: lavender; border: solid thin grey;">
An especially useful link is the one to the [http://eds.bmc.uu.se/eds/ '''Electron Density Server''' (EDS)]. Here, you can have a look at the experimental data that the model is based upon. If there is a particular part of the structure you suspect of being erroneous, you can go to this server and see for yourself how well the model fits the electron density map at this particular location in the structure (provided the crystallographer has deposited the experimental data, which is unfortunately not always the case). Electron density maps can also be viewed in PyMOL. For more information, have a look [http://pymolwiki.org/index.php/Fetch here].
</blockquote>
-->


== Visualization & PyMOL ==
Now that we have a clear idea of where the epitope of the allergen is, we would like to compare the allergen with another closely related profilin protein in mammals, for example in cow, to see what the differences are.


You can visualize the structure directly at the PDB website using a browser-based viewer (buttons
are found below the structure image), but we will use the viewer PyMOL for our purposes. It is an excellent viewer that can also be used to prepare publication-quality images of protein structures, and it is a very valuable tool when working with protein structures.


<!--
'''Question 5.''' For that purpose, we will need to search for an homologous sequence to the one in our crystal structure, and guess which software we will use for that?
⇒ Install the program according to the [[PyMOL|'''installation instructions''']] and start the program.
-->
⇒ If you have not already, download PyMol from the web site and install it on your computer.


The program has three panels: The Viewer panel where the molecule will be displayed, a right side panel with a list of all your objects, use the pull-down menus to show (S) or hide (H) elements, and the bottom panel where you can type commands in the command line.


⇒ If you type:
Yes, you have guessed correctly! And to avoid you spending some time on the webserver we have performed such a search for you and found that the Uniprot ID for that one is Q2NKT1.


fetch 1k7c
⇒ Go to the Uniprot site and check on the available structures.


at the command line, PyMOL will fetch the structure for you from the PDB and display it in the Viewer. Try this. The molecule will now be shown in the Viewer and an object named “1K7C” has been created in the list to the right in the Viewer. You can toggle the object on and off by clicking on its name. Try this. To the right of the object name, there are five buttons: A(ction), S(how), H(ide), L(abel) and C(olor).
You will notice that there are no experimental structures determined for this cow Profilin, however there are some predictions and models.  




<blockquote style="background-color: khaki; border: solid thin grey;">
'''Question 6.''' Can you explain what is the main difference of the crystal structures, the predicted structure and the modelled structure with SWISS-MODEL? Which one do you think would be more reliable? Here you are encouraged to use a chatbot to ask for hints, like chatGPT, but you will have to justify your selection.
'''Troubleshooting:''' Occasionally, the ''fetch'' command may fail for certain installations of PyMOL, especially under Windows. In this case, go directly to the PDB homepage for the structure of interest and download the PDB file (as text) from the top right drop-down menu. Go to File - Open… in PyMOL and find the PDB file you just downloaded.
</blockquote>




'''<span style="color:#FF0000">Q5</span>'''
⇒ Pick one and download it.  
Click on H(ide) and select “waters”. What happened? (To undo this action, simply select S(how) – nonbonded.)


The molecule is by default shown in a “cartoon”, showing the secondary structure. Try to switch to the “lines” representation, click on S(how) – As – lines. This shows all the atoms and how they are connected through covalent bonds. You can try turning the molecule around using the mouse to view it from different angles. If you are interested in seeing the trace of the polypeptide string in order to get an idea of the fold of the protein (the tertiary structure), it is better to view the molecule in a simpler representation, where not all the atoms are shown. Try showing the molecule in a cartoon representation again: S(how) – As – Cartoon. Color the molecule by secondary structure: C(olor) – by ss – (choose a color scheme). This makes it easy to see the fold.
⇒ Now open the structure with PyMOL, in the same window where you have the previous complex. You can use the fetch command on the command line or go to File > Open.. and select the Profilin structure you have just downloaded.  


As you saw earlier, there are several sulfate ions in this structure. In order to view them, create an object containing the sulfates by entering the following command at the GUI command line:
You will see that the new Profilin is in a random position in our screen.  
sele sulfate, resn XXX
where XXX is the residue name of the sulfate ions (you found this earlier when you looked at the PDB file). This selects the XXX objects, now named “sulfate” in your object list. Show the sulfates in “stick” representation: S(how) – As – sticks. As shown in Fig. 3, one of these sulfates is situated near the active site.


Under the "objects panel on the right you will see some quick buttons including "3-button Viewing, and magic wand, SEQ, and a camera to record videos. Make sure you have the SEQ button activated to show your sequence.


[[Image:RGAE_active_site.png|thumb|center|border|400px|'''Figure 3.''' The active site in RGAE.]]
[[Image:PyMOL_show_SEQ.png|thumb|center|border|400px|'''Figure 4.''' The PyMOL Structure of 9NUE. A subsection of Chain A is displayed in grey.]]




By looking at the sulfate ions in your Viewer window, try to find the active site in the molecule, and identify the three active site residues. As a visual help you can visualize the 1K7C object in ribbon representation and show amino acid side chains as lines colored by element.
⇒ Scroll where you can see the sequence name, which name has the new profilin in PyMOL?
Alternatively, you can select objects of the Ser, His, and Asp residues in the same way as you did for the sulfates and show these as sticks. The visualisation of the active site might be a bit difficult for the non-structural expert eye. As an alternative of the visual finding remember the details of the protein that you gathered on Q1 from Uniprot and look into the PDB file to find the residues on the active site. Now select them and show them as sticks.


'''<span style="color:#FF0000">Q6</span>'''
We will now try to align the sequences, to see where they are different and maybe explain why the cow profilin is not giving allergy.  
The active site residues are: Ser_____, His_____ and Asp_____.


Does this correspond to the information you wrote down earlier from the UniProt entry? Why/why not?


<blockquote style="background-color: lavender; border: solid thin grey;">
To align the two chains, we can use the command ''align''. The align command requires two elements to align separated by comma.
'''Alternative approach:''' If directly looking at the amino acid side chains is not your strongest side, you can try the following approach instead:
# Go to back to the UniProt page (where you also picked up the information about the active site)
# Find the actual amino acid sequence of the protein, and notice the amino acids directly BEFORE and AFTER each active site residue (e.g. 5 amino acids to each side)
# Turn on sequence viewer mode in PyMol, and use the knowledge of the sequence AROUND the active site residue to help guide your selection.
</blockquote>


== Structure comparisons ==


Proteins exhibiting the same fold may occasionally have similar function, especially in the case of enzymes. However, when the proteins have reached the same fold by convergent evolution (or diverged a very long time ago), such similarities are not always obvious from sequence comparisons alone. Here, we will compare RGAE with platelet-activating factor acetylhydrolase (PAFA) from domestic cow (''Bos taurus'') and have a look at their active sites. These two enzymes have similar hydrolytic functions and catalytic residues but have no obvious sequence similarity (advanced alignment tools will identify approximately 20% identical residues).
'''Question 7.''' Provide a screenshot of the alignment. Can you see the differences of the two Profilins?


⇒ First, fetch the structure for PAFA called 1WAB and open it in PyMOL along with the structure of RGAE. You will notice that the two structures are not aligned. To fix this, type the following:
align 1WAB, 1K7C
This will align the structure of PAFA (1WAB) with that of RGAE by moving the former. Navigate to the active site of RGAE
found previously and identify the residues in PAFA corresponding to the active site residues in RGAE.


'''<span style="color:#FF0000">Q7</span>'''
'''Question 8.''' Provide an hypothesis of why the cow profilin does not cross-react with the IgE antibody giving allergy to Hev b8.
The active site residues of PAFA are: Ser_____, His_____ and Asp_____. Does this correspond to the residue numbering in RGAE? Why/why not?


<blockquote style="background-color: lavender; border: solid thin grey;">
'''Hint:'''
# if you color the active site amino acids in RGAE (1K7C) to something that is easy to recognize, it will be much easier to spot the overlap.
# Sequence viewer mode will also be a big help here - when you click a AA side chain in the other structure, the corresponding position will light up in the sequence view.
</blockquote>


<!--
<!--
== Making pretty pictures (an example) – NOT mandatory ==
== Making pretty pictures (an example) – NOT mandatory ==
 
'''<span style="color:#FF0000">Q7</span>'''
Now that you have found the active site residues, it is time for you to make a nice image of the active site. Select the three residues by clicking them (they now all become marked with pink dots). Choose cartoon representation for the whole protein and a neutral colour for the protein (some pale colour). Show just the side chains of the active site residues (they will then be sticking out from the cartoon representation). Give them individual colors to make them stand out. Show also the sulfate ions in some suitable representation. Now zoom in on the active site and find a good orientation. Before generating the final image, set the background color to white by either selecting Display – Background – White on the menu or simply typing
bg white
on the command line. Finally, simply type
ray
on the command line or press the Ray button (right side of the command window) to generate a ray-traced image. Examples of such an image could look like the ones below (or figure 3 above). Different ray trace modes are available and will change the appearance of the final image.
 


[[Image:1K7C_active_site_ray0.png|thumb|center|border|400px|'''Figure 4.''' Active site of RGAE, ray_trace_mode 0 (default).]]
[[Image:1K7C_active_site_ray0.png|thumb|center|border|400px|'''Figure 4.''' Active site of RGAE, ray_trace_mode 0 (default).]]
Line 198: Line 157:




Change between the different ray trace modes by going to Setting – Edit All... and enter a number (0, 1, 2 or 3) in the box next to ray_trace_mode. You can achieve the same result by typing (for mode 3)
set ray_trace_mode, 3
on the command line. All other settings in PyMOL can be changed in this way. Try playing with them to change the way PyMOL behaves or the way images appear in the viewer window.
-->
-->


==PyMOL links==
==PyMOL useful links==
 
* PyMOL home: http://www.pymol.org
* PyMOL home: http://www.pymol.org
* A cool PyMOL user guide: https://www.compchems.com/pymol-selection-tool/
* PyMOL manual: http://pymol.sourceforge.net/newman/userman.pdf
* PyMOL manual: http://pymol.sourceforge.net/newman/userman.pdf
* PyMOL Wiki: http://www.pymolwiki.org/index.php/Main_Page
* PyMOL Wiki: http://www.pymolwiki.org/index.php/Main_Page
* PyMOL settings (documented): http://pymolwiki.org/index.php/Settings
* PyMOL settings (documented): http://pymolwiki.org/index.php/Settings
<!-- * [[Protein Structure and Visualization exercise answers]] -->
<!-- * [[Protein Structure and Visualization exercise answers]] -->

Latest revision as of 10:33, 7 October 2025

By Carolina Barra Quaglia

Overview

In this exercise you will learn how to

  • Search diferent protein databases to obtain protein structures.
  • Critically choose the best structure, when more than one is available.
  • Visualize a protein structure using PyMOL
  • Highlight features of interest and perform a basic alignment.

This exercise is written a bit different. Instead of the step wise exercise we are going to try a different method called inquiry-based learning, where students explore real-world problems and questions, taking ownership of their learning by asking questions, conducting research, and developing their own solutions and understandings.

Therefore today we are going to start with a biological question we want to answer:

Why only plant profilins are giving an allergic response despite their high conservation across species? Why are we not allergic to mammal profilins, for example?

A bit of background

The question of today’s exercise is about allergens.

Allergens are fascinating because they aren’t random proteins; they cluster into certain protein families that repeatedly trigger allergic sensitization in humans.

However, one very interesting allergen protein family is the profilins. Here’s why:

They are highly conserved across species.

Profilins are small actin-binding proteins found in almost all eukaryotic cells (plants, animals, fungi).

Despite their ubiquity and structural conservation, humans typically only become allergic to plant profilins (e.g., from birch pollen, grass pollen, the rubber tree, or certain fruits/vegetables).

⇒ Today we would seek to understand: Why only plant profilins are giving an allergenic response despite their high conservation across species?

Figure 1. The allergen response.

Getting started

We know that protein functions are closely correlated to their 3D structure, so in order to understand what is different in Profilin proteins from plants and mammals we are going to gather some experimentally determined protein structures from plant and mammal proteins.

⇒ First go to the Protein Data Bank (PDB) https://www.rcsb.org/


Figure 2. The Protein Data Bank.


We are looking for and antibody of the class E (those are related to allergy) in complex with a plant allergen (in this case we will use Hevea brasiliensis, the rubber tree).

⇒ And search for IgE AND Hev b 8

We are looking for a protein structure complex, so dismiss all the search results that do not contain a protein complex including both the IgE AND Hev b 8.


Question 1. Look at the method X-RAY diffraction (Å). Which of the complexes is of better quality and why?


Visualization & PyMOL

You can visualize the structure directly at the PDB website using a browser-based viewer (buttons are found below the structure image), but we will use the viewer PyMOL for our purposes. It is an excellent viewer that can also be used to prepare publication-quality images of protein structures, and it is a very valuable tool when working with protein structures.

⇒ If you have not already, download PyMol from the web site and install it on your computer.

The program has three panels:

  • The "viewer panel" where the molecule will be displayed
  • The "objects panel" on the right, with a list of all your objects and with the pull-down menus to show (S) or hide (H) elements
  • The "command-line panel" in the bottom, where you can type commands
Figure 3. The PyMOL Panels.


Important Note: In this exercise, you will do some command line coding and get some outputs on the PyMOL. Please remember to add your code into the answer along with the screenshot.

⇒ Now select the best complex and load it into PyMOL using the fetch command.

(If you cannot solve question 1 use this alternate code to continue the exercise: fetch 7SBG)

You can toggle the object on and off by clicking on its name. Try this. To the right of the object name, there are five buttons: A(ction), S(how), H(ide), L(abel) and C(olor).

⇒ Remove the water molecules and cofactors (hetatm) used for making the crystal using the command remove (these are not part of the of the protein complex). If you show the sequence, you will notice that each of the chains have a name H and L for the antibody (heavy and light chain) and C for the allergen.

⇒ Colour the different chains of the antibody (IgE) in yellow and blue and the allergen (Hev b 8) in green.


Hint: To colour chain A, you can do it on the command line by the command color -> color red, chain A.


Question 2. Provide a screenshot of the complex.


⇒ Let’s try to look at the surface of the interaction now. You can create an object called “antibody”, that contains both antibody chains (chain H+L) by the command select, and show as surface.


Question 3. Provide a screenshot of the interaction surface. Which secondary structures of the allergen Is the antibody interacting with: alpha helices, beta-sheets or loops?


⇒ Now we want to colour which specific residues of the allergen are in close contact with the antibody (exactly at a distance of 4Å). We will call this object the “epitope”. To define the distance, we can use the command around.

In order to select the full residues and not just the atoms of the allergen that are in direct contact we have to do a trick and select by residue using the command br..

Hint: select ZZ, br.(XX around 6) With this command we can select all residues in a distance of 6Å to XX, and name this new objext ZZ


Question 4. Now colour the epitope in red and provide a screenshot. How many residues can you count that are in close contact with the IgE antibody?

Structure comparisons

Now that we have a clear idea of where the epitope of the allergen is, we would like to compare the allergen with another closely related profilin protein in mammals, for example in cow, to see what the differences are.


Question 5. For that purpose, we will need to search for an homologous sequence to the one in our crystal structure, and guess which software we will use for that?


Yes, you have guessed correctly! And to avoid you spending some time on the webserver we have performed such a search for you and found that the Uniprot ID for that one is Q2NKT1.

⇒ Go to the Uniprot site and check on the available structures.

You will notice that there are no experimental structures determined for this cow Profilin, however there are some predictions and models.


Question 6. Can you explain what is the main difference of the crystal structures, the predicted structure and the modelled structure with SWISS-MODEL? Which one do you think would be more reliable? Here you are encouraged to use a chatbot to ask for hints, like chatGPT, but you will have to justify your selection.


⇒ Pick one and download it.

⇒ Now open the structure with PyMOL, in the same window where you have the previous complex. You can use the fetch command on the command line or go to File > Open.. and select the Profilin structure you have just downloaded.

You will see that the new Profilin is in a random position in our screen.

Under the "objects panel on the right you will see some quick buttons including "3-button Viewing, and magic wand, SEQ, and a camera to record videos. Make sure you have the SEQ button activated to show your sequence.

Figure 4. The PyMOL Structure of 9NUE. A subsection of Chain A is displayed in grey.


⇒ Scroll where you can see the sequence name, which name has the new profilin in PyMOL?

We will now try to align the sequences, to see where they are different and maybe explain why the cow profilin is not giving allergy.


To align the two chains, we can use the command align. The align command requires two elements to align separated by comma.


Question 7. Provide a screenshot of the alignment. Can you see the differences of the two Profilins?


Question 8. Provide an hypothesis of why the cow profilin does not cross-react with the IgE antibody giving allergy to Hev b8.


PyMOL useful links