Protein Drug Deimmunization

Protein Drug Deimmunization - Exercise

Over the past few decades, biologics have gone from being experimental treatment options to standard care. They are very targeted and useful molecules in health care, but they come with their own immunological challenges. If a biologic is recognized as foreign by the immune system, Anti-Drug Antibodies can be formed that neutralize the biologic and render it useless. Deimmunization provides a potential solution to this challenge.

The general strategy of deimmunization is to target T-Cell epitopes with mutation, thereby reducing immunogenicity of the biologic. Epitopes can be identified experimentally, but in silico prediction makes the process more economically tractable.

Here we will go through a short exercise on in silico deimmunization of human Erythropoietin (EPO). EPO is a protein that signals for the production of red blood cells in response to hypoxia. It is used in treatment of anemia, but is perhaps better known for its use in blood doping. We will use the IEDB to get epitope data on EPO, Uniprot to get its sequence, and IEDB again for tools on CD4 epitope prediction and deimmunization.

The goal of today’s exercise is to get you acquainted with existing CD4 epitope prediction- and deimmunization- tools. You will handle and process data and question some technical aspects of the tools and results.

Data Gathering

Start by going to the IEDB homepage and find T Cell epitope results for human erythropoietin. If nothing comes up, try reformulating your query. It should result in 19 epitopes from a single reference. In this publication, EPO was tested for immunogenicity by testing 15-mers; overlapping by 10-mers, covering the entire sequence. Under the antigen tab, navigate to the immunome browser to get an overview of the results.

Q1: Looking at the T-cell epitope response frequency profile of EPO, in what region/regions of the protein is immunogenicity mostly found?

Q2: Again, looking at the response frequency profile, an irregular pattern is observed for the first 27 amino acids. In the table of results below, it becomes apparent that no epitope assays were performed on this region. Why do you think that is? Hint: EPO is an extracellular protein

Export the immunome browser results to a .csv file.

The answer to question 2 can be found on Uniprot, where we will also get the sequence for human EPO. Go to Uniprot and search for Erythropoietin. Select the human entry and download the .fasta file with the EPO sequence (click 'format' tab at top of page and select fasta). On the Uniprot page for human EPO, you can learn that EPO is a glycoprotein.

Q3: Can you find any other information on the EPO Uniprot site that could explain the irregular response frequency pattern in the N-terminal of EPO(as seen in Question 2)?

This N-terminal sequence will interfere with our prediction so remove the first 27 residues in your EPO fasta and save it as e.g. EPO-processed.fasta

CD4 Epitope Prediction

To deimmunize the EPO protein we will use two prediction tools available on the IEDB: "CD4 T-cell immunogenicity prediction" and "Deimmunization". On the IEDB home page, under "Analysis Resource" select "T Cell Epitope Prediction". You should see links to the two tools with a paragraph explaining how they work. Read this carefully!

To gain confidence in the validity of the predictors, let us plot together the predicted immunogenicity and the experimentally measured response frequency.

Using the "CD4 T-cell immunogenicity prediction" tool, upload your processed EPO fasta file and run the prediction with the following parameters, "Prediction method": "IEDB recommended (combined)" and "Select maximum percentile rank threshold":"Show all peptides". This will start a prediction on 15-mers, overlapping by 10-mers, from the EPO protein and output results from two prediction methods(and their combined score). This will take a few seconds and once done, save the resulting .csv file.

Now we have two .csv files to compare, one with experimental data and the other with predictions. Import these with your program of choice (Python, R, Excel,...) so that you can manipulate, compare and plot the data.

Q4: Compare the "Mapped Postion" column in the experimental result table to the "Start" and "End" columns in the prediction result table. Why are they different? Create a new "Start_mod" column in the prediction table so that it aligns with the first value in the "Mapped Position" column.

Q5: The two tables have a different number of peptides. Why do you think that is? Hint: look at newly created "Start_mod" column in the prediction result table.

Remove the row with peptides that are not shared between the tables.

Now, the prediction scores and the response frequency have different units, so to make them comparable, we must process them. In the prediction score table, the "Combined score" column shows the weighed score of two models. It has the unit of rank, which can be interpreted as "where does the prediction score rank among prediction scores of random peptides" i.e. it is a percentage of random peptides with lower prediction score. That means that a high rank indicates low predicted immunogenicity, which is the reverse of the response frequency. Create a new column (e.g. "Combined score mod") where the "Combined score" become comparable to response frequency.

Q6: What mapping function would you use to make the "Combined score" fall in the same range(and directionality) as response frequency?

Now create a line plot and a scatter plot comparing "Combined score mod" with the "Response Freq." columns.

Q7: Does the predicted immunogenicity align with the measured epitopes?

Q8: What is the Pearson correlation coefficient between "Combined Score Mod" and "Response Freq."?

Deimmunization

Now that we have some confidence in the prediction tools, let us deimmunize EPO.

Open the "Deimmunization" tool on the IEDB and input your processed EPO .fasta file. Run the program with default parameters. This will predict and present the most immunogenic peptides in the input sequence. Select the top 5 peptides, enter a job name and your e-mail to receive your results when they are ready. Start running the job.

This takes quite a lot longer than performing a CD4 epitope prediction of the full EPO sequence, despite the same CD4 epitope prediction tools being used.

Q9: Why does a deimmunization prediction for just a few epitopes take longer than a CD4 epitope prediction for the full EPO sequence?

This takes quite a while so I have uploaded the results of the deimmunization. Download the result and inspect. Sort the table based on drop in immunogenicity to find the most influential mutations. Have a look at the suggested amino acid substitutions.

Q10: Using the BLOSUM substitution matrix as a guide, do you expect these top suggested mutations to be structurally/functionally conservative?

Select an influential mutation (perhaps even several) and create a new EPO .fasta file where you substitute the wild type variant with the less immunogenic mutant. Input this fasta into the CD4 epitope prediction tool and run the prediction with the same parameters as in the "CD4 Epitope Prediction" section. Download the results when ready.

Now, after going through the same processing steps as in the "CD4 Epitope Prediction" section, plot the predicted immunogenicity for your mutant together with the predicted immunogenicity of the wild type.

Q11: Do you notice a drop in predicted immunogenicity?

You now have tools to improve biologics!

Done!

Protein Drug Deimmunization

Contents