Ancient DNA exercise answers

From 22126
Revision as of 16:36, 19 March 2024 by WikiSysop (talk | contribs) (Created page with "'''Q1''' the read length is about 100bp but the actual insert size is unknown. '''Q2''' very low, less than 1% '''Q3''' About 40bp. '''Q4''' About 25%. '''Q5''' As and Gs '''Q6''' The sample indeed looks ancient. If we did not see DNA fragmentation or damage it could be indicative of present-day human contamination. '''Q7''' <pre> wc -l world.fam wc -l world.bim </pre> 297 samples and 587772 SNPs. '''Q8''' <pre> cut -f2 world.sampleInfo.txt | tail -n +2...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Q1

the read length is about 100bp but the actual insert size is unknown.

Q2

very low, less than 1%


Q3

About 40bp.

Q4

About 25%.

Q5

As and Gs

Q6

The sample indeed looks ancient. If we did not see DNA fragmentation or damage it could be indicative of present-day human contamination.

Q7

wc -l world.fam
wc -l world.bim

297 samples and 587772 SNPs.

Q8

cut -f2 world.sampleInfo.txt | tail -n +2 | sort | uniq -c|sort -rn
     70 Yoruba
     33 Han
     29 Basque
     27 Sardinian
     25 French
     20 Hungarian
     20 Greek
     19 Bedouin2
     17 Adygei
     10 Lithuanian
     10 Armenian
      8 Tuscan
      1 UstIshim
      1 Stuttgart
      1 Samara
      1 NE1
      1 MA1
      1 Loschbour
      1 Karelia
      1 Iceman
      1 Brana

Q9:

You should be getting the same:

plink --bfile world --missing --out world
587772 variants loaded from .bim file.
297 people (0 males, 0 females, 297 ambiguous) loaded from .fam.


Q10:

 cat world.imiss  |grep -i ice
     Iceman          Iceman          Y    11873   587772   0.0202

so about 2%.

Q11

zcat RISE507.pileup.gz |wc -l 
102014

Q12

Using:

 plink --bfile world --bmerge RISE507 --out RISE507.merge

should result in:

Error: 253 variants with 3+ alleles present.

This normally is due to tri-allelic sites. Normally they should be very few. However, in our case, there are a lot. This is likely due to damage that creates spurious variations.

Q13:

The Yoruba.


Q14:

The Han.

Q15:


The Adygei

Q16

The Sardinians

Q17

the Ust-Ishim and the Mal'ta–Buret' boy (MA1).

There are many reasons that can explain this:

  1. the ancient individuals completely fall outside the range of genomic diversity of modern humans i.e. they were isolated populations that potentially died off.
  2. these were individuals with mixed ancestry
  3. they contain numerous errors due to damage


Q18:

The RISE507 sample from Afanasievo culture.

Q19:

Our individual is now a bit outside of the Hungarian / French cluster.

Q20

Allentoft et al. 2015 actually found that the individuals from the Afanasievo were genetically indistinguishable from the Yamnhaya culture which is a culture closely related to Western Steppe Herders which is one of the major genetic contributor to present-day Europeans.