Tuesday 27 September 2022

Haplogroups of the Swat Indians

 Here's a list of all the haplogroups found in the SPGT samples, date and culture specific. 


Now, keep in mind that R1a, I2a and Q1a are the likely Steppe haplogroups here. Out of a total of 69 samples, we have 7 R1a, 2 I2a and 1 Q1a.  R1a is the 4th most common lineage. However, a lot of this R1a is from the medieval period, so if we remove that we get a total of 4/63 R1a + 2/63 I2a + 1/63 Q1a. This gives a total of 7 Steppe haplogroups for a total of 11% of the Y-DNA of SPGT likely coming from Steppe MLBA. It's interesting to note the presence of I2a for this is a European Hunter Gatherer haplogroup that almost guarantees that SPGT was mixed with Steppe_MLBA.

The most common lineages are L1a (13), R2a (10) and E1b1b1b2 (8) for a total of 31/63 or 49% of the SPGT samples. The most common lineage was L1a/L1 for a total of 14/63 samples or 22% of Y-DNA. 

L1a, R2a and E1b1b are all likely Indus Valley lineages.

An Iranian from 1800 BCE

While scourging through the Narasimhan supple, I found a sample that caught my eye. It clustered on a PCA with modern Iranians and on G25 distances, it had the lowest distances to modern Iranians; however, it was from the post-BMAC site of Sappali Tepe in Uzbekistan and dated to ~1800 BCE cal. 

The supplementary describes it as such.

BMAC_o2 (n=1): This individual has a significantly elevated proportion of ancestry related to

Central_Steppe_MLBA. We call this individual by the BMAC_o2 analysis label and the

Sappali_Tepe_BA_o split label.

• UZ-ST-010, Sappali Tepe (ST) 1975, 6, Grave 02-05 (I7493): Context date of 2000-1600

BCE. Genetically male.


Distance to: UZB_Sappali_Tepe_BA_o

0.03315719 Iranian_Zoroastrian
0.03528786 Iranian_Fars
0.04027134 Iranian_Persian_Shiraz

0.04254455 Azerbaijani_Dagestan
0.04405072 Ezid
0.04435116 Iranian_Mazandarani
0.04586352 Iranian_Lor

On a PCA, Sappali Tepe clusters with other Iranians as well, shifted a bit to the extreme end of the West Iranian Steppe % The rest of the Sappali Tepe site is regular post-BMAC and accordingly clusters with other BMAC (Gonur1) and post BMAC (Bustan BA) sites. So it's obvious this outlier has BMAC + Steppe ancestry. Not that out of the ordinary, right? Even TKM_IA has this mixture.

Target: UZB_Sappali_Tepe_BA_o:I7493
Distance: 4.9056% / 0.04905551
65.2 UZB_Sappali_Tepe_BA
34.8 RUS_Sintashta_MLBA

Poor distance, but where it gets interesting is when you add West Iranian farmer ancestry.  

Target: UZB_Sappali_Tepe_BA_o:I7493
Distance: 3.0157% / 0.03015733
48.2 IRN_Seh_Gabi_C
33.2 RUS_Sintashta_MLBA
18.6 UZB_Sappali_Tepe_BA

This is great, the distance improves a lot. This must mean this individual is a mix of West Iranian type farmers + early East Iranian (TKM_IA) type ancestry. Now, interestingly enough, this must mirror the actual process by which modern West Iranians came to be. Steppe pastoralists mix with Turan_BA (BMAC) type people and form a hybrid 50:50 ish population in the region surrounding modern Tajikistan. This population eventually comes down to the Iranian plateau and mixes with the native Elamite farmers to form modern West Iranians. 

So, let's model it with TKM_IA too, but first add in Indus Valley. 

Target: UZB_Sappali_Tepe_BA_o:I7493
Distance: 2.3048% / 0.02304809
57.2 IRN_Seh_Gabi_C
32.8 RUS_Sintashta_MLBA
10.0 IRN_Shahr_I_Sokhta_BA2

Modern Iranians likely received an additional wave of Levant_C or Anatolia_C that mixed with their Sehi_Gabi type pop, so might as well as try it with that too.

Target: UZB_Sappali_Tepe_BA_o
Distance: 2.1811% / 0.02181055
52.2 TKM_IA
26.6 IRN_Seh_Gabi_C
12.4 Levant_ISR_C
8.8 Indus(Sis2+Gonur)

Yep, way better fits. TKM_IA likely also absorbs nearby WSHG/ANE type ancestry.

Can we replicate this on qpAdm? Let's try. 

Uzbekistan_SappaliTepe_BA_o
Turkmenistan_IA.SG 49.2% ± 9.4
Iran_C_SehGabi 19.8% ± 9.5
Israel_C 20.1% ± 5.6
Indus 10.9% ± 7.3
p-value:  0.300

(Outgroups: Mbuti, EEHG, WEHG (Iron Gates), WSHG, ESHG, Han, Tianyuan, TurkeyN, IranN, Levant PPN, Iberomaurusian/Natufian (optional), CHG, MA1 (optional), Kostenki14 (optional))

This is pretty close to our G25 results. This would mean Sappali Tepe has around ~25% Sintashta ancestry, thus being a perfect fit to model modern Iranians with (who have quite a bit lower Sintashta).
By the way, I tried Anatolia_C and Rus_Maykop_Nosovko as sources too. Former gives quite decent fits on qpAdm and works equally well as a source, but drops TKM_IA to 40% ish. Latter is a poor source, but I decided to try it anyways as some outliers in Bronze Age Turan carry Caucasus ancestry (like the Dzharkutan outliers from Uzbekistan which are Caucasus + Steppe). 

I also tried out other IranN rich profiles such as Sarazm, Geokysur, Parkhai, Namazga as well. They give much poorer fits on G25 and qpAdm. Anyway, let's just do a Dstats run to see if there is indeed allele sharing b/w the two, and to what extent.

result:  Chimp.REF Uzbekistan_SappaliTepe_BA_o Anatolia_C   Israel_C     -0.0235    -4.133     -    7424   7781 153314 

Significant negative Z score, so it looks like Anatolia_C is actually preferred as the Chalcolithic source over Israel_C.  G25 shows the same, the fit improves when we use Anatolia_C. 

Regardless of the exact Chalco source, let's try and model Modern Iranians with the Sappali outlier.

TargetDistance
IRN_Seh_Gabi_C
IRN_Shahr_I_Sokhta_BA2
Levant_ISR_C
MNG_Slab_Grave_EIA_1
UZB_Sappali_Tepe_BA_o
Yoruba
Iranian_Bandari0.02025044
26.620.80.80.048.83.0
Iranian_Fars0.01639102
27.20.87.02.262.80.0
Parsi_India0.01486118
10.829.213.40.845.80.0
Parsi_Pakistan0.01436086
9.627.412.80.849.40.0
Iranian_Lor0.01120472
35.80.010.81.252.20.0
Iranian_Mazandarani0.01877939
44.83.40.00.051.80.0
Iranian_Persian_Shiraz0.01724956
29.03.69.42.655.00.4
Iranian_Zoroastrian0.01704363
22.40.06.60.071.00.0
Average0.01626760
25.810.77.60.954.60.4


Very good fits. If Sappali is around 24-27% ish Steppe, then this gives Iranian Zoroastrians around 18-19% Steppe, which sounds just about right. So this individual from 1800 BCE Uzbekistan is indeed quite close to a modern West Iranian. 

His haplogroup is Q1b2, which is an India specific branch of Q1 (found in the Swat Valley Loebanr samples) and this kinda gives more evidence for the 10-11% Indus Valley ancestry in him. 

This individual is remarkably similar to another outlier from 1500 BCE Uzbekistan, who I discussed in my substack here. However, this outlier clusters with modern Indo-Aryans and contains the 3 way Steppe + Indus + Onge admix that characterizes modern Indians. The Sappali Tepe outlier contains the 3 way mix that characterizes modern Iranians: Steppe + BMAC + Iranian chalcolithic farmer. 

Now, we cannot be possibly sure what language this outlier spoke, but my bet is it was likely some form of early Iranian, thereby meaning the genepool that characterizes modern Indians and Iranians had already formed in Central Asia by 1600-1800 BCE.


Saturday 24 September 2022

The Genetic Echo of the Tarim Mummies in Modern Central Asians

Several Bronze Age Xinjiang individuals (e.g., Xinj_BA3, Xinj_BA4, Dzungaria_EBA1, and Dzungaria_EBA2) cluster closely with the four Tajik populations (fig. 2B), which may reflect a high genetic similarity between Tajik and Bronze Age Xinjiang populations. The Iron Age and Historical Era Central Asian and Steppe individuals are clearly separated from the four Tajik populations, except for the sporadic individuals from Xinjiang and South Asia (e.g., JEZK_IA3_oBMAC, LSH_IA2_oSte, Xinj_HE1, and Pakistan_RajaGira; fig. 2C and D).





Especially, compared with other modern Central Asian populations, the three Tajik from the Pamirs (i.e., Sarikoli Tajik, Wakhi Tajik, and Pamiri Tajik) share a higher level of genetic drift with Tarim_EMBA1 (supplementary fig. S13, Supplementary Material online). The Kyrgyz populations share great genetic drift with Neolithic, Bronze, and Iron Age populations from northern China and Mongolia (e.g., China_Wuzhuangguoliang_LN.EC, China_WLR_BA_o, China_AR_Xianbei_IA, and Mongolia_EIA_8), The Sarikoli Tajik and Pamiri Tajik can be modeled as a mixture of Russia_Andronovo, BMAC, Tarim_EMBA1, and Mongolia_Xiongnu_o1 (supplementary table S5, Supplementary Material online and fig. 4A). The results are supported by the qpWave (Reich et al. 2012) analysis that at least three separate sources are present in the Tajik populations (supplementary table S6, Supplementary Material online). The Tarim_EMBA1 is required under the four-way models in the admixture inference for the Sarikoli Tajik and Pamiri Tajik. When removing the Tarim_EMBA1, the admixture modeling failed.  For the Sarikoli Tajik and Pamiri Tajik, the admixture models unanimously failed when using any group of Russia_MLBA_Sintashta, Central_Steppe_MLBA, Russia_Afanasievo, and Russia_Samara_EBA_Yamnaya as a Steppe source. For three highland Tajik (i.e., Sarikoli Tajik, Wakhi Tajik, and Pamiri Tajik), when we replaced Russia_Andronovo or Russia_Andronovo and BMAC with Turkmenistan_IA, the models worked well (supplementary table S5, Supplementary Material online), as Turkmenistan_IA was an admixture of BMAC and Andronovo (Guarino-Vignon et al. 2022). The Dushanbe Tajik can be modeled as a mixture of Turkmenistan_IA and Mongolia_Xiongnu_o1. 
So, Pamiri Tajiks need Tarim_EMBA1 to model properly? Dushanbe seems to do fine as just TKM_IA + Xiongnu. I notice how they didn't use any Indian sources at all, though we know all Tajiks have around 6-8% Andaman-type ancestry, barring Yaghnobis. (If Andaman is 6-8%, then actual Indian admixture must be a bit higher). This admixture entered the Tajik gene pool around 1000-1200 years ago, matching the Ghaznavid slave raids into India that carried Hindu slaves to be sold in the slave baazaars of Samarkand and Tashkent. So it's interesting they've modelled Tajiks without any Indian.

The major ancestry of Kyrgyz and Kazakh populations is from Xinj_HE3 (44.8–58.9%) and Mongolia_Xiongnu_o1 (41.1–55.2%; supplementary table S5, Supplementary Material online and fig. 4A). The Uyghur, Uzbek, and Turkmen populations are modeled as a mixture of Turkmenistan_IA (48.8–65.1%) and Mongolia_Xiongnu_o1 (34.9–51.2%). The most prevalent paternal lineage in the Kyrgyz (26/44) and Tajik (16/33) populations was haplogroup R1a1 (supplementary table S8, Supplementary Material online), which has been reported in Steppe-related populations, such as Corded Ware, Andronovo, and Sintashta. The mtDNA haplogroup C4 characterized in Tarim_EMBA1 (Zhang et al. 2021) is also found in the Kyrgyz and Tajik

Ok, so Tajiks and the Kyrgyz are both ~50% R1a-Z93, and mtDNA related to the Tarim mummies is found in both. 

Among modern Central Asian populations, the Tarim_EMBA1 was only detectable in the Sarikoli Tajik, Wakhi Tajik, and Pamiri Tajik from the Pamirs neighboring Tarim Basin (fig. 1). In the Dushanbe Tajik west of the Pamirs, as well as other Turkic-speaking populations, we failed to detect the signature of Tarim_EMBA1 ancestry. It is expected that western Tajiks in Uzbekistan from the previous study (Guarino-Vignon et al. 2022) may have no Tarim_EMBA1 ancestry. 

So, only Pamiri Tajiks (includes Sarikoli, Wakhi as the mountain range is the Pamirs) have detectable Tarim EMBA type ancestry. This rules out any such ancestry in non-Pamiris, and certainly in Afghani Tajiks. I wonder if this ancestry was partially responsible for inflating the Steppe Andronovo of Pamiri Tajiks. 








Friday 23 September 2022

The genetic legacy of Zoroastrianism in Iran and India

 

Zoroastrianism is one of the oldest extant religions in the world, originating in Persia (present-day Iran) during the second millennium BCE. Historical records indicate that migrants from Persia brought Zoroastrianism to India, but there is debate over the timing of these migrations. Here we present novel genome-wide autosomal, Y-chromosome and mitochondrial data from Iranian and Indian Zoroastrians and neighbouring modern-day Indian and Iranian populations to conduct the first genome-wide genetic analysis in these groups. Using powerful haplotype-based techniques, we show that Zoroastrians in Iran and India show increased genetic homogeneity relative to other sampled groups in their respective countries, consistent with their current practices of endogamy. Despite this, we show that Indian Zoroastrians (Parsis) intermixed with local groups sometime after their arrival in India, dating this mixture to 690-1390 CE and providing strong evidence that the migrating group was largely comprised of Zoroastrian males. By exploiting the rich information in DNA from ancient human remains, we also highlight admixture in the ancestors of Iranian Zoroastrians dated to 570 BCE-746 CE, older than admixture seen in any other sampled Iranian group, consistent with a long-standing isolation of Zoroastrians from outside groups. Finally, we report genomic regions showing signatures of positive selection in present-day Zoroastrians that might correlate to the prevalence of particular diseases amongst these communities.


https://www.biorxiv.org/content/10.1101/128272v1 


Analysis of mtDNA and NRY variation using data from the Human Origins array showed that the modal NRY haplogroup in all Iranians and Parsis was J, with maximum frequency observed among the Parsis (freq=0.67; Figure 3a, Table S4). This is consistent with previous NRY haplogroup frequencies observed in Iranian Zoroastrian and non-Zoroastrian groups68. In particular, 8 of the 12 Iranian Zoroastrians from the city of Yazd belonged to NRY haplogroup J. 


We infer the most probable Iranian lay Zoroastrian contribution to the lay Parsis Y-chromosomes to be 96% (median = 86%, mean = 82%, 95% CI = 41% to 99%), whereas the most probable Iranian lay Zoroastrian contribution to Parsis mtDNA is 8% (Figure 3c; median = 26%, mean = 32%, 95% CI = 1% to 88%).  These data are consistent with the majority of Parsi priests being patrilineal descendants of two male founders in the relatively recent past. 


 Consistent with this, we infer the autosomal, sex-averaged contribution to be 61-76% using a variety of modern and ancient Iranian surrogate groups (Figure 2, Figure S7, Tables S9-S11). This supports Zoroastrianism being brought from Iran to India by a group of males, and/or that gene-flow into the Parsi community from the neighbouring Indian population was mainly female-mediated.  


This study mostly confirms what we've known. Parsis are 3/4 Iranian and 1/4 Indian, most of the Zoroastrian migrants were men so mtDNA from India is much higher. Interesting to note is that Iranian Zoroastrians do not really have any higher frequency of haplogroup R either. 

It's Indian Parsis with higher R. 




 

The Scythians were half Asian

 A few people have been criticizing me for my latest twitter post on Scythians (post anything on these people and you get a bunch of wind, I guess people feel very strongly about being related to them?)

Anyways, the criticism is that early Scythians were supposedly 100% Andronovo, had a Caucasoid/Europoid phenotype and lacked any distinctive East Asian ancestry or the associated phenotype. The people saying so make one important mistake: they conflate Andronovo with Scythian. While it's true that the latter originally came from the former, it's also true that they're widely separated in space and time with true Scythian cultures only emerging in the 1st millenium BCE. It must also be kept in mind the Andronovo horizon was massive, spanning most of Central Asia and reaching the boundaries of modern day Mongolia on the East. Given so, it's impossible for the populations in the horizon to be homogenous in ancestry. 

Scythians can certainly be Andronovo descendants, but not all Andronovo descendants are Scythian, nor did all even survive till the 1st millenium BCE (late Iron Age onwards). So we must ask ourselves where the Scythian homeland is, and what were the genetics of people found nearby in the Iron Age. 

Animal style art is a distinctive feature of Scythian and Scythian-like cultures (Cimmerians, Sarmatians). The earliest such art is from the Arzhan kurgans (Arzhan 0, Arzhan 1) in modern-day Russia, right next to Mongolia and Kazakasthan.

Here is where the earliest Scythian kurgan mounds are from, by the way, nestled well on the easternmost edges of the Eurasian steppes. The Arzhan 0 kurgan is in fact a princely tomb, with identical construction to Arzhan 0 and of course, filled with animal style decorative art, and dated to the 9th century BCE by accelerator mass spectrometry. Read more about it here


The Arzhan culture is succeeded by the Pazyrk culture, another early-Scythian one dated to the 6th-3rd centuries BCE. Regardless to say, these are some of the oldest sites where the material culture archaeologists consider "Scythian" arose, and they're deep inside Asia, right next to a bunch of Siberian people where the Andronovo descendants could've picked up a ton of EA ancestry, of course. 

Kuzmina discusses this in her magnum opus in a bit of detail, though keep in mind she lacked any genetic data and relied purely on archaeolinguistics and anthropology. I'll quote the relevant part.

He suggested that the impulse from Central Asia was of crucial importance at the end of the Bronze Age. This impulse brought to Scythia the arrows, daggers, knives, and horse-gear of the Karasuk culture. The reason in support of this hypothesis was offered by M. P. Gryaznov’s (1980) discovery of the Arzhan barrow in Tuva. There, early articles of horse equipment along with the objects of Scythian art have been discovered. It has been suggested that the complex dated to the 8th century BC which is a considerably early date. Scholars from St. Petersburg assume that the Scythian culture formed in Central Asia in the 8th or perhaps in the 9th century BC. They even admit the probability of the influence of Chinese art on the formation of the animal style. The Central-Asian hypothesis, which has gained wide acceptance, is represented in the works of N. A. Bokovenko (Moshkova 1992)

 Kuzmina and 20th century Russian archaeologists did not have the power of aDNA, but we do. So let's see how actual Scythian men and women from these earliest sites plot on a PCA. 



Huh, what a surprise! every single early Scythian sample clusters with modern day Bashkirs. [click the image so it magnifies in your window, otherwise it will be blurry).

The oldest Scythian samples are Zevakinko_IA and Aldy Bel_IA (this one is from the famous Arzhan kurgans, by the way). How about checking them out on G25?

Distance to: RUS_Tuva_Aldy_Bel_IA
0.04341559 Bashkir
0.04764879 Tatar_Siberian
0.07581902 Tatar_Siberian_Zabolotniye

0.08711644 Nogai
0.09365923 Uzbek
0.10087071 Yukagir_Forest
0.10372093 Hazara_Afghanistan
0.10404151 Tubalar
0.10465838 Karakalpak

As expected. 

Distance to: RUS_Zevakino_Chilikta_IA
0.04804137 Bashkir
0.05933492 Uzbek
0.06077576 Tatar_Siberian

0.07615525 Nogai
0.07823705 Tatar_Crimean_steppe
0.07973288 Hazara_Afghanistan
0.08347611 Uygur
0.08489468 Turkmen_Uzbekistan
0.08967604 Hazara
0.09028372 Turkmen

Barely surprising to anyone who's been following the latest studies on this. How about checking out one of the later descendant cultures, such as Pazyryk? 

Distance to: KAZ_Pazyryk_IA
0.04310245 Tatar_Siberian
0.05177868 Bashkir
0.07341034 Nogai

0.07795478 Tatar_Siberian_Zabolotniye
0.08562940 Uzbek
0.08595528 Tubalar

Here's how the Pazyrk Scythians depicted themselves. 




One last thing, how much East Asian do they have exactly?

Target: RUS_Zevakino_Chilikta_IA
Distance: 2.7621% / 0.02762117
45.6 MNG_Khovsgol_BA
39.2 RUS_Krasnoyarsk_MLBA
15.2 TKM_Gonur1_BA

Target: KAZ_Pazyryk_IA
Distance: 1.4174% / 0.01417422
53.2 MNG_Khovsgol_BA
39.4 RUS_Krasnoyarsk_MLBA
7.4 TKM_Gonur1_BA

Target: RUS_Tuva_Aldy_Bel_IA
Distance: 2.6900% / 0.02690034
50.4 MNG_Khovsgol_BA
47.4 RUS_Krasnoyarsk_MLBA
2.2 TKM_Gonur1_BA 

Yep, I'm hardly surprised. They were basically ancient hapas, tongue-in-cheek. How do Bashkirs look like? This model will probably be a poorer fit on them, as we need more proximal sources, but let us see. 

Target: Bashkir
Distance: 3.7975% / 0.03797549
47.8 RUS_Krasnoyarsk_MLBA
46.8 MNG_Khovsgol_BA
5.4 TKM_Gonur1_BA

Very similar. Some Turanian (BMAC) and the rest being Andronovo + Khovsgol (East Asian proxy). This was also the conclusion Ruscone et al. came to. 

Our findings shed new light onto the debate about the origins of the Scythian cultures. We do not find support for a western Pontic-Caspian steppe origin, which is, in fact, highly questioned by more recent historical/ archeological work (1, 2). The Kazakh Steppe origin hypothesis finds instead a better correspondence with our results

So that's settled. The Scythians were a nearly 50:50 product of Steppe Iranics and East Asians with some additional Turanian throw in.  

Important papers on the Scythians

 Going to start off this blog by collating some important papers on the Scythians and Xiongnu-Huna. 

 

1) https://link.springer.com/article/10.1007/s00439-019-02002-y

2) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7997506/

3) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7664836/

4) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5337992/

5) https://link.springer.com/article/10.1007/s00439-020-02209-4

Maratha & Chitpavans

Marathas seem to have a lot of variation in their Andronovo and AASI ranges. Perhaps this is a confirmation of the fact the modern Maratha c...