Genetic Evidence

Genetic Evidence — Y-DNA Q1b (L275) and the Steppe Connection

Modern population genetics testing of the Varaha lineage

Modern genetics offers a separate and independent line of evidence for the migration history sketched in the textual record. The book reports the author’s own Y-chromosome haplogroupQ1b (Q-L275) — and analyses it against published reference panels using the qpAdm framework. The results are consistent with the textual story: a steppe-rooted lineage with significant Mongolia North Neolithic admixture, layered over later Iranian and South Asian components.

Why genetic evidence matters here

Three things make Y-chromosome haplogroup analysis particularly useful in cases like this one. First, Y-chromosome lineages descend strictly through the male line and do not recombine, so a haplogroup signature carries unbroken from father to son for hundreds of generations. Second, Q-L275 is geographically very distinctive — its modern centre of gravity is the western steppe, not the Indo-Gangetic plain — so its presence in a north-Indian Rajput community demands a historical explanation. Third, ancient-DNA work over the past decade has now sequenced enough Xiongnu, Hephthalite, and Iron Age steppe samples that we can model whole-genome admixture, not just terminal SNP affiliation.

The qpAdm framework

qpAdm — published by the David Reich lab and refined by collaborators including Choongwon Jeong — is a tool for testing whether a target population can be modelled as a mixture of “source” populations against a panel of “reference” outgroups. Where simpler tools simply report admixture proportions, qpAdm explicitly tests model fit (a p-value above the conventional 0.05 threshold means the model is not rejected) and rules out plausible alternative ancestries. A successful qpAdm model — especially one that survives rotation against multiple reference panels — is a serious test of a migration hypothesis.

Key components

The qpAdm models reported in the book identify five recurring ancestral components in the Varaha sample:

  • Eastern Hunter-Gatherer (EHG) — the Mesolithic Siberian taiga component carried into the steppe.
  • West Eurasian Steppe — Yamnaya / Andronovo / Sintashta-derived ancestry.
  • East Asian / Siberian (Mongolia North Neolithic) — the steppe component that distinguishes the Xiongnu and the Hephthalites from southern Iranian populations.
  • Iranian Neolithic (Iran_GanjDareh_N) — the substrate of the Iranian plateau.
  • South Asian (AASI / Indus Periphery) — the substrate that the migration encountered after crossing the Hindu Kush.

Q1b and the Xiongnu connection

Y-haplogroup Q1b (Q-L275) is the marker that anchors the lineage. The phylogeography of Q3-L275 is published in Balanovsky et al. (2017) in BMC Evolutionary Biology; the haplogroup’s modern frequency peaks in north-eastern Iran, Afghanistan, and parts of central Asia, with secondary clusters among Pashtun and Punjabi populations and a distinctive presence in the Khorasanian Sayyid lineages. Mascarenhas et al. (2015), in BioMed Research International, reconstructed an “ancient lineage” of Q1b crossing from inner Asia into north-west India in the late first millennium CE — a date and direction consistent with the Hephthalite-Alkhan migration documented in the textual record.

Medieval-era admixture

The genetic evidence is not static. Three later layers add complexity:

  • Turkic migrations (6th–11th c. CE) — overlap with the Turk Shahi period and add a small east-Eurasian component.
  • Mongol Empire (13th–14th c. CE) — adds a further east-Asian signal, particularly in lineages along the Silk Road corridor.
  • Silk Road and medieval Iran — continuous gene flow with the Iranian plateau through the entire period, accounting for the Iran_GanjDareh_N component.

What the data does not say

The qpAdm models reported in the book do not include a direct Y-DNA “source” — qpAdm is an autosomal admixture tool, not a Y-haplogroup classifier. The Q-L275 finding is reported separately, on the basis of commercial Y-chromosome SNP testing and the Balanovsky et al. published phylogeography. Both lines of evidence — the autosomal admixture model and the Y-haplogroup affiliation — point in the same direction: a steppe-rooted lineage with later Iranian and South Asian admixture. They are independently consistent with the textual story; they do not, on their own, prove it.

For the textual record on the same migration see Origins — The Altai Steppe and Migration West. For the published references see the Sources page.

Autosomal admixture — G25 model fit

A four-source admixture model fitted to the Global 25 PCA coordinates of a Lower Himachal Rajput sample

In addition to the qpAdm framework discussed above, the same lineage has been tested independently against the Global 25 (G25) coordinate system — a 25-dimensional principal-component space built from a few thousand ancient and modern reference genomes. The G25 vector for the Lower Himachal Rajput sample was modelled as a four-source mixture; the best fit returned the four ancient populations below, with a fit distance of 1.86% (under 2% is considered a solid fit, under 1% is excellent).

Sample
Lower Himachal Rajput (anonymised)
Sources used
4
Fit distance (lower = better)
1.86%
G25 dimensions
25
Iran (Shahr-i-Sokhta BA3)
62.6%
Steppe (Srubnaya-Alakul)
18.8%
BMAC (Dzharkutan, Uzb.)
14.0%
Tibetan (Chokhopani, Nepal)
4.6%

What each source represents

  • IRN_Shahr_I_Sokhta_BA3 · 62.6% — one of the standard Iran-Bronze-Age proxies; late-3rd-millennium-BCE samples from the Helmand-basin urban site of Shahr-i-Sokhta in south-eastern Iran. Captures the broad “Iranian-plateau farmer + early eastern-Eurasian admixture” component that dominates South Asian autosomal ancestry.
  • RUS_Srubnaya_Alakul_MLBA · 18.8% — Middle-to-Late-Bronze-Age steppe pastoralists of the Andronovo / Sintashta cultural horizon, c. 1900–1500 BCE, on the Volga-Ural steppe. The canonical Indo-Iranian / steppe signal that arrives in South Asia in the Late Bronze and Iron Age. 18.8% is in the typical band for North-Indian Brahmin/Rajput populations and slightly above the all-India average.
  • UZB_Dzharkutan1_BA · 14.0%BMAC (Bactria-Margiana Archaeological Complex) at the Dzharkutan type-site in southern Uzbekistan; the urban Oxus civilization, c. 2200–1500 BCE. The “Bactrian agricultural / Hephthalite hinterland” component.
  • NPL_Chokhopani_2700BP · 4.6% — an Iron-Age Tibet-related sample from Mustang, Nepal. The small East-Asian-via-Himalayas signal that turns up in Himachal/Punjab samples and reflects the Shivalik-foothill geography of the present-day settlement.

How this lines up with the Varaha narrative

The textual story sketched on this site runs Altai origins → Hephthalite Sogdiana / Bactria → Hindu Kush → Punjab → Shivalik. The G25 model translates that into autosomal terms as roughly ~77% combined Iran-BA + BMAC ancestry — the “Hephthalite-Bactria-Sogdiana” hinterland — plus ~19% steppe-Indo-Iranian ancestry, the “Altai / Andronovo cousin-line” signal. The 4.6% Tibet-related trace reflects the Shivalik-foothill geography. None of the four sources is post-Hunnic per se — G25 has no dedicated Hephthalite reference panel — but Srubnaya-Alakul is the steppe pastoralist cousin-clade of the same Bronze-Age Andronovo horizon from which the Xiongnu, the Hephthalites and the Q-L275 carriers all descend. The combined ~33% steppe + BMAC component is exactly what a Hephthalite-derived North-Indian Rajput line would be expected to show.

Caveats

The 1.86% fit distance is good but not perfect. Under 1% would be excellent; above 3% is suspect. The model is also under-determined: with only four sources you can sometimes get reasonable fits even when the true mixture is more complex. To strengthen the result, a four-population test with extra sources — substituting Iron-Age Hunnic (e.g. Hun_Asian_Sarmatian) or Xiongnu proxies for the Srubnaya-Alakul slot — would tell us whether the steppe signal is specifically Hunnic or generic Bronze-Age Andronovo. The G25 numbers are presented here as complementary evidence, alongside the Y-haplogroup Q1b (Q-L275) finding and the qpAdm autosomal model from the book; readers should treat all three lines together rather than relying on any single test.

In this era

  • Y-haplogroup affiliation: Q1b (Q-L275)
  • qpAdm autosomal admixture model — Mongolia North Neolithic + EHG + Iran_GanjDareh_N + AASI + West Eurasian Steppe
  • Primary references — Balanovsky et al. (2017), Mascarenhas et al. (2015), Schmidt & Seguchi (2016), Habu et al. (2018)