Monologue: Race - genetics, history and sociology

title Monologue: Race - genetics, history and sociology

description You can find the complete monologue here: https://www.razibkhan.com/p/monologue-race-genetics-history-and

This is where you will find all the podcasts from Razib Khan's Substack and original video content.

On this episode, Razib talks about race, and how to think about this touchy subject.

pubDate Thu, 23 Apr 2026 15:01:00 GMT

author Razib Khan

duration 2055000

transcript

Speaker 1:
[00:00] This podcast is brought to you by the Albany Public Library Main Branch and the generosity of listeners like you. What is a podcast? God, daddy, these people talk as much as you do. Razib Khan's Unsupervised Learning.

Speaker 2:
[00:29] Thanks for listening to the ungated version of the Unsupervised Learning Podcast. If you want to read some essays on some of these topics, please check out razib.substack.com. Again, that's razib.substack.com. Thank you. All right, I am doing another monologue. This is on a topic that people have asked me to do before, because they think I know a little bit about this. So basically, I'm gonna talk about race. The social aspect of it, the biological aspect of it, its history, its genetics, all these things I'm gonna try to touch on. So I guess, I've written about these topics in kind of an implicit way for a long time in a lot of the posts. I don't write about this stuff very explicitly anymore. It's, you know, candidly, it's kind of done, not really interested in it. But, you know, it's interested, I mean, people are interested in it, and I'm not. And I guess you could say race is interested in us, even if we're not interested in race. So I figure I should talk about it. So, you know, you may have heard that race is socially constructed. It's a fiction. It's a myth. You know, I think it is obviously socially constructed. But, you know, a lot of the things that we conceive of as real are socially constructed. So, a table is socially constructed. It's mostly empty space. It's just, you know, various nuclear forces, electromagnetic forces are keeping the matter in place and preventing your hand from going through the table, right? Just we see it as matter, even though it's mostly space, right? It's a social construction. It's a label. It's defining something real, even though it's literally quite different than what we actually kind of perceive with our eyes. Like we see solidity, but it's not really solid. It's mostly space. It's mostly empty. The solidity is just, you know, the light electromagnetic, you know, particles not getting through and whatnot. Okay. What does that have to do with race? What it has to do with race is, you know, race is a way to classify human beings and classifications are not like quarks or electrons. They're more like a chair. It's a way to categorize something that makes sense to humans and is intelligible only in terms of human categories. In fact, species is quite like that as well. You might have, you know, heard or read that species exist in terms of populations, breeding populations that, you know, when they interbreed, there are problems, there's hybrid breakdown and stuff like that. It's a lot more complicated than that. You know, I did a very, very early podcast with John Wilkins, my friend John Wilkins, philosopher of science in Australia, on unsupervised learning about different species concepts and there's, I mean, there's dozens, common ones, but there's probably hundreds. And the species concepts, most scientists, they're interested in them for instrumental values in terms of they're an instrument to answering some other question. And that's how a lot of geneticists look at biological species concept and species concepts in general, geneticists and evolutionary biologists. You know, what can a species tell us about population history, population trajectory, population dynamics, species themselves are not really interesting. You know, the whole classification and accounting, you know, the bug collecting, so to speak, is not really interesting to a lot of scientists. But, you know, it is to some, you know. There are cladists and, you know, taxonomists and, you know, who are really interested. But my point here is race, obviously, is a less clear and distinct category than species. But like species, it is instrumentally useful. So, you know, the idea that race is a myth, it's false, really emerged in the last generation. And I think the deep insight, well, from genetics, from biology, what emerged was in the 1970s, looking at allozymes, which is a type of protein variation that's really easy to assay, easy to detect, you know, before no genetics could get into genomes and whatnot. This is the 1970s, allozymes actually date back to the 1960s. But in humans in the 1970s, the late Richard Lewontin, who is a prominent left wing, famously Marxist geneticist, showed that the vast majority of genetic variation in humans can be found within a racial grouping, like the racial groupings that we know. So for example, Africans, Europeans, Asians, East Asians, South Asians, these sorts of things. This is common sense groupings that we use from phenotype. And what Richard Lewontin showed is about like 85, 80 to 90 percent of the variation is within a population. Only 10 to, say Africans versus non-Africans, 10 to 10 to 20 percent, whereas it'd be like 5 to 10 percent maybe when it came to within Eurasians, because they're a little closer together. And this was surprising to a lot of people that most of the variation is actually within the populations. And it was kind of taken by the general public and certain popularizers to mean that race is not important, racial differences are not important, and that you cannot actually classify people very easily. Now, allozymes are very, very primitive. So, they're proteins, there aren't that many of them. So, it's like you have a couple, you have, you know, like a handful of variables to figure out what's going on. And so, if you're looking at a single gene, a single genetic locus, and 90% of the variation is within a population, within any given population, or pooled across the human, you know, pooled across the human race. That's an easy way to explain it. That gene is not going to be very important or useful in identifying populations, right? But what happened in the early 2000s is genetic methods, modern-day genetic methods, like this is pre-genomic, pre-snip array, so snip arrays are like, you know, hundreds of thousands of markers. This was what we used even before that. You know, geneticists started to use microsatellites and other types of markers that are highly variable, and they were, you know, say, on the order of a couple hundred of them, that were widely used about 26 years ago around the year 2000. And they started using these to construct phylogenetic trees, but also classifiers. So, there's emeritus professor at Stanford, Neil Risch. He showed in a paper in 2002 that, you know, just using a couple of hundred well-curated markers that are highly variable, you can easily classify with very, very, very high accuracy and confidence, someone as, for example, whether they're African-American, white American, Latino, etc., etc. So, what you had about 25 years ago is a situation where people are saying that race is a myth, it's a biological fiction, it's not real. On the other hand, you are developing these methods that can classify people by race very, very easily. So, if race is a myth, if it's not real, how can you classify people so easily? So, this generated a counter-narrative kind of against the mainstream orthodoxy, so to speak. And the mainstream orthodoxy is, you know, a lot of scientists will parrot it and non-scientists will repeat it. It's dominant orthodoxy. The race is biologically meaningless, etc., etc. But there are some scientists who disagree. AWF. Edwards, I think in 2003, he wrote a paper called Lewontin's Fallacy. And basically, Lewontin's Fallacy meant, I mean, Edwards was just showing, look, just because most of the population variation is pooled across all the populations, does not mean that the between-population variation is not informative of phylogenetic structure. That basically, you can easily distinguish different populations because the between-population variation is highly, highly informative. So think about the fact that today, 23andMe or Ancestry, these genetic platforms and also like most of the studies, use hundreds and hundreds of thousands of variables. So these are SNP arrays. These are single-nuclear type polymorphisms. So out of your 3 billion base pairs, the vast, so any given human has about 5 million that are different from the human reference. These are polymorphisms and the SNP arrays are usually ascertained. So they're curated in a way to maximize variation. And so these are highly variable regions. And so with these highly variable regions, even if most of the variation is not differentiating different populations, you still have an incredible amount of between population variation. And that variation is correlated together, like it goes together, because, you know, it reflects the history of the human populations. The human populations, you know, today we know that it's less of a tree than a graph, so there's gene flow and admixture. But the point is, there is a history to human populations, and the genetic structure is a reflection of that human variation. So, you know, genetic structure obviously exists. So, genesis will tell you population structure is real, you know, it obviously exists, but they can also say, well, I mean, race is not real. You know, race is a fiction. And what do they mean by that? I think that what's going on here is there is a sociological, historically contingent classification of race. Cognitive psychologists like Scott Atron, who is an evolutionary psychologist, wrote a really good book on, you know, cognitive psychology of religion in God's We Trust. It's about 23 years old, very influential book. I read it multiple times. But he also wrote about things like, something called folk biology. So, we have intuitions about biology. And so, humans classify people, you know, into different races. Now, they don't use a snip array, okay? What humans do is they use coarse external features. And coarse external features, and when I mean coarse, I mean, just like, you know, your hair form, your nose shape, your skin color, the shape of your eyes, the thickness of your lips, the texture of your hair, all of these things. They use them to categorize into different populations. And, you know, obviously people that are, say, a native from the Amazon, invariably looks different from someone from Sweden, you know? So, when you say, oh, well, most of the population variation is within these different groups, well, I mean, you can still tell them apart. Why can you tell them apart? There's a couple of reasons. But I think, you know, the primary one is that some of these external characteristics, you know, are due to adoption, adaptation in local regions, like skin color, for example, or in nasal form. And so that's reflecting shared history. It could be sexual selection as well. That's another hypothesis. But the point is, some of these external characteristics are not arbitrary. They're not random. And so they are reflecting local adaptation, they're reflecting history, which allows you to differentiate people pretty easily. So, you know, in a common sense way, you know, race is obviously real in terms of social construction. But, you know, humans starting, you know, 3,000 years ago, really 2,500 years ago with the Greeks, you know, they started developing these taxonomies that were quite typological. And what I mean by typological is they had a category, you know, so you have the white race or the yellow race or, you know, Ethiopians as the Greeks would say. And then everyone would be put into these pigeonholes. This is quite different from the way biologists, modern biologists and geneticists. I say modern, think about race and populations. They tend to use what's called population thinking, so it's like a distribution. There's a lot of characteristics, populations mixed with each other, they blend, you know, and where one population ends and another begins, it can be a little bit arbitrary. There are no clean lines in this world, especially in terms of races slash subspecies, you know, geographically partitioned populations. There are rather, you know, continuities, arguably. And so it's much different than the typological conception that people were thinking about in the past. But the thing is, we do tend to think typologically, we do tend to think dichotomously, we don't think statistically. And so race, you know, statistical genetics obviously is statistical. It's about distributions, it's about probabilities. It's about categories that blend in to each other quite often, and our verbal descriptions don't map well onto it. But those descriptions are ancient, they've been around for a long time. And, you know, sometimes people say, so let's talk about the history of this. Sometimes people say, well, you know, the Romans didn't have an understanding of race. Well, that's not true. They clearly understood that there were different races. And before the Romans, the Greeks, like Aristotle and some of the other thinkers, they had theories of the different races and why they're the way they were. A lot of it was quite Lamarckian in terms of environment affecting people's appearance and dispositions. So for example, the Greeks, you know, distinguished between Ethiopians of the East and Ethiopians of the West. So the Ethiopians of the West are basically what we call Sub-Saharan Africans, Ethiopians of the East or South Indians. And they call them Ethiopians because they were burnt by the sun, is their hypothesis, and that's not totally wrong, in terms of what we know about melanin having a protective effect on the skin. And it might be less about skin cancer and sunburned and more about neural tube defects, if you read Nina Jablonski's work. But in any case, so they distinguished between these two types of Ethiopians, well, the Ethiopians of the East often have straight hair, whereas the Ethiopians of the West have woolly hair. So the Greeks were not dumb. They weren't just making things up based on nothing. They were making observations. Also, some of the Greek thinkers observed, some of the Greek philosophers observed, the people of India in the north, in the Indian subcontinent, their complexion was lighter than those in the south, and it was quite like that of Egyptians. Okay, I mean, it kind of makes sense. You know, it kind of makes sense. So they were able to typologically classify people. You know, your northern Europeans had red hair or, you know, blonde hair, blue eyes. They were quite distinct. And whereas the people of the Mediterranean look, you know, kind of similar, Egyptians similarly also classify different races. Now, part of the problem here is you have to kind of understand that they're also stylized. So Egyptians depicted and described people of West Asia, people of the Levant, as yellow. Well, they're not literally yellow, but, you know, olive complexion, maybe somewhat lighter than the typical Egyptian is what's going on here. Egyptians depicted the people in their stelae of Nubia as black-skinned while they had red skin. Did the Egyptians literally have red skin? No, they had a brownish probably complexion, a light brownish complexion. It's not coincidental that the Nubians are depicted as black, though, the Nubians are dark-skinned. They are towards the black skin. So, my point here is the sociological descriptions can be a little imperfect. You know, they have their nuances, they have their idiosyncrasies, but they're not arbitrary. And I think that's a problem that we have when we describe them as if there's just random and arbitrary. It's not random and arbitrary. They knew what was going on. When Europeans arrived in the waters of East Asia, they observed that the people of Japan and China were white like them. Now, they knew that they were not of the same race, but they noticed that the skin color was white. And similarly, the Chinese and the Japanese were, oh, these people are white skinned like us, they're ruddy. But eventually, the salient differences between the two populations became so extreme in terms of when they're interacting with each other, in terms of how they ate, how they behaved, you know, religions. That, you know, new racial classification emerged where it was, oh, hey, the people of the East, East Asia are yellow, and people of Europe are white. But the underlying reality is still the same. You know, just like giving a different label doesn't change anything. And they always knew that they were somewhat different. They knew that they were somewhat different. There was a somewhat different population, just like the Greeks knew that the people of the Indian subcontinent that were very dark skinned were a different race than the people of Africa that were very dark skinned. By the way, Arabs, Muslim thinkers, and intellectuals, like Al-Buruni, a famous geographer, but I don't know, thousand AD, worked with Muhammad Afghazni, I think. They would refer to Indians as black, as blacks, and themselves as white, and it's because people of Iran are quite light skinned compared to the people of the Indian subcontinent. But they still knew that the people of the Indian subcontinent were a different race than the people of Africa who also had black skin were also blacks. So, you know, my point here is, you know, the ancients were sophisticated. They used their common sense, and they used their eyes to classify human beings into these social categories that were quite useful. They were not racist or in a way that moderns are racist, partly because they did not often interact with extremely different people in the same way, you know, as modern people. And in the United States, a particular system developed because you had Native Americans, white Europeans, and black Africans together in the same region. These are extremely different populations. And so America developed its own system of racial classification and racial identity. It's somewhat different than the old world, where, for example, within Europe, obviously, they kind of knew they were white or European, racially distinct, but they really differentiated themselves by religion and later by nationality, by ethno-linguistic categories. So the people of Northern Spain knew that they did not have Moorish blood. They were lighter skinned. The word blue blood comes from the fact that the nobles of the North, presumably Visigothic ancestry, that's what they claimed. You could see their veins through their skin, they were paler. But when you describe the battles in Moorish Spain between Muslims and Christians, basically the way they distinguished each other was mostly by dress. And sometimes if they spoke the other group's language very well, they could change dress and go behind enemy lines. This shows that racially they were not very distinct. There's on average some differences and we know this from genetics. People that are Muslim tend to have more Moorish, North African ancestry, a little bit more Sub-Saharan African ancestry, but they overlapped. And that's because of gene flow and mixture. I think Abd al-Rahman III, one of the most powerful Caliphs of the Umayyad Al-Andalus dynasty, at the end of the 10th century, he was overwhelmingly European in terms of Northern Spanish, actually mostly, and some Frankish even, ancestry. And so he dyed his hair, which was apparently light. He had light hair and he had blue eyes. So he dyed his hair to black just so that he would look a little bit more like an Arab, but that's how much they overlapped. In any case, so what has happened though over the last, say, 20 years is the rise of direct to consumer testing, where people classify their races, basically they classify their race and ethnicity in a very fine-grained way, has occurred at the same time that this massive cultural campaign emerged to say that race is a myth. And so the reason that 23andMe and Ancestry and all these companies can classify UOs, they have access to a lot of data, and the data is just clear, like populations have different histories, and these different histories lead to the ability to classify them differently. As Lewin, no, as AWF Edwards said in 2005, and actually Armand Leroy in his book Mutants wrote about how race is a useful classifier in the New York Times. This was, this is still a minority position. It seems like it's declined, in my opinion, even at the same time as, you know, 23andMe's racial classifiers become fodder for jokes on South Park. I don't know. But the genomics has advanced a lot, and we know a lot about the different human populations. You know about this from my other podcasts, from my writings, right? And what we know is a lot of details about population history. So, for example, the vast majority of our ancestry does seem to come from an African population that probably lived in Africa on the order of 200,000 years ago. This includes both Africans, even include the most genetically distinct Africans, like the Khoisan of the South, and Eurasians, and Australians, and obviously people in the New World. This means around, when you look at our genetics, about 90% when you look at the whole genome, around 90% of the polymorphisms, they are, you know, date back to this period in the ancestral population, right? So we overlap genetically so much because we were a single ancestral population on the order of about 200,000 years ago. And then since then, what's happened is some populations have split, some populations have mixed with other populations, and that's introduced variation that partitions the populations, right? So, Eurasians, the ancestor, well, the ancestor of all non-Africans, because they're the Australians, the New World people as well. What seems to have happened is out of this ancestral population, about 200,000 people, the first split is a group of people that seem to have gone southward within Africa and eventually kind of settled in South Africa, Botswana, that region, a bigger region actually in the past. They are the ancestors of the Khoisan, the Bushmen people, the Khoi people. There were many, many other types of people like that, but most of them were absorbed or disappeared. They split off first from all other human beings, or all other human populations, right? So they're the first split. So on the first order, you can think of there being a Khoisan and a non-Khoisan race, okay? And the non-Khoisan race includes other Africans. And so this is one of the issues when people say race is a myth, as people point out. Well, if you look at genetic variation, the difference between Khoisan and non-Khoisan Africans is greater than the difference between East Asians, between Chinese and Europeans, and even greater than between Yoruba African and a non-African, right? That's fair. But the point here is that modern classifiers are of course, and they're not all telling us all the details of the history of the populations. Just because the categories that we have don't perfectly reflect what we're figuring out from paleogenetics and population genetics doesn't mean the categories themselves are totally useless. They're not. So after the Khoisan, you have Africans, you have all the other Africans, all the other Africans, let's call them all the other Africans and non-Africans, right? All the other Africans are East Africans, West Africans. These are mostly agriculturalists, although I think the split of the Babutu Pygmies occurs earlier. So the foragers have deep, deep splits, more than 100,000 years ago. And then it looks like something like 70,000, 80,000 years ago, the split occurs. And these splits are like imperfect splits. They're probably splits that came back together, and there's admixture and weird things going on. But all these agriculturalist Africans, like the ancestors of the Bantu, and West African agriculturalists, like Yoruba, split from non-Africans around 70,000 years ago, something like that. And then after this period, say around 60,000 years ago, and here there's another ancient group, ancient North Africans. I've talked about it, written about it. That splits, and then the non-Africans split to Basel Eurasian and everybody else. Basel Eurasian doesn't mix with Neanderthals, they're probably in Egypt, North Africa, Saudi Arabia, whatever, somewhere like that. And then all the other non-Africans probably go to the Zagros Mountains, into West Asia, the highlands of Anatolia, Western Iran, and there they mix with Neanderthals, and they've actually spread across the rest of the world. So when you talk about racial classifications, all races, you know, the white race, the yellow race, the brown race, the red race, whatever, Australians, all of those races descend from a common population on the order of 1,000 to 10,000 people that was isolated. We don't know why, and we don't know the details for, you know, thousands of years after 60,000 years. And what happened around 50,000 years ago is the initial upper Paleolithic, and, you know, I've talked about this in podcasts you guys know, spread all across the world, mixed with some Neanderthals, mixed with some Denisovans, and eventually they became isolated from each other, and genetic differences began to emerge. This is why the genetic differences between non-Africans is quite low. Well, when you do the calculation, the calculation is basically you look at the amount of variation, the total variation, and then that's the denominator, and the numerator, you look at the variation between the populations, and, you know, you get like a value of like, say, 10%. So, I think like Chinese, if you pool Chinese with Northern Europeans, and then you look at the variation between the two, it's like 10% of the genetic variation. Everything else is shared between the populations, right? So, this all happened in the last 60,000 years. Within Africa, it's more complicated and deeper because there are deeper splits, like between the Pygmies and the non-Pygmies, and then obviously the Khoisan first, and even within the West African, even within the agriculturalist African populations, there's a distinction between East Africans, the Nihilotic people, and the West Africans, like the people of Nigeria and the Bantu, who are a branch of the Eastern Nigerians, 4,000 to 5,000 years ago, right? In any case, so you have this complicated system where you have hierarchical races, and there's a lot of population structure, and it doesn't map on to all of our categories. But if you're a direct-to-consumer, and I've worked on some of these, so I know. If you're a direct-to-consumer person, you can take the modern categories, and some of them don't work as well, but a lot of them still work. And instead of worrying about the hierarchy of which population is related to which population and what their distances are and stuff like that, you just get a bunch of categories, like 100 categories, and then you just create classifiers because there's so much genetic variation, and the classifiers will work. So why do the classifiers work? Classifiers work because they're looking at a lot of data from SNP arrays, and the SNPs have history, have tens of thousands of years of history, hundreds of thousands of years of history of separation and genetic mutation that builds up differences between the groups. So for example, Australian-Aborigines, there's a lot of evidence that they may have been pretty isolated from about 45,000 years on. So over the last 45,000 years, they're isolated, they start separating, they start diverging from the Eurasian populations to their north. 40,000 years is a long time. And population bottlenecks and small populations can result in kind of an inflation of that difference. So one stylized fact that you may have heard, and I'm sure you have heard this and you have read this, the genetic distance between European foragers and European farmers 6,000 years ago, maybe closer to 7,000 years ago, like 5,000 BC, something like that, 6,500 years ago in Germany, was basically about the same as genetic distance between Chinese and modern Germans. Now, it's kind of an apples to oranges comparison, but the distance is the proportion between population variation, which is about 10 percent, like on there were almost 10 percent. And one of the reasons the difference is like this is because, first of all, they were very different populations. The early European farmers had a lot of, had not only a lot of basal Eurasian ancestry, but also the foragers went through a lot of bottlenecks, and bottlenecks tend to make you very distinct. So they're very distinct, you know? So these distinct populations, they eventually merged into each other and created the precursors of modern Northern Europeans that also had a lot of steppe ancestry that came later. So all this history shows up in population structure. So if you do admixture analysis and all these things, you see that Northern Europeans are distinct from Southern Europeans. Well, why is that? Well, there's a couple of things going on here. There's a couple of things historically that's going on here. First, Southern Europeans have more early farmer ancestry. So over 50 percent of their ancestry, well over 50 percent of their ancestry is early farmer. Less than 50 percent, like closer to, depending on where, 20 to 40 percent of their ancestry is from the Eurasian steppe is from the Yamnaya. Okay, I put it that way. Closer to 20 percent, maybe not even 20 percent, maybe 30, 25, in Sicily and parts of Southern, not even in Greece, because Greece has a lot, it's really Sicily and parts of Southern Italy. And then in the rest of Southern Europe, it's closer to 40 percent. So in Spain, it's closer to 40 percent of steppe ancestry. In Northern Europe, steppe ancestry is 50 to 60. So that steppe ancestry means that there's genetic variation between these different groups. That genetic variation is picked up in clustering algorithms, whether it's principle component analysis that kind of maps it out on a 2D plot, or admixture analysis that takes a model and generates clustering output from the model. What those methods are detecting, what 23andMe is detecting, what ancestry is detecting, is a geographical partitioning based on a history of separation and admixture that's variable, right? It's reflecting real history and real events that make the populations different. That's all I'm saying here. So when people are saying like race is a myth, what they're really saying, I mean, first of all, they want to say that there's no difference between populations and to justify egalitarianism, candidly. Okay, but set that aside and I'll get to that later. What they're really saying is that all the categories that we use are very, very kludgy and imperfect, and they don't reflect the underlying complexity and detail, and it's kind of relevant. A lot of people in Australia can tell the difference between Africans and Melanesians, but a lot of people in the rest of the world cannot. But Africans and Melanesians are actually quite genetically distinct. Melanesians are just as distant from Africans as, well, actually, they're a little bit more distant because of Denisovans. But the point is Melanesians are just as different from Africans as Swedes are different from Africans. They're both just from the out of African migration. It's just they look really, they look very different due to differential adaptations and stuff like that, right? But Africans in Africa, even though they have dark skin just like the Melanesians, and they have wooly hair just like the Melanesians, it doesn't mean that they're actually. Thank you for listening. To hear the rest of the monologue, please go to razib.substack.com and subscribe.

Speaker 1:
[33:35] This podcast for kids. This is my favorite podcast.