Quick Take
- Narration: Eric Jason Martin reads Wiggins and Jones’s scholarly history with an academic steadiness that serves the book’s three-part structure, though some listeners may want more vocal range in the denser historical sections.
- Themes: The long history of data as power, the politics of statistical knowledge, the relationship between data science and social control
- Mood: Scholarly and revelatory, structured like a university course that keeps surprising you
- Verdict: The most historically rigorous audiobook on data’s political dimensions, structured in three clear arcs that reward cover-to-cover listening in a way that few academic-origin books achieve.
I came across How Data Happened because a friend forwarded me a review that described it as the book she wished existed when she was trying to understand where facial recognition technology came from. That’s an accurate description of what the book provides, and also an indication of why it’s harder to summarize than most data science books. Wiggins and Jones aren’t explaining how data works. They’re explaining how data came to have the power it has, and those are very different projects.
Chris Wiggins is the Chief Data Scientist at the New York Times and a faculty member at Columbia, where he and Matthew Jones created the course on which this book is based. That pedagogical origin shapes the book’s structure productively: it moves in three clear arcs, from the birth of statistics as a discipline through the development of machine learning and into the current moment of algorithmic governance. Each section builds on the previous one, and the progression feels earned rather than arbitrary.
The Victorian Origins That Explain the Present
The book’s first section, on the history of statistics, is what surprised reviewer James Muncy most pleasantly. Wiggins and Jones don’t begin with Silicon Valley or even with computers. They begin with the census, with Victorian-era debates about how to measure population, and with the development of eugenics as the first systematic attempt to use statistical data to argue for what was socially true and who deserved social resources. This is not a comfortable history, and the authors don’t make it comfortable. The relationship between statistical methodology and social control is not an accident of bad actors misusing good tools; it is built into the origins of the discipline.
That founding argument gives the book its political seriousness. When Wiggins and Jones get to facial recognition and predictive bail algorithms in the later chapters, you understand those systems not as unexpected intrusions of bias into neutral technology but as the latest iteration of a project that has always been about which truths get counted and who gets to do the counting.
How the Three-Part Structure Works in Audio
Reviewer Jerry, who bought the book twice in different formats to engage with it fully, identifies something real: this is a book where the specific references and academic citations matter enough that print has genuine advantages. Eric Jason Martin’s narration is competent and clear, but passages that involve technical arguments about statistical inference or the genealogy of specific algorithmic approaches benefit from the ability to slow down, reread, and follow citation trails.
That said, Martin’s reading keeps the argumentative thread intact across ten and a half hours. The second section, on big data and the rise of machine learning, is where the book is densest and where the narration does the most work. Martin doesn’t dramatize the material, which is right for a book of this register. He keeps the ideas moving, and the book’s own intelligence carries the weight.
Data as Weapon, Not Just Tool
The phrase the authors use, that data has been used as a tool and a weapon in arguing for what is true, is the book’s most important formulation, and it runs beneath every chapter even when it’s not stated explicitly. The census figures used to argue for immigration restrictions in the early twentieth century, the data collected by surveillance capitalism to argue for what consumers want, the risk scores used to argue for who deserves bail: these are not separate phenomena. They are instances of the same underlying dynamic, the use of numerical authority to produce social facts that benefit particular interests.
This is not a paranoid argument. Wiggins and Jones are careful to distinguish between data being used for social control and data as inherently a tool of control. The book’s concluding argument, that understanding this history enables us to bend data toward collectively chosen ends, is genuinely hopeful rather than defeatist. It requires, however, that we stop thinking of data as neutral and start thinking of it as historical.
Who Should Listen, Who Should Skip
Listen if you want the most historically grounded available account of how data came to shape political power and social decision-making. The Columbia course origins give it a clarity of structure that most books in this space lack.
Skip if you want practical guidance on data science methods or current policy proposals for algorithmic regulation. This is intellectual history, and it operates at that register throughout.
Frequently Asked Questions
Is this book accessible to listeners without a data science background?
Yes, with effort in the denser sections. Wiggins and Jones write for intelligent general readers, not specialists, and they explain technical concepts carefully. The historical and political arguments are fully accessible without technical training, though some of the methodology discussions in the second section reward patience.
How does the book handle eugenics and the darker history of statistics?
Directly and without euphemism. The Victorian origins of statistical methodology are traced through their connection to eugenics with scholarly honesty. This is one of the book’s most important contributions, but listeners should know it engages seriously with disturbing material.
Does the book offer any remedies or policy prescriptions?
In general terms. The concluding argument is that historical understanding enables more intentional choices about how data is used collectively, but Wiggins and Jones don’t offer a specific policy agenda. The book frames understanding as a prerequisite for action rather than providing the action plan itself.
Is this better in print or audio format?
Reviewer Jerry’s experience of buying both formats speaks to a real tradeoff. The audio narration serves the book’s argumentative progression well, but the academic apparatus, citations, and specific historical references benefit from the ability to pause and look things up. Serious students of the topic may want both.