Skip to main content

Causal Inference - Paul Rosenbaum ***

The whole business of how we can use statistics to decide if something is caused by something else is crucially important to science, whether it's about the impact of a vaccine or deciding whether or not a spray of particles in the Large Hadron Collider has been caused by the decay of a Higgs boson. 'Correlation is not causality' is a mantra of science, because it's so easy to misinterpret a causal link from things that happen close together and space and time. As a result I was delighted with the idea of what the cover describes as a 'nontechnical guide to the basic ideas of modern causal inference'.

Paul Rosenbaum starts with a driving factor - deducing the effects of medical treatments - and goes on to bring in the significance of randomised experiments versus the problems of purely observational studies, digs into covariates and ways to bring in experiment-like features to observational studies, brings up issues of replication and finishes with the impact of uncertainty and complexity. This is mostly exactly the kind of topics than should be covered in such a guide, and as such it hits spot. But, unfortunately, while it is indeed an effective introductory guide for scientists who aren't mathematicians, Rosenbaum fails on making this accessible to a nontechnical audience.

Rosenbaum quotes mathematician George Pólya as saying that we need a notation that is 'unambiguous, pregnant, easy to remember…' I would have been happier with this book if Rosenbaum had explained how a mathematical notation could possibly be pregnant. (He doesn't.) But, more importantly, the notation used is simply not easy to remember for a nontechnical audience. Within one page of it starting to be used, I had to keep looking back to see what the different parts meant. 

We are told that a causal effect is 'a comparison of outcomes' and in the first example given this is rTw - rCw. Bits of this are relatively clear. T and C are treatment and control. W is George Washington (as the example is about his being treated, then dying soon after). I'm guessing 'r' refers to result, though that term isn't used in the text, but most importantly it's not obvious why the 'causal effect' is those two variables, set to arbitrary values, with one subtracted from the other. I'm pretty familiar with algebra and statistics, but I rapidly found the symbolic representations used hard to follow - there has to be a better way if you are writing for a general audience: it appears the author doesn't know how to do this. 

The irritating thing is that Rosenbaum doesn't then make use of this representation - he's lost half the readership for no reason. The rest of the book is more descriptive, but time after time the way that examples are described is handled in a way that is going to put people off, bringing in unnecessary jargon and simply writing more like a textbook without detail. Take the opening of the jauntily headed section 'Matching for Covariates as a Method of Adjustment': 'In figure 4 [which is several pages back in a different chapter], we saw more extensive peridontal disease amongst smokers, but we were not convinced that we were witnessing an effect caused by smoking. The figure compared the peridontal disease outcomes of treated individuals and controls who were not comparable. In figures 2-3 we saw that the smokers and nonsmokers were not comparable. The simplest solution is to compare individuals who are comparable, or at least comparable in ways we can see.' 

This is a classic example of the importance of being aware of who the audience is and what the book is supposed to do. To reach that target nontechnical audience, the book would have to have been far less of a textbook light, rethinking the way the material is put across. The content is fine for a technical audience who aren't mathematicians - so this is still a useful book - but the content certainly isn't well-presented for the general public.

Paperback:   
Kindle 
Using these links earns us commission at no cost to you
Review by Brian Clegg - See all Brian's online articles or subscribe to a weekly email free here

Comments

Popular posts from this blog

David Spiegelhalter Five Way interview

Professor Sir David Spiegelhalter FRS OBE is Emeritus Professor of Statistics in the Centre for Mathematical Sciences at the University of Cambridge. He was previously Chair of the Winton Centre for Risk and Evidence Communication and has presented the BBC4 documentaries Tails you Win: the Science of Chance, the award-winning Climate Change by Numbers. His bestselling book, The Art of Statistics , was published in March 2019. He was knighted in 2014 for services to medical statistics, was President of the Royal Statistical Society (2017-2018), and became a Non-Executive Director of the UK Statistics Authority in 2020. His latest book is The Art of Uncertainty . Why probability? because I have been fascinated by the idea of probability, and what it might be, for over 50 years. Why is the ‘P’ word missing from the title? That's a good question.  Partly so as not to make it sound like a technical book, but also because I did not want to give the impression that it was yet another book

The Genetic Book of the Dead: Richard Dawkins ****

When someone came up with the title for this book they were probably thinking deep cultural echoes - I suspect I'm not the only Robert Rankin fan in whom it raised a smile instead, thinking of The Suburban Book of the Dead . That aside, this is a glossy and engaging book showing how physical makeup (phenotype), behaviour and more tell us about the past, with the messenger being (inevitably, this being Richard Dawkins) the genes. Worthy of comment straight away are the illustrations - this is one of the best illustrated science books I've ever come across. Generally illustrations are either an afterthought, or the book is heavily illustrated and the text is really just an accompaniment to the pictures. Here the full colour images tie in directly to the text. They are not asides, but are 'read' with the text by placing them strategically so the picture is directly with the text that refers to it. Many are photographs, though some are effective paintings by Jana Lenzová. T

Everything is Predictable - Tom Chivers *****

There's a stereotype of computer users: Mac users are creative and cool, while PC users are businesslike and unimaginative. Less well-known is that the world of statistics has an equivalent division. Bayesians are the Mac users of the stats world, where frequentists are the PC people. This book sets out to show why Bayesians are not just cool, but also mostly right. Tom Chivers does an excellent job of giving us some historical background, then dives into two key aspects of the use of statistics. These are in science, where the standard approach is frequentist and Bayes only creeps into a few specific applications, such as the accuracy of medical tests, and in decision theory where Bayes is dominant. If this all sounds very dry and unexciting, it's quite the reverse. I admit, I love probability and statistics, and I am something of a closet Bayesian*), but Chivers' light and entertaining style means that what could have been the mathematical equivalent of debating angels on