Skip to main content

Causal Inference - Paul Rosenbaum ***

The whole business of how we can use statistics to decide if something is caused by something else is crucially important to science, whether it's about the impact of a vaccine or deciding whether or not a spray of particles in the Large Hadron Collider has been caused by the decay of a Higgs boson. 'Correlation is not causality' is a mantra of science, because it's so easy to misinterpret a causal link from things that happen close together and space and time. As a result I was delighted with the idea of what the cover describes as a 'nontechnical guide to the basic ideas of modern causal inference'.

Paul Rosenbaum starts with a driving factor - deducing the effects of medical treatments - and goes on to bring in the significance of randomised experiments versus the problems of purely observational studies, digs into covariates and ways to bring in experiment-like features to observational studies, brings up issues of replication and finishes with the impact of uncertainty and complexity. This is mostly exactly the kind of topics than should be covered in such a guide, and as such it hits spot. But, unfortunately, while it is indeed an effective introductory guide for scientists who aren't mathematicians, Rosenbaum fails on making this accessible to a nontechnical audience.

Rosenbaum quotes mathematician George Pólya as saying that we need a notation that is 'unambiguous, pregnant, easy to remember…' I would have been happier with this book if Rosenbaum had explained how a mathematical notation could possibly be pregnant. (He doesn't.) But, more importantly, the notation used is simply not easy to remember for a nontechnical audience. Within one page of it starting to be used, I had to keep looking back to see what the different parts meant. 

We are told that a causal effect is 'a comparison of outcomes' and in the first example given this is rTw - rCw. Bits of this are relatively clear. T and C are treatment and control. W is George Washington (as the example is about his being treated, then dying soon after). I'm guessing 'r' refers to result, though that term isn't used in the text, but most importantly it's not obvious why the 'causal effect' is those two variables, set to arbitrary values, with one subtracted from the other. I'm pretty familiar with algebra and statistics, but I rapidly found the symbolic representations used hard to follow - there has to be a better way if you are writing for a general audience: it appears the author doesn't know how to do this. 

The irritating thing is that Rosenbaum doesn't then make use of this representation - he's lost half the readership for no reason. The rest of the book is more descriptive, but time after time the way that examples are described is handled in a way that is going to put people off, bringing in unnecessary jargon and simply writing more like a textbook without detail. Take the opening of the jauntily headed section 'Matching for Covariates as a Method of Adjustment': 'In figure 4 [which is several pages back in a different chapter], we saw more extensive peridontal disease amongst smokers, but we were not convinced that we were witnessing an effect caused by smoking. The figure compared the peridontal disease outcomes of treated individuals and controls who were not comparable. In figures 2-3 we saw that the smokers and nonsmokers were not comparable. The simplest solution is to compare individuals who are comparable, or at least comparable in ways we can see.' 

This is a classic example of the importance of being aware of who the audience is and what the book is supposed to do. To reach that target nontechnical audience, the book would have to have been far less of a textbook light, rethinking the way the material is put across. The content is fine for a technical audience who aren't mathematicians - so this is still a useful book - but the content certainly isn't well-presented for the general public.

Paperback:   
Kindle 
Using these links earns us commission at no cost to you
Review by Brian Clegg - See all Brian's online articles or subscribe to a weekly email free here

Comments

Popular posts from this blog

David Spiegelhalter Five Way interview

Professor Sir David Spiegelhalter FRS OBE is Emeritus Professor of Statistics in the Centre for Mathematical Sciences at the University of Cambridge. He was previously Chair of the Winton Centre for Risk and Evidence Communication and has presented the BBC4 documentaries Tails you Win: the Science of Chance, the award-winning Climate Change by Numbers. His bestselling book, The Art of Statistics , was published in March 2019. He was knighted in 2014 for services to medical statistics, was President of the Royal Statistical Society (2017-2018), and became a Non-Executive Director of the UK Statistics Authority in 2020. His latest book is The Art of Uncertainty . Why probability? because I have been fascinated by the idea of probability, and what it might be, for over 50 years. Why is the ‘P’ word missing from the title? That's a good question.  Partly so as not to make it sound like a technical book, but also because I did not want to give the impression that it was yet another book

Vector - Robyn Arianrhod ****

This is a remarkable book for the right audience (more on that in a moment), but one that's hard to classify. It's part history of science/maths, part popular maths and even has a smidgen of textbook about it, as it has more full-on mathematical content that a typical title for the general public usually has. What Robyn Arianrhod does in painstaking detail is to record the development of the concept of vectors, vector calculus and their big cousin tensors. These are mathematical tools that would become crucial for physics, not to mention more recently, for example, in the more exotic aspects of computing. Let's get the audience thing out of the way. Early on in the book we get a sentence beginning ‘You likely first learned integral calculus by…’ The assumption is very much that the reader already knows the basics of maths at least to A-level (level to start an undergraduate degree in a 'hard' science or maths) and has no problem with practical use of calculus. Altho

Everything is Predictable - Tom Chivers *****

There's a stereotype of computer users: Mac users are creative and cool, while PC users are businesslike and unimaginative. Less well-known is that the world of statistics has an equivalent division. Bayesians are the Mac users of the stats world, where frequentists are the PC people. This book sets out to show why Bayesians are not just cool, but also mostly right. Tom Chivers does an excellent job of giving us some historical background, then dives into two key aspects of the use of statistics. These are in science, where the standard approach is frequentist and Bayes only creeps into a few specific applications, such as the accuracy of medical tests, and in decision theory where Bayes is dominant. If this all sounds very dry and unexciting, it's quite the reverse. I admit, I love probability and statistics, and I am something of a closet Bayesian*), but Chivers' light and entertaining style means that what could have been the mathematical equivalent of debating angels on