Skip to main content

Weapons of Math Destruction - Cathy O'Neil ****

As a poacher-turned-gamekeeper of the big data world, Cathy O'Neil is ideally placed to take us on a voyage of horrible discovery into the world of systems making decisions based on big data that can have a negative influence on lives - what she refers to as 'Weapons of Math Destruction' or WMDs. After working as a 'quant' in a hedge fund and on big data crunching systems for startups, she has developed a horror for the misuse of the technology and sets out to show us how unfair it can be.
It's not that O'Neil is against big data per se. She points out examples where it can be useful and effective - but this requires the systems to be transparent and to be capable of learning from their mistakes. In the examples we discover, from systems that rate school teachers to those that decide whether or not to issue a payday loan, the system is opaque, secretive and based on a set of rules that aren't tested against reality and regularly updated to produce a fair outcome.
The teacher grading system is probably the most dramatically inaccurate example, where the system is trying to measure how well a teacher has performed, based on data that only has a very vague link to actual outcomes - so, for instance, O'Neil tells of a teacher who scored 6% one year and 96% the next year for doing the same job. The factors being measured are almost entirely outside the teacher's control with no linkage to performance and the interpretation of the data is simply garbage.
Other systems, such as those used to rank universities, are ruthlessly gamed by the participants, making them far more about how good an organisation is at coming up with the right answers to metrics than it is to the quality of that organisation. And all of us will come across targeted advertising and social media messages/search results prioritised according to secret algorithms which we know nothing about and that attempt to control our behaviour.
For O'Neil, the worst aspects of big data misuse are where a system - perhaps with the best intentions - ends up penalising people for being poor of being from certain ethnic backgrounds. This is often a result of an indirect piece of data - for instance the place they live might have implications on their financial state or ethnicity. She vividly portrays the way that systems dealing with everything from police presence in an area to fixing insurance premiums can produce a downward spiral of negative feedback.
Although the book is often very effective, it is heavily US-oriented, which is a shame when many of these issues are as significant, say, in Europe, as they are in the US. There is probably also not enough nuance in the author's binary good/bad opinion of systems. For example, she tells us that someone shouldn't be penalised by having to pay more for insurance because they live in a high risk neighbourhood - but doesn't think about the contrary aspect that if insurance companies don't do this, those of us who live in low risk neighbourhoods are being penalised by paying much higher premiums than we need to in order to cover our insurance. 

O'Neil makes a simplistic linkage between high risk = poor, low risk = rich - yet those of us, for instance, who live in the country are often in quite poor areas that are nonetheless low risk. For O'Neil, fairness means everyone pays the same. But is that truly fair? Here in Europe, we've had car insurance for young female drivers doubled in cost to make it the same as young males - even though the young males are far more likely to have accidents. This is fair by O'Neil's standards, because it doesn't discriminate on gender, but is not fair in the real world away from labels.
There's a lot here that we should be picking up on, and even if you don't agree with all of O'Neil's assessments, it certainly makes you think about the rights and wrongs of decisions based on automated assessment of indirect data.


Paperback 

Kindle 
Using these links earns us commission at no cost to you
Review by Brian Clegg

Comments

  1. This book is basically a discussion of the negative effects of 'unfair' models when applied to people, with copious examples given. The author has been working as a data scientist, a role that I split into two types - those who can explain their statistical models and those who can't explain their machine learning (ML) models. Note that sometimes a statistical model can be derived from an ML model. Most of the models are described as black boxes - presumably ML models.

    The author confesses to being an SJW although that doesn't mean that at times her ire isn't justified - thankfully the book doesn't read as a polemic. She's an American and thinks that it was a good idea that the British NHS stopped rejecting doctors for poor English language skills. Unfortunately, according to the local news, this has led to patients dying. She worked on Wall Street during the crash of 2007, which she is rightly critical of. Interestingly she is a fan of Barack Obama even though the then Senator didn't support the 2005 Bill that would have stopped Fannie Mae and Freddie Mac from making those subprime loans.

    There is very little technical analysis beyond discussion of the inputs and weightings. A/B testing is presented early on in a very negative light and near the end mentioned again in a better light. In the conclusion it is suggested that the amount of data could be reduced to sacrifice accuracy for fairness with no mention that it could instead increase the quality of predictions by eliminating over-fitting. So not really my sort of book, I prefer less data and more technical details.

    ReplyDelete

Post a Comment

Popular posts from this blog

Battle of the Big Bang - Niayesh Afshordi and Phil Harper *****

It's popular science Jim, but not as we know it. There have been plenty of popular science books about the big bang and the origins of the universe (including my own Before the Big Bang ) but this is unique. In part this is because it's bang up to date (so to speak), but more so because rather than present the theories in an approachable fashion, the book dives into the (sometimes extremely heated) disputed debates between theoreticians. It's still popular science as there's no maths, but it gives a real insight into the alternative viewpoints and depth of feeling. We begin with a rapid dash through the history of cosmological ideas, passing rapidly through the steady state/big bang debate (though not covering Hoyle's modified steady state that dealt with the 'early universe' issues), then slow down as we get into the various possibilities that would emerge once inflation arrived on the scene (including, of course, the theories that do away with inflation). ...

Why Nobody Understands Quantum Physics - Frank Verstraete and Céline Broeckaert **

It's with a heavy heart that I have to say that I could not get on with this book. The structure is all over the place, while the content veers from childish remarks to unexplained jargon. Frank Versraete is a highly regarded physicist and knows what he’s talking about - but unfortunately, physics professors are not always the best people to explain physics to a general audience and, possibly contributed to by this being a translation, I thought this book simply doesn’t work. A small issue is that there are few historical inaccuracies, but that’s often the case when scientists write history of science, and that’s not the main part of the book so I would have overlooked it. As an example, we are told that Newton's apple story originated with Voltaire. Yet Newton himself mentioned the apple story to William Stukeley in 1726. He may have made it up - but he certainly originated it, not Voltaire. We are also told that â€˜Galileo discovered the counterintuitive law behind a swinging o...

Ctrl+Alt+Chaos - Joe Tidy ****

Anyone like me with a background in programming is likely to be fascinated (if horrified) by books that present stories of hacking and other destructive work mostly by young males, some of whom have remarkable abilities with code, but use it for unpleasant purposes. I remember reading Clifford Stoll's 1990 book The Cuckoo's Egg about the first ever network worm (the 1988 ARPANet worm, which accidentally did more damage than was intended) - the book is so engraved in my mind I could still remember who the author was decades later. This is very much in the same vein,  but brings the story into the true internet age. Joe Tidy gives us real insights into the often-teen hacking gangs, many with members from the US and UK, who have caused online chaos and real harm. These attacks seem to have mostly started as pranks, but have moved into financial extortion and attempts to destroy others' lives through doxing, swatting (sending false messages to the police resulting in a SWAT te...