Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Roger Highfield - Stephen Hawking: genius at work interview

Roger Highfield OBE is the Science Director of the Science Museum Group. Roger has visiting professorships at the Department of Chemistry, UCL, and at the Dunn School, University of Oxford, is a Fellow of the Academy of Medical Sciences, and a member of the Medical Research Council and Longitude Committee. He has written or co-authored ten popular science books, including two bestsellers. His latest title is Stephen Hawking: genius at work . Why science? There are three answers to this question, depending on context: Apollo; Prime Minister Margaret Thatcher, along with the world’s worst nuclear accident at Chernobyl; and, finally, Nullius in verba . Growing up I enjoyed the sciencey side of TV programmes like Thunderbirds and The Avengers but became completely besotted when, in short trousers, I gazed up at the moon knowing that two astronauts had paid it a visit. As the Apollo programme unfolded, I became utterly obsessed. Today, more than half a century later, the moon landings are

Space Oddities - Harry Cliff *****

In this delightfully readable book, Harry Cliff takes us into the anomalies that are starting to make areas of physics seems to be nearing a paradigm shift, just as occurred in the past with relativity and quantum theory. We start with, we are introduced to some past anomalies linked to changes in viewpoint, such as the precession of Mercury (explained by general relativity, though originally blamed on an undiscovered planet near the Sun), and then move on to a few examples of apparent discoveries being wrong: the BICEP2 evidence for inflation (where the result was caused by dust, not the polarisation being studied),  the disappearance of an interesting blip in LHC results, and an apparent mistake in the manipulation of numbers that resulted in alleged discovery of dark matter particles. These are used to explain how statistics plays a part, and the significance of sigmas . We go on to explore a range of anomalies in particle physics and cosmology that may indicate either a breakdown i

Splinters of Infinity - Mark Wolverton ****

Many of us who read popular science regularly will be aware of the 'great debate' between American astronomers Harlow Shapley and Heber Curtis in 1920 over whether the universe was a single galaxy or many. Less familiar is the clash in the 1930s between American Nobel Prize winners Robert Millikan and Arthur Compton over the nature of cosmic rays. This not a book about the nature of cosmic rays as we now understand them, but rather explores this confrontation between heavyweight scientists. Millikan was the first in the fray, and often wrongly named in the press as discoverer of cosmic rays. He believed that this high energy radiation from above was made up of photons that ionised atoms in the atmosphere. One of the reasons he was determined that they should be photons was that this fitted with his thesis that the universe was in a constant state of creation: these photons, he thought, were produced in the birth of new atoms. This view seems to have been primarily driven by re