Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Math Without Numbers - Milo Beckman *****

In some ways, this is the best book about pure mathematics for the general reader that I've ever seen.  At first sight, Milo Beckman's assertion that 'the only numbers in this book are the page numbers' seems like one of those testing limits some authors place on themselves, such as Roberto Trotter's interesting attempt to explain cosmology using only the 1,000 most common words in the English language, The Edge of the Sky . But in practice, Beckman's conceit is truly liberating. Dropping numbers enables him to present maths (I can't help but wince a bit at the 'math' in the title) in a far more comprehensible way. Counting and geometry may have been the historical origin of mathematics, but it has moved on. The book is divided into three primary sections - topology, analysis and algebra, plus a rather earnest dialogue on foundations of mathematics exploring the implications of Gödel's incompleteness theorems, and a closing section on modelling (

Linda Schweizer - Four Way Interview

Linda Schweizer earned an MA in mathematics and a PhD in astronomy at UC Berkeley, with the visual arts and dance as her other passions. She observed southern-hemisphere galaxy pairs with several telescopes in cold dark domes in Chile, then modelled, analyzed, and published her work in 1987. Those papers on the statistical and dynamical modelling of dark matter in binary galaxy halos were, she says, just a small stone in the mosaic of our growing understanding of dark matter. A Carnegie Fellowship in Washington, DC, was her first science job. By then, she had her second daughter in the oven— with two more daughters to follow, and she turned her focus to properly preparing them for life. After 15 years, she returned to the world of astrophysics. After a brief stint in External Affairs, she taught science writing to undergraduate students at Caltech and loved it. She was a Visiting Scholar at Caltech while researching Cosmic Odyssey , an insider’s history of one of the greatest eras in a

The Ten Equations that Rule the World - David Sumpter ****

David Sumpter makes it clear in this book that a couple of handfuls of equations have a huge influence on our everyday lives. I needed an equation too to give this book a star rating - I’ve never had one where there was such a divergence of feeling about it. I wanted to give it five stars for the exposition of the power and importance of these equations and just two stars for an aspect of the way that Sumpter did it. The fact that the outcome of applying my star balancing equation was four stars emphasises how good the content is. What we have here is ten key equations from applied mathematics. (Strictly, nine, as the tenth isn’t really an equation, it’s the programmer’s favourite ‘If… then…’ - though as a programmer I was always more an ‘If… then… else…’ fan.) Those equations range from the magnificent one behind Bayesian statistics and the predictive power of logistic regression to the method of determining confidence intervals and the kind of influencer matrix so beloved of social m