Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Where are the chemistry popular science books?

by Brian Clegg
There has never been more emphasis on the importance of public engagement. We need both to encourage a deeper interest in science and to counter anti-scientific views that seem to go hand-in-hand with some types of politics. Getting the public interested in science both helps recruit new scientists of the future and spreads an understanding of why an area of scientific research deserves funding. Yet it is possible that chemistry lags behind the other sciences in outreach. As a science writer, and editor of this website, I believe that chemistry is under-represented in popular science. I'd like to establish if this is the case, if so why it is happening - and what can be done to change things. 


An easy straw poll is provided by the topic tags on the site. At the time of writing, there are 22 books under 'chemistry' as opposed to 97 maths, 126 biology and 182 physics. The distribution is inevitably influenced by editorial bias - but as the editor, I can confirm …

Artificial Intelligence - Yorick Wilks ****

Artificial intelligence is one of those topics where it's very easy to spin off into speculation, whether it's about machine conciousness or AI taking over the world (and don't get me onto the relatively rare connection to robots - cover designer please note). All the experience of AI to date has been that it has been made feasible far slower than originally predicted, and that it faces dramatic limitations. So, for example, self-driving cars may be okay in limited circumstances, but are nowhere near ready for the commute home. Similarly, despite all the moves forward in AI technology, computers are so-so at recognising objects after learning from thousands of examples - sometimes fooled by apparently trivial surface patterning - where humans can recognise items from a handful of examples.

Even so, we can't deny that AI is having an influence on our lives and Yorick Wilks, emeritus professor of AI at the University of Sheffield, is ideally placed to give us a picture …

Apollo 11 - David Whitehouse *****

The problem with doing a book about the Apollo programme is that it's hard to find something that hasn't been said before - but with the 50th anniversary of the first moon landing just weeks away, the publication of this elegant book is extremely timely, and science-reporting veteran David Whitehouse manages to make the story feel fresh, even if you're one of the just 20 per cent of the world population who were alive on the remarkable day in 1969.

Although he has worked a lot with New Scientist, Whitehouse was for many years a TV journalist, and that comes through in his impressively engaging prose as he takes us back to the origins of the US/USSR space race that would lead to the moon landing. He passes through the wartime aspects relatively quickly, but once the two superpowers are flexing their space technology muscles, Whitehouse achieves a near perfect balance between the far less-heard USSR side of the story and the US. This is probably the best bit of the whole bo…