Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

The World According to Physics - Jim Al-Khalili *****

There is a temptation on seeing this book to think it's another one of those physics titles that is thin on content, so they put it in an odd format small hardback and hope to win over those who don't usually buy science books. But that couldn't be further from the truth. In Jim Al-Khalili's The World According to Physics, we've got the best beginners' overview of what physics is all about that I've ever had the pleasure to read.

The language is straightforward and approachable. Rather than take the more common historical approach that builds up physics the way it was discovered, Al-Khalili starts with the 'three pillars' of physics: relativity, quantum theory and thermodynamics. In simple language with never an equation nor even a diagram in sight, the book lays out what physics is all about, what it has achieved and what it still needs to do.

That bit about no diagrams is an important indicator of how approachable the text is. Personally, I'm no…

Outbreaks and Epidemics - Meera Senthilingam ****

This book was written before the COVID-19 coronavirus outbreak, though it has been updated to include it: it's certainly not any kind of attempt to cash in, but rather a sober reflection on how outbreaks and epidemics work, what process the world has in place to deal with them and how a changing, globalised world has magnified risk.

If I'm honest, I'm not a great fan of medical books, but Meera Senthilingam gives an important introduction to disease outbreaks and epidemics, giving enough detail to make sense of them without ever being too technical for the general reader. This is careful journalism, which can sometimes come across as rather dry, but that's not necessarily a bad thing given the topic.

The book starts by plunging us into the beginnings of the 2003 SARS epidemic, then brings in COVID-19 (as of, by the look of it, around the start of March 2020) and measles before plunging back to smallpox and the origins of vaccination. There is a strong section on disea…

Jim Al-Khalili - Four Way Interview

Jim Al-Khalili hosts The Life Scientific on BBC Radio 4 and has presented numerous BBC television documentaries. He is Professor of Theoretical Physics and Chair in the Public Engagement in Science at the University of Surrey, a New York Times bestselling author, and a fellow of the Royal Society. He is the author of numerous books, including Quantum: A Guide for the Perplexed; The House of Wisdom: How Arabic Science Saved Ancient Knowledge and Gave Us the Renaissance; and Life on the Edge: The Coming of Age of Quantum Biology. The paperback of his novel Sunfall is published in March 2020 by Transworld. His latest book is The World According to Physics.


Why physics?

I fell in love with physics when I was 13 or 14, when I realised not only that I was pretty good at it at school – basically common sense and puzzle solving – but because it was the subject that answered the big questions I had started contemplating, like whether the stars in the night sky went on for ever, what they were ma…