Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Magdalena Zernicka-Goetz - Four Way Interview

Magdalena Zernicka-Goetz is Professor of Mammalian Development and Stem Cell Biology at the University of Cambridge, and Bren Professor in Biology and Bioengineering at Caltech. She has published over 150 papers and book chapters in top scientific journals and her work on embryos won the people’s vote for scientific breakthrough of the year in Science magazine.Her new book, co-authored with Roger Highfield, is The Dance of Life: symmetry, cells and how we become human.

Why science?

I fell in love with biology when I was a child because I loved doing experiments and seeing what happened. It was fascinating and enormous fun. I also fell in love with art at the same time. Art and science are both based on experiments and uncovering new paths to understand the world and ourselves. Why do we think the way we think? Where do our feelings come from? Is the 'right' answer always right? Where do we come from? How do parts of our body communicate with each other?  What is the nature of ti…

The Dance of Life - Magdalena Zernicka-Goetz and Roger Highfield ****

There is without doubt a fascination for all of us - even those who can find biology a touch tedious - with the way that a tiny cellular blob develops into the hugely complex thing that is a living organism, especially a human. In this unusual book which I can only describe as a memoir of science, Magalena Zernicka-Goetz, assisted by the Science Museum's Roger Highfield, tells the story of her own career and discoveries.

At the heart of the book, and Zernicka-Goetz's work, is symmetry breaking, a topic very familiar to readers of popular physics titles, but perhaps less so in popular biology. The first real breakthrough from her lab was the discovery of the way that a mouse egg's first division was already asymmetrical - the two new cells were not identical, not equally likely to become embryo and support structure as had always been thought.  As the book progresses, throughout the process of development we see how different symmetries are broken, with a particular focus on…

Meera Senthilingam - Four Way Interview

Meera Senthilingam is currently Content Lead at health start-up Your,MD and was formerly International Health Editor at CNN. She is a journalist, author and public health researcher and has worked with multiple media outlets, such as the BBC, as well as academic institutions, including the LSHTM and Wellcome Trust. She has Masters Degrees in Science Communication and the Control of Infectious Diseases and her interests lie in communicating global health issues to the general public through journalism and working with global health programmes. Her academic research to date has focused on tuberculosis, particularly the burden of drug-resistant tuberculosis and insights into the attitudes and behaviours of the people most affected. Her new book is Outbreaks and Epidemics: battling infection from measles to coronavirus.

Why science?

I have always found science fascinating and have always had a strong passion for it. My friends in high school used to find it amusing to introduce me to people…