Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

God: the Science, the Evidence - Michel-Yves Bolloré and Olivier Bonnassies ***

This is, to say the least, an oddity, but a fascinating one. A translation of a French bestseller, it aims to put forward an examination of the scientific evidence for the existence of a deity… and various other things, as this is a very oddly structured book (more on that in a moment). In The God Delusion , Richard Dawkins suggested that we should treat the existence of God as a scientific claim, which is exactly what the authors do reasonably well in the main part of the book. They argue that three pieces of scientific evidence in particular are supportive of the existence of a (generic) creator of the universe. These are that the universe had a beginning, the fine tuning of natural constants and the unlikeliness of life.  To support their evidence, Bolloré and Bonnassies give a reasonable introduction to thermodynamics and cosmology. They suggest that the expected heat death of the universe implies a beginning (for good thermodynamic reasons), and rightly give the impression tha...

The Infinite Alphabet - Cesar Hidalgo ****

Although taking a very new approach, this book by a physicist working in economics made me nostalgic for the business books of the 1980s. More on why in a moment, but Cesar Hidalgo sets out to explain how it is knowledge - how it is developed, how it is managed and forgotten - that makes the difference between success and failure. When I worked for a corporate in the 1980s I was very taken with Tom Peters' business books such of In Search of Excellence (with Robert Waterman), which described what made it possible for some companies to thrive and become huge while others failed. (It's interesting to look back to see a balance amongst the companies Peters thought were excellent, with successes such as Walmart and Intel, and failures such as Wang and Kodak.) In a similar way, Hidalgo uses case studies of successes and failures for both businesses and countries in making effective use of knowledge to drive economic success. When I read a Tom Peters book I was inspired and fired up...

The War on Science - Lawrence Krauss (Ed.) ****

At first glance this might appear to be yet another book on how to deal with climate change deniers and the like, such as How to Talk to a Science Denier.   It is, however, a much more significant book because it addresses the way that universities, government and pressure groups have attempted to undermine the scientific process. Conceptually I would give it five stars, but it's quite heavy going because it's a collection of around 18 essays by different academics, with many going over the same ground, so there is a lot of repetition. Even so, it's an important book. There are a few well-known names here - editor Lawrence Krauss, Richard Dawkins and Steven Pinker - but also a range of scientists (with a few philosophers) explaining how science is being damaged in academia by unscientific ideas. Many of the issues apply to other disciplines as well, but this is specifically about the impact on science, and particularly important there because of the damage it has been doing...