Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

It's On You - Nick Chater and George Loewenstein *****

Going on the cover you might think this was a political polemic - and admittedly there's an element of that - but the reason it's so good is quite different. It shows how behavioural economics and social psychology have led us astray by putting the focus way too much on individuals. A particular target is the concept of nudges which (as described in Brainjacking ) have been hugely over-rated. But overall the key problem ties to another psychological concept: framing. Huge kudos to both Nick Chater and George Loewenstein - a behavioural scientist and an economics and psychology professor - for having the guts to take on the flaws in their own earlier work and that of colleagues, because they make clear just how limited and potentially dangerous is the belief that individuals changing their behaviour can solve large-scale problems. The main thesis of the book is that there are two ways to approach the major problems we face - an 'i-frame' where we focus on the individual ...

Introducing Artificial Intelligence – Henry Brighton & Howard Selina ****

It is almost impossible to rate these relentlessly hip books – they are pure marmite*. The huge  Introducing  … series (a vast range of books covering everything from Quantum Theory to Islam), previously known as …  for Beginners , puts across the message in a style that owes as much to Terry Gilliam and pop art as it does to popular science. Pretty well every page features large graphics with speech bubbles that are supposed to emphasise the point. Funnily,  Introducing Artificial Intelligence  is both a good and bad example of the series. Let’s get the bad bits out of the way first. The illustrators of these books are very variable, and I didn’t particularly like the pictures here. They did add something – the illustrations in these books always have a lot of information content, rather than being window dressing – but they seemed more detached from the text and rather lacking in the oomph the best versions have. The other real problem is that...

The Laws of Thought - Tom Griffiths *****

In giving us a history of attempts to explain our thinking abilities, Tom Griffiths demonstrates an excellent ability to pitch information just right for the informed general reader.  We begin with Aristotelian logic and the way Boole and others transformed it into a kind of arithmetic before a first introduction of computing and theories of language. Griffiths covers a surprising amount of ground - we don't just get, for instance, the obvious figures of Turing, von Neumann and Shannon, but the interaction between the computing pioneers and those concerned with trying to understand the way we think - for example in the work of Jerome Bruner, of whom I confess I'd never heard.  This would prove to be the case with a whole host of people who have made interesting contributions to the understanding of human thought processes. Sometimes their theories were contradictory - this isn't an easy field to successfully observe - but always they were interesting. But for me, at least, ...