Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

The Infinity Machine - Sebastian Mallaby ****

It's very quickly clear that Sebastian Mallaby is a huge Demis Hassabis fan - writing about the only child prodigy and teen genius ever who was also a nice, rounded personality. After a few chapters, though, things settle down (I'm reminded of Douglas Adams' description of the Hitchhiker's Guide to the Galaxy ) and we get a good, solid trip through the journey that gave us DeepMind, their AlphaGo and AlphaFold programs, the sudden explosion of competition on the AI front and thoughts on artificial general intelligence. Although Mallaby does occasionally still go into fan mode - reading this you would think that AlphaFold had successfully perfectly predicted the structure of every protein, where it is usually not sufficiently accurate for its results to have direct practical application - we get a real feel for the way this relatively unusual company was swiftly and successfully developed away from Silicon Valley. It's readable and gives an important understanding of...

Beyond Belief - Helen Pearson *****

Apparently it comes as a surprise to many that medicine was not particularly scientific until the end of the twentieth century (to be honest, it's no surprise to me - we had a GP who used homeopathy in the 90s). Instead it was based on anecdotal guidance - the kind of thing that appeared to work. Evidence-based medicine has since improved the field, trying where possible to base decisions on evidence, ideally based on randomised controlled trials. The first part of Helen Pearson's book covers this well - though I think it's by far the least interesting part of what we discover. Instead what's truly fascinating is the rest of it, looking at a wide range of other fields where evidence was rarely properly used and that are only now starting to dip a toe in the water. These include social policy, policing, conservation, business and education. The main part of the book gives us examples of how bad these areas have been in terms of basing decisions on what's always been ...

In Seach of Sea Dragons - Matthew Myerscough ****

It's common advice to would-be authors of narrative non-fiction to open with something dramatic - Matthew Myerscough certainly does this with the story of his being trapped under an avalanche on Snowdon (while his girlfriend, also carried away remains on top of the snow unhurt). It certainly is dramatic, but seemed entirely disconnected from the reason I got the book, which was to read about fossil collecting.  Luckily, though, in the second chapter we get into a more conventional 'how I got interested in fossils as a boy'. Having recently reviewed Patrick Moore's autobiography and noting that astronomy was one of the few sciences where amateurs can still make a contribution, it came to mind that palaeontology is another - Myerscough is a civil engineer by trade, but just as amateur astronomers can find new details in the skies, so amateur fossil hunters have been searching for these relics for centuries. When I give talks in junior schools, the two topics that guarant...