Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

The Science of Being Human - Marty Jopson *****

It might seem at first sight that a book titled 'The Science of Being Human' is about biology (or anthropology) - and certainly there's an element of that in Marty Jopson's entertaining collection of pretty-well freestanding articles on human science - but in reality a better clue comes from the subtitle 'why we behave, think and feel the way we do.'

What Jopson does is to pick out different aspects of the human experience - often quite small and very specific things - and take us through the science behind it. I often found that it was something I really wasn't expecting that really caught my fancy. The test with this kind of book is often what inspires the reader to tell someone else about it - the first thing I found myself telling the world was about why old 3D films used to give you a headache, but modern ones tend not to. (It's about the way that in the real world, your eyes swivel towards each other as things get closer to you.)

It's irresisti…

The Crowd and the Cosmos - Chris Lintott ****

We tend to have a very old fashioned idea of what astronomers do - peering through telescopes on dark nights. In reality, not only do many of them not use optical telescopes, but almost all observations are now performed electronically. Chris Lintott does a great job of bringing alive the realities of modern astronomy, and the way that the flood of data that is produced by all these electronic devices is being in part addressed by 'citizen scientists' - volunteer individuals who check image after image for interesting features.

Inevitably, all this cataloguing and categorising brings to mind Ernest Rutherford's infamous quotation along the lines of 'all science is either physics or stamp collecting.' This occurred to me even before Chris Lintott brought it up. Lintott defends the process against the Rutherford attack by pointing out that it can be a useful starting point for real, new research. To be fair to Rutherford, I think this misses the great man's poin…

Artificial Intelligence - Melanie Mitchell *****

As Melanie Mitchell makes plain, humans have limitations in their visual abilities, typified by optical illusions, but artificial intelligence (AI) struggles at a much deeper level with recognising what's going on in images. Similarly in some ways, the visual appearance of this book misleads. It's worryingly fat and bears the ascetic light blue cover of the Pelican series, which since my childhood have been markers of books that were worthy but have rarely been readable. This, however, is an excellent book, giving a clear picture of how many AI systems go about their business and the huge problems designers of such systems face.

Not only does Mitchell explain the main approaches clearly, her account is readable and engaging. I read a lot of popular science books, and it's rare that I keep wanting to go back to one when I'm not scheduled to be reading it - this is one of those rare examples.

We discover how AI researchers have achieved the apparently remarkable abiliti…