Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Luna: Moon Rising (SF) - Ian McDonald ****

I'm not the natural audience for this book. Game of Thrones l eaves me cold - and it's hard not to feel the influence of GoT (and a whole lot of Dune )   underneath a veneer of science fiction and the trappings of a South American drug cartel in the cod-medieval family power battles and chivalric details. There are even dragons (of a sort). I'd be really sad if the future did involve this sort of throwback feudalism. However, remarkably, despite this I found Luna: Moon Rising kept me engaged. The fact is that Ian McDonald can put together a good plot with intricate machinations, which is enough to carry the reader through what can be a bewildering collection of characters. The two page scene-setter saying who did what to whom at the start was useful, but I could have done with family trees for the main family as I was constantly forgetting who was who - especially easy as McDonald endows many families with characters with the same first initial (e.g. Ariel and Al...

Adventures of a Computational Explorer - Stephen Wolfram ***

Stephen Wolfram, the man behind the scientist's mathematical tool of choice, Mathematica, plus a whole host of other software products, including the uncanny Wolfram Alpha knowledge engine, is undoubtedly a genius of the first order. In this book, we get an uncensored excursion into the mind of genius - which is, without doubt, a fascinating prospect. The book consists of a collection of essays and speeches that Wolfram has produced over the last ten to fifteen years, covering an eclectic range of topics. Like all such collections, the result is something that lacks the coherence of a book with a narrative that runs through it, inevitably introducing a degree of repetition and a mix of interesting and not-so-interesting topics - but there's likely to be something to catch the attention anyone who is into computing or mathematics. One of the most interesting pieces is the opening one, where Wolfram describes being a consultant on the SF movie Arrival. He seems to hav...

E=mc2: A biography of the world’s most famous equation – David Bodanis *****

David Bodanis is a storyteller, and he fulfils this role with flair in E=mc2. The premise of the book is simple – Einstein himself has been biographed (biographised?) to death, but no one has picked out this most famous of equations, dusted it down and told us what it means, where it comes from and what it has delivered. Allegedly, Bodanis was inspired to write the book after hearing see an interview with actress Cameron Diaz in which she commented that she’d really like to know what that famous collection of letters was all about. Although the book had been around for a while already when this review was written (September 2005), it seemed a very apt moment to cover it, as the equation is, as I write, exactly 100 years old. So when better to have a biography? Bodanis starts off by telling us about the individual elements of the equation. What the different letters mean, where the equal sign comes from and so on. This is entertaining, though he seems to tire of the approach on...