Skip to main content

Can Computers write Science Books? - Brian Clegg

The German academic publisher Springer has for some time been using automated editing software (with mixed results) - but recently has brought out a whole book written by a piece of AI software called Beta Writer. The book, Lithium-Ion Batteries: a machine generated summary of current research, can be downloaded free of charge as a PDF. But is this a serious challenge for science writers?

It's certainly interesting. If I'm honest, this is hardly a book at all - it's more the output of an automated abstract generator pulled together in book form, where frankly this information would be far better just as a web page. However, there's no doubt that there is some interesting work going on here, particularly in the introduction and conclusion sections of the 'book'.

The whole thing starts with a (human written) preface explaining the technology - by far the most readable part of the text. We then get four 'chapters' of machine-generated content, which each have the format introduction/ set of abstracts / conclusion. Obviously it's the introduction and conclusion that provide the most interest.

I'll focus on the first introduction, though the same criticisms apply throughout. The first test of a piece of scientific writing meant to be readable is to take a step back and get an overview of a chunk of text - does it look like English or is it dominated by acronyms and numbers? A chunk out of the first page shows that this is very dense technical text, extremely low on readability:



The other two significant indicators of readability are whether the text is a collection of fact statements or is written using connectives and summary to give flow, and whether or not overall there is a structure that takes the reader by the hand and leads them through a communication process. On both tests, the book falls down in a big way. Pretty well every sentence is a standalone fact statement that could be a bullet point: there is no flow whatsoever. And although some attempt has been made to group these statements effectively, there is no sense of a thought-through structure. In the interminable-seeming introductions - the first one runs to 22 dense pages - there is no sense that we are going anywhere, just that we are experiencing randomly thrown together bits of data.

Inevitably, an automated process will produce some sentences that don't quite work, so one essential here is to see whether these have been captured and fixed. A reasonably high percentage of the content does make grammatical sense, but there are regular hiccups - for example we get: 

  • 'That sort of research's principal aim...' - it should be 'principle' not 'principal'. 
  • 'Materials, a number of metal oxides with high theoretical capacity have aroused more and more attention including...' - that 'Materials,' start makes no sense.
  • 'Through Tang and others, mesoporous nanosheet is synthesized...' - sounds painful.
  • 'It is still maintained the huge capacity of 611 mAg-1... when utilized as an anode.' - doesn't make any sense.
  • 'Apart from, few-layer nanosheets enhance a fast insertion...' - apart from what?
  • And so on for many, many more examples.

Going on comments I've had from some Springer authors, the level of uncaught or automatic-editing-generated errors is fairly high in their human-authored publications - these books tend not to be heavily edited - but because they are starting with far more readable text, this is less of an issue.

So, should science writers be worried? Obviously, as a professional writer myself I'm biassed, but I would say 'No' - at least, not yet. The text in the introductions and conclusions is nowhere near the readability of a decent technical science book, let alone the far higher writing quality required for a good popular science book. And the outcome also emphasises that even if, long-term, automated writing becomes more common, it is always likely to need a look over by a human editor to avoid errors creeping in. However, this is a fascinating experiment and Springer should be congratulated for getting this far.

Comments

Popular posts from this blog

Why Nobody Understands Quantum Physics - Frank Verstraete and Céline Broeckaert **

It's with a heavy heart that I have to say that I could not get on with this book. The structure is all over the place, while the content veers from childish remarks to unexplained jargon. Frank Versraete is a highly regarded physicist and knows what he’s talking about - but unfortunately, physics professors are not always the best people to explain physics to a general audience and, possibly contributed to by this being a translation, I thought this book simply doesn’t work. A small issue is that there are few historical inaccuracies, but that’s often the case when scientists write history of science, and that’s not the main part of the book so I would have overlooked it. As an example, we are told that Newton's apple story originated with Voltaire. Yet Newton himself mentioned the apple story to William Stukeley in 1726. He may have made it up - but he certainly originated it, not Voltaire. We are also told that â€˜Galileo discovered the counterintuitive law behind a swinging o...

Ctrl+Alt+Chaos - Joe Tidy ****

Anyone like me with a background in programming is likely to be fascinated (if horrified) by books that present stories of hacking and other destructive work mostly by young males, some of whom have remarkable abilities with code, but use it for unpleasant purposes. I remember reading Clifford Stoll's 1990 book The Cuckoo's Egg about the first ever network worm (the 1988 ARPANet worm, which accidentally did more damage than was intended) - the book is so engraved in my mind I could still remember who the author was decades later. This is very much in the same vein,  but brings the story into the true internet age. Joe Tidy gives us real insights into the often-teen hacking gangs, many with members from the US and UK, who have caused online chaos and real harm. These attacks seem to have mostly started as pranks, but have moved into financial extortion and attempts to destroy others' lives through doxing, swatting (sending false messages to the police resulting in a SWAT te...

Battle of the Big Bang - Niayesh Afshordi and Phil Harper *****

It's popular science Jim, but not as we know it. There have been plenty of popular science books about the big bang and the origins of the universe (including my own Before the Big Bang ) but this is unique. In part this is because it's bang up to date (so to speak), but more so because rather than present the theories in an approachable fashion, the book dives into the (sometimes extremely heated) disputed debates between theoreticians. It's still popular science as there's no maths, but it gives a real insight into the alternative viewpoints and depth of feeling. We begin with a rapid dash through the history of cosmological ideas, passing rapidly through the steady state/big bang debate (though not covering Hoyle's modified steady state that dealt with the 'early universe' issues), then slow down as we get into the various possibilities that would emerge once inflation arrived on the scene (including, of course, the theories that do away with inflation). ...