Skip to main content
of montreal

In Of Montreal, Robert Everett-Green writes weekly about the people, places and events that make Montreal a distinctive cultural capital.

'I doubt that a recipe for a bestseller could be as easy to follow as a recipe for a cake." So said novelist Marie-Hélène Poitras in Le Devoir recently, after agreeing to try a recipe for bestselling American fiction developed by McGill University's .txtLAB, a research unit that applies computer analysis to literary texts.

Over the past several Saturdays, the French-language Montreal daily has conducted a competitive experiment in using digital text analysis as a way to change the way writers write. The paper asked five established Quebec novelists to compose a story of about 1,200 words using guidelines produced by .txtLAB from a study of 200 titles from the New York Times bestselling fiction list.

Common features of American bestsellers, according to .txtLAB director Andrew Piper, are short sentences (11 words on average), simple actions relayed with active verbs, frequent descriptions of facial expressions and characters who are into technology and have a mystery or violent crime to solve. These books avoid complex emotions, uncertainty and nature description, he says, as well as tea, rats, giants and bears.

Most writers of literary fiction regard bestsellers with a mixture of envy for the numbers involved and disgust for the kind of writing that often racks them up. Le Devoir lightened its assignment by presenting it as a game, with a reader poll to decide the winner.

Most of the four francophone writers whose responses have appeared so far treated the task as a joke, conspicuously ticking off items on Piper's list, importing characters from Star Wars (in Monique Proulx's On ne rit pas) or going all meta on the brief (Daniel Grenier, whose Annie courait features a Meta-Troll). Only Stéphane Dompierre's Millionaire fauché took the challenge more or less seriously. It was a gothic mystery tale written in flat, simple sentences that mimicked the dull music of some bestsellers while propelling me to the last line – a page-turner just one page long.

The .txtLAB unit is part of McGill's investment in the blooming field of digital humanities, which, over the past five years, has attached itself to faltering humanities departments, including literary studies. Digital humanities involves many things, but in this instance, it's what you get when you stop reading a text and start counting and sorting its working parts – a practice known as quantitative analysis.

Some quantitative analyses of literary texts can produce real insights about evolutions in style and vocabulary. Others rely on what Piper himself calls "admittedly blunt tools" to make broad generalizations. In a December article in the New Republic, he and colleague Richard Jean So claim to chart the rise and fall of sentimentality in fiction – at a peak in Victorian times, they say, and declining ever since. That conclusion seems more sensible than the way they got there: by measuring the frequency of "sentimental words" such as "abominable" and "rapturous." An arbitrary and archaic word list is too coarse a net to catch the sentimentality in a sentence such as, "Tell me if Tiny Tim will live."

A quantitative analysis of 40,000 novels in 2014, by a team at the Stony Brook campus of the State University of New York, claimed an 84-per-cent rate of predicting literary success. But its main criterion for "success" was the number of times a book had been sourced from the Gutenberg Project, a free text-sharing site that includes nothing new or under copyright. In this kind of analysis, "prediction" often means studying a text "blind," then seeing whether your results tally with what actually happened to the book after publication, whether in 1978 or 1850.

The Stony Brook study's focus on texts that were many decades old produced no solid criteria for predicting the future success of a new book, contrary to many hopeful media headlines. A lot of bestselling fiction fades into obscurity over time, as recent Giller Prize-winner André Alexis noted in a discussion with Piper published at the start of Le Devoir's series.

Piper claims that .txtLAB has a 75-per-cent prediction rate for bestsellers, but that, too, is retrospective. It doesn't mean that any unpublished book has become a bestseller on the basis of a .txtLAB recommendation. But it does mean, Piper says, that he has information that could usefully affect what is created in the future. The point of Le Devoir's experiment, in his view, was not to get Quebec authors to play at being clones of bestselling Americans, but to see how they might appropriate aspects of blockbuster fiction into their own voice and style.

In any case, Piper says, "bestsellers are much more diverse than the word lets on." A profile of the average bestselling novel may be like a computer-generated sketch of an average face that resembles no one in particular.

It's easy to find writing by bestselling authors that breaks .txtLAB's rules. "As it slowly sinks behind the mountains, the sun sprays light so warmly coloured and so mordant that, where touched, the darkening lands appear to be wet with it and dyed forever." That's from the first page of Intensity, a bestselling novel by Dean Koontz, whose books have sold more than 450 million copies. Note the length of the sentence and the nature description, both barred by Piper's criteria. Note also the botched poetry of "the sun sprays light," so close to Neil Simon's satirical "the sun spits morning" in The Owl and the Pussycat, a play about a failed novelist. I doubt that any set of computer-generated rules could measure the licence given to very popular storytellers to write badly.

The odd thing about Le Devoir's experiment in recipe fiction is that there has been no allusion to the single biggest barrier to any francophone writer aiming to produce "an American bestseller." Surely, the task should have been defined as writing a text that a translator such as Sheila Fischman might convert into a bestseller in the United States, which is a very hard job indeed. According to figures from the University of Rochester's translation program, only 3 per cent of books published in the U.S. are translations, and most of those aren't fiction.

"Despite the quality of these books," says a post on the program's Three Percent webpage, "most translations go virtually unnoticed [in the U.S.] and never find their audience." Best to treat the whole thing as a game, as Le Devoir did, and leave it to Shakespeare to deploy the ultimate active verb in this story: Exit, pursued by a bear.