Want to write a best-seller? Scientists claim this algorithm will tell you how

'Statsitical stylometry' looks at vast amounts of data in order to sift out the stylistic tropes that define a popular novel

Sophie Murray-Morris
Friday 10 January 2014 15:31 GMT
Comments
Encyclopædia Britannica, Eleventh Edition
Encyclopædia Britannica, Eleventh Edition (Stewart Butterfield / Creative Commons)

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Ever wondered what the secret is to a novel’s success? Computer scientists from the US think they might have discovered the secret.

The new technique, with an accuracy rate of 84%, can tell aspiring writers whether their book will shoot to fame or be a total slump even before it is published.

Researchers at New York based Stony Brook University analysed over 40,000 books from a broad range of genres, as well as film scripts, to collate the findings. Notable titles included A Tale of Two Cities by Charles Dickens and The Lost Symbol by Dan Brown.

The technique, called statistical stylometry, differentiates between highly successful literature and less prosperous literary works by using vast amounts of data to define variations in literary style between one writer or genre and another.

The researched defined a book’s success by looking at its download figures and Amazon sales records.

A high percentage of verbs, adverbs and foreign words could be the reason why some books are failing, according to the research. They may also rely on verbs that more explicitly describe actions and emotions, including words such as “wanted”, “took”, “promised”, “cried”, and “cheered”. These books may also depend on overused words, such as cliché terms like “love” and their settings may be common geographical settings.

In contrast, more successful books use more conjunctions such as “and”, “but”, and “or”. They also included more thought-processing verbs such as “recognised” and “remembered”, the research revealed.

Yejin Choi, assistant professor at Stony Brook University, said: “Predicting the success of literary works poses a massive dilemma for publishers and aspiring writers alike.”

She added: “Based on novels across different genres, we investigated the predictive power of statistical stylometry in discriminating successful literary works, and identified the stylistic elements that are more prominent in successful writings.”

“Our work is the first that provides quantitative insights into the connection between the writing style and the success of literary works.”

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in