Letters and numbers
15 May 2014
The director of the Centre for Literary and Linguistic Computing at the University of Newcastle knows it's not every English scholar's cup of Twinings, but he does love his stats.
As enamoured as, School of Humanities and Social Science, Professor Hugh Craig is of English Renaissance literature, he is more likely to dissect Shakespeare or Jonson through sophisticated computer software or to devour Bacon with his Kindle, than he is to perch himself in a Wingback in the corner of the reading room, hard cover in one hand, cup of tea in the other.
The Centre for Literary and Linguistic Computing was established in 1989 by then vice-chancellor Keith Morgan for the retiring professor of English John Burrows to continue groundbreaking research in the area of computer-assisted analysis.
Hugh, who had known John Burrows since first-year university in Sydney, has been involved with the centre from the outset.
Hugh explains that one of the first significant areas of computer-assisted "text mining" was in the exploration of the use and frequency of the most common words in the English language such as "and" and "the".
They found that from author to author, period to period, "it did vary in revealing ways".
"The frequency of common words is a very rich source of data. It could help us understand styles, and it could help us work out who wrote what," Hugh says.
Such forensic investigation continues to develop at the centre, regarded as a pioneer and world leader in its field.
Computer analysis, for example, has proven "William Shakespeare had a vocabulary very similar in size to those of other writers of his time".
The expertise of "big data" devotee Pablo Moscato, head of the University of Newcastle's bioinformatics program, proved invaluable.
"There is this myth of Shakespeare that his [extensive] vocabulary was what made him great. It is a mistaken basis," Hugh says, adding that many a sleepless night has failed to reveal the true answer to his genius.
Since the CLLC's beginnings there have been major developments in the computer software, now known as the Intelligent Archive, where a multitude of texts are stored and analysed.
"It's like using a power tool as opposed to a hand tool," Hugh says, marvelling at its speed and accuracy.
PhD students to have benefitted from the centre include Louisa Connors, who has just completed her thesis on the difference between plays that were written to be performed and those written to be read; Jack Elliott, who is investigating how bound to formula Mills & Boon writers are; and post-doctoral scholar Alexis Antonia's exploration of the writing styles of journalists in Victorian magazines.
One of the current projects of the CLLC involves a collaboration with speech pathology researchers, exploring whether people's language changes as they age.
Written responses by women in their 20s through to those in their 80s sourced from the Australian Longitudinal Study on Women's Health have been fed into the computer and analysed.
Among the findings are that women in their 20s tend to challenge the system and are interested in societal and gender issues; women in their 80s are preoccupied with physical frailty; the mature ages use the word "blessed" a lot; and women in their late 40s to early 50s change very rapidly from a younger mindset to that of a senior citizen.
"The underlying theory is we all speak in a different way," Hugh says.
"The interesting aspect of language is that we share it to communicate, but when we use it we make a special dialect out of it. We individualise it.
"Computational stylistics explores this dimension of human culture, of our being."
And so the quest continues.
"The whole fibre of our being as English scholars is to trust instincts rather than numbers. But the challenge is on to produce results that interest people. I am forever looking out for things that people might say that I could test," Hugh smiles.