by Mic Wright
If there is a ghost in the machine, it’s increasingly a grammarian and a petulant pedant in equal measure. Spellcheckers are old hat, but a new generation of language tools from tech giants like Google and start-ups like Grammarly is expanding far beyond error correction. Increasingly, they’re veering into attempts at analysing tone and flagging up when they believe you’re using the “wrong” terms for people and things.
The first spellcheck arrived long before personal computing became common, in the age of mainframes. Les Earnest, a research scientist and lecturer in computer science at MIT, put together a list of 10,000 words to create a spelling checker sub-routine for a project he was working on back in 1961. In 1965, Earnest moved institutions to help set up Stanford’s pioneering Artificial Intelligence Lab (SAIL) and, two years later, put graduate students to work designing new spellchecking programmes based on the word list he put together.
One product of the exercise was Ralph Gorin’s SPELL, a programme that could interactively suggest possible correct spellings for unidentified words. Gorin made his work publicly available and SPELL spread to other academic institutions, inspiring further teams to produce their own spell checkers.
It wasn’t until 1980 that the first spellcheckers for personal computers started to appear. At first, they were standalone applications but were quickly built into word processing packages with delightfully ’80s-style names like WordStar and WordPerfect. As they moved beyond simply correcting English, spellcheckers had to become far more sophisticated to deal with the intricacies of agglutinative languages like Hungarian, Turkish and Finnish, which often express concepts in complex words made up of many elements.
“How fast will new Word 6.0 fix typos? How fast can you make them?” a Microsoft advert boasted in 1993, promoting the arrival of an innovation many damn daily – AutoCorrect. Which is either a great addition to life or a ducking [sic] irritant, depending on your perspective.
While the original AutoCorrect could be sold as quite magical, it didn’t actually use a dictionary to spot typos. Instead, it checked each word against a table of everyday mistakes (so “teh” became “the” for instance). By the late ’90s though, AutoCorrect had become more sophisticated, able to actually scan the words you typed against a dictionary and attempting to find the closest matches for any unknown strings of characters.
If there was ambiguity, AutoCorrect left the word alone. The approach remains largely the same to this day and Google’s spellchecker is scanning these very words as I type them into the Google doc that will become the article you’re reading.
The trouble is that increasingly interventionist algorithms casting a virtual eye over our words can force language to evolve according to the decisions of a small number of giant companies. Speaking to The New York Times magazine in 2014, Thierry Fontenelle, a linguist who managed Microsoft’s natural language processing team from 2001 to 2009, noted that Word’s spellcheck offered to correct Barack Obama’s surname to “Osama” when he was first coming to national attention and warned that AutoCorrect could raise the stakes: “Now I’m not even going to bother suggesting something, I’m going to replace it automatically. That’s when things start becoming dangerous.”
Which brings me to changes Google has announced about to how its products will suggest words and phrases. It says it wants to encourage people who use its products to choose non-gendered language – suggesting replacements for words such as “chairman” and “postman” – as well as using them to avoid “offensive language”, though it didn’t define what would and wouldn’t count as “offensive”.
The issue is not with encouraging inclusive language, but that Google is in a position to decide which language is inclusive and offensive. Just as it is deliberately opaque about the factors that go into organising and presenting the results produced by its search engine, it’s far from open about how it makes decisions about the treatment of language in its products.
Features driven by algorithms are not free from bias; they simply reflect the bias of their creators and the societies those creators live in. In 2016, Google had to make changes to the autocomplete feature in its search engine after it began suggesting “…evil?” when people began to type queries that started with “Are Jews…”. A similar bias was uncovered in 2018 when Google altered Gmail’s Smart Compose feature to stop it using gender-based pronouns, as the risk of incorrectly predicting someone’s sex or gender was too high. The company’s engineers had discovered that bias was built into the system around words such as job titles. When someone typed, “I am meeting an investor next week,” Smart Compose tended to suggest “Do you want to meet him?” as a follow-up question.
SpellCheck, and its more presumptuous younger sibling AutoCorrect, reflect languages as we speak them, largely moulded and evolved by choices we make as writers. But Google’s language-shifting suggestions and features like Grammarly’s attempts at sentiment analysis try to nudge us into changing our intentions. From there, it’s not a great leap to feel that allowing a handful of huge corporations to tell us what we actually mean to say is a bit sinister.
Arthur Koestler, who popularised the phrase “ghost in the machine” – which was originally coined by the Oxford philosopher Gilbert Ryle to describe the Cartesian duality of mind and body – wrote in his book of the same title: “Aberrations of the human mind are to a large extent due to the obsessional pursuit of some part-truth, treated as if it were a whole truth.” Allowing algorithms to tell us what we really meant to type in real time, as we write, runs the risk of turning everything we say into a series of part-truths. And that can lead to the whole “truth” resting in the hands of executives you’ll probably never meet, working within corporations who have agendas that are often very different to your own.
Mic Wright is a freelance writer and journalist based in London. He writes about technology, culture and politics