'Dangerous' AI generates words that don't exist

The algorithm was developed by an ex-Instagram engineer who helped build the app's recommendation algorithm

Adam Smith
Thursday 14 May 2020 16:26 BST
Comments
Language has continually always evolved naturally over time and these are some of the latest mutations
Language has continually always evolved naturally over time and these are some of the latest mutations (Rex)

Your support helps us to tell the story

This election is still a dead heat, according to most polls. In a fight with such wafer-thin margins, we need reporters on the ground talking to the people Trump and Harris are courting. Your support allows us to keep sending journalists to the story.

The Independent is trusted by 27 million Americans from across the entire political spectrum every month. Unlike many other quality news outlets, we choose not to lock you out of our reporting and analysis with paywalls. But quality journalism must still be paid for.

Help us keep bring these critical stories to light. Your support makes all the difference.

A new AI has been created to generate words that do not exist.

The one-shot website develops new, artificially-generated definitions for the non-existent words.

ThisWordDoesNotExist.com generates new words such as “wacamole” (a single serving of waffle batter made with a sweet cornmeal mixture), “pileset” (form a mass of, or make a shape about, something), or “prayman” (the principal or leading men in a society or enterprise).

Users click a button on the site, and a new word is made.

The website was developed by San Francisco-based developer Thomas Dimson, an engineer who used to work for the Facebook-owned Instagram developing its recommendations algorithm.

The actual artificial intelligence that creates new words is based on the natural language processing algorithm Transformers and the language framework GPT-2 - an algorithm that can be fed a piece of text and use the information to predict the words that can come next and create writing that can be near-indistinguishable from that written by a human.

GPT-2 gained notoriety for being “too dangerous to release” but the researchers have since made it available for use.

The site works by looking through a database of eight million webpages, taken from the most upvoted content on the social media site Reddit. Algorithms are able to detect when one word appears next to another word, and using that information (and replicating it enough times) means it can generate new words and sentences.

Like every other artificial intelligence, the system is not perfect. A small disclaimer at the bottom of the site says that “words are not reviewed and may reflect bias in the training set”.

Artificial intelligence systems have often been been criticised when it is unregulated or used for warfare but also has many benefits.

Data-driven platforms have been promoted as ways of helping disrupt illegal wildlife trades or translating thoughts into text directly from your brain.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in