To build artificial intelligence, one must fill it with existing data. The data that A.I. gets to “read” controls what that intelligent machine believes to be true. In his recent New York Times article, “The Great A.I. Awakening,” Gideon Lewis-Kraus digs into this concept. What Lewis-Kraus doesn’t touch on, however, is that sometimes the data that computers get to “read” is biased. And biased data creates biased robots.
Recently, an artificial intelligence was given the ability to judge a beauty contest. The result was a bunch of white faces being chosen as “beautiful.” The company behind the program, Beauty.ai, claims that the reason the A.I. picked white women as the most beautiful was because the majority of the applicants to the pageant were white, but that doesn’t seem to make sense.
For a computer to decide who is beautiful, it must have a definition of beauty. Although I don’t work at Beauty.ai I can imagine how I’d define that training set. I might take pictures of every model at every global talent agency. I might take the average symmetry of the faces of celebrities. I might just take every picture of the top 500 celebrities of all time and define them as archetypal beauty for the computer. But here’s the rub: a lot of those people are white.
It’s not to say that those people aren’t beautiful, but it’s shortsighted to not acknowledge that global notions of beauty have been influenced by a dominant white culture. That makes it nearly impossible to define a “beauty” training set for an A.I. without including the inherent cultural bias of its programmers.
Within 24 hours Tay was saying the worst things on the internet. It literally went from “humans are super cool” to “I hate everybody” and “Hitler was right” in a matter of hours.
Microsoft provided perhaps the best example of what biased data can do to A.I. The company released TayTweets in March with a bit of advertisement about how it was the first A.I. that would learn how to speak from tweets from fellow Twitter users. Well, a few individuals decided to light Microsoft’s A.I. dreams on fire and fed TayTweets nonstop racist, sexist, and horrific statements. Within 24 hours Tay was saying the worst things on the internet. It literally went from “humans are super cool” to “I hate everybody” and “Hitler was right” in a matter of hours.
Tay — the A.I. — is not racist, of course. Tay is a system of analysis that decides what it should “think” based on what the predominant thoughts are amongst all the individuals communicating with it. It just so turned out that the people who decided to talk to Tay were overwhelmingly horrific.
As discouraging as the thought of racist, sexist, xenophobic robots may be, it’s also encouraging because the phenomenon shines a light on the way bias operates.
I had my own experience with biased data in a voice-activated car. The person I rode with spoke English but with a slight accent. While the car responded to me instantly, it couldn’t understand a single word from my co-passenger. The car was not making a judgment. It did not subjectively devalue that person’s way of speaking, but rather didn’t understand the words at all because the data it was fed. Unfortunately, nobody thought to teach it that some people speak English differently from others.
As discouraging as the thought of racist, sexist, xenophobic robots may be, it’s also encouraging because the phenomenon shines a light on the way bias operates. Indeed, A.I. is currently being built along neural networks that mimic the way the human brain functions. If it’s possible to identify the bad information that informs, say, Tay or the Beauty.AI, it might also be possible for human beings to identify it in ourselves.
If we are to accept that a stream of data powers our own intelligence, just like these robots, we must also ask ourselves what parts of our data are biased and what elements of the information we use to form our perspectives on the world have historical and cultural issues attached to them.