Researchers have created a machine learning system that they claim can determine a person’s political party, with reasonable accuracy, based only on their face. The study, from a group that also showed that sexual preference can seemingly be inferred this way, candidly addresses and carefully avoids the pitfalls of “modern phrenology,” leading to the uncomfortable conclusion that our appearance may express more personal information that we think.
The study, which appeared this week in the Nature journal Scientific Reports, was conducted by Stanford University’s Michal Kosinski. Kosinski made headlines in 2017 with work that found that a person’s sexual preference could be predicted from facial data.
The study drew criticism not so much for its methods but for the very idea that something that’s notionally non-physical could be detected this way. But Kosinski’s work, as he explained then and afterwards, was done specifically to challenge those assumptions and was as surprising and disturbing to him as it was to others. The idea was not to build a kind of AI gaydar — quite the opposite, in fact. As the team wrote at the time, it was necessary to publish in order to warn others that such a thing may be built by people whose interests went beyond the academic:
We were really disturbed by these results and spent much time considering whether they should be made public at all. We did not want to enable the very risks that we are warning against. The ability to control when and to whom to reveal one’s sexual orientation is crucial not only for one’s well-being, but also for one’s safety.
We felt that there is an urgent need to make policymakers and LGBTQ communities aware of the risks that they are facing. We did not create a privacy-invading tool, but rather showed that basic and widely used methods pose serious privacy threats.
Similar warnings may be sounded here, for while political affiliation at least in the U.S. (and at least at present) is not as sensitive or personal an element as sexual preference, it is still sensitive and personal. A week hardly passes without reading of some political or religious “dissident” or another being arrested or killed. If oppressive regimes could obtain what passes for probable cause by saying “the algorithm flagged you as a possible extremist,” instead of for example intercepting messages, it makes this sort of practice that much easier and more scalable.
The algorithm itself is not some hyper-advanced technology. Kosinski’s paper describes a fairly ordinary process of feeding a machine learning system images of more than a million faces, collected from dating sites in the U.S., Canada and the U.K., as well as American Facebook users. The people whose faces were used identified as politically conservative or liberal as part of the site’s questionnaire.
The algorithm was based on open-source facial recognition software, and after basic processing to crop to just the face (that way no background items creep in as factors), the faces are reduced to 2,048 scores representing various features — as with other face recognition algorithms, these aren’t necessary intuitive things like “eyebrow color” and “nose type” but more computer-native concepts.
The system was given political affiliation data sourced from the people themselves, and with this it diligently began to study the differences between the facial stats of people identifying as conservatives and those identifying as liberal. Because it turns out, there are differences.
Of course it’s not as simple as “conservatives have bushier eyebrows” or “liberals frown more.” Nor does it come down to demographics, which would make things too easy and simple. After all, if political party identification correlates with both age and skin color, that makes for a simple prediction algorithm right there. But although the software mechanisms used by Kosinski are quite standard, he was careful to cover his bases in order that this study, like the last one, can’t be dismissed as pseudoscience.
The most obvious way of addressing this is by having the system make guesses as to the political party of people of the same age, gender and ethnicity. The test involved being presented with two faces, one of each party, and guessing which was which. Obviously chance accuracy is 50%. Humans aren’t very good at this task, performing only slightly above chance, about 55% accurate.
The algorithm managed to reach as high as 71% accurate when predicting political party between two like individuals, and 73% presented with two individuals of any age, ethnicity or gender (but still guaranteed to be one conservative, one liberal).