October 5, 2021 4 min read
Above: Andrew Blumenfeld, co-founder of Telepath, and Lily Adelstein, creative project manager, discuss Facebook's meaningful social interaction metric.
It’s been a bad few weeks for Facebook, starting with a massive trove of leaked internal documents and culminating most recently in a multi-hour service outage, as well as a 60 Minutes interview and US Senate testimony by a company whistleblower. The whistleblower, Frances Haugen, was a product manager at Facebook, and has provided documents and other evidence to journalists and Congress that paint an incredibly unflattering picture of the company.
Haugen argues that Facebook is well aware of the ways in which user safety sometimes conflicts with company profits/growth, and consistently favors the latter over the former. She believes the company has acted fraudulently by keeping this information from shareholders, and is in need of stricter government regulation to better protect users.
One interesting new phrase that emerged in some of the leaked documents is “MSI.” MSI stands for Meaningful Social Interaction and is a concept Facebook has used to measure the extent to which content on its platforms is driving activity. This is important to Facebook because they want to help promote content that is likely to receive engagement from users, so that those users will keep coming back to the platform and posting their own engaging content. The more active users on the platform, the more advertising revenue Facebook can generate.
It seems as if the earliest iterations of MSI were very rules-based, not relying much on machine learning or artificial intelligence, but rather complying with a pre-programmed points system. A post that received a like, for instance, might get a point, and each comment may be worth two points. Another dimension to this early MSI was the closeness of the people who were engaging with the content, to the person posting the content. So a post that received a lot of engagement from people with whom you had a strong connection would end up with a higher MSI score than one that received the same amount of engagement, but from looser acquaintances or friends-of-friends, etc.
Over time, however, the MSI started to evolve and eventually a new target metric emerged: Downstream MSI. Downstream MSI was added as a component of MSI, and was a new machine learning algorithm that predicted whether a piece of content was going to receive a lot of engagement. This component of the MSI became increasingly central, and the closeness of the people engaging became less significant to a post’s MSI score. Content that the machine learning model predicted would receive a lot of engagement, was given a lot of exposure by Facebook.
The problem now facing Facebook is the revelation that their own research was showing that this algorithmic prioritization of any and all engagement was, indeed, creating more user activity-- but it was also denigrating user happiness, increasing divisiveness, spreading misinformation, and causing all sorts of other toxic effects. They found that content that generated the most engagement was content that evoked strong feelings, especially anger. This raises some very interesting questions about the use of machine learning in content feeds, like that of Instagram and Facebook.
In many ways, machine learning makes a lot of sense in this context. Human moderation alone is not feasible at the scale of these platforms, and rules-based systems are almost certainly inadequate because of the wide variety of content and other variables. But in this case, machine learning also has created a feed that seems to prioritize toxicity. Is machine learning to blame? Not exactly.
As is typically the case, a computer develops a model and learns to understand the world based upon what humans teach it to value. In this case, Facebook was almost singularly motivated by the desire to increase active usership on the platform. It was perfectly capable of measuring other outcomes-- we know this because the leaked documents show that they do measure other outcomes, such as user joy, feelings of well-being, etc. But rather than train a machine learning model that seeks to increase those other outcomes, it pursued a different outcome, and one that it would come to learn actually came at the expense of user safety.
At her Senate testimony, Frances Haugen said she understood that platforms like Facebook cannot control all of the content produced by users of its platform, but that it had 100% control over the algorithms it uses to decide whether and how to distribute that content. This is true, even when considering the extent to which artificial intelligence is making nearly all of these day-to-day decisions. It’s possible that the utilization of machine learning to tackle big problems will be used as an excuse for bad outcomes. But this should provide an early warning and a reminder that humans cannot absolve themselves of responsibility by outsourcing ruthless decision making to machines.