The intense media coverage this past week of the so-called “Facebook killer” drew attention once again to the horrific ways in which social media platforms can provide a global audience to people who wish to do themselves or others grievous harm and indeed begs the question of whether in the absence of such instant fame would at least some of these acts have been prevented?
In the immediate aftermath of the Steve Stephens video, Facebook issued a statement condemning the video and noting that it had been removed from the platform, but only after it had garnered global attention.
Perhaps the greatest challenge to stopping violent imagery from finding an audience on social media is simply the sheer volume of material being posted every second. No army of human moderators could hope to manually review every uploaded image or video in a reasonable amount of time and the rise of live streaming brings with it new time pressures in that the video may begin and end before a human moderator even knows it has aired. While one could argue whether social media companies bear any kind of ethical or moral responsibility in hiring the necessary staff to do a far better job at reviewing content, one could also argue that as public companies they owe a fiduciary duty to their shareholders to minimize their expenditures and take a calculated risk that a certain volume of violent imagery will be immortalized by their network.
Yet, automated technologies, while far from perfect, offer a powerful and compelling opportunity to both flag the most egregious content and prevent banned content from being reposted to the network in a giant game of whack-a-mole.
After processing more than a quarter billion global news images through Google’s Cloud Vision API last year, I’ve found that Google’s deep learning algorithms are extraordinarily adept at identifying violence in a myriad contexts, spotting even situations that a typical human was likely to miss unless they looked very carefully. From a person holding a gun to another’s face to blood pools on the pavement to any number of other situations, Google’s API has been able to recognize an incredible diversity of imagery that one might characterize as depicting violence of some fashion. In short, deep learning has reached a point where it is able to recognize many classes of “violence” simply by looking at a photograph and understanding the objects and activities it depicts – all within a fraction of a second and infinitely scalable.
Imagine taking such a tool and using it to filter every single image uploaded to Facebook in realtime. If the algorithm flags an image as being potentially violent, the user would receive a warning message and would be asked to provide a textual explanation of why the image should be permitted, such as that the warning is an error or that the image legitimately depicts violence, but that its publication is in the public good (such as documenting police violence). A simple linguistic AI algorithm would evaluate the description for detail and linguistic complexity ask the user to provide additional detail as needed. The final report would then be provided to a human reviewer for a final decision.
No algorithm is perfect and such an approach would still allow some level of violent imagery onto Facebook (which could still be flagged through the traditional review process), but it would at the very least filter out the most egregious and graphic violence. False positives would result only in a slight delay as a human reviewer confirms whether the image was in fact violent or overrides the algorithm in the case of a mistake.
Moreover, every time the algorithm misses an image or yields a false positive, all of that information can be fed back on an ongoing basis to retrain the algorithm, meaning the system will get more and more accurate over time.
Such a system would completely invert Facebook’s current model of allowing violent images to be posted and then removing them only after there is a complaint (and after the subject of the image has been revictimized by the publicity and other users have been traumatized by seeing the images) and instead prevent violent images from being uploaded in the first place.
While this would eliminate the most obvious and graphic images, image analysis alone can’t always discern the context of a scene to fully understand the implications of the actions therein. For example, a single frame of a video may not readily identify it as a sexual assault and thus incorporating speech to text transcription and flags like a person screaming in pain or anguish or someone yelling “no” in a forceful manner could also be used as indicators to flag a video as having questionable content.
While the Facebook killer’s video was not live streamed, one could imagine a similar live streaming scenario in which an individual films themselves getting out of a car and walking towards someone on the street. Once the person raises a gun into the frame, begins screaming obscenities at the other person, gunshots are heard, blood is seen or any other indicator is detected, the stream would immediately go dark and a human reviewer asked to provide a prioritized immediate assessment of the video and whether it should be allowed to continue. If the video contains obvious violence but appears to be within the scope of permissible content, the reviewer could then open a live chat with the person streaming the video to discuss its context and then determine within policy whether the video should be permitted to continue (documenting a major public protest that has become violent) or to intervene (in the case of a live stream of self-harm or harm against another).
Here again the idea is to use technology to augment the human reviewers rather than replace them. Automated tools can scan 100% of content posted to a platform like Facebook, including all of its live streamed video in realtime and perform prescreening, leaving more complex questions like permissible violent content, to the human reviewers. They can also learn from those human reviewers over time, continually improving the quality of their reviews.
Facebook has discussed the notion of allowing violent imagery to remain on its platform if it is particularly “newsworthy” and bearing on current events. In fact, even here algorithms can help human reviewers understand the context of an image as it evolves. Google’s Cloud Vision API recently added the ability to conduct what amounts to a reverse Google Images search for any image, returning a list of webpages across the open internet on which Google has seen that same image or a cropped or slightly altered version of it. In my own research I’ve been using this capability to explore how an image spreads through the media ecosystem. One could use a similar capability to scan violent images uploaded to a social network against global news outlets to see if the image has been published by a news organization and use that as a signal that the image bears sufficiently on a matter of public interest to allow it to be shared regardless of its violent content (essentially deferring to the editorial judgement of professional news editors).
Of course, once a human reviewer has determined that a given image or video violates Facebook’s terms of use and manually bans the content from the platform, today that exact same image can simply be immediately reuploaded and it would have to be re-flagged as offensive and re-reviewed by a human auditor and re-removed. In short, even after an image is banned from Facebook, it never actually disappears as a myriad users repost it to their own accounts.
Here too technology offers significant opportunity for improvement. Image fingerprinting software computes a unique digital signature of an image that can be used to instantly identify all reposts of the image, even if they have been slightly altered. It has long been used by internet companies to prevent users from uploading illegal content like child pornography to their systems.
This past December Facebook committed to using fingerprinting for terrorist imagery and just this month announced that it was finally rolling out image fingerprinting for so-called “revenge porn” to prevent banned images from being reposted, but a company spokesperson clarified that this only applied to the narrow area of intimate imagery shared without consent. When asked why the company did not simply roll the technology out across all areas to permanently prevent any banned content on any topic from being reposted, the company did not respond, nor did it respond as to why it took so long for the company to finally roll out fingerprinting technology.
All technological solutions come with their own attendant issues and deep learning image categorization systems and image fingerprinting algorithms all have limitations. Yet, even with those limitations the current approach of waiting for users to flag an image as violent or abusive and then crossing one’s fingers that a human reviewer hours or days later will make a good judgement about that image in the few seconds of time they have to look at it is simply not working. Allowing algorithms to identify potentially violent images before they are posted and coupling this with a streamlined human review process, all the while feeding the reviewer feedback into the algorithms and preventing banned images from being reposted, offers a powerful potential solution that could transform how social media platforms deal with violent content.
Putting this all together, no solution is perfect, but it certainly seems, based on the more than a quarter billion global images I’ve processed using Google’s deep learning image tools over the past year, that image recognition algorithms have become sufficiently advanced at recognizing violent imagery that we can do a whole lot better than we are at preventing our online platforms from being used to empower and promote violence.
[“Source-forbes.”]