Tumblr, NSFW porn blogging, and the challenge of checkpoints

Tarleton Gillespie · July 26, 2013

After Yahoo’s high-profile purchase of Tumblr, when Yahoo CEO Marissa Mayer said that she would “promise not to screw it up,” this is probably not what she had in mind. Devoted users of Tumblr have been watching closely, worried that the cool, web 2.0 image blogging tool would be tamed by the nearly two-decade-old search giant. One population of Tumblr users, in particular, worried a great deal: those that used Tumblr to collect and share their favorite porn. This is a distinctly large part of the Tumblr crowd: according to one analysis, somewhere near or above 10% of Tumblr is “adult fare.”

Now that group is angry. And Tumblr’s new policies, that made them so angry, are a bit of a mess. Two paragraphs from now, I’m going to say that the real story is not the Tumblr/Yahoo incident, or how it was handled, or even why it’s happening. But the quick run-down, and it’s confusing if you’re not a regular Tumblr user. Tumblr had a self-rating system: blogs with “occasional” nudity should self-rate as “NSFW”. Blogs with “substantial” nudity should rate themselves as “adult.” About two months ago, some Tumblr users noticed that blogs rated “adult” were no longer being listed with the major search engines. Then in June, Tumblr began taking both “NSFW” and “adult” blogs out of their internal search results — meaning, if you search in Tumblr for posts tagged with a particular word, sexual or otherwise, the dirty stuff won’t come up. Unless the searcher already follows your blog, then the “NSFW” posts will appear, but not the “adult” ones. Akk, here, this is how Tumblr tried to explain it:

What this meant is that your existing followers of a blog can largely still see your “NSFW” blog, but it would be very difficult for anyone new to find it. David Karp, founder and CEO of Tumblr, dodged questions about it on the Colbert Report, saying only that Tumblr doesn’t want to be responsible for drawing the lines between artistic nudity, casual nudity, and hardcore porn.

Then a new outrage emerged when some users discover that, in the mobile version of Tumblr, some tag searches turn up no results, dirty or otherwise — and not just for obvious porn terms, like “porn,” but also for broader terms, like “gay”. Tumblr issued a quasi-explanation on their blog, which some commentators and users found frustratingly vague and unapologetic.

Ok. The real story is not the Tumblr/Yahoo incident, or how it was handled, or even why it’s happening. Certainly, Tumblr could have been more transparent about the details of their original policy, or the move in May or earlier to de-list adult Tumblr blogs in major search engines, or the decision to block certain tag results. Certainly, there’ve been some delicate conversations going on at Yahoo/Tumblr headquarters, for some time now, on how to “let Tumblr be Tumblr” (Mayer’s words) and also deal with all this NSFW blogging “even though it may not be as brand safe as what’s on our site” (also Mayer). Tumblr puts ads in its Dashboard, where only logged-in users see them, so arguably the ads are never “with” the porn — but maybe Yahoo is looking to change that, so that the “two companies will also work together to create advertising opportunities that are seamless and enhance the user experience.”

What’s ironic is that, I suspect, Tumblr and Yahoo are actually trying to find ways to remain permissive when it comes to NSFW content. They are certainly (so far) more permissive than some of their competitors, including Instagram, Blogger, Vine, and Pinterest, all of whom have moved in the last year to remove adult content, make it systematically less visible to their users, or prevent users from pairing advertising with it. The problem here is their tactics.

Media companies, be they broadcast or social, have fundamentally two ways to handle content that some but not all of their users find inappropriate.

First, they can remove some of it, either by editorial fiat or at the behest of the community. This means writing up policies that draw those tricky lines in the sand (no nudity? what kind of nudity? what was meant by the nudity?), and then either taking on the mantle (and sometimes the flak) of making those judgments themselves, or having to decide which users to listen to on which occasions for which reasons.

Second, and this is what Tumblr is trying, is what I’ll call the “checkpoint” approach. It’s by no means exclusive to new media: putting the X-rated movies in the back room at the video store, putting the magazines on the shelf behind the counter, wrapped in brown paper, scheduling the softcore stuff on Cinemax after bedtime, or scrambling the adult cable channel, all depend on the same logic. Somehow the provider needs to keep some content from some people and deliver it to others. (All the while, of course, they need to maintain their reputation as defender of free expression, and not appear to be “full of porn,” and keep their advertisers happy. Tricky.)

To run such a checkpoint requires (1) knowing something about the content, (2) knowing something about the people, and (3) having a defensible line between them.

First, the content. That difficult decision, about what is artistic nudity, what’s casual nudity, and what’s pornographic? It doesn’t go away, but the provider can shift the burden of making that decision to someone else — not just to get it off their shoulders, but sometimes to hand it someone more capable of making it. Adult movie producers or magazine publishers can self-rate their content as pornographic. An MPAA-sponsored board can rate films. There are problems, of course: either the “who are these people?” problem, as in the mysterious MPAA ratings board, or the “these people are self-interested” problem, as when TV production houses rate their own programs. Still, this self-interest can often be congruent with the interests of the provider: X-rated movie producers know that their options may be the back room or not at all, and gain little i pretending that they’re something they’re not.

Next, the people. It may seem like a simple thing, just keeping the dirty stuff on the top shelf and carding people who want to buy it. Any bodega shopkeep can manage to do it. But it is simple only because it depends on a massive knowledge architecture, the driver’s license, that it didn’t have to generate itself. This is a government sponsored, institutional mechanism that, in part, happens to be engaged in age verification. It requires a massive infrastructure for record keeping, offices throughout the country, staff, bureaucracy, printing services, government authorization, and legal consequences for cases of fraud. All that so that someone can show a card and prove they’re of a certain age. (That kind of certified, high-quality data is otherwise hard to come by, as we’ll see in a moment.)

Finally, a defensible line. The bodega has two: the upper shelf and the cash register. The kids can’t reach, and even the tall ones can’t slip away uncarded, unless they’re also interested in theft. Cable services use encryption: the signal is scrambled unless the cable company authorizes it to be unscrambled. This line is in fact not simple to defend: the descrambler used to be in the box itself, which was in the home and, with the right tools and expertise, openable by those who might want to solder the right tab and get that channel unscrambled. This meant there had to be laws against tampering, another external apparatus necessary to make this tactic stick.

Tumblr? Well. All of this changes a bit when we bring it into the world of digital, networked, and social media. The challenges are much the same, and if we notice that the necessary components of the checkpoint are data, we can see how this begins to take on the shape that it does.

The content? Tumblr asked its users to self-rate, marking their blog as “NSFW” or “adult.” Smart, given that bloggers sharing porn may share some of Tumblr’s interest in putting it behind the checkpoint: many would rather flag their site as pornographic and get to stay on Tumblr, than be forbidden to put it up at all. Even flagged, Tumblr provides them what they need: the platform on which to collect content, a way to gain and keep interested viewers. The categories are a little ambiguous — where is the line between “occasional” and “substantial” nudity to be drawn? Why is the criteria only about amount, rather than degree (hard core vs soft core), category (posed nudity vs sexual act), or intent (artistic vs unseemly)? But then again, these categories are always ambiguous, and must always privilege some criteria over others.

The people? Here it gets trickier. Tumblr is not imposing an age barrier, they’re imposing a checkpoint based on desire, dividing those who want adult content from those who don’t. This is not the kind of data that’s kept on a card in your wallet, backed by the government, subject to laws of perjury. Instead, Tumblr has two ways to try to know what a user wants: their search settings, and what they search for. If users have managed to correctly classify themselves into “Safe Mode,” indicating in the settings that they do not want to see anything flagged as adult, and people posting content have correctly marked their content as adult or not, this should be an easy algorithmic equation: “safe” searcher is never shown “NSFW” content. The only problems would be user error: searchers who do not set their search settings correctly, and posters who do not flag their adult content correctly. Reasonable problems, and the kind of leakage that any system of regulation inevitably faces. Flagging at the blog level (as opposed to flagging each post as adult or not) is a bit of a dull instrument: all posts from my “NSFW” blog are being withheld from safe searchers, even the ones that have no questionable content — despite the fact that by their own definition a “NSFW” tumblr blog only has “occasional” nudity. Still, getting people to rate every post is a major barrier, few will do so diligently, and it doesn’t fit into simple “web button” interfaces.

Defending the dividing line? Since the content is digital, and the information about content and users is data, it should not be surprising that the line here is algorithmic. Unlike the top shelf or the back room, the adult content on Tumblr lives amidst the rest of the archive. And there’s no cash register, which means that there’s no unavoidable point at which use can be checked. There is the login, which explains why non-logged-in users are treated as only wanting “safe” content. But, theoretically, an “algorithmic checkpoint” should work based on search settings and blog ratings. As a search happens, compare the searcher’s setting with the content’s rating, and don’t deliver the dirty to the safe.

But here’s where Tumblr took two additional steps, the ones that I think raise the biggest problem for the checkpoint approach in the digital context.

Tumblr wanted to extend the checkpoint past the customer who walks into the store and brings adult content to the cash register, out to the person walking by the shop window. And those passersby aren’t always logged in, they come to Tumblr in any number of ways. Because here’s the rub with the checkpoint approach: it does, inevitably, remind the population of possible users, that you do allow the dirty stuff. The new customer who walks into the video store, and sees that there is a back room, even if the never go in, may reject your establishment for even offering it. Can the checkpoint be extended, to decide whether to even reveal to someone that there’s porn available inside? If not in the physical world, maybe in the digital?

When Tumblr delisted its adult blogs from the major search engines, they wanted to keep Google users from seeing that Tumblr has porn. This, of course, runs counter to the fundamental promise of Tumblr, as a publishing platform, that Tumblr users (NSFW and otherwise) count on. And users fumed: “Removal from search in every way possible is the closest thing Tumblr could do to deleting the blogs altogether, without actually removing 10% of its user base.” Here is where we may see the fundamental tension at the Yahoo/Tumblr partnership: they may want to allow porn, but do they want to be known for allowing porn?

Tumblr also apparently wanted to extend the checkpoint in the mobile environment — or perhaps were required to, by Apple. Many services, especially those spurred or required by Apple to do so, aim to prevent the “accidental porn” situation: if I’m searching for something innocuous, can they prevent a blast of unexpected porn in response to my query? To some degree, the “NSFW” rating and the “safe” setting should handle this, but of course content that a blogger failed (or refused) to flag still slips through. So Tumblr (and other sites) institute a second checkpoint: if the search term might bring back adult content, block all the results for that term. In Tumblr, this is based on tags: bloggers add tags that describe what they’ve posted, and search queries seek matches in those tags.

When you try to choreograph users based on search terms and tags, you’ve doubled your problem. This is not clean, assured data like a self-rating of adult content or the age on a driver’s license. You’re ascertaining what the producer meant when they tagged a post using a certain term, and what the searcher meant when they use the same term as a search query. If I search for the word “gay,” I may be looking for a gay couple celebrating the recent DOMA decision on the steps of the Supreme Court — or “celebrating” bent over the arm of the couch. Very hard for Tumblr to know which I wanted, until I click or complain.

Sometimes these terms line up quite well, either by accident, or on purpose: for instance when users of Instagram indicated pornographic images by tagging them “pornstagram,” a made-up word that would likely mean nothing else. (This search term no longer returns any results, although — whoa! — it does on Tumblr!.) But in just as many cases, when you use the word gay to indicate a photo of your two best friends in a loving embrace, and I use the word gay in my search query to find X-rated pornography, it becomes extremely difficult for the search algorithm to understand what to do about all of those meanings converging on a single word.

Blocking all results to the query “gay,” or “sex”, or even “porn” may seem, form one vantage point (Yahoo’s?), to solve the NSFW problem. Tumblr is not alone in this regard: Vine and Instagram return no results to the search term “sex,” though that does not mean that no one’s using it as a tag – though Instagram returns millions of results for “gay,” Vine, like Tumblr, returns none. Pinterest goes further, using the search for “porn” as a teaching moment: it pops up a reminder that nudity is not permitted on the site, then returns results which, because of the policy, are not pornographic. By blocking search terms/tags, no porn accidentally makes it to the mobile platform or to the eyes of its gentle user. But, this approach fails miserably at getting adult content to those that want it, and more importantly, in Tumblr’s case, it relegates a broadly used and politically vital term like “gay” to the smut pile.

Tumblr’s semi-apology has begun to make amends. The two categories, “NSFW” and “adult” are now just “NSFW” and the blogs masked as such are now available in Tumblr’s internal search and in the major search engines. Tumblr has promised to work on a more intelligent filtering system. But any checkpoint that depends on data that’s expressive rather than systemic — what we say, as opposed to what we say we are — is going to step clumsily both on the sharing of adult content and the ability to talk about subjects that have some sexual connotations, and could architect the spirit and promise out of Tumblr’s publishing platform.

Author

Tarleton Gillespie

Principal Researcher at Microsoft Research, New England; also an affiliated associate professor in the Department of Communication and Department of Information Science, at Cornell University. On Twitter at @TarletonG

Comments are closed.