YouTube Essay: Footnotes

This post is a short addition to my previous post, “Disruptive YouTube” – two footnotes that I think merit their own post.

The first is that it took 20 years for email to have enough impact on society that the US Postal Service is in trouble. Although it may take that long for YouTube to be visibly disruptive at a large scale, YouTube is already disrupting television to some extent, with programmers painfully aware of the competition that YouTube represents, particularly to their younger viewers. YouTube itself is a fairly compelling “channel”, with a search engine interface (which itself is more powerful and compelling to savvy users than simple channel number buttons). Also, it may take considerably less time for YouTube to be disruptive, since its audience is already internet savvy to some degree (particularly the commenters). YouTube itself has gone “viral” at this point, since YouTube as a platform is carried as a meme on virtually every viral video, so millions of views on one video may also mean millions of new and converted YouTube users. And many will notice the “upload” button, create an account, and contribute something of their own, becoming a participating content provider rather than an audience member.

The second footnote is that I’m noticing more and more how beloved some videos seem to be (note 50% in this context means 50% of the votes on the video are Likes). Some, like Ray Charles’ “Hit the Road, Jack!”, with 99% (over 120,000 likes!), may be expected to be likeable. “Something Stupid – Frank & Nancy Sinatra” has an incredible 99.7%, with over two thousand likes, and dislikes in the single digits. More surprising are David Bowie’s “Starman”, with 98% and Boy George, with 98.1%. “Guns N’ Roses – Paradise City” has 99.2%, with over 70,000 votes, and over 20 million views (this is no mean sample). The controversial Sinead O’Connor got 98.9%. Apparently now that the controversy has died down, kids like her music, all the same (At this point, I should come out and say that I believe that teenagers are by far YouTube’s largest consistent demographic – exploring/surfing YouTube instead of just getting a video, watching it, and closing it). Some of these acts seem to be liked more now than when they first came out.

What is going on? How is this level of unanimity on the internet, often with tens of thousands of participants (in Sinead O’Connor’s case, almost 65,000 votes), remotely possible? Surely not everyone watching a particular video got there completely intentionally. Sometimes YouTube makes suggestions to people that seem like a stretch based on the current video (“Don’t touch that dial! Something’s coming right up!”). So we have to consider that a portion of the audience got there randomly. In addition, one would expect a population of trolls to vote any video down, even the most likeable ones, just because they can. Perhaps clicking “Dislike” is not gratifying enough for a troll, who clicks “Like” and proceeds to write something obnoxious. In any case, this mystery makes the relative unanimity even more impressive.

I should clarify that this unanimity is not a given, by any means. Good old “Charlie bit my finger”, which has as of this writing received 107,496 Dislike votes, still has a score of 88.3% likes. A budget speech by President Obama given on April 13, 2011 stands at 80.4% – not nearly as popular as the music videos, but considerably higher than his current approval rating. And poor Rebecca Black’s Friday received only 22.7% (with 65,427 likes and 222,923 dislikes). Her second video did slightly better with 36.9% (of close to a million votes). This lack of unanimity seems to be the exception that proves the rule.

This data is interesting and valuable, and thankfully, public. I predict that YouTube will continue to be an excellent way of judging the zeitgeist for some time to come. More importantly, it is increasingly clear that there is some sort of agreed-upon shared cultural heritage that is being celebrated (“Liked”) on YouTube. If a music video is capable of receiving tens of thousands of positive votes ten, twenty, forty, or even eighty years after it first came out, it seems safe to say that the video has staying power. Some of it may be nostalgia, and some of it is the continuing novelty of YouTube. But there is a crowd-sourced curation that is happening on YouTube, and it merits our attention.

UPDATE: My favorite example of negative feedback on YouTube may be “Fox Business blasts The Muppets for brainwashing America’s kids with anti-corporate, liberal agenda“, a video with over 222,000 views and 3,300 votes, 94% of them negative. It seems that the YouTube community is almost unanimous in despising an attack on the beloved Muppets.

–Daniel Tsadok

YouTube as Conversational Space

As a conversational space, to use NYU Professor Clay Shirky’s coinage, YouTube is often a disaster. Shirky himself has used the example of “Charlie bit my finger – again !” to demonstrate this disaster in action. “Charlie”, which has received an astonishing (and growing) 380 million views (the final episode of MASH, by contrast, got a paltry 50 million), has a virtually unreadable comments section; of the 600,000+ comments, what isn’t spam is either inane (“Charlie is so cute!”) or generally unintelligible vulgarity (“this video sucks”).

Based on “Charlie”, one could be forgiven for seeing YouTube as a hopeless space for facilitating meaningful conversation. Fortunately, “Charlie” is in fact an exceptional case: few videos get that many views, or even close to it, and while the video is entertaining, there is not much intellectual space for users to contribute in the comments. “Charlie” the video is too simple and “Charlie” the space is too crowded to maintain any sort of interesting momentum in the comments below.

Even so, the YouTube comment system is turning out to be something new that cannot be modeled as a traditional conversation. It may be more fruitful to study it as a feedback system instead. Most of the time, conversation on YouTube is not happening, in the sense of a coherent written back-and-forth among a group of commenters. YouTube’s specific commenting features, considered below, actually make conversation in this sense virtually impossible, so that even a small group of well-meaning, mature, and amiable commenters will have trouble carrying a real conversation. But as we’ll see, what emerges instead can be fascinating.

The YouTube commenting interface is straightforward: each video gets its own page, with the video itself dead center. Around the video is basic metadata (title, description, uploader handle). Below, towards the bottom of the page, are user comments on the video. There is a single thread in the comments: the video. YouTube has a basic crowd-sourced feedback system: users can vote up or down (i.e. Like or Dislike) both videos and comments. If there are enough positive votes on a comment, it becomes a “Top Comment”, and appears prominently above all other comments (even above the new comment form). Uploader comments have an especially elevated status: they always appear at the top. With enough negative votes, a comment is suppressed: it is hidden by default, and requires an additional click to read.

In addition to the interface itself, YouTube’s commenting system has some unique features that distinguish it as a forum.

1) YouTube is an anonymous forum. True, usernames are probably tracked by Google, but in the context of an actual discussion, the identities of the commenters among themselves is by default unknown. The only real hints to most commenters’ identities are the videos they choose to watch and what they write, which can be fairly substantive.

2) YouTube comments are uncensored. There seem to be no designated moderators (at least not public ones) appointed by YouTube/Google. Uploaders themselves have only one moderation option: Off. Any and all comments are allowed, or no comments at all. Other than that, comment moderation is completely user-driven, based on the simple Like/Dislike voting system, where each user gets one vote.

3) YouTube comments are always organized around a specific video. There is no other comment taxonomy, such as forums or threads. Comments are always supposed to be about The Video, and they generally are, or are quickly voted down by other users. This structure, together with the anonymous nature of the forum, keeps the video itself as the focus of conversation.

4) The ability to vote on individual comments creates a secondary feedback system (feedback on the feedback), and has a marked effect on how comments are constructed. As mentioned above, with enough Likes, a comment can be raised up to “Top Comment” status, making it much more prominent than those left at the bottom. This usually has the immediate layout effect of ripping a popular comment out of context and putting it onto a pedestal (a transfer that can be highly disruptive to what would have been a conversation).

It also creates a new incentive structure. Comments want to be read, and since being promoted to Top Comment results in many more readings, comments on YouTube want to be Liked as a result. Instead of the “rat race” of actual conversation, the commenter aims for the Top by trying to one-up other commenters by writing the funniest, wittiest, or most insightful comment – a pressure cooker of artificial selection. The end result is that Top Comments are typically pithy and quotable, like sound bites, constructed to solicit immediate positive feedback from other anonymous visitors. At its best, what replaces conversation is a feedback space full of sharp one-liners annotating the video: a pure meritocracy.

5) The scale of the YouTube landscape is vast, and growing rapidly. According to YouTube’s official blog, “more than 48 hours (two days worth) of video are uploaded to the site every minute” (over 4 million minutes of video per day). A related, and even more important, number is the quantity of videos: the diversity of subjects, styles, and qualities, each with the potential to foster a unique discussion. That number was not released, but assuming a long-ish 5 minutes per video, about 13,000 videos are uploaded every day. YouTube also has a massive audience up to the task of watching all this stuff: the site receives 3 billion video views daily (ibid). That means that, over a period of a year, YouTube handles over 1,000 trillion views. These incredible numbers have increased 100% and 50% since last year, respectively, and are likely to continue to increase.

Taking these features together, we begin to see each YouTube video as a bit like an island with a port, full of strangers, surrounded by open sea. Islands are of many different sizes. Commenters and Likers are drifters and vagrants: sailors who travel from port to port, briefly visiting and moving on, rather than settling down. At each port, The Video is the island’s shrine which centers all discussion. Some islands are friendlier than others (usually based on the tone of the shrine). When an island becomes hostile (or boring), a visitor can simply raise anchor and sail away, perhaps to visit another day. There are, from the sailor’s perspective, endless islands to see next.

The key question then is, how active is YouTube’s feedback space? How many viewers are also participants, leaving some sort of feedback? The video “Hit the road Jack!”, featuring Ray Charles performing live, is well-known, well-liked, and mainstream, making it a reasonable benchmark. “Hit the road Jack!” has been viewed over 30 million times, and Liked by users over 100,000 times [1]. So in this case, 0.333% of YouTube viewers had accounts, were logged in, and voted on the video. In lieu of hard numbers, a reasonable low-ball estimate is that 0.3% of viewers participate by giving their feedback (this turns out to be a fairly consistent ratio of votes to views).

Well, 0.3% of 1,000 trillion views is 3 billion votes per year. “Hit the Road, Jack!” also has over 20,000 comments. Visitors may post multiple comments (but not multiple votes), so let’s assume 10,000 actual participants. That is one tenth of the voter population, or about 0.04% of the viewer population. This much smaller number still implies 300 million comments per year on YouTube. To put this in perspective, Amazon claims 10 million product reviews, total [source].

Possibly the most significant feature of YouTube’s numbers is the statistical distribution of views, which has not yet been released by Google. The most watched video on YouTube history is “Justin Bieber – Baby ft. Ludacris”, with over 630 million views and rising. This may seem like a large number, but it is dwarfed by the 1,000 trillion views distributed across less popular videos. In fact, “Baby” has received less than 0.1% of the total YouTube view share, demonstrating that on YouTube, big ratings and tiny share are not mutually exclusive, a feature that is characteristic of a Long Tail system.

The relative handful of behemoths like “Charlie” and “Baby” cannot compete with the sum total of thousands of videos with a million views each, or tens of thousands of videos with 100 thousand views each, etc*. Those videos comprise the smaller scale multitude; the nooks and crannies of YouTube; the shadows. And participation has a much better chance in the shadows. “Charlie” and “Baby” are tough places to have a discussion, but with “only” 190,000 views, a geology animation called “650 Million Years in 1 Min and 20 Sec” might not be. The video itself has 430+ Likes (~ 0.2% participation rate), and with 700+ comments, a fairly lively and on topic discussion of tectonic plates, the future of earth, and yes, religion. Since the video predicts where the continents will be in 150 million years, there is some humor, like “ZOMG, we’ll all be neighbors. 8’D” (by iDinoroars).

As YouTube, one of the fastest and most successfully scaling websites in history, continues to inflate, the number of participants represented by the 0.3% will grow with it. And as the viewer base becomes savvier, the 0.3% participation rate has a good chance of growing as well, perhaps to 3%, perhaps to 93%.

Meanwhile, the seemingly endless parade of both videos and participants is creating a rich and diverse feedback space, constantly growing and evolving, especially in the shadows.

UPDATE: added on 11/6/2011

I would be remiss if I did not address the so-called problem of trolls on YouTube. The answer is that because the YouTube forums are completely driven, and managed, by the user base of YouTube, which includes, well, anyone who feels like commenting, since there are few barriers to entry. In any case, trolls are treated in various ways: they may be isolated with down votes on their comments, ignored, or actually, in not-so-rare cases, voted up by others, if the comment is witty or amusing enough.

A Curtis Mayfield music video has 3,365 likes and 44 dislikes. That’s a 98.7% Like voting rate, virtually unanimous by internet standards. The 44 dislikes may be genuine dislikes, or merely trolling (ensuring things are never quite unanimous). Either way, even if every one of the dislikes were trolls (which I doubt), that leaves 1.3% of the vote. Not much. My reasoning is that even the worst trolls probably don’t click “Like” on videos, and if they do, that could be because they actually like the video.

Anyway, 98.7% is nothing (Sorry, Curtis). Neither is 98.52% (“Andy Williams – Moon River 1960’s performance”), or 98.99% (“Bee Gees – Stayin’ Alive [Version 1] (Video)”). And forget about “Tallest Man – Guinness World Record”, with a measly 94.36% vote up rate.

“Bill Withers – Ain’t No Sunshine” has 87,506 likes, and 679 dislikes. That’s 99.23% of voters who agree that they like this video (I’m one of them). That 0.77% almost seems like noise – the video seems impossible to dislike, just from the numbers. Interestingly, the comments seem surprisingly trollish, perhaps a reaction to the near unanimous approval the video has. There are videos with 100% Like rates, but typically they need to stay under a certain view threshold.

One other related observation: reading the comments, there’s often a “NN dislikes” meme. That is, many of the comments are responses to how many dislikes there are. It usually is a variation on the video itself. A contrived example might be “42 people think the world is flat.” This may be of interest since it seems to be almost a kind of double vote. “Not only do I like this video, I’m going to call out people who didn’t like it!” The “NN Dislikes” meme seems to be popular, based on the number of likes the comment usually gets.

–Daniel Tsadok

* I strongly suspect that YouTube videos follow a Power Law distribution for number of views, but I don’t have numbers to back that up (yet).

UPDATE: I changed the title from “On the Feedback Shadows of Youtube” on October 23, 2011.
UPDATE: I changed the title from “YouTube, Disruptive YouTube” (ugh) on November 27, 2011.

Google Analytics is Even More Dangerous

I recently wrote about the potential dangers of Google’s Chrome browser to user privacy. However, Google has another product that is far more dangerous in the short term to privacy – Google Analytics. What is Google Analytics? I will let Google answer that one:

Google Analytics is the enterprise-class web analytics solution that gives you rich insights into your website traffic and marketing effectiveness. Powerful, flexible and easy-to-use features now let you see and analyze your traffic data in an entirely new way. With Google Analytics, you’re more prepared to write better-targeted ads, strengthen your marketing initiatives and create higher converting websites.

In other words, websites can use Google Analytics to analyze their web traffic to see who is visiting what pages, for how long, and in what order. Websites can also see how visitors made it to their site (whether via an ad, or a search, or a link from another site). Obviously this is invaluable information to most websites, and like so many Google products, is a great service. And Google Analytics is free for lower traffic websites, which means smaller websites are encouraged to use the service as well.

I should also say at this point that I don’t really have a problem with any individual website monitoring and analyzing its traffic. It’s pretty much a necessity these days if you are doing business on the web to get information about what is going on with your site. So my problem is not with the website operators who depend on this tool.

Which is what makes Google Analytics so dangerous. It is a ubiquitous service. Go to just about any website and you’ll see the Google Analytics tracking code on the bottom of the source code. And every time a website uses Google Analytics, it is sharing its traffic data with Google. Since you are giving your IP address to both the website and Google, Google is able to cross-reference your visit across websites.

In other words, Google knows just about every website you visit, whether the website is affiliated with Google or not, since chances are that websites that you frequent depend on Google Analytics. You don’t have to ever use any of Google’s services for Google to spy on you. All you have to do is surf.

Fortunately, there are steps you can take to protect yourself. You can block Google Analytics unilaterally via your HOSTS file, for example (see here for more information). There are also several Firefox plugins, such as NoScript and RequestPolicy that will block Google Analytics. But the most important step right now is to raise awareness, and to let websites know that you are concerned about your privacy. The fact is that any third-party service that does analytics (like Chartbeat, a competitor in that sphere) will become dangerous if a critical mass of websites use it. So the best suggestion would be to use in-house analytics tools to analyze the website’s server logs, which is exactly what people did before Google Analytics came along. That way websites can still see what is happening with their traffic without sharing their data, as well as their users’, with a third party.

It’s All Middleware

Jon Crosby has an excellent talk from Mountain West Ruby Conference 2009 about how middleware is taking over web development. His article “A World of Middleware” is a bit difficult to parse if you don’t know Rack, so I’ll try to simplify what he’s saying a bit…

Rails 2.3 is built on Rack, which is a simple specification to connect web frameworks like Rails and Merb with web servers like Mongrel and Thin.  Rails 2.3 also lets you define middleware code to be run before Rails is invoked, again using Rack.  In Rails terms, middlewares are like before_filter’s that are run before Rails is ever even called.  Another way of looking at a middleware is like a miniature application that can connect, via Rack, to another middleware.  Each middleware stack handles certain requests (for example, certain url’s can be handled by specific middlewares), or all requests.  If the middleware returns 404, the next middleware is called.  Rails 2.3 only handles requests that fall to the bottom of the middleware stack.

Crosby’s point is, why bother with a chain of middlewares leading up to an application?  All you need is a chain of middlewares that are the application.  His example in the talk is an authentication middleware that sits on top of everything else, instead of the authentication layer being part of a monolithic application.

The best part is that middlewares can be created as black boxes, and simply included in your Rack application (or used as a middleware in your Rails 2.3 application).  And it looks like the best tool for creating simple middlewares is Sinatra.  This stuff isn’t even on the main Sinatra site yet, but you can read about it on the Sinatra blog.  To quote: “multiple Sinatra applications can now be run in isolation and co-exist peacefully with other Rack based frameworks.”  This is awesome – it means that a Sinatra application can be used as middleware for a Rails application (or other Sinatra applications).  So it’s easy to build a big application by chaining together smaller Sinatra middlewares.

Since Ruby’s open source community is so strong, there’s no doubt we will be seeing pluggable middlewares that fit easily inside any Rack application.  This is exciting stuff, and I look forward to building an app this way!

The Rise (and Approaching Fall) of the Duopoly

I wish I had a nickel for every time I heard someone say they couldn’t “get on the internet”.  Whether it’s via cable, DSL, 3G, or whatever, people are always having problems logging on.  It’s a baffling problem for me, because the internet evolved from ARPAnet, which was commissioned by the US Department of Defense forty years ago to be a completely reliable network.  So why, decades later, is “the internet” so “unreliable”?  The answer is that it’s not.  Individual connections (like mine and yours) may not be reliable, but the internet as a whole is incredibly reliable.  Any device connected to the internet can connect to billions of computers all over the world, seamlessly.

Ok, so why is your connection so unreliable? Don’t be so hasty to blame your service provider, although they certainly deserve it.  The real blame lies collectively with every consumer who currently pays for internet access.

One of the things that saddens me about the evolution of the internet is how it went from being a participatory network to a provider-consumer one.  Originally, the internet was a community of computers. Later, but before the broadband revolution, there were dozens of companies that offered internet via dial-up, which has a maximum speed of 56K, paltry by today’s standards. Broadband made dial-up obsolete, and the dial-up business crumbled.  Unfortunately, broadband required a direct connection to the home, and the only companies that could make that connection were the phone and cable companies.  Startup broadband just didn’t work.

Which brings us today, where your only options for wired broadband in most of New York City are Time Warner and Verizon (there are some small DSL companies that still exist out there, but they usually require you to have Verizon phone service already).  In other words, New York City residents face a duopoly when it comes to broadband internet service – they must choose between two mediocre, unreliable, bloated bureaucracies.

Considering how long the technology behind the internet has been around, surely there are other possibilities?  I believe there are, and in the coming years, I believe those possibilities will crystalize.

More on that in another post…