Twitter's Garbage Problem

One person’s garbage is another person’s gold. Twitter has no shortage of both.

“A startup is a human institution designed to deliver a new product or service under conditions of extreme uncertainty.” PM May 10th via ShareFeed


I just ousted Owen S. as the mayor of Samovar Tea Lounge on @foursquare! AM May 5th via foursquare

But which is which? Information content is subjective. When someone I follow tweets about what they ate for breakfast, it’s just noise to me. However, that person might have other followers who find breakfast updates genuinely interesting. Who knows? It’s not for me to say.

Unfortunately, there’s not much I can do with my preferences on Twitter since it’s an all-or-nothing platform. I follow person X, or I don’t. I get all of X’s tweets, or none. There’s no easy way to follow X and only receive the tweets that have value to me without also plowing through tons of noise.

Human flexibility. We rock the flock & swarm like birds & bees… but a solo human is also v. formidable. We’re ants AND we’re lions.9:20 AM May 15th via Tweetie


Here in Seattle they put cream cheese on their hot dogs. W-E-I-R-D. Alaska’s got nothing on this.4:08 PM May 13th via Tweetie

A core problem with Twitter is that signal and noise are inseparable in a single person’s tweet stream.

Being an entrepreneur is not the same as starting a company. PM May 13th via TweetDeck


Run: 3mi down, 3mi to go. Feeling #awesome!4:42 PM May 14th via Tweetie

(Disclaimer: all of the above Twitterers are awesome and you should follow them.)


The solution to this noise problem is to filter out tweets on the client side.

A first approach is to simply filter out tweets by keyword. I think of this as anti-search: specify a keyword, and never see any updates containing that word.

Keyword filtering alone could probably solve 1/3 of the Twitter garbage problem by tossing out tweets containing location check-in URLs like or Filter out “lunch”, “beer”, and other daily-trivia keywords (along with “iPad” or other topics that become oversaturated) and the signal-to-noise ratio could likely be raised even higher. Hashtags could be useful as filtering keywords, too.

A second approach is to filter out tweets by regular expression. This is basically the same idea as keyword filtering, but a regular expression would allow more subtle patterns to be recognized.

Adding a layer of boolean logic on top of these filtering operations would make them even more useful by allowing constructs like “remove tweets containing A but not B”.

Filtering with annotations

A third approach is to filter out tweets by metadata. Twitter’s Annotations will allow extra information (metadata) to be embedded inside a tweet without modifying the contents of the tweet itself.

Imagine if authors annotated each tweet with a subject category. Then, a perfect filter would be as simple as saying “don’t show me tweets with category = lunch”. Or “don’t show me tweets with category = checkin”. And so on.

The actual process of annotating a tweet would have to be painless and require zero extra work. Otherwise, no one will do it. I hope some clever client app will solve this annotation problem.

More information with less noise

If, say, 50% of my Twitter stream is noise, then filtering all of it out means I could follow 2x as many people and get 2x as much information for the same time spent. That’s a great ROI in my book.

The ability to automatically filter individual tweet streams is a killer feature that would make Twitter many times more useful.

It’s surprising to me that this feature is not commonplace.

In a future post I’ll review the few Twitter clients that have filtering-out capabilities. Among the options I’ve found so far are Mixero and Filttr. If you have other suggestions, please leave them in the comments.