type in your query to search makeyougohmm
Things that ... make you go hmmtechnology music video art news reviews and muse on the web

December 3, 2006

David Berlind, here is the dark side of URL shortening services

developers, spam — by TDavid @ 12:30 pm PST
New! F = please no more posts like thisD = not among your best stuffC = average postB = good post, I liked itA = great post, please create more like this (Hmm, no ratings yet)
Loading ... Loading ...

In one of the most bizarre exclamations I’ve read in a long time, ZDnet’s David Berlind attempts to make a case for the short URL service tinyurl.com being the next YouTube. No, really, he’s serious.

Berlind writes:

TinyURL is the next YouTube. In fact. It’s better. It’s a dream come true for the Madison avenue types whose Holy Grail has always been how to serve people with an advertisement at their moment of greatest need.

Yes, he’s talking about the same tinyurl shortening service which in addition to being useful to Berlind and others for shortening long relevant URLs is also a haven for spammers wanting to mask links to sites with illegal activity and worse (malware, anyone?). After I read the article I looked in my comment moderation bin for this blog and what do I see? Yup, you already saw it to lead off this post. Really, that was the very first comment waiting for me to approve. Berlind should get with Matt Mullenweg and see how many of these type comment spams are sullying up the Akismet bins every hour of every day.

My apologies for not masking most of the NSFW keywords in that screenshot, they are there to reinforce my point that this happens frequently with short URL services. All legitimate adult and mainstream affiliate programs have strict rules against spamming. Using a third party service to intentionally mask the URL for spamming purposes is clearly a major TOS violation. These affiliates would have their account banned for this activity. Is TinyURL or any of the competing short URL services working to report these people who use their service for spamming?

An owner’s perspective
Before we get too far, I speak on the subject of short URL services with direct experience. I’ve owned and operated a similar service since 2004 that is actually shorter than tinyurl.com (11 characters) called tdurl.com (9 characters). I didn’t start the service to compete against the other URL shortening services, but primarily to use for myself. I didn’t like the idea of links that could be easily expired at some point in the future and knew the only way to guarantee that short links I used wouldn’t be killed later or perhaps as Berlind hints someday become a second page to an advertisement. So I wrote my own code and deployed my own service intending for myself to be the primary benefactor. The entire codebase which makes up our tdurl.com short service is less than 300 lines in a single PHP file. The click counts for each short URL are kept in a database.

I told a few friends and our radio show audience of my intentions back in 2004. I even offered the code to my hosting company and encouraged them to start their own URL shortening service. IMO, I still believe every company with multiple websites should have their own URL shortening service rather than use any third party service (including mine).

With that in mind hopefully this post isn’t being seen as trying to bash a competing service or disrespect to the tinyurl creator from Minnesota Kevin “Gilby” Gilbertson (although I’m sure some will see it that way). Instead I’d like to point out the inherent problems with running a short URL service that outweigh their usefulness and value for third party use from a short URL owner perspective. I have almost two years of direct experience to draw from and that should be worth something to add to the discussion.

As a quick aside, I thought it was kind of funny how Gilby seemed a bit perplexed by David Berlind’s overexuberance:

ZDNet: Every time that I go to TinyURL, the thing that crosses my mind is this is the next YouTube…
Gilby: You think so?
ZDNet: I think so, I don’t know if you’re worth 1.6 billion dollars, but I think well, this kind of traffic, this kind of utility, this kind of simple idea, lot’s of incredibly useful data coming through that would be useful to advertisers on the Internet…

In a few words, hosting short URLs for others that I don’t know overall has been a miserable experience. Most of the third party activity has been similar to what you see in the screenshot above and/or from individuals trying to mask other affiliate URLs and spam newsgroups, messageboards, email, myspace, blogs, you name it.

Running one of these short URL services — and keeping it clean — requires even more aggressive filtering and monitoring than checking your email box. I lamented these challenges trying to keep these services spammer free back in a post here July called War of the Short URL Worlds. The activity had gotten progressively worse as spammers tried to hide their activities behind the short URL service which culminated in that post. For awhile I was getting daily emails alerting me that my short URL service was being used for spamming. Fortunately with the changes made over the past few months it has improved slightly. I also had to ban uses of other shorten URL services like TinyURL which were being used as multiple level spam redirection. Sound like fun?

In every case we remove the links and in most add custom filtering rules. I’m filtering out many porn, viagra, mortgage and other affiliate programs. Also blocking many country domains outside the US, Canada, UK and Europe. Still, the service can and is be used by spammers. An ongoing battle.

And remember, this comes from somebody who isn’t out actively promoting their short URL service for others to use. Spammers are still finding the service and trying to subvert it for TOS-violating and in some cases illegal activity. I can only imagine how much BS Mr. “Gilby” Gilbertson and others who do actively promote for others to use have to face.

Google doesn’t need the misery
Now why would somebody like Google ever want to buy a short URL service and suddenly become the proxy for spammers? They already receive enough bad karma from owning arguably the world’s worst spog haven with blogspot.com? Instead, they’d be wiser to use their assets to buy a one digit domain and build their own. These short URL services are easy to build and deploy, but require much more work to keep clean of trash. They could then start fresh and build their own anti-spam defense.

Also, and perhaps more importantly, Google has gone on record as saying that they don’t like intrusive advertising. People who use — and have used — URL shortening services didn’t intend, agree or want somebody to slam an ad in front of the target link like [cough, cough] a number of mainstream media sites use. That type of intrusion would no doubt be deemed evil by Google’s standards, eyeballs be damned.

To play devil’s advocate, there was a lot of negativity surrounding YouTube — particularly with copyright violations — and that didn’t stop the big G from closing the deal. But this is different in the sense that there are dozens of short URL services and the thing that separates them is how good they are about filtering (or accepting) the trash.

So then who would be a suitor for an existing URL shortening service? Somebody who doesn’t mind taking the PR hit, perhaps. Somebody who would love to take that database of links and subvert them, just as the comment spammers are using them while you are reading this. What does Gilby think about this?

The answer is in the podcast.

Podcast with Gilby, TinyURL’s creator
I listened to Berlind’s podcast with trepidation, primarily hoping to absorb some amount of logic in why he thinks tinyurl.com and the dozens of copycats (my own service admittedly being one of them) will ever be anything more valuable to non-spammers and companies that are willing to absorb the headaches associated with running this type of service.

Ironically, the podcast starts with “a word from the sponsor.” Then Berlind prefaces with some background stats on TinyURL that listeners learn has been running since January 2002. Over 28 million long URLs have been shortened using TinyURL and the homepage claims to be receiving over 675 million clickthrus a month.

In the podcast we learn, however, that TinyURL doesn’t count clicks of individual long URLs and that number must be an aggregate figure. This is a major feature for alerting the short URL owner that a long URL shortened could be spam and I’m surprised TinyURL doesn’t include this feature yet. Many others, my own service included, does.

Berlind realizes later in the interview that the 675 million clickthrus aren’t actual webpage visits to TinyURL. How did he not realize this is how it works if he had been using this for awhile?

Gilby sounds like a good guy. The type you might go ice fishing with in Minnesota. I graduated from high school in Wisconsin and he seems like the type of friendly Midwestener I went to school with. Gilby created a useful service, but it’s too bad that Berlind doesn’t dig into the dark side of these services. He sort of glosses over the abuse factor, even when mentioned by Gilby, in some sort of misguided effort to support his crazy notion that this could be the next YouTube.

Disappointing interview.

Curiously enough, Gilby is concerned about the service being ruined by advertising, perhaps in the very way that Berlind is thinking. I like web developers like Gilby who are more concerned about keeping the purity of the service than letting it be completely violated by advertising. I’m glad to hear he’s able to make a living from shorter URLs, but I don’t envy his position in any way, shape or form. When he mentioned “abuse” I knew exactly what he was talking about.

I’ve read stuff from David Berlind before and he seems like a pretty smart chap, but I don’t understand where he’s coming from at all on this one. Maybe he’ll be kind enough to ping me or stop by below and explain more fully in a follow-up. Honestly, I’m surprised that CNET/ZDnet doesn’t start their own URL shortening service. They could have cnurl.com or zdurl.com, if those are available. If they happen to have a coveted one or two digit domain with perhaps .us behind it, even better. The whole point is to get the shortest characters possible and the newer the service, the shorter the resulting URL.

In the podcast Berlind tells Gilby multiple times: “I think you are sitting on a gold mine here.”

Spammers joyfully subverting these services most certainly agree. Not sure any suitors with big wallets will agree.

Related Posts

RSS Feed comments for this post 17 Comments »

  1. TinyURL A Big Utility…but the Next YouTube??

    Have you ever come across an extraordinarily long URL and wondered: why does it have to be that long and is there a way to shorten it so it could become more user-friendly in an e-mail or blog post? (Wow, that’s a long first sentence could be shortene…

    Trackback by Mark Evans — December 4, 2006 @ 7:44 am PST

  2. Well said. Providing a obfuscated redirect service in a way that isn’t abused is very very difficult. The redirect site’s abuse desk doesn’t see all the email and IM spam containing its link — they only get the clicks. This creates a very murky mess — Is a volume redirect click abusive (the screenshot) or just popular (the mentos diet coke fad)? Similarly, how does a free, no-registration required site do effective rate-limiting of submissions? I’d guess that they’d have to use url blacklists like SURBL and URIBL to shut down links, but, those services only find spam after the fact. Probably worst though is that those URI blacklists don’t have ways of listing specific redirect sites — they can blacklist the entire domain (causing massive false positives), but not tinyurl.com/badone.

    Comment by Miles — December 4, 2006 @ 2:10 pm PST

  3. LOL, in your screen it looks like one of your links was a visited one.

    Comment by Robyn Tippins — December 5, 2006 @ 6:15 pm PST

  4. Nope Robyn, just highlighted ;) Oddly enough the entire URL isn’t highlighted though with the complete URL so it does look like a visited URL.

    Comment by TDavid — December 5, 2006 @ 6:29 pm PST

  5. […] Most Hmm-worthy posts from December 2, 2006 - December 31, 2006 - David Berlind, here is the dark side of URL shortening services (4) [dec 3] “Well said. Providing a obfuscated redirect service in a way that isn’t abused is very very difficult. The redirect site’s abuse desk doesn’t see all the email and IM spam containing its link — they only get the clicks. This creates a very murky mess — Is a volume redirect click abusive (the screenshot) or just popular (the mentos diet coke fad)?” – Comment by Miles — December 4, 2006 @ 2:10 pm […]

    Pingback by Hmmcast #30: Unplugged » Make You Go Hmm — January 1, 2007 @ 11:34 am PST

  6. Hey I just started a URL redirection site called spinlinks.net And a whole network of them which can be seen from network.sponlinks.net

    Got a complaint from a blogger the third day the system was up.

    People really don\’t understand how this stuff works. If you are not concerned with the bandwidth, a spammer is just another user for a redirection service. My service at the spinlinks network cannot produce spam, and it cannot be used \’to hide the target of a link\’. Spammers can name their domains just as creatively as any other user. The redirector is not used to advertise any product or service, and does not place links on other sites.

    In fact, these sites (the spinlinks redirectors) provide a way for surfers to look and see what sort of scripts will be run, and what information will be placed on their computer when they visit a site. We also make available a complete listing of the redirects encountered when visiting a link in our system.

    Can you explain to me why you felt obligated, and/or feel that others should be obligated to monitor and censor the links placed in a redirector system?

    Thanks

    Comment by Jake — January 2, 2007 @ 12:20 pm PST

  7. Hi Jake - being you admitted you just started running your own URL redirection site, I’d ask you to get back to me on that question 6-12 months. I’m not trying to be flippant with that comment, but with all due respect I don’t think you have enough experience actaully running the service to see all dimensions of what the service entails.

    If your links get banned by server admins and end up in spam blacklists that impacts your overall service. Like when people want to use them in forums and the forum admins don’t allow them, etc. Also there is the case of your name being associated with spamming and linking to illegal activity. Obviously the links aren’t yours but anybody who runs a service that is open to the public has legal responsibilities.

    Check back with me in a year and let me know how you’re handling these issues. If you are going completely unmoderated that’s going to get you in trouble sooner or later. That’s your call if you want to take that risk. People do it every day on the web.

    Not how we run any of our businesses though. To each his own :)

    Comment by TDavid — January 2, 2007 @ 12:52 pm PST

  8. Wow, thanks for a quick response!

    I understand about the issues of association, and I think I\’m prepared to address that.

    I am, however, very interested in a comment like:

    Obviously the links aren’t yours but anybody who runs a service that is open to the public has legal responsibilities.

    What sort of legal responsibilities are you aware of regarding the use of public systems? Are you discussing US law, or another country? It is not as if we\’re providing physical facilities, subject to building code or some such.

    Are you honestly aware of legal obligations regarding this sort of service?

    I mean, I understand that there are issues. If I was not able to afford much bandwidth I would not have appreciated the extra 12,000 or so hits to the redirector from the one spammer I\’ve had so far.

    I\’ve already noticed that since my redirectors correctly give 301 codes on redirect, the only backlinks or SE results I get are for incorrectly formed links, and mostly obvious spam links. The valid links pass all reference, page rank, and SE queries on to the target page. This makes for a bit of marginal publicity, at best.

    I may look into implementing SURBL functionality, if the traffic gets to be a problem. Admins who block URL shorteners have their own reasons, and make their own decisions regarding what they allow on their site.

    But the worst problem so far comes from a knee-jerk user reaction. He sent a mail to the abuse addresses at the site, my registrar, and my hosting service which appeared to be accusing me of spamming his blog. Since the redirector cannot create spam or post to a blog, it is a good thing I have an intelligent hosting provider.

    In fact, what I have learned about SURBL lists today will get mentioned on my page at robertalansoloway.com at the next update!

    Thanks again for the feedback, too. Are there other methods you can recommend besides the SURBL for checking such things without user input? I\’m REALLY interested in a phishing URL check.

    Comment by Jake — January 2, 2007 @ 1:57 pm PST

  9. IMO, I still believe every company with multiple websites should have their own URL shortening service rather than use any third party service (including mine).

    You are 100% right! I know nothing about programming but figured out a workaround a few years ago by using subdomains

    psk.funDiva.com
    funDiva.blogspot.com/2005/08/princess-samantha-kitten-and-notorious.html

    But trying to explain HOW to use subdomains for my computer challenged friends proved way too hard and thats if their registrar even offered subdomains.

    And I always liked the look of funDiva.com/psk better

    I can accomplish this with cpanel easy but I would LOVE to know how to set this up on a web page for my friends. I have no interest in making it a public service, just an easy way for them to use their own shortened domain redirects.

    Your radio show on the subject was never archived, anyplace else I can get the info?

    Thanks in advance,
    Christy

    Comment by Christy — March 17, 2007 @ 7:03 am PST

  10. Christy - tdurl.com is run off my own custom code, another reader informed me that there is a PHP-based script out there called phurl that will allow you to run a URL shortening service for yourself which you can find here: http://www.hido.net/projects/phurl/

    Comment by TDavid — March 17, 2007 @ 9:26 am PST

  11. I decided to start off my own url shortening after finding that many have lacks features like path forwarding, masking and also their domains was not short enough… Well, think most ppls wants to shorten their url as shortest as possible right.

    Comment by Rino — June 12, 2007 @ 10:41 am PST

  12. One solution would be to have some kind of keyword filtering for submitted URLs. Many of the spammer’s URLs contain the name of a drug or some disgusting act, so they could simply reject any URL containing a banned word before it even enters the system. For other URLs that pass, it could use some kind of blacklist and periodically remove any undesirable links.

    Comment by Mike — November 19, 2007 @ 10:06 am PST

  13. I have my own url shortening system too (http://urlb.at) which also counts actual hits to the shhorter url. I did for my own sake, after getting an idea to give it a go.

    Today I removed over 3000 links from it which were clearly spam - the url was easy to spot.

    Maybe I’ll give some thought on how to create a kind of ‘Akismet’ like ‘blacklist’ to help stop these even getting submitted into the database.

    I think that approach could be interesting for all of us who build these things ;)

    cheers!
    Kosso

    ps: what’s the shortest url/domain you own? Mine is shw.ag ;)

    Comment by Kosso — November 19, 2007 @ 11:24 am PST

  14. Haven’t bought domains outside of .com and .net. And with .com we have several five digit dot coms, Kosso. The counter is a good thing, I’m using that for metrics. Dual purpose! :)

    Comment by TDavid — November 19, 2007 @ 11:33 am PST

  15. Kosso: What percent of those 3000 were listed on Surbl? URIBl? (and why not just use/publish to them instead of trying to rebuild one?)

    Comment by miles — November 19, 2007 @ 11:35 am PST

  16. miles: they were ip based urls all pointing to some (now) broken page (on ‘FastBox’? I think it said)

    I try to avoid dupes in the url database by looking it up first. It looks like this user had used my simple REST API and simply added a random number to the end of each request. Cunning little swines! :)

    If I could be bothered, I’d have the system go and grab the head of the page too and make some kind of comparison too.

    S’funny - I just saw also that someone had used urlb.at to shorten a tinyurl.com url. hehehe. some people, eh? :))

    Comment by Kosso — November 19, 2007 @ 11:43 am PST

  17. Might then try adding spamhaus to the lookup list; and think about banning IP addresses period — seems like very very few reasons to link directly to IP. Btw, it’s trivial for the bad guys to use server rewrite rules to make any random #s/words/characters yield the same page..

    Comment by miles — November 19, 2007 @ 11:50 am PST


TrackBack URI: http://www.makeyougohmm.com/20061203/4006/trackback/

Leave a comment


By leaving a comment you consent to the Official Hmm Comment Policy

Return Home


Copyright 2003-2008 KMR Enterprises All Rights Reserved