type in your query to search makeyougohmm
Things that ... make you go hmmtechnology music video art news reviews and muse on the web

June 13, 2006

War of the short URL worlds

developers, spam — by TDavid @ 9:14 am PST
New! F = please no more posts like thisD = not among your best stuffC = average postB = good post, I liked itA = great post, please create more like this (Hmm, no ratings yet)
Loading ... Loading ...

In 2004 I decided to create my own URL shortening service. The plan was rather than use somebody else’s short URL service — there are plenty of them — I’d use my own in email correspondences, forum posts, anywhere that I needed to shorten a long URL. I registered the domains tdurl.com and the alternate tduri.com, created the database and small script that powers the program and began using it with little fanfare. Never even posted about the launch here, although I have mentioned or used it in a half dozen posts according to the search.

Clearly this hasn’t been something I’ve promoted for others to use, although you are more than welcome to use it for legitimate purposes. Really something I intended mostly for myself, friends and anybody that came along after following the shortened URLS. I’ve even thought about making the code behind it completely open source so that others could create their own short URL services. Unfortunately, lately it’s been getting pounded by spammers.

At first I thought it was just a couple spam URLs but it was worse than that.

Recently I’ve been getting email notifying me of the severity of the problem in some other places and have been having to put on my cop’s hat and spend about half the amount of time I wrote the program cleaning it up and further securing it against use by spammers. Fortunately, I setup click count tracking from the beginning so I could see what redirects are getting the most activity and follow them. A couple of database queries and I had all the information needed to jump into anti-spam mode.

Redirecting the redirect
During my investigation I noticed spammers taking affiliate links and running them through my short URL service and then spamming the shortend URL everywhere. Several of these URLs were posted on craigslist with just one redirect being clicked well over 140,000 times (heaven knows how many people saw that link, that’s how many who clicked it). I also found some 127.0.0.1 links which for those not familiar will open a local host. Clearly, some were at least attempting to use my service for nefarious purposes.

My first plan of defense was to cut off the money source by redirecting these spammy redirects. When/if the spammers realized that their links would be changed then using my service in the future would be futile because I’d just redirect their redirects back home. The hope being that they would go off and find some other short URL service that would be less dilligent. The problem almost overnight cleaned itself up, literally and figuratively but I’ve had a couple stragglers still banging away trying to figure out how to poke holes in the program so they can use it to spam.

Where to send the clickers?
I thought about creating a blackhole directory like the one that exists here at Hmm (not linked) and is intended for malicious bots, but then I decided instead just redirect these people clicking to the service homepage which has in bold red text “please do not spam” as one of the instructions for use. My gut reaction isn’t to be so nice with spammers and give them more of a reaming, but it’s possible that people don’t realize that masking an affiliate URL and then putting it places where other sites have strict rules against spam is still spamming.

There is no advertising or any money being made whatsoever from my free URL shortening service. Never was and currently isn’t as of this writing. It wasn’t done to make money and now in some ways I’m starting to regret sharing it with some others for free. I have to wonder why there are any free URL shortening services out there if others are experiencing what I have without promotion and advertising.

The ironic hidden value experience
The flip side is that the spammers have made the service better. They’ve forced me to turn the code into being more anti-spam than it was when it was first put out there. I can see how creating, executing and maintaining a free URL shortening service could be a good learning lesson for somebody researching and learning about anti-spam techniques and technology. That wasn’t one of my initial goals, but lately anyway, that’s become one of the realities.

While my service isn’t as full featured as some other URL shortening services, folks can subscribe to an RSS feed of the most recent clicked URLs, which at least at the time I created the service was somewhat progressive. If I really had wanted to get serious about it, I would add even more features, including some way to group and remember short URLs users have created. Starting to ponder that a little more seriously now that I’ve been spending more time back in the database and code.

What you gonna do when they come for you? Bad boys, bad boys …
When I started removing some of these spammed URLs and techniques as well as blocking the ability to sign up anew, I started noticing the spammers using multiple layers of redirects. Sneaky. They would use a service like tinyurl to mask their URL to mine, so next I had to block all the other short URL services out there too. What legitimate use would somebody want to use one short URL service for another short URL service anyway?

Next I had to block a couple foreign country URLs as I noticed a high percentage of these spam URLs coming from certain countries. I also decided to block IP address URLs. I realize this eliminated some legitimate long URLs that don’t have domain names and only use their IP address, but the investigation revealed that the vast majority of the long URLs that didn’t have domain names were spam.

Free web services have hidden costs
If you are developing a free, web-based service, expect to spend more time policing it than if it were a commercial service. If I’d have made the service free for me and $1/month for everybody else, I’m curious what the difference in spammer usage would be?

Something to add to the cost of all these new web-only startups because you just know some of them are spam magnets too. It might be free to netizens, but the labor and resources in keeping these services running is a whole other story.

Related Posts

RSS Feed comments for this post 16 Comments »

  1. Maybe the spammers would pay the dollar. Everyone knows they’re rolling in the money anyways. Paypal you payment, and you’d never know the difference unless you were browsing the logs regularly for the shortening service.

    Comment by darkmoon — June 13, 2006 @ 9:49 am PST

  2. I’m reminded of what Nick Bradbury said: “Any new Web 2.0 company that hasn’t considered the spam problem automatically isn’t worth my time.”

    Any free public service will be cased by spammers for every advantage they can take of it. Even a WordPress blog with Akismet enabled requires some spam monitoring (although its mostly the gleeful “got em!”). Definitely any startup needs to build the handling of the spam problem into their business plan.

    Comment by Sterling Camden — June 13, 2006 @ 1:11 pm PST

  3. darkmoon - a lot of what spammers do is automated trash, as I’m sure you know. I could have made it more difficult by adding CAPTCHA, but then that ruins the accessibility. Add a dollar figure, even a small one, and you do cut off a percentage of spammers who can find easier prey for their automated spamming.

    I agree, Sterling.

    Although my plans weren’t to provide a service to the masses really (nor especially anything web 2.0ish) which is something that didn’t used to be as big a deal as it is now (I guess that’s the bigger thing that made me go hmm with this particular experiment). Those who say the spam problem has gotten better are just not seeing the same internet that I’ve been seeing. More a statement on those building free services for themselves primarily to also be on the lookout.

    If we put up any page, anything public really online these days we have to build some degree of anti-spam protection into it or risk being abused.

    Comment by TDavid — June 13, 2006 @ 2:27 pm PST

  4. Tdavid: So you’re saying that they’d rather just get free instead of pay? I’m just wondering if some spammers would actually pony up the money and then make you an accomplice. $1 per url isn’t bad. Unless I misread what you said about the $1. :)

    Anyone that thinks spam is getting better has never looked at my logs. For the most part, it’s just my blog on it with some extraneous test things I do on it, and it gets hammered hard lately by trackback spam. Even with the software catching it, it’s still useless bandwidth wasted for no reason. I make it a point to block ranges of IPs, but that’s not exactly the way to go if you’re a business or trying to set up a service.

    Comment by darkmoon — June 13, 2006 @ 2:43 pm PST

  5. Yes, I believe adding a price tag makes it less desirable to spammers, but it’s not just the cost aspect, it’s the data that goes along with paying that is perhaps a bigger deterrent.

    I think spammers love low hanging fruit. Some of the worst spammed services out there are targeted by them. Check out blogspot. Part of the reason they use that for splogs is because it is so easy to setup an account anonymously. I wonder if the fact that I didn’t advertise the service, didn’t push or promote it made it actually more spammer desirable because they figured (rightly so) that they might catch me off guard. Catch a limited moderated and monitored offering. Those are the most susceptible projects to abuse.

    As part of the $1 idea the more information gathered about the spammer that can be used against them, I guess is the concept. I wasn’t thinking micropayment or paypay I was thinking gathering full information on users. Obviously most people wouldn’t want to use or pay for such a service because they can get it for free, so I’m saying going commercial could be a form of spam deterrent in itself.

    Sure, the spammer could use stolen information to sign up but again, this would make it less desirable.

    Not something I’m seriously considering here, just thinking aloud. Maybe it’s a really wrong theory. What do you think?

    Comment by TDavid — June 13, 2006 @ 2:53 pm PST

  6. Just thinking out loud, but back in the day when I was doing “underground” stuff because it was fun and cool, there were many levels of what the government would consider these days as “cyber terrorists”. But there was an ethical code that certain people crossed, and others did not.

    In the example you put above, I’m making the assumption that the worst case scenario is deterred (those that use the information gathered for identity theft and thus buying services not using their money). But what about legitimate spam operations (go with me here, cuz we all hate spam)? These are the ones that don’t cross the boundaries of ID theft, but walk the fine line of “mass marketing”. The darker side is you won’t ever stop the botnets, but what if they’re on the up and up and give you the $1 and their business information?

    I’m going out the way here, so if it gets too unrealistic, just stop me.

    Comment by darkmoon — June 13, 2006 @ 3:01 pm PST

  7. Good conversation, darkmoon. I think it’s important to flush out the really bad spammers first. Borderline spammers need to be educated and hopefully reformed, but the really bad ones have to be the first target. I suspect borderline ones for at least the service mentioned in this tread will go away after they find their work has been ruined (the link that was supposed to earn them money now redirects to my site). Cut off the money source and it takes the fun out of things in a hurry for these guys.

    That is the ones without mass scripts running. Scripts don’t care. They can run all day and night and they have no morals or ethics unless programmed by us to have them.

    If these spammers are giving me their information and then use the service to spam, they are breaking the agreement and then it boils down into how we deal with that from a legal standpoint. That gets more into the part where I’d turn that information over to the attorneys and let them advise us as to how to proceed.

    My problem with existing spam is that unless there is some sort of feet to the fire concern, they will just keep on doing it. I’m not sure there are enough deterrents out there with the current internet climate. It would be good to have it be a much more treacherous environment for spammers to operate on the web and thus create a greater risk proposition for them. Until the web operates that way, spammers will continue to proliferate and we’ll just keep complaining about the problem in a neverending cycle.

    Here’s my concern: I setup a service and put my name on it, and then it gets used and abused with my name on it and ends up putting my name in places damaging the brand. Spammers like their anonymity. If they feel this anonymity could be breached when it’s really unnecessary (other easier, less risky targets) I think that’s a risk not as many — I didn’t say all of them — are willing to take. You are correct it wouldn’t stop all of them, but that’s not my immediate goal thinking, I’m thinking of stopping most of them. If you can get the noise level down to a small amount, it’s manageable.

    That’s the hypothesis, anyway, I’m not sure how sound this actually is in reality. One thing I do know is that things have gotten worse, not better.

    Comment by TDavid — June 13, 2006 @ 3:37 pm PST

  8. If you ask me, if you can detect waves of spam coming (like I do with my blog server), then I don’t see how Tier 1 routers couldn’t also detect traffic like that from traffic analysis. If I had anything to do with it, I’d turn off every single compromised IP address until the issue was fixed and the owner of that computer was taught how to use the device correctly.

    Unfortunately, Tier 1 along with even Tier 2/3 ISPs don’t give one’s rat’s butt about spam. I contacted University of Minnesota’s IT department just recently about one of their computers being compromised and trackback spamming my server to hell and back. The IT staff was like: we care about spam, please send us the message with headers.

    UHHH…..

    Currently, I have a whole bunch of IP ranges that are blocked due to spam. Blackholed them, but doesn’t protect me from a DDOS.

    If vendors like AT&T are truly “filtering” our networks anyways with net neutrality out the door, then I fully expect them to do this anyways. If they’re going to QoS things, it doesn’t take much to place a couple more rules to do some spam detection on the packets that are going though. Of course, that makes us no better than the Great Firewall of China, but obviously no one cares about that one. :p

    Comment by darkmoon — June 13, 2006 @ 4:10 pm PST

  9. Actually now that I think about it… this goes back to my ideas during my college years in specializing in embedded wireless networking and security. I think that ultimately, the best way to detect is at the router itself. If you ran a distributed network of routers that all talked with one another and compared the traffic patterns, then you could definitely catch on to the scripted spamming. No matter how sophisticated the spamming technology is driven, there is no way to truly randomize traffic. Since the Tier 1 routers route pretty much all of the traffic, they would have the largest “data pool” to analyze from and eventually detect even the most minute of patterns for spam headers.

    I know I know, I’m starting to sound like the government with their terrorist detection methodologies, but it’s not terribly different. I just find that if you apply a mathematical model to human beings, it doesn’t quite work the same way as pattern detection with e-mail. Along with this, combine it with current known IPs or compromised spam originations and you could have yourself a pretty powerful anti-spam bottleneck.

    Well, I can dream at least. :p

    Comment by darkmoon — June 13, 2006 @ 4:16 pm PST

  10. Have you implemented SURBL?

    Comment by Kevin — June 13, 2006 @ 7:16 pm PST

  11. Actually, that link doesn’t work without the www.

    http://www.surbl.org/

    An Open Letter To Operators Of Redirection Sites

    http://www.surbl.org/redirect.html

    Comment by Kevin — June 13, 2006 @ 7:30 pm PST

  12. No Kevin, hadn’t even heard of it until you shared. Thanks, will check it out.

    Comment by TDavid — June 13, 2006 @ 7:38 pm PST

  13. You may want to try an approach similiar to gostubby.com they are excellent at fighting spam.

    Urls that redirect are not allowed.
    Urls containing common spam words are not allowed. etc…

    There are plenty of ways to code this, but they seem to be pushing the bar when it comes to fighting spam with URL forwarding ;)

    Comment by big j — July 11, 2006 @ 11:12 pm PST

  14. […] Running one of these short URL services — and keeping it clean — requires even more aggressive filtering and monitoring than checking your email box. I lamented these challenges trying to keep these services spammer free back in a post here July called War of the Short URL Worlds. The activity had gotten progressively worse as spammers tried to hide their activities behind the short URL service which culminated in that post. For awhile I was getting daily emails alerting me that my short URL service was being used for spamming. Fortunately with the changes made over the past few months it has improved slightly. I also had to ban uses of other shorten URL services like TinyURL which were being used as multiple level spam redirection. Sound like fun? […]

    Pingback by David Berlind, here is the dark side to URL shortening services » Make You Go Hmm — December 3, 2006 @ 12:30 pm PST

  15. I invite them to try to use our service for spam, the filters will catch them, and if they dont, it gives us all the more reasons to make them even more intelligent. The key to avoiding not all but most spam when running a service like this is not allowing redirects, not allowing anything except the HTTP 200 code, and spidering the page for stop words. url-z.com has accomplished this.

    Comment by Steve — December 19, 2006 @ 2:16 pm PST

  16. It’s just hard to keep spammers way from our sites.

    Comment by Rino — June 12, 2007 @ 11:51 am PST


TrackBack URI: http://www.makeyougohmm.com/20060613/3437/trackback/

Leave a comment


By leaving a comment you consent to the Official Hmm Comment Policy

Return Home


Copyright 2003-2008 KMR Enterprises All Rights Reserved