Everyone has been very supportive in helping me deal with the comment spam issue. Thanks everyone.

We've installed MT-Blacklist plug in for now. I'm generally against blacklist type filters, but it looks like the best solution for now. I will wait for MT Pro to deal with it in a more elegant way.

I thought my troubles were over when I got two comments just now with "interesting..." and "page-rank?" on my last two post. The links were to a casino site. These comments were probably not machine scripted like the other comment spam, but they added no value to the comments and the casino site URL made me feel that they had posted the comments for the purpose of trying to steal page rank on Google. I have a feeling some bloggers also post comments on my blog just to get links to their sites.

My current policy on this issue is, if you post something on my blog that clearly adds no value to the conversation and if your URL is a gambling site, a porn site, a pharmaceutical site or some other obviously spam friendly commerce site, I will delete the post and add you to the blacklist. I will discourage bloggers to post opinion-less or off-topic posts just to get links. I continue to encourage people to post their opinions whether they are supportive or critical and of course I will not delete critical comments.

My policy may change, but this is it for now.

19 Comments

James Seng from Sigapore just made a MT plugin to block spams using Bayesian algorithm. I will install it and see how it works.

I just think that a simple challenge/response type of confirmation (showing graphical data of a certain number and asking this number to input, for instance) when your comment is posted may work well against automated comment spammers.

Suggest you nip this in the bud right now through use of robots.txt to tell search engine crawlers not to index your comments. We did that with our referer pages and it worked like a charm. The spammers went elswhere.

I used a .htaccess solution (mmm, .htaccess) on Vancouver Webloggers. So far no comment spam.

http://eliot.landrum.cx/archives/2003/05/27/more_complete_blockage.php

Dave, that might work if everyone does it (if only a few do it the spammers will still spam indiscriminately) but then you're also penalizing legitimate commenters. First rule, do no harm.

I am testing using both MT-Blacklist and James' equally excellent MT-Bayersian on my blog. I am in contact with both developers and both seem pretty gung-ho about making their offering solid (40+ hour coding marathons anybody?)

Mr. Winer's suggestion doesn't work on this (and many) MovableType and TypePad blogs since comments are archived on the same "pages" as the entries themselves. Hehe, seems the "elsewhere" they went is us. ;)

I'd like to say also that this is not a problem that can be "fixed". We are in an ecosystem and a new predator has been introduced. We must adapt and evolve and learn to live with the threat. So far we are doign quite well: we've identified the threat, identified it's current methods and developed defences against it. Spammers will naturally escalate. And so it begins.

Trackbacks are next. I predict within a week or two.

So if I comment about really wanting to find power puff girl panties on a post about Wikki's and firewire will I be blacklisted?
just trying to be clear babee :P

Dave, yeah, this is actually something I've talked to Howard Rheingold about too. I do like my comments indexed because it makes this place more of a community. On the other hand, it does attract the spammers. This is a classic community scalability issue I think.

Chey, will never be banned from my blog. You're the punctionation in the discussion. ;-)

I've never been a punctuation before it kind of sounds groovy though.

the fact that I will never be banned from you is way more groovier indeed :P

*mwah*

If there was a way of verifying urls of posters by seeing if they were listed in something like Technorati it would be one step of filtering.

Do I need to be listed as a blogger on technorati to voice an opinion on this site? I think that's an awful idea.

Robot exclusion mechanism is overdue for an update. While many of us bloggers are enjoying pagerank prominence, we need to fix this ASAP.

Luckily I seem to be escaping this plage at the moment so I haven't been following it as closely as I should I *think* i read somewhere the other day that one of the big issues was people using the default MT directories, and that modifying those cut down a lot of the problems, any validity to that?

Joi, I agree with you that it will take a major mod in Movable Type to really come up with more effective solutions (i.e. more sophisticated comment management).

I dislike blacklisting with a passion and wrote about this at http://weblogging.forpoets.org. And since this link is on topic and doesn't mention underwear -- I don't think -- I should be okay with your comment policy ;-)

Boris, I really like what you had to say -- this type of problem is a fact of life of open systems. Which means we need to keep this in perspective and not slap blacklisting over all the openings. For instance, I hope no one is using IP banning.

If we value an open system, then we can't necessarily opt for quick, global tech fixes; because I'd rather let in all the spam, than exclude a genuine voice with something important to say. Matt's first rule -- do no harm.

As for using robots.txt -- talked about this too in the posting I linked. However, I like my 'good' commenters to get nice Google buzz, so it's not something I'm partial to.

There might be another mechanism to add for fighting comment spam; I was recently thinking of introducing Bayesian Filtering. It has been discusssed at Paul Graham's site. To sum up; it is about content-based filtering. The mechanism can be trained with a basket of good and one of evil content.

As the implementation is not yet finished in PHP and plugged into the blog, I can not proove it being efficient. But there are other implementations (Perl, C etc.) - might like to have a look in the Wiki - to give others a chance to proof the concept maybe ;-)

Terrific discussion here on the comments spam issue.

IMO we need to also voice our concerns to the search engines and directories. They want to stay on top of the latest tricks and they'll outright ban sites that use tactics to attempt to outwit the search engines.

Over the last several years people have tried all kinds of things to boost search engine rankings, as I'm sure you know. Comment spam is just one of the most recent ploys. It seems more invasive to me, though, since they also leave their garbage at our websites, not just their own.

I thought I saw an open letter to the search engines and directories posted at someone's weblog but I can't find it tonight... rats. And I thought I'd bookmarked it, too. If/when I find it I'll come back here and post the URL in case anyone's interested.

"There might be another mechanism to add for fighting comment spam; I was recently thinking of introducing Bayesian Filtering."

Yes, it has been done. See http://james.seng.cc/archives/000152.html

Boris, I did not spend 40hrs doing this. It is a quick hack which tooks estimated 15hrs spread across 3 busy days. Don't make Bayesian look so hard to do...it isn't.

There are various solutions, but the most scaleable, useful methods for dealing with this are a combination of things:

1. Moderated comments. (I've never seen this mentioned!) This would probably work for 90-95% of users. Takes a few minutes a day to scan the latest comments, allow the ones you want, remove the rest.

1B. Going along with item 1, an easy link to ban the IP Addresses of the spam links.

2. Challenge Response with an image.

3. User registration.

All 3 need to be part of the next major release of MT and which solution the user decides to turn on or off is an individual preference decision. I doubt the blacklist will last/work long-term. And I doubt that any other revolutionary, useful solutions will come up.

Forums have been around for a long time and these are the major tools of the trade. When I first installed MT, I was astonished that at the very least moderated comments weren't an option. It was so obvious this would happen, I'm surprised it took so long. And of course spam isn't the only reason for moderated comments. I would love to be able to remove the swear words before it ever shows up on the blog.

I'll be really interested to see how you go with this - its an issue that is starting to get out of hand at my blog.

Why is it so impossible to make everything?

Leave a comment

10 TrackBacks

Listed below are links to blogs that reference this entry: Policy on comment spam.

TrackBack URL for this entry: http://joi.ito.com/MT-4.35-en/mt-tb.cgi/1078

from Sync A World You Want To Explore
October 16, 2003 12:30 PM

[Remarks] This entry is written in Japanese describing what's the Bayesian Spam Comment Filter for MovableType is and how to use it. Read More

3 Things from Jason in the City of Fallen Angels
October 16, 2003 1:22 PM

1. The moving of the West Wing tonight was just lame. 2. The new AOL 9.0 commercial that mocks movie trailers is pretty damn funny. AOL still sucks though 3. Comment spam is finally getting it's day in the light.... Read More

Jay Allen's MT-Blacklist has been made available to help combat comments spam. There's plenty of praise and comments at his post, MT-Blacklist: Stop Spam Now, too. Since I'd already implemented Jay's comment spam modules I've seen firsthand how incredi... Read More

My first comment spam from Jamie Jamison On Technology
October 23, 2003 10:40 PM

Well, I got my first bit of comment spam today. I deleted the comments and added the person to my ban list. I have read a lot of posts on other blogs about comment spam - it is a phenomenon Read More

I'm sorry to see that many others are having problems with so-called comment spam. This is where some advertiser places... Read More

I'm sorry to see that many others are having problems with so-called comment spam. This is where some advertiser places... Read More

I'm sorry to see that many others are having problems with so-called comment spam. This is where some advertiser places... Read More

Comment Spam from Dawning Awareness
October 26, 2003 11:25 AM

I had a milestone in my blogging life today. I got my first comment spam. NOW, I understand all the hubbub is about. I feel so violated! I'll have to find time next week to investigate the MT blacklist plugin.... Read More

My contribution to warding off comment spam: reduce its value to the spammers by breaking their URLs. The blog owner (and trusted friends) can keep their URLs intact by adding a password to their comments. This doesn't stop someone from... Read More

First posted here. Everyone has been very supportive in helping me deal with the comment spam issue. Thanks everyone. We've installed MT-Blacklist plug in for now. I'm generally against blacklist type filters, but it looks like the best solution for... Read More

About this Archive

This page is an archive of recent entries in the Business and the Economy category.

Books is the previous category.

Computer and Network Risks is the next category.

Find recent content on the main index.

Monthly Archives