Thursday, February 28, 2008

Thoughts on CAPTCHA

I really dislike those little CAPTCHA things that make you try to read the messed up letters and numbers and type them in a box in order to post something. I know, some people see them here, but I didn't write the blog software.

I imagine that few people actually enjoy filling out captcha boxes, but the alternative is horrible. No one wants to wade through lots of unconnected comments that are really just attempts to either game the search engines or lure you to some shopping website that will never actually ship you any product if you are foolish enough to buy something.

So, while looking around for some captcha alternatives, I had these thoughts

  1. Most of the alternatives seem to expose character based data. If they ever get popular at all, the bots will be all over them just like some of the early captchas.
  2. How easy would it be to defeat a comment submission page written in flex/flash? It seems like that might be hard to script a bot for, but could probably be defeated with a mouse/keystroke recorder of some sort if there were no challenge on the page.
  3. This one is my favorite. Why not do away with captcha all together, but run your comments moderation through your email spam/bayesian filters. Comments get deleted if they aren't approved in a month, and you could set up a rule to automatically accept any comment that made it through the spam filter. The spam filter on my gmail account works terriffic. I think I might be able to actually set that up here. Maybe I'll give it a whirl if I run out of things to do.
Any thoughts?


  1. I like that idea, but.. say gmail for example, i thought for the majority of their filters are based on existing statistical data on where emails come from, is the sender spoofed, is the sending mail server an open rely, those types of things. but when working with comments those type of things are non-existant... i'd love to see how it might work

  2. You're right Don, I really don't know how it would work at this point. I'm sort of describing the outside of a pocket calculator. I have no idea what's going on inside the box here.

    On blogger, if you turn on comment moderation (like I have here), there is a series of emails that get fired off to moderate the comments. The road block for this, of course, is that the spam filter would have to work almost entirely on parsing of the content with bayesian filters. There could be a higher incendence of mis-categorized comments and that would be a bit of a problem. The other issue, is that I have to click at least once on a link in the message to approve a comment. It would be cool if I could set up a mail filter that would automatically approve messages somehow if they made it to my in-box. Maybe a catcher for a forwarded message or something like that could be set up, but it doesn't exist here now as far as I can tell.

  3. There's a third party service called akismet that does comment spam checks. There is a plugin for Wordpress but I know nothing about blogger myself, so I don't know if it can be integrated.