The Captcha dilemna

Today, on the internet, there are a lot of bots all over our communication tools. They all are spamming us with some kind of ads or links in order to give some websites more visibility. All this spam is consuming lots of resources all over the worls world: programmers who develop them and those who fight against them, electricity, network ressources, memory and storage.

Some people had ideas about how to fight those bots by using a Turing Test in order to differentiate humans being and computers. The problem is that captchas must be strong enough for computers but easy for humans which is not quite an easy problem. After some time, Google Inc. did some good job at it and let anyone use their captcha to fight against bots. But today, we all know that this tool is no more neutral. It is used by a society to improve its A.I. for free, getting lots of informations about people resolving the captcha and doing it for bad purposes like war and money.

This is why I did try to get something out of the actual current situation...

What is a CAPTCHA for ?

A captcha is an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart".

It's a test that should be able to detect if the resolver is a Human or a Computer by giving a problem very hard to resolve by a computer but easy to resolve for a human. As the computer evolved the problem evolved too because today there are a bunch of AI able to resolve such tests.

At first, computer "vision" was very poor and they were unable to "read" text on images. So first captchas were simply some text on an image. By the time, some programs were then able to read such texts (they are called OCR : Optical Character Recognition), so captchas started to add some "graphical noise" around the text, then text was distorted in order to block computer's algorithms.

Finally, such texts were so much modified that even humans have had difficulties to read them.

Then comes Google and it's its exciting idea : Captcha can help AI by giving it good answers to a problem the machine can't resolve. So it is, captcha is now used for two distinct goals :

  • Services providers don't want to have bots on their instances consuming ressources
  • Google wants its AI to improve for free

At the beginning, it started with some scanned text from books, then it was some street numbers on the street, yesterday it was later on some road signs and cars, and today it's claimed that the AI can detect if you are a human or a bot !

It could be all good if it wasn't a way to make users work for free in order to get more money and datas and improve AI for killing and get people killed while masquerading as a good-tech company.

Important points

In order to be able to offer a captcha, there are lot of possibilities. I'm going to discuss a lot of them (if not all, I'm open to new ideas ;o) ).

Which problem to use ?

In order to "detect" bots, you can also "detect" humans. The problem displayed to users has to be hard for computers and easy for humans. But, as the time passes, computers are more and more effectives and they can resolve lots of problems. Happily there are some subjects where they have more difficulties like vision and understanding.

Vision problem

A vision problem can be of multiple types. Actually,Currently Microsoft, Google and Wikimedia have tried some kind of visual recognition, which is quite a difficult problem for computers.

Microsoft Asirra Captcha displays some cats and dogs pictures and ask users to pick all the pictures of one of the two categories.

Google reCaptcha tries to detect if the user is a human, if not it displays some pictures from its projects Google Maps and Google Street View asking users to select only cars or road signs. Again, the visual recognition is done by humans.

Wikimedia tried some captcha technology by displaying a picture and asking users to select the corresponding tag.


  • Visual recognition is a very difficult problem for computers
  • Visual recognition is a simple problem for humans
  • Even a child can recognize an animal or a car


  • Visual recognition can be a blocking problem for humans with visual deficiencies (blindness, color-blindness, etc...)
  • You got need to get a lot of different pictures in order to be able to have some kind of randomness
  • The computer generating the problem should know the solution, so you need to classify your pictures first
  • Screenreaders can't display such problem (je ne comprends pas ce que ça veut dire mais c'est peut-être normal...)

Mathematical or spelling problem

A mathematical or spelling problem is a problem often displayed in text only. Its purpose is to detect human by asking them simple questions :

    Type the last but third letter from the end of the word : "perspicacity"
    Guess what is the result of this operation : 1 + 2 =
    What is the first word of this web site ?


  • Infinite number of mathematical problems
  • Easy generation


  • Got to translate lot of stuff A lot of translation work is needed
  • What about non occidental alphabets ?
  • Some children can't resolve certain problems
  • Cognitive defiency deficiency can block users (dyslexia, etc...)
  • Often too simple for computers
  • Questions about visited website can't be numerous

Reading problem

Deformed text can be displayed. The goal of the problem being to read letters and type the same word. By using letters and digits, and not selecting a word, bots can't use dictionary ies for help.


  • Infinite number of problems
  • Easy generation


  • What about non occidental alphabets / keyboards ?
  • Cognitive defiency deficiency can block users (dyslexia, etc...)
  • Today, too simple for computers
  • Visual deficiency can block users (blindness, visually impaired, etc...)

Audio problem

The aim of this problem is to have the user listening to a sound and type the word listened.


  • Visually impaired people can solve this problem


  • Not easy to generate, or need to have a big collection of native speakers
  • Sound defiency deficiency can block users (deaf, etc...)

The problem generation

In order to generate a problem, it should be more efficient if a computer can do it. But, if a computer can do it, dos can a computer can solve it ? Well, not necesseralynecessarily.

The aim of a Captcha is to have a big pool of different problems, so that a bot can't easily store each problem in order to solve them by brute force and remember the solutions. The perfect Captcha should be to never present the same problem twice.

Plus, the generation of a problem can be tricky because the generation needs to know the solution. There are two possibilities for this :

  • Some humans tells the solution to the computer
  • The computer can create the problem easily but not solve it

First possibility is quite impossible, apart if you got a lot of people working days and nights for free. Second possibility can use some algorithms like the one used in cryptography : hash, or more simply a process which gives a result where you can't get back to the beginning.


Some ideas have come in my mind after such thoughts...

Problem generation

Visual problem generation

Regarding the visual problem generation, a solution could be to take a picture, then cut it up in parts randomly (always the same size, but not the same origin point). With this process, and using N parts of the picture, we are not creating one problem, but N !

Another important point is to slightly modify the parts of the pictures in order to block a simple recognition algorithm while a human would be able to solve it easily. The Gaussian Blur filter with a random radius will be ideal for this as you can't easily go back, even more if you don't know the parameters used for it !

Audio problem generation


Having a big pool of different problems

Using Wikimedia Commons can be very efficient. This website has an important stock of medias of different types :

Plus, using the special random page ( we can access completely random files without having to download every medium.

Fair-use and license

Medias uploaded on Wikimedia Commons are not all usable without any conditions. Some are under specific license.

Regarding medias, we can add the Author and License on the complete media but we shouldn't do that on the modified ones ! It can be considered as a fair-use of the media since the media itself will be displayed with all its related informations and the modified one will only be a part of the original one.


Hotlinking can be interesting if the Captcha ends as a Wikimedia Foundation Project. Reusing existing data is better than copying it all over the internet. But a media could be modified later on and no more be linked to the parts.

Plus, it's mandatory to store the parts of the media for displaying to display problems. So the first solution will be to download and store all the medias used by the Captcha. Once the last part of the media has been used, the media will be deleted.

Other idea(s) not choosen

Visual test : Little small video game

Some modern javascript libraries make it possible to create small video games where the goal is simple.

It could be some object to pick up and release on a specific place, or just moving something out of a small labyrinth.

Since this technology is far more complicated this idea has been abandonned for now.

Offering such a service


As the service should be sustainable, it has to cost as less little money as possible. In return, the service should be as fast as possible in order not to block users, and simple to use for developers.

I found such service billed at 7$ by month for unlimited amount of use for 1 website, 20$ for up to 10 websites and 40$ for up to 50 websites. Savings can be done if you bought buy more than one month in a row.

Creating some kind of Foundation/Association should be the goal. Joining the Wikimedia Foundation Projects could be a wonderful lever in order to have funds, community, and experimented developers to help.


As the world goes, some people make a business of resolving Captchas, therefore even a strong Captcha is not sufficient to block all bots. But using such a Captcha would, at least, block the text-readers solvers and maybe reduce the size of personal datas that others Captchas (like reCaptcha) are getting each minutes.

A demo ?

Finally I made a demo in order to show this solution was feasible and that first step is not so difficult...