CAPTCHAs

What is a CAPTCHA ?

Definition:

  • Completely Automated Public Turing test to tell Computers and Humans Apart
  • commonly, a third party software installed on the web pages
  • /kæp.tʃə/
A bit of history:

What is a CAPTCHA ?

Definition:

  • Completely Automated Public Turing test to tell Computers and Humans Apart.
  • commonly, a third party software installed on the web pages.
  • /kæp.tʃə/

A bit of history:

What are CAPTCHAs for ?

They filter out the real humans !

What is a non real human ?

Why are CAPTCHAs needed ?

Source: Imperva

some CAPTCHA examples

some CAPTCHA examples

some CAPTCHA examples

some CAPTCHA examples

some exotic CAPTCHA examples

some exotic CAPTCHA examples

some exotic CAPTCHA examples

some exotic CAPTCHA examples

Possible attacks on CAPTCHAs ?

Quite difficult and costly:

Alternatives to CAPTCHAs ?

Not much:

Drawbacks ?

  • Annoying
  • Accessibility
  • Privacy
🎉 Thank you for your attention 🎉

Welcome dear fellow humans to our scientific presentation on CAPTCHAs

Dire plein de trucs bonus en cliquant en live sur les liens (en bleu) des slides

On en faisant le con sur les tests des captchas

Faire planer le doute tout du long si clément est réelement un humain

So first of all, what is a captcha ?

By definition, CAPTCHAs are a completely automated...

So they are simply a tool for categorizing humans and non-humans

Turing was a brilliant famous mathematician of the last century, he is well known to be at founder of modern computers (turing machine...)

CAPTCHAs nowadays mostly present in your web browser (pretty much only place where you encounter them).

They a are what's called 3rd party software, meaning that they 99% of the time not dev by owner of site but by other organisation. This is due to the requirements that such a tool has. We'll talk a bit more about that in few seconds !

And they are pronounced /kæp.tʃə/.

Let's see where captchas come from

Introduced by AltaVista, a web engine company when they wanted to prevent unwanted addition by nefarious users to their search engine. Because at the time, if you wanted your website to be referenced in a search engine, so that it could be found easily, you add to manually add them to their system.

At the time, this preventive system was unnamed. the term captcha was coined by four mathematicians / computer scientists in 2003, namely Luis...

It's based on a reverse turing test ! first of all a turing test is method for determining whether a computer is capable of human-like thinking. So reverse turing test is a method for testing wether or not something is a human or not.

They concieved so that they are practically impossible for current computers to decipher, but they must be easy enough for real humans to do.

So captcha filter out non humans, this include

bots, a software application that runs automated tasks (scripts), usually with the intent to emulate human activity. They are fairly easy to code, and generally astonishly cheap. precisely who we want to restrict.

crawlers, an internet bot that browses the World Wide Web for the purpose of web indexing. They are most of the time used by search engines to better their search results, they mostly look at the metadata of pages (title, date, author, thumbnail, description, language, icons...), but they can also by used for more nefarious reasons, combined with scrappers for example.

scrappers, the automated extraction of data on websites via bots and crawlers, not just metadata anymore they are designed to gather a lot more data, phone numbers, emails, passwords (?), addresses, any precious info. They are generally badly viewed since they generally cause a lot of traffic on sites.

Dogs/cat KEKW

spammers, you don't want your contact form to be unprotected, or you'll soon receive email for special pills..

hackers, they actually are humans, but they generally use all the tools from above (except cat/dog) and you want to at least slow them down.

clément ? 😳

Why all the trouble, are bots really that common ? yes

a study from Imperva in 2020, estimate human traffic to only be about 60%, some other studies are even more aggressive (less than 45% sometimes).

good bots, search engines, monitoring bots, commercials crawlers, feed fetchers...

bad bots, every tools that we saw before, hackers, state spies...

You may understand why one may want to protect some areas of his website

In a way this type of challenge is relatively easy for computers to do nowadays, the difficulty of this captcha comes from the fact that attackers don't have the dataset that google has. (if you didn't know theses come from google street view)

dataset comes from companies or individuals that need data to be classified, if you pay them and give them a 100 millions images, they will classify it for you (at a price).

simpler test, can still be effective, but will be surpassed very easily

same, simpler test

theses types of captchas are generally uncommon, but are generally insanely effective at stopping bots. They are cheap to create and manage/evolve.

Though they aren't well suited for any other platform that a desktop computer. I don't want to solve that using my phone.

theses types of captchas are generally uncommon, but are generally insanely effective at stopping bots. They are cheap to create and manage/evolve.

Though they aren't well suited for any other platform that a desktop computer. I don't want to solve that using my phone.

theses types of captchas are generally uncommon, but are generally insanely effective at stopping bots. They are cheap to create and manage/evolve.

Though they aren't well suited for any other platform that a desktop computer. I don't want to solve that using my phone.

audio is interesting, for blind people

Human farms, sound like matrix... but you can actually pay people, in third world countries, to click on your captchas.

Flying under the radars, you could try to optimize your techniques to be as less suspicous as possible, you'll get a bit further

praying ?

It's an rams race, people are building deep learning models to try an solve these captchas

MITM, simply infecting of normal people and making internet requests on their behalf, basically a botnet

Honeypot, not a real alternative, but more a mindset, you want to trick bots into doing useless stuff

double authentification, your bank for example doesn't want you to be a robot

Centralized sign-on, the famous "connect with google/facebook/france connect" button, this way you don't actually do the process yourself, but trust a third party to filter out the bots for you. (spoiler: not that effective)

force human interaction, example des procurations lors des présidentielles

motion tracking, captchas are actually observing you even though you are not actively solving them, they look at you mouse movement, your keyboard strokes, and categorize you. For example when you initally click the I'm not a robot, the algorithm will observe this click an compare it to precedent cliks to detect if there is a pattern (did you click perfectly in the center each time ?)