Jump to content

Recommended Posts

Hi folks,

 

I was just wondering, some captcha images can be read by OCR image analysers because they read the characters from the image and guess due to the width and shape etc..

 

I've noticed that most CAPTCHA images are use a single font, would it be harder for the bot/analyser to read the characters if ever character in the string was a different font (including size) or would it make no difference?

 

Sam

Link to comment
https://forums.phpfreaks.com/topic/88605-captchaimage-analysis-question/
Share on other sites

It depends on the bot/analyzer.

 

Some might search many different types of fonts for each letter. some may not.

 

So far I have never had any spam with a CAPTCHA though.

 

A new type of CAPTCHA I came across was kind of cool...

it asked a question, such as:

"A dog has __ legs"

then there is a text box that you fill in the blank for the answer.

OCR basically identifies end points of lines or points along a curve and then connects the dots. It does not really matter if different fonts or different size characters are used in one image. Doing so might make the image look more confusing to a human, but to a computer, no.

Using weird fonts may help in confusing a computer, or using crazy transparent foregrounds in front of the text.

 

This makes it harder to read for a human, but can be readable if the foreground is carefully created.

 

Here is an example CAPTCHA that I created:

http://phpsnips.com/examples/CAPTCHA/form2.php

 

If you want the source code:

http://phpsnips.com/snippet.php?id=43

what I have come to realize with captcha is there is 3 levels of them

 

Level 1 is a super basic background + colored straight roman time easy to read, and force you to read it

, a very simple OCR can read it, but at least your users aren't mad they can't read it

 

Level 2  is a broad range between insanely hard and level 1 slows down the OCR but still crackable

 

Level 3 is what google uses and I hate it because I can never get them right

 

Personally if you have additional filters on you are fine with level 1 to a low level 2  the point is that it needs to be typed and that slows down 99% of the user base

What I did to make my captcha was use an obscure font with 2 codes written on it (but in the background, unseeable by humans about 17 other codes) then have the random code as a string and split the string into 1 character chunks, also have a random number. Depending on what the random number is, let's say 1 - 3 then the top code in the image is the one to use and 4 - 6 (not the exact numbers) then the bottom code is to use.

 

Also (now the reason for the 1 character chunks) if the number is 1 then the user is requested to enter characters 1,3,6 from the top image if the the number is 4 the characters 6,2,5 from the lower code (in order (not exact order but it's the idea that counts)) etc...

 

(It's easier to use than it is to explain :P)

 

Would that help to stop a OCR?

 

This one (as described):

 

captcha.gif

 

or this one:

 

captcha.php again the user has to put in a certain order, which means that the OCR bot can read it but then has to guess which order to put them in. There is around a 1 in 216 chance for the bot to get it.. if I ask for 4 characters then that increases further to 1 in 576 change then if my maths is right, if 5 characters are needed there is a 1 in 1536 chance. Therefore the bot isn't all that likely to guess easily.

 

If the same goes for the top image, those chances double so for 3 characters there is a 1 in 432 chance, for 4 = 1 in 1152 and for 5 = 1 in 3,072 chance since there a two codes.

 

So would that be hard for a bot since once it's read it it's all guess work?

 

Sam

 

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.