PDA

View Full Version : Audio Prompt for verifyimg.php



hmag
18-Feb-2007, 09:12 AM
Hi all,
I've been given the task of creating an accessible formmail script which can also be used by blind users. I have noticed on numerous Captcha image verifiers the audio prompt that sits next to the request another image buttons.

Is there some way to add this kind of prompt to the Tectite image verifier? If so, that would be awesome. If anyone can lead me to a resource to do this, or has such a modification, please let me know - my email address is hmag@ozemail.com.au or just let me know through this forum - many thanks, Terry

russellr
18-Feb-2007, 09:51 PM
Hi,

I've been looking for a solution for this myself, because I'd like this option to be available for our Image Verification feature.

At this stage, I found some audio applications for Linux systems which I'm hoping might be a way forward.

My idea is to create a small .wav file for each character that the CAPTCHA can display, and then the vertifyimg.php script will concatenate these .wav files together to produce one larger .wav file.

The problem is that most Linux servers will *not* include the audio applications. So, we might need to provide this as a service from our server.

hmag
18-Feb-2007, 10:30 PM
Hi again,
Thanks for the reply Russell, it's not an easy task even finding a Captcha that is available for general use, let alone an audio prompt.

The servers I use run OSX, so they will be similar to the Linux machines I administrate - a couple of the scripts I have seen in use seem to have this coded into the PHP script somehow, it's not generated as an audio file, which is rather interesting, but I am at a loss how to get something like this working.

I did find this one: http://skimpygimpy.sourceforge.net/ but that's only a quick search so far. Thanks for your help so far

russellr
19-Feb-2007, 12:18 AM
Hi,

http://skimpygimpy.sourceforge.net/ seems to do what we want, but I found several problems with it (in increasing order of importance):

The graphic image is almost unreadable. This isn't so important because our verifyimg.php could just use the audio feature.
It takes a very long time to generate the wave file on a fast computer (over 2 seconds for 5 characters).
The audio output is quite terrible....lots of pops and crackles.
The voice used isn't good enough. For example, "m" sounds like "n".These are all solvable problems, and the way he's implemented the audio generation confirms exactly the way I was intending to do it.

hmag
19-Feb-2007, 01:32 AM
Hi again,
Agreed, the voice is indeed quite ordinary, but my feeling is that it's been done deliberately to prevent spammers from doing audio recognition.

I can't get the image to load apart from the sample in their site, so I can't comment on legibility - I have to say that some of the Tectite images are a little hard to read in the default security setting - I did read about the ability to 'defuse' the noise surrounding the characters anyhow.

If I come across any other possibilities, I'll let you know.

hmag
19-Feb-2007, 01:54 AM
Hi again,
In further research for this, I came across this page:

http://www.ejeliot.com/pages/2

It appears to have a lot of info regarding generating the appaopriate files, so hope that helps.

Also found this one:

http://www.nswardh.com/shout/

hmag
19-Feb-2007, 09:42 AM
Hi again,
I think the URL containing the audiovisual captcha found at http://www.nswardh.com/shout/ is quite promising - the legibility is quite good, the only problem I found was I needed to correct a link for the font, on line 17 from:

$font = 'trebuc.ttf'; // Font-type...

to read:

$font = './trebuc.ttf'; // Font-type...

The font was unable to be found owing to a location error - hopefully it can help you a little, I like the idea of the pregenerated audio files.

russellr
19-Feb-2007, 06:16 PM
Hi,

I think it's worth looking at this site: http://ocr-research.org.ua/

russellr
19-Feb-2007, 06:50 PM
Hi,


Agreed, the voice is indeed quite ordinary, but my feeling is that it's been done deliberately to prevent spammers from doing audio recognition.

Yes, I re-read the page and he actually states that's why he's making it noisy.

I've been doing some more research and these issues seem prominent:

Bad News

Researchers and other clever people can break visual CAPTCHA, in most cases.
It's very difficult to create a solid visual CAPTCHA that can also be read by humans 100% of the time.
Audio CAPTCHA can probably be broken by simple audio recognition software.
Like its visual counterpart, audio CAPTCHA that's difficult to recognize automatically may be difficult for humans to understand.
The W3C doesn't like CAPTCHA (http://www.w3.org/TR/turingtest/) but also hasn't done anything much to give us a serious alternative.
Spammers only need an automated break of CAPTCHA around 30% or more accuracy to make it a good for them to use.
We need services like http://ocr-research.org.ua/ and someone on the audio side to seriously test our CAPTCHA algorithms.Good News

There is a cost to breaking CAPTCHA whichever method they use (including using humans to break it) and that can make spamming uneconomic. For now, this means that just about any CAPTCHA may kill spam almost completely.
There are other techniques we can use in addition to CAPTCHA, such as limiting the number of emails sent from specific IP addresses in a given time. This achieves two things:
makes spamming from a given server unviable
allows automated detection of suspicious activity

hmag
20-Feb-2007, 06:34 AM
Hi again,
That was why I thought the other URL I posted was good - it utilised a format of speech which didn't just tell you the letters, it gave them to you in sets, so automated breaking of the Captcha was almost a non-event.

I know your time is probably at a premium, but do you have any idea when something like that might be usable with your Formmail script?

Also, though I haven't read right through the script, is there any way to stop Formmail responding with the details of what script is being used if called directly? Just an idea for security.

russellr
20-Feb-2007, 08:53 AM
Hi,


I know your time is probably at a premium, but do you have any idea when something like that might be usable with your Formmail script?

If someone pays for it, then it could be within a month. Otherwise, it will happen sometime this year.


Also, though I haven't read right through the script, is there any way to stop Formmail responding with the details of what script is being used if called directly? Just an idea for security.

Obfuscation is not security.

Also, even if it didn't output its name, you could conclude what it was by its behviour (this is true of almost any program).

Obfuscation just makes things hard for everyone *except* bad guys.

Sure, if a FormMail is insecure you'd want to go to great lengths to hide any details about it.

Instead, our FormMail has a history of being extremely secure. If an attacker sees that you have Tectite FormMail, they'll leave your website alone and go looking for a website with one of the lesser FormMail's around. :D

Telling the world you use Tectite FormMail is the most secure thing you can do.

Quite frankly, if you don't believe a script is secure, you shouldn't be using it on your site.

hmag
21-Feb-2007, 02:01 AM
Hi again,
The project I'm working on is completely free of charge for a charity - what sort of cost would it be to get that to work, not being a programmer, I have no idea of how long this might take.

I agree on the other front, but sometimes, presenting some script kiddie with a blank page usually means they have no idea what to try next.

russellr
21-Feb-2007, 02:13 AM
Hi,

What's the charity?

Also, if you need to install some software (e.g. audio executables, python) is that OK for you to do?

Is the server Linux or something else?

hmag
21-Feb-2007, 09:09 PM
Hi again,
The charity is a local chapter of a Landcare organisation - they help out a lot of disabled & handicapped people who do volunteer work for them & this project is so that some of those people can do some stuff from their houses. Normally I'd have suggested that they just did an Intranet setup, but when these people are going to do some things from home, it needs to be accessible from outside, so it needs good spam protection.

As for extra software, then it's not a problem - the server is OSX in this case, which already has Python etc... built in, but at a later date, they might run their own server which I would say will be Linux.

Thanks for the off-list reply too - will check it out.