56

We all hate CAPTCHAs, but to some applications they're a necessary evil. Today I wondered if there's a better alternative we just haven't thought of yet. I considered the dilemma: how do you create something that is indecipherable to a computer, but readable to a human?

Then I remembered an email doing the rounds years ago along the lines of:

I cdn'uolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg: the phaonmneel pweor of the hmuan mnid. Aoccdrnig to a rseearch taem at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Scuh a cdonition is arppoiatrely cllaed Typoglycemia.

In case you can't read the above:

I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind. According to a research team at Cambridge University, it doesn't matter in what order the letters in a word are, the only important thing is that the first and last letter be in the right place. The rest can be a total mess and you can still read it without a problem. This is because the human mind does not read every letter by itself, but the word as a whole. Such a condition is appropriately called Typoglycemia.

This is called Typoglycemia, and although it wasn't actually researched at Cambridge, there is an element of truth in that people find it surprisingly easy to read.

Could Typocaptcha be the future? Read these three questions:

  • Wihch anmial is bgeigr - a fox or an eplthneat?
  • Waht aianml is siad to nverr freogt?
  • Waht tpye of aimnal was Wlat Dsi'enys Dmbuo?

In case you haven't guessed it, the same answer to all three questions - is:

elephant

There are millions of possible combinations of questions, but before getting into the 'how', it all boils down to user experience.

Would Typocaptcha result in a better or worse user experience when compared to CAPTCHA?

P.S. I am aware that this would not be very accessible to visually impaired users, much like CAPTCHAs aren't.

rybo111
  • 1,059
  • 1
  • 7
  • 11
  • 52
    How would this work with dyslexics and people who English isn't their first language? – Wander May 27 '15 at 20:15
  • 1
    @Wander Regarding dyslexia, the same could be said for CAPTCHA. And the language of the question/answer would match the site's. – rybo111 May 27 '15 at 20:19
  • 45
    The UX answer to "is there a better CAPTCHA?" is "it doesn't matter, there are better solutions to the problems CAPTCHAs are usually designed to solve. – DA01 May 27 '15 at 20:43
  • 3
    All three of your questions are sufficiently difficult enough for a computer without changing the text. Muddling the letters makes it harder for humans, without any change in difficulty for humans. –  May 28 '15 at 00:15
  • 13
    Your non-native-English-speaking users would have a real problem with this. And, in the end, a computer program could just look for "all words that containt the letters in question" and build meaning out of it. A better option (as suggested below) is the Google experience that uses what Google already knows about the users to determine yes/no. – Jon Watte May 28 '15 at 03:43
  • Strange the the best source I could find to debunk a chain email is Fox News, but here it is: http://www.foxnews.com/story/2009/03/31/if-can-raed-tihs-msut-be-raelly-smrat.html – Patrick M May 28 '15 at 05:43
  • 6
    Spambots are a technical problem users should not have to solve. Try using a technical solution like a honypot – uxfelix May 28 '15 at 07:03
  • 2
    @uxfelix Whilst I love the honeypot concept, it is easily hacked. – rybo111 May 28 '15 at 07:05
  • 1
    @rybo111 the problem of CAPTCHAs is a problem of increasing the cost of very-large scale attacks to make spam economically non-viable. The only good solution in that respect is to force all users to waste craploads of bandwidth, and that is already an horror UX-wise. As DA01 pointed out, there are better solutions to spam prevention than punishing human users. Consider that CAPTCHAs are now solved at 99% efficiency in milliseconds by well-trained computers. There was an initial investment (building the ML model) but then the task becomes easier than it is for humans. Same with your approach. – Steve Dodier-Lazaro May 28 '15 at 11:20
  • 1
    I vote for emotion based image CAPTCHAs! And if someone is not able to match the image to the equivalent emotion... heartless =D – Gustav May 28 '15 at 14:09
  • @DA01 I have heard this several times now, but never seen what the superior alternatives are. Would you mind informing me? Been curious for a while now (I am relatively ignorant on this topic). – HC_ May 28 '15 at 16:20
  • 1
    @HC_ it depends on context. But suggestions can include honeypots (where you have hidden form fields), email verification, SMS verification, or google's "are you a robot?" non-CAPTCHA http://googleonlinesecurity.blogspot.com/2014/12/are-you-robot-introducing-no-captcha.html – DA01 May 28 '15 at 18:11
  • 13
  • 5
    For what it's worth, I'm dyslexic and don't have a problem with this. In fact, it resembles how things look to me when I read anyway. It resembles even more what I write when I rush and don't have a spellchecker. I agree, however, that it would be a barrier to non-English speakers. But then, so is a site written in French and no one objects to that, do they? – Nagora May 29 '15 at 07:31
  • 2
    @Nagora, exactly my point (regarding language). A reply means more than an anonymous upvote, so thanks for that - I wish more people would comment positively! – rybo111 May 29 '15 at 07:39
  • 1
    This is an interesting idea, I am sure it has potential. But : 1) Blind people hearing the spoken sentences would probably have big problems. 2) Google is very good now at auto-correcting phrases full of typos. – Nicolas Barbulesco May 29 '15 at 10:17
  • 1
    This is not only language dependent, but also culture-dependant. I live outside US and although I have no problem with understanding (even scrambled) English, I had absolutely no idea that "elephants never forget" and I am sure that most of my parents' generation have never heard of Dumbo. – el.pescado - нет войне May 30 '15 at 16:13
  • @el.pescado The questions could be culture-dependant as well as language-dependant. Also, they would need to be written in such a way that all ages could answer. My questions are just an example - they would obviously need a lot of thought in the real world. – rybo111 May 30 '15 at 16:35
  • 2
    This an example where human's pattern recognition almost matches that of algorithms. What you want is something that humans are good at, but machines aren't, like emotional response, planning, creativity. Better even would be to use pattern recognition not against you potential users, but analyse submissions on your end to determine which are humans and which are computer filled submissions. – kontur May 30 '15 at 17:55
  • 1
    English is my second language but I had no problems reading the scrambled sentences. – Alexia Luna May 31 '15 at 20:45
  • i'd've expected the correct answer to be more like "elpheant" – Octopus Jun 01 '15 at 22:58
  • You can't beat a computer at playing Scrabble when it's serious. – Lie Ryan Jun 02 '15 at 01:22
  • @LieRyan In Scrabble, the task is to find multiple words, and pick the best score. But what if the task was choosing a particular word based on context, and the computer had one guess? Suddenly, the task is more difficult. – rybo111 Jun 02 '15 at 06:46
  • The 'emotional response' suggestion is interesting: I wonder how good computers are at understanding facial expressions ? – PhillipW Jun 02 '15 at 08:43
  • In addition to ESL and dyslexic users, you would also run into some serious issues with visually impared users who are using screen readers . . . – talemyn Jun 02 '15 at 14:09
  • @talemyn "P.S. I am aware that this would not be very accessible to visually impaired users, much like CAPTCHAs aren't." Also, dyslexic users have replied saying they don't struggle to read it. – rybo111 Jun 02 '15 at 14:11
  • @rybo111 - Well, my point was that, in current CAPTCHAs (many, at least), there is an option to have the word spoken to the user as an alternative to reading it themselves. I'm not sure what the alternative would be in this case, when speaking the words would result in nonsense. – talemyn Jun 02 '15 at 15:04

11 Answers11

72

Ironically, I could not get by myself what bgeigr meant, but almighty Google helped me out:

So this captcha is quite easy for computers to guess, yet may be hard for humans.

And bear in mind that Google is using an error model for common typos (letters replaced by those adjacent on the keyboard etc.) If you program your computer to only consider anagrams, it will easily reach 99-ish percent accuracy.

Dmitry Grigoryev
  • 2,256
  • 12
  • 19
  • 17
    You're correct. This is an example where people are trying to "outthink" computers on their own turf. – Mayo May 28 '15 at 13:27
  • Also, the fact that Google can answer the suggested question makes me think that some people might really enter such questions into Google cause they don't know the answer (maybe they have never watched Dumbo). So in priciple a human user who wants to use a web service to answer a question he does not know to answer, is asked by that same web service another question he does not know to answer ... – Hagen von Eitzen May 29 '15 at 06:04
  • 1
    Regarding the bgeigr anagram: Perhaps double letters could stay together in words. Example: beiggr. – rybo111 May 29 '15 at 16:17
  • 1
    Apparently I wasn't the only one with problems with typocaptcha ☺, I couldn't decode bgeigr and freogt, and would thusfail it. The most similar word I came up with was _beige_… – Ángel May 29 '15 at 16:26
63

Why would this be indecipherable to a computer? Since each word has the correct letters, but they are scrambled, it would seem very easy for me for a computer to crack the correct order of the letters by comparing it to known words. Which defeats the whole point of having this extra barrier.

Secondly, how would this affect folks with dyslexia or other reading disabilities? Because I would think this would be even harder on them than captchas (someone with actual dyslexia please chime in!). It's true that captchas affect vision-impaired folks, but this could be a bother to additional disabilities (obviously that is my initial thought and user research would be needed).

I totally agree that captchas should be improved, but I am not sure if this would accomplish that purpose.

Rachel9494
  • 1,267
  • 1
  • 8
  • 8
  • 1
    In my defence, I didn't say it would be indecipherable to a computer - I said that is the dilemma. I also think "very easy" is an exaggeration, considering the computer would need to work out the context to understand the words. Your point about dyslexia is more what I'm interested in -- how does it compare to CAPTCHA? – rybo111 May 27 '15 at 20:23
  • 6
    To follow-up on my point: consider the words could and cloud. How would the computer know which word to use? – rybo111 May 27 '15 at 20:32
  • 7
    @rybo111 I think initially it wouldn't, but bot makers are constantly evolving bots; in your example, it could know which is a noun and which isn't, so it could compute based on English sentence structure knowledge which of the two words is more likely to be the correct one. And, if its guess was wrong, it could try again just like humans need to try again with captchas. – Rachel9494 May 27 '15 at 20:44
  • If bots are constantly evolving, validation should be too. – rybo111 May 27 '15 at 20:52
  • 5
    I've worked in a company that made a bot for moderating online comments - newspapers comments, forums messages. Scrambling the letters in each word is kid's play for a bot - you don't have that much combinations. Marketing aside, most of the filters they show in this video were really working OK three years ago. – mgarciaisaia May 27 '15 at 23:23
  • @mgarciaisaia but what about the answers? Are they also easy to get?Taking as example "Which animal is bigger, a fox or an elephant?", are bots capable of extracting relationships, in this case "bigger" as a comparation, and fox and elephant as the elements to compare, and then just search them online looking for numbers near words like height/weight to do de comparation? PS: Nos vemos el año que viene en Operativos. – Alejandro Veltri May 28 '15 at 12:23
  • @rewobs: They already know how to do that, sort of. TextCaptcha's home page says "Do text CAPTCHAs actually work? Yes, and No.". You can ask complex questions, but there are lots that are already easy to answer. Anyway, scrambling the words doesn't add much complexity for the bot, but can do for humans. PS: #miedo. No te olvides de saludarme! – mgarciaisaia May 28 '15 at 12:37
  • There are lots of types of dyslexia, as it's just an umbrella term for atypically troubled reading. For my particular dyslexia, the scrambled text is only slightly more difficult to read than normal text. If I'm skimming it (and I pretty much skim everything because otherwise my reading speed is awful) there's little difference. If However, I think I've missed something and have to slow down, then it turns indecipherable. – Rick May 29 '15 at 12:31
  • Dyslexia causes the brain to mix up the order of words or letters. Jumbled spellings actually have less impact on dyslexic users than standard users, because dyslexic users are used to reading words that look like that anyway. – Brian Jun 01 '15 at 19:17
43

This is not effective for keeping out a targeted attack by someone who uses a word list, such as /usr/share/dict/words, to solve your anagrams. A task like "unscramble the words in standard input, assuming the first and last letters are correct, given a word list file for the language" is probably so straightforward that it'd make a good puzzle for our Code Golf site. Sorting out words that are already anagrams, such as could and cloud, could be done with an n-gram database derived from the Project Gutenberg corpus. Then the attacker sees each clue, makes a database of correct responses with the help of Mechanical Turk, and gains the technical ability to spam your site.

If an English proficiency test like this is effective for anything, it'd be for shutting out human users who happen to live in the wrong country. If you have a license to offer your service only to customers in (say) the United States, then someone coming in through a VPN who's not a native English speaker is less likely to actually be a U.S. resident. So it might be useful for the sign-up page of an Internet music or video streaming service, which are markets that are still heavily balkanized by decades-long exclusive territorial distribution contracts. In fact this technique has been seen in the wild: Two levels of the first WarioWare game for Game Boy Advance were typo tests in Japanese, which made it hard for people who downloaded an infringing copy to play through until Nintendo released the English version of the game to the North American market later.

Damian Yerrick
  • 1,075
  • 7
  • 11
  • 5
    A fantastic answer that not only lists the drawbacks, but sees the positives. And a real-world example of a similar concept for good measure. – rybo111 May 30 '15 at 20:10
  • In regards to the questions, they could be built on-the-fly, such as in the animal sizes comparison. – rybo111 May 30 '15 at 20:15
  • "is probably so straightforward that it'd make a good puzzle for our Code Golf site" - done in my answer, although I haven't tried to golf it. – Steve Jessop May 30 '15 at 23:02
  • Your 'positive' parts only works when the majority of the world does not speak a language (like Japanese), which is definitely not the case with English - I think most people can get it right one out of three tries, which should be enough if your UI allows for human error. – Sanchises May 31 '15 at 17:22
  • @sanchises - most people absolutely do not speak English. – Davor Jun 01 '15 at 12:20
  • Also, in regards to, "shutting out human users who happen to live in the wrong country" -- sadly, there are overseas markets where people are paid about 25 cents an hour to solve CAPTCHAs. – rybo111 Jun 02 '15 at 07:02
36

tl;dr

A good captcha would need (ideally) to offer the best possible protection (difficult to get for a computer) and ease of use (easy to get for a human). But captchas aren't good at this and "typoCaptchas" doesn't seem to improve them. Questions can be rearrenged quite easily and then if the question is easy enough for people is probably easy enough for google:

enter image description here

Captchas are still difficult for humans

Taken from How Good are Humans at Solving CAPTCHAs? A Large Scale Evaluation. Stanford University :

Overall, we found that captchas are often harder than they ought to be, with image captchas having an average solving time of 9.8 seconds and three-person agreement of 71.0%, and audio captchas being much harder, with an average solving time of 28.4 seconds, and three-person agreement of 31.2%.[...] Using the data collected from Amazon’s Mechanical Turk, we identified a number of demographic factors that have some influence on the difficulty of a captcha to a user. Non-native speakers of English were slower, though they were generally just as accurate unless the captcha required recognition of English words. We also saw small trends indicating that older users were slower but more accurate.

I'm sure everyone can remember at least one time being sure that what you're typing matches perfectly with the captcha, but it results to be wrong. So first, it's stepping in users' way and later making them feel stupid, something that of course they don't like.

Here a Google's site about classic CAPTCHA's security:

While the new reCAPTCHA API may sound simple, there is a high degree of sophistication behind that modest checkbox. CAPTCHAs have long relied on the inability of robots to solve distorted text. However, our research recently showed that today’s Artificial Intelligence technology can solve even the most difficult variant of distorted text at 99.8% accuracy. Thus distorted text, on its own, is no longer a dependable test. One advantage of the recaptchas is that they've been helping to digitalize books.

Classic Captchas vs TypoCaptchas

I find similar problems between them, but I don't think typoCaptchas presents a solid advantage:

  1. You'd still have to read and interpret a question (instead of one/two words) which doesn't fix the problem for the visually impaired.
  2. You'd additionally have to know/recall the answer (here you are introducing a new variable and more cognitive load). You have to deal with internationalization too.
  3. You'd have to type it correctly. This could be better in the sense that you could let the user introduce typos (as @rybo111 mention in the comments).e.g.: elefant instead of elephant.
  4. You can't use them as a medium of digitalization.
  5. They doesn't seem to be hard for bots from the beginning.

There's a new option

enter image description here

Google has recently developed a new reCaptcha, that improves the ease when it's possible:

However, CAPTCHAs aren't going away just yet. In cases when the risk analysis engine can't confidently predict whether a user is a human or an abusive agent, it will prompt a CAPTCHA to elicit more cues, increasing the number of security checkpoints to confirm the user is valid. Captchas are still difficult for humans

enter image description hereenter image description here

Alejandro Veltri
  • 10,487
  • 2
  • 33
  • 49
  • In terms of typing it correctly, it could allow for typos (ironically). For instance elefant could be an accepted answer -- it's more the context and understanding that is important. Personally, it would be more natural for me to type 'elephant' than enter random characters in the correct order, since you would need to look at the screen every so often. – rybo111 May 27 '15 at 21:14
  • 2
    Also see the match the images captcha alternative in the same link http://1.bp.blogspot.com/-pA1LDaLSnWg/VH5MKbOJYxI/AAAAAAAAAC0/2jNt5wF-cJA/s1600/turkey_captcha.png – Danny Varod May 27 '15 at 22:36
  • 1
    Using a "match-the-image" challenge is a really bad idea, see this, this, or that. – Liam Marshall May 29 '15 at 02:47
  • 2
    I actually find those “identify what is this image“ recaptchas harder than the «insert the address number» it usually provides. There's often some food you aren't sure what it is (probably those are the images Google is most interested in getting humans to identify, so they can use them as the basis for their own image learning algorithms). – Ángel May 29 '15 at 16:33
18

I couldn't believe that I could Macaulay uesdnatnrd what I was radioing: the phenomenal power of the human mind. According to a research team at Cambridge Nerviness, it doesn’t matter in what order the letters in a word are, the only iprmoatnt thing is that the first and slat letter be in the right pilau. The rest can be a tootle mess and you can still read it outhit a problem. This is useable the human mind does not read nervy letter by itself, but the word as a whole. Such a condition is appropriately called Hypoglycemia.

The above is how a spell checker sees your example paragraph. The key thing for a CAPTCHA is that it is easy for humans, but difficult for computers. It appears that this "typocaptcha" is reasonably easy for computers to deal with, which means that the difficulty for humans is irrelevant.

Mark
  • 1,875
  • 13
  • 14
  • The spell checker has at least one incorrect word in each sentence -- how could a computer logically answer a question based on this? – rybo111 May 27 '15 at 22:34
  • 13
    This was just a simple test -- run it through the spell checker and take the first suggestion. With the exceptions of "uesdnatnrd", "iprmoatnt", and "Typoglycemia", the correct word was somewhere in the suggestion list. If you add in a grammar checker or a Markov-based "what is the likely next word" routine, the success rate should go up considerably. – Mark May 28 '15 at 00:28
  • 5
    Additionally, an anagram calculator can also decipher "uesdnatnrd" and "iprmoatnt" as the first answer (longest single word). "Typoglycemia" could not be solved, since it was obviously not a dictionary word. – March Ho May 28 '15 at 10:49
  • The spell checker isn't build as a capctha bypass. – Daniel Zahra May 28 '15 at 14:19
  • @Mark I assume Daniel is not criticizing your answer but answering Rybo's question (ie, since it was't purpose-built, it's not perfect, but it's close enough even now). – Joe May 28 '15 at 19:43
  • @Mark, sorry I didn't include the @ for rybo as Joe mentioned :). – Daniel Zahra May 29 '15 at 07:46
9

No. I do not thing this is a good alternative to captchas. There are several goals that I think need to be achieved in order to be considered as an alternative.


As Seamless As Possible

When designing a captcha or a captcha alternative, it's very easy to lose sight of this goal. You have to keep in mind that we're purposely blocking users from doing what they wanted to do. We're creating a wall. This wall should be as low as possible while still achieving the other goals. If the process for getting through is too much of a hassle, then the user will move on.


Easy To Complete

One of the defining characteristics of captchas is that very little prior knowledge is needed. Even people that know little English can complete them. Asking the user to complete puzzles or answer trivia makes a larger wall for the user to climb. This almost invariably means that some users are going to leave. What's worse is that users that do want to get in might not know the trivia. No matter how easy you think it is. Not to mention that it won't work for many disabilities or disorders such as dyslexia. Popular captchas have a button that allows an audio captcha as an alternative and I'm not sure how this alternative captcha would be made to complement that.


Blocks Non-Humans

This is the entire reason why captchas exist. It's important to not only consider what computers are able to do today but to also consider what will be possible in a year, five years, and even ten years. While the captcha alternative you are suggesting would have probably worked five to ten years ago or more, that's just not true anymore. This is a problem that computers are not only able to solve, but that they're pretty good at.

While I don't know of any programs built to handle this particular idea, there are many programs available that handle many parts. We have spell check, predictive text, search engines (that will likely be able to answer the trivia more often than humans), and translation software. As a programmer, I don't see this alternative stopping me.


In short, I think this is harder for humans and easier for robots.

MiniRagnarok
  • 191
  • 4
8

Would Typocaptcha result in a better or worse user experience when compared to CAPTCHA?

It doesn't really matter. CAPTCHAs are a hurdle for humans. Which is the intent...they are meant to be a hurdle...hopefully one a human can jump but not a computer. But regardless, hurdles are a bad user experience so if there's a way to do it without the hurdle, don't redesign the hurdle, but just get rid of it.

Would your idea of a CAPTCHA be better than the scrambled letters? Or picture versions? Or any of the other types of CAPTCHAs? For some, probably. For others, maybe not (say, anyone with dyslexia or speaks a different native language).

Plus, it likely is less effective at preventing computers from getting by it as, after all, this is basically what a spell/grammar checker is.

DA01
  • 41,799
  • 5
  • 81
  • 142
7

This would be incredibly easy for a computer to determine. The thing is to not go head-to-head with a computer on its own turf. A hacker can go through millions of permutations per second. Any form of l337 speak or other variation would be decoded in a heartbeat. Literally. (If that).

EDIT:

A program will check to see if the word is in a dictionary. If not it will do the typical substitutions. There are 30,000 words in a decent dictionary. If there are 20 variations of each word (much less than that) there would be 600,000 "words" to check. At a million+ guesses a second it would take how long to go through a paragraph? 1 second?

If questions like "What type of animal was Walt Disney's Dumbo?" are asked then the human attackers will have to put into the database. Bear in mind that there are millions of people born outside the US and using your website who could not answer that question. That is a major usability issue.

The question "which one is bigger a fox or an ekdjior" would be trivial for any algorithm.

  • The phrase "which one is bigger" is determined to be a comparative.
  • Check dictionaries.
    • A fox is in a dictionary.
    • An ekdjior is not in dictionary)
  • Therefore a fox is bigger.

Your opponent is not some dumb computer. It's a tool. Your opponent is a team of very smart people who are using a tool to get to a place they want to go.

Mayo
  • 6,641
  • 9
  • 30
  • 37
  • A hacker could go through millions of permutations per second, but ultimately they would have one "best guess" submission. – rybo111 May 27 '15 at 22:56
  • @rybo111 And then, if it fails, the bot will try again since a good user interface will allow for human error. As a rule of thumb, image processing is always harder for a computer than text processing. – Sanchises May 29 '15 at 09:48
  • Then show the question as an image. And perhaps have at just a few intentional typos. – rybo111 May 29 '15 at 09:49
  • You can also throw in the edit distance. That would take care of the typo problem – Wayne Werner May 29 '15 at 19:09
2

Just to expand on the effect of using a word list, the following Python code prints a summary of how many possibilities there are for each of the words in your example text:

import collections

with open('words.txt', 'rb') as infile:
    allwords = set(line.strip() for line in infile)

def key(word):
    word = word.lower().translate(None, '.,:;-?')
    if not word:
        return None
    return len(word), word[0], word[-1], ''.join(sorted(word[1:-1]))

words_by_letters = collections.defaultdict(set)
for word in allwords:
    words_by_letters[key(word)].add(word)

def findword(word):
    return words_by_letters[key(word)]

if __name__ == '__main__':
    message = """
    I cdn'uolt blveiee taht I cluod aulaclty uesdnatnrd waht I was rdanieg: the phaonmneel pweor of the hmuan mnid. Aoccdrnig to a rseearch taem at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht the frist and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. Scuh a cdonition is arppoiatrely cllaed Typoglycemia.
    """
    print sorted(collections.Counter(len(findword(word)) for word in message.split()).items())
    for word in message.split():
        result = findword(word)
        if len(result) != 1:
            print word, len(result), result

result (using some word list I found by searching):

[(0, 2), (1, 90), (2, 5)]
cluod 2 set(['could', 'cloud'])
uesdnatnrd 2 set(['unstranded', 'understand'])
mttaer 2 set(['matter', 'mettar'])
frist 2 set(['frist', 'first'])
bcuseae 2 set(['besauce', 'because'])
arppoiatrely 0 set([])
Typoglycemia. 0 set([])

That is to say, only 2 words were not deciphered at all, and one of those is because of a mistake in the original message ("appropriately" contains 3 Ps). 5 words have multiple solutions but probably would not be that difficult to guess with high confidence based on context and the frequencies in English of the alternatives. 90 words (some of them duplicates) were unambiguously identified.

Your three questions put together:

[(0, 4), (1, 21)]
eplthneat? 0 set([])
nverr 0 set([])
Dsi'enys 0 set([])
Dmbuo? 0 set([])

So it's failed on two more typos and two proper nouns.

As others say, the anagrams are considerably harder for non-native English speakers to read. Even a few minutes work makes them not all that much harder for robots to read. Even if it's a good test of English proficiency in humans, it's not a very promising CAPTCHA technique. Answering natural-language questions is way harder than dealing with this amount of obfuscation, so the hard part of solving your CAPTCHAs isn't the novel part ;-)

Of course, you can do more to obfuscate the English (for example by introducing intentional typos of the kind my trivial code failed to solve). But firstly I haven't even tried to use a spellchecker, so a robot can do much better than I have so far, and secondly it's not clear at this point how much of that you can do before even native-English humans fail to solve the puzzle too.

Steve Jessop
  • 189
  • 1
  • 6
  • Introducing incorrect letters might slow the spell-checker down a little, but I think it would get the correct word in most cases. Perhaps a better solution is to create a list of anagrams that can have multiple words (e.g. could / cloud) and ensure they are used in each question, thus confusing the bot somewhat. – rybo111 Jun 02 '15 at 09:19
  • Yeah, spell-checkers are designed to deal with incorrect letters, and they do so better than humans, otherwise they wouldn't exist. – Dmitry Grigoryev Jun 03 '15 at 09:05
2

As a person who speaks English as a second language and live in an english speaker country I couldn't read the text , imagine plenty of user have same situation, and if some one use this captcha for bank , they can't use their bank account any more, so I think it's more difficult to comprehend for people than computer.

Amitis
  • 111
  • 6
1

http://www.anagram-solver.org/?letters=eplthneat
http://wordsolver.net/solve#!q=eplthneat

Anagrams are no threat to AI.
But if you would replace, add or remove some of the letters — it would no longer be an anagram thus only decipherable by complex guessing algorithms which would also be true for humans as well though.

Erquint
  • 11
  • 2
  • 2
    Which is bigger, a fox or a telepath? I have no idea. – rybo111 May 30 '15 at 16:36
  • @rybo111 Good one, but it's just a matter of making two attempts or discarding incomparable guess. – Erquint May 30 '15 at 17:10
  • That 'complex guessing algorithm' is called 'dictionnary search' and it's not that complex. Google search bar is quite good at guessing, even if it only has a few letters to go with. – Dmitry Grigoryev Jun 03 '15 at 09:09
  • @DmitryGrigoryev, you misunderstood me. Change one or more letters in an anagram itself. After the word has already been jumbled. For example, "aplthneat" instead of "eplthneat'. And yes, again I have to remark, this would probably be to difficult for the human users as well. Show me one example where a corrupted anagram is recognized by an algorithm. – Erquint Sep 10 '16 at 20:46