7

Wanting to avoid bots registering on a website, I might use reCAPTCHA by Google. Since the algorithm behind reCAPTCHA is not open source, I'm wondering what user information they actually collect about the person registering on the website.

In my researches, I found out that they probably check at :

  • IP
  • Loaded resources
  • Whether you have a google account or not
  • Behavior on the page
  • Past history and cookies

Which is already a good list of information. But is it also possible that they look at the information the user filled the form with?

schroeder
  • 129,372
  • 55
  • 299
  • 340
Raka
  • 79
  • 1
  • 3
  • Note that google is not the only entity in the world able to do captcha systems. There are many alternatives you can use, some of which are less untrustworthy. You can also use a code library and do it at home, with open source code. – SandRock Sep 08 '23 at 07:47

2 Answers2

2

Google's documentation walks you through the process:

Front end, the client loads the javascript, it executes (looking at whatever data locally), then sends the token to you .

Server side, you send to Google the user's response token. You can optionally send the user's IP.

From the documents, nothing is sent to Google itself from the front-end. Please make sure you read the documents from the source.

schroeder
  • 129,372
  • 55
  • 299
  • 340
  • From the documents, nothing is sent to Google itself from the front-end: that is false. I observed a AJAX request with a huge payload (7 KiB) going straight to a google domain. The payload is obfuscated and it's not easy to find what's in there. – SandRock Aug 30 '23 at 14:59
  • So, is the document false, or are you assuming that the AJAX payload is from the reCAPTCHA? I'm not sure how to interpret your comment or what we are to learn from it. – schroeder Aug 30 '23 at 15:53
  • The document does not contain a description of network calls. But there are network calls involved, obviously. A bad documentation page should not be considered as "the truth". – SandRock Sep 08 '23 at 07:31
  • I think you've misunderstood my statements and missed how you can be helpful. The documentation is the documentation. I never said anything was "the truth". If you have documentation that says something different, then this answer needs to be updated. But I'm not going to change my answer based on documentation to replace it with your undefined, anecdotal "I saw something in an AJAX request". Provide actual details, and I'll update. Just saying "yoUr wRonG!!1!" isn't helping anyone. – schroeder Sep 08 '23 at 07:45
1

They can definitely look at form contents on the page. In fact, keyboard & mouse events is one of the signals they use to decide whether the user is a robot or human.

What they do with this information is up to them. It all depends on how much would you trust a company not to snoop when their whole business model is based on snooping.

André Borie
  • 12,826
  • 3
  • 42
  • 76