The first thing that comes to mind when you hear about RPA and bots is: how does it manage to solve captchas?
And this question is commonly repeated by MSPs who work in automation.
There is probably not a single more polarizing technology that was created to make life less stressful and companies more efficient yet has so many people ready to boycott its benefits.
What is a captcha?
A captcha is a tool that helps to distinguish a human user from a software bot online. The tool is a challenge-response system that asks end-users to perform some task that a software bot cannot do. If the user can do the task correctly, it provides authentication to the service that the user is a human being and not a spam bot and allows the user to continue.
Unfortunately, there is no way for captchas to identify or set apart good bots from malicious ones. Because of that, the good bots will have to deal with them for much longer.
In case you didn't know, the word captcha is an acronym for ‘Completely Automated Public Turing test to tell Computers and Humans Apart’.
Why do companies use captchas?
The use of captchas is often recommended to protect sites from malicious activity done by bots. Here are six examples of reasons to be aware of:
- To protect the integrity of online polls by stopping hackers from using robots to send in repeated false responses.
- To stop brute force attacks on online accounts in which hackers repeatedly try to log in using hundreds of different passwords.
- To prevent hackers from signing up for multiple email accounts that they’ll then go on to use for nefarious purposes.
- To stop cyber criminals spamming blogs or news content pages with dodgy comments and links to other websites.
- To prevent ticket hoarders from using robots to bulk buy tickets for shows and gigs.
- To make online shopping more secure.
Different types of captchas
The most common type of captcha is the text captcha, which requires the user to view distorted letters or distorted text, usually containing a string of alphanumeric characters in an image, and enter the characters in an attached form.
Another typical captcha uses picture recognition by asking users to identify a subset of images within a larger set of images. For instance, the user may be given a set of pictures and asked to click on all the ones with cars, buses or street signs.
Less commonly seen captchas are based on Math operations or Question and Answer. The first generates a random addition, subtraction, or multiplication equation, and the second generates a question that can be configured.
Text captchas are being replaced by more sophisticated captchas like reCaptchas and hCaptchas.
reCaptchas are free and provided by Google. There are three types of them:
- v2 Checkbox reCAPTCHA
In the classic version of reCAPTCHA, the user must select a checkbox labeled “I’m not a robot” to confirm that they’re human. In some cases, v2 Checkbox reCAPTCHA will ask users to answer image-based questions.
- v2 Invisible reCAPTCHA
- v3 reCAPTCHA
However, v3 reCAPTCHA will never display image-based questions like the one shown above. Instead, it runs entirely in the background.
To avoid asking for user interaction, Google will monitor the user’s behavior on your site to look for what it considers suspicious activity. Then, reCAPTCHA will assign the user a score. A minimum score is set for users to submit their forms.
If a user’s reCAPTCHA score does not meet the requirements, they won’t be able to continue with the next steps.
hCaptcha is also a free reCaptcha alternative and is very similar to reCaptcha V2, which offers a checkbox-style CAPTCHA where users must check a box labeled “I am human” to prove that they’re legitimate.
Based on the user’s activity and your difficulty settings, hCaptcha may also sometimes ask users to answer image-based questions to confirm that they aren’t spambots.
How to deal with captchas?
So now that we understand what the captchas are and their essential role in preventing the hackers from attacking with bots, let's see how we can solve them to allow the “good” bots to do their job.
There are several options and strategies to solve captchas. We’ll look at a couple of them that will cover most captcha cases seen in Latin America.
We can use an OCR (Optical Character Recognition) approach for the text-based captchas when they are simple enough. This is not perfect, but this method could solve the captcha in a couple of attempts. You can use regular OCR engines like Google Cloud or Microsoft Cloud OCR, or you can also use Python code with purpose-built libraries that use OCR and Machine Learning to recognize the text in the image.
For more complex captchas, like reCaptcha and hCaptcha, the best way is to use a third-party Captcha solver service capable of solving several types of captchas. In some cases, solving captchas using AI and Machine Learning is so complex and expensive that this kind of service often uses humans to solve them. Yes, you got it right! Humans are working for bots! It sounds crazy, but it is real.
Several Captcha Solver services are available on the web, like 2captcha.com or deathbycaptcha.com. These services can solve different types of captchas and work through API calls.
If you are already an ElectroNeek partner, ElectroNeek provides an Anti-Captcha feature that will solve reCaptcha v2 automatically when it appears on the web.
Captchas are meant to protect websites from malicious activity done by programs or bots. There are different types of captchas, and they are constantly evolving and becoming more sophisticated.
ElectroNeek offers a variety of strategies from the native Anti-Captcha feature for reCaptcha V2, OCR approach, code execution, or finally, using a third-party service through API.