Regresar

Text-based CAPTCHA Vulnerability Assessment using a Deep Learning-based Solver

Abstract:

The focus of this work is to test the security offered by Text-based CAPTCHAs. We present different types of CAPTCHAs and a preprocessing and segmentation process to clean noise in CAPTCHA images and crop digits or characters in single images. We present a convolutional neural network architecture trained under several hyperparameters, comparing multiple models with different batch sizes, epochs, and optimizers. We confirmed that using Text-based CAPTCHAs is no longer a secure mechanism for protection because, with simple computer vision techniques and current machine learning algorithms, they can be broken. We achieved a 90.49% accuracy with our model trained with a mix of four datasets and up to 97.10% with one dataset, which is enough to consider these schemes insecure in practice.