基于机器学习的验证码识别系统研究
基于机器学习的验证码识别系统研究(论文11000字)
摘要:验证码技术作为一个全自动区分计算机和人类的图灵测试,是一个重要的被广泛用来区分人类和计算机的技术。在现代网络快速发展的环境下,验证码在保护网络安全方面起着不可或缺的作用。验证码有效的防止了机器人对网页的大规模恶意攻击。而验证码识别技术有助于提高验证码的安全性,在提高现代网络安全发挥着不可或缺的作用。本文不再同传统验证码识别技术一样需要对图像做去除噪点,分割,再识别然后统一的操作,而是选择通过加入卷积神经网络(CNN),直接端到端的识别验证码,无须分割图片,更加高效也更容易应对复杂的验证码。本文将采用python的captcha库生成大量的字符验证码图像作为数据集,使用3层卷积层3层池化层1层全连接层的卷积神经网络进行训练,同时分别使用sigmoid激活函数和softmax激活函数两种不同的方法训练出两个不同的模型,互相比较训练的成果,选择更优的激活函数训练出的模型作为最后的验证码识别模型并测试模型的训练效果。
关键词:Python3;CAPTCHA;TensorFlow;NumPy;CNN;
CAPTCHA RecognitionBase on Machine Learning
Abstract:CAPTCHA, as a fully automated Turing test that distinguishes between computers and humans, is an important technology widely used to distinguish between humans and computers. In the fast-developing environment of modern networks, CAPTCHA play an indispensable role in protecting network security. CAPTCHA effectively prevents the robot from massive malicious attacks on web pages. CAPTCHArecognition technology helps to improve the security of the CAPTCHA and plays an indispensable role in improving modern network security. This article no longer needs to remove noise, segmentation, re-identification and then unified operation with the traditional CAPTCHA recognition technology. Instead, it chooses to add the convolutional neural network (CNN) and directly end-to-end recognizeCAPTCHA without dividing the image. More efficient and easier to deal with complex CAPTCHA.This paper uses python's captcha library to generate a large number ofcaptcha images as data sets, using a 3-layer convolutional layer 3 layer pooling layer 1 layer fully connected layer convolutional neural network for training, while using sigmoid activation function and softmax activation respectively. The function trains two different models in two different ways, compares the training results with each other, selects the model trained by the better activation function as the final verification code recognition model and tests the training effect of the model.
Keywords:Python3;CAPTCHA;TensorFlow;NumPy;CNN;
目录
1前言 1
2相关研究 2
2.1验证码发展历史 2
2.2验证码的破解 3
2.3 人工神经网络的发展 4
2.4小结 5
3 基于卷积神经网络(CNN)的验证码识别 5
3.1 理论基础 6
3.2 准备训练的验证码图像数据集 7
3.3 图片预处理 7
3.4 定义卷积神经网络 8
3.4.1 第一层卷积和池化 9
3.4.2 第二层卷积和池化 9
3.4.3 第三层卷积和池化 9
3.4.4 全连接层 9
3.5模型训练 10
3.5.1 sigmoid激活函数 10
3.5.2 SoftMax激活函数 11
3.6 模型测试及验证码识别 12
3.7 可拓展的验证码识别 12
3.8 小结 13
4实验结果 13
4.1 训练及测试结果 13
4.1.1 sigmoid激活函数的训练情况 13
4.1.2 SoftMax激活函数的训练情况 14
4.2 验证码识别情况 15
4.3小结 18
5结论 18
6讨论 19
参考文献 19
致谢 21 [资料来源:Doc163.com]