Abstract:
Character recognition on digital displays of coal mine equipment is one of the research hotspots in the intelligent construction of coal mines. However, this field still faces two significant problems: poor recognition effect due to interference factors such as the small effective recognition area, complex underground lighting conditions, and low image quality; due to the constraints of the underground working environment in coal mines, the collection of sample data is limited, resulting in insufficient generalization ability of the model. Aiming at the existing problems mentioned above, a character recognition algorithm for digital displays in coal mines based on PP-OCRv3(a practical ultra light weight optical character recognition, PP-OCR) transfer learning is proposed. Firstly, PP-OCRv3 is adopted as the pre-training model to improve the expression ability of general text features and enhance the accuracy of character detection and recognition in the complex environment of coal mines. Secondly, driven by the public data set for text recognition, the self-made digital display character data set, and the real and simulated coal mine digital display character data sets respectively, the PP-OCRv3 model was gradually migrated multiple times to drive the model to adaptively transform from the general scene to the special scene of the coal mine, achieving the improvement of the cross-scene generalization. The experimental verification shows that in the anti-interference ability test, the average accuracy of the migration optimization model reaches 78.83% (an increase of 17.29%), among which the improvement in the interference block scenario is particularly significant, reaching as high as 79.73% (an increase of 29.32%). The real-time evaluation shows that the average inference frame rate has increased by 27.295 fps, among which the increase in the fuzzy scene is as high as 57.67 fps. The PP-OCRv3 model after multiple migrations effectively reduces the dependence on labeled data samples while having better recognition accuracy and recognition speed than the comparison models.