PaddleOCR Training Model Reference

PaddleOCR Training Model Reference

This article introduces training on NVIDIA GPUs. CPU training is for reference only. For any differences, please visit the official website for information.

Last updated 3/30/2022 11:00 AM
Dream.Machine
4 min read
Category
.NET
Tags
.NET C# Pattern Training

This article is about training with NVidia GPUs. CPU training is for reference only; for any differences, please refer to the official website.

Prerequisites

  • Python 3.9 (3.10): Initial tests had ongoing issues, so the author switched to 3.9. If needed, verify with 3.10 on your own: https://www.python.org/

  • Python: This is the keyword to execute scripts. Environment variables need to be configured. Many of the following components also require environment variable configuration; refer to online resources for details.

  • pip: The author is not a Python developer; this is understood as an installation plugin. It can be used to install third‑party libraries. If pip3 doesn't work, try pip instead for unknown reasons.

  • pip network issues: You can add the parameter -i when using pip, e.g., https://pypi.tuna.tsinghua.edu.cn/simple

For example: pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pyspider will install the pyspider library from the Tsinghua mirror.

  • CUDA

https://developer.nvidia.com/cuda-downloads

The author installed version 10.2.

  • CUDNN

https://developer.nvidia.com/cudnn

After downloading, extract and copy the files to the CUDA directory.

  • PaddleOCR

https://github.com/PaddlePaddle/PaddleOCR

Clone the project locally.

  • cd PaddleOCR

pip3 install -r requirements.txt

Install the required Python libraries for OCR.

  • PPOCRLabel

This is an annotation tool for creating training data – not mandatory, but very convenient.

cd ./PPOCRLabel # Switch to the PPOCRLabel folder
pip install pyqt5 # Install Qt5 runtime environment
pip3 install -r requirements.txt
python PPOCRLabel.py --lang ch # Launch the tool; if nothing happens, the environment is incomplete.
  • ch_ppocr_mobile_v2.0_rec

Pre‑trained model (other models can be found at: models_list.md)

ch_ppocr_mobile_v2.0_rec_pre.tar

  • Training parameter documentation

config.md

Local config file path: PaddleOCR-release-2.4\configs\rec\ch_ppocr_v2.0\rec_chinese_lite_train_v2.0.yml

  • Values to modify:
epoch_num: 1000 # Number of epochs
data_dir: ./train_data/ # Training data directory
label_file_list: ["./train_data/train_list.txt"] # Label file for training data
batch_size_per_card: 128 # Batch size per GPU card (reduce if it fails to start)

  • Explanation of training directories
PaddleOCR-release-2.4\train_data
PaddleOCR-release-2.4\train_data\crop_img # Place cropped images generated by the annotation tool here
PaddleOCR-release-2.4\train_data\train_list.txt # Training label file
PaddleOCR-release-2.4\train_data\val_list.txt   # Validation label file (currently the author uses the same content as the training file); content example below

PaddleOCR-release-2.4\pretrain_models  # Place pre‑trained models downloaded from the official site here
PaddleOCR-release-2.4\output # Training output directory
PaddleOCR-release-2.4\output\inference # Final exported model
  • Training scripts
// Train the model
python tools/train.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.pretrained_model=./pretrain_models/best_accuracy
// Export the model
python tools/export_model.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints=output/rec_chinese_lite_v2.0/latest Global.save_inference_dir=output/inference
// Predict using the trained model (folder of images)
python tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints=output/rec_chinese_lite_v2.0/latest Global.load_static_weights=false Global.infer_img=trainTest/
// Predict using the trained model (single image)
python tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_lite_train_v2.0.yml -o Global.checkpoints=output/rec_chinese_lite_v2.0/latest Global.load_static_weights=false Global.infer_img=trainTest/1000.jpg
// Predict using the exported model
python tools/infer/predict_rec.py --image_dir="./trainTest/" --det_model_dir="./ch_PP-OCRv2_det_infer/"  --rec_model_dir="./output/inference/" --cls_model_dir="./ch_ppocr_mobile_v2.0_cls_infer/"

Author: Dream.Machine

Website: www.dmskin.com

Keep Exploring

Related Reading

More Articles