[PaddlePaddle/PaddleOCR]评估表格结构和cell坐标的时候,html结构预测acc为0.999,为什么检测的box的召回精度等评价指标都是0,排查了一下好像是gt的bbox没有读取到,导致eval的时候评价指标都是0,这个需要怎么修改?

2024-05-10 277 views
2

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:
  • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
  • 运行指令/Command Code:
  • 完整报错/Complete Error Message:
  • 我的config文件: Global: use_gpu: False epoch_num: 10 log_smooth_window: 20 print_batch_step: 20 save_model_dir: /Users/pengkang01/Desktop/txt转matrix/PaddleOCR/output/SLANet_ch save_epoch_step: 400 evaluation is run every 331 iterations after the 0th iteration

    eval_batch_step: [0, 331] cal_metric_during_train: True pretrained_model: checkpoints: save_inference_dir: ./output/SLANet_ch/infer use_visualdl: False infer_img: /Users/pengkang01/Desktop/txt转matrix/PaddleOCR/500_table

    for data or label process

    character_dict_path: /Users/pengkang01/Desktop/txt转matrix/PaddleOCR/ppocr/utils/dict/table_structure_dict_ch.txt character_type: en max_text_length: &max_text_length 500 box_format: &box_format xyxyxyxy # 'xywh', 'xyxy', 'xyxyxyxy' infer_mode: False

    use_sync_bn: True

    use_sync_bn: False save_res_path: output/infer

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 clip_norm: 5.0 lr: learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00000

Architecture: model_type: table algorithm: SLANet Backbone: name: PPLCNet scale: 1.0 pretrained: True use_ssld: True Neck: name: CSPPAN out_channels: 96 Head: name: SLAHead hidden_size: 256 max_text_length: *max_text_length loc_reg_num: &loc_reg_num 8

Loss: name: SLALoss structure_weight: 1.0 loc_weight: 2.0 loc_loss: smooth_l1

PostProcess: name: TableLabelDecode merge_no_span_structure: &merge_no_span_structure True

Metric: name: TableMetric main_indicator: acc compute_bbox_metric: True loc_reg_num: loc_reg_num box_format: box_format del_thead_tbody: True

Train: dataset: name: PubTabDataSet data_dir: 500_table/ label_file_list: [500_table/train.txt] transforms:

  • DecodeImage: img_mode: BGR channel_first: False
  • TableLabelEncode: learn_empty_box: True merge_no_span_structure: merge_no_span_structure replace_empty_cell_token: False loc_reg_num: loc_reg_num max_text_length: *max_text_length
  • TableBoxEncode: in_box_format: box_format out_box_format: box_format
  • ResizeTableImage: max_len: 488
  • NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: 'hwc'
  • PaddingTableImage: size: [488, 488]
  • ToCHWImage:
  • KeepKeys: keep_keys: [ 'image', 'structure', 'bboxes', 'bbox_masks', 'shape' ] loader: shuffle: True batch_size_per_card: 1 batch_size_per_card: 48

    drop_last: True

    num_workers: 1

    num_workers: 0

Eval: dataset: name: PubTabDataSet data_dir: 500_table/ label_file_list: [500_table/val.txt] transforms:

  • DecodeImage: img_mode: BGR channel_first: False
  • TableLabelEncode: learn_empty_box: True merge_no_span_structure: merge_no_span_structure replace_empty_cell_token: False loc_reg_num: loc_reg_num max_text_length: *max_text_length
  • TableBoxEncode: in_box_format: box_format out_box_format: box_format
  • ResizeTableImage: max_len: 488
  • NormalizeImage: scale: 1./255. mean: [0.485, 0.456, 0.406] std: [0.229, 0.224, 0.225] order: 'hwc'
  • PaddingTableImage: size: [488, 488]
  • ToCHWImage:
  • KeepKeys: keep_keys: [ 'image', 'structure', 'bboxes', 'bbox_masks', 'shape' ] loader: shuffle: False drop_last: False batch_size_per_card: 48 num_workers: 1

    batch_size_per_card: 1 num_workers: 0

eval结果: [2024/04/29 15:55:18] ppocr INFO: metric eval *** [2024/04/29 15:55:18] ppocr INFO: acc:0.9999990000010001 [2024/04/29 15:55:18] ppocr INFO: bbox_metric_precision:0.0 [2024/04/29 15:55:18] ppocr INFO: bbox_metric_recall:0 [2024/04/29 15:55:18] ppocr INFO: bbox_metric_hmean:0 [2024/04/29 15:55:18] ppocr INFO: fps:1.2853043121990564

回答

3

检查一下标注格式是否正确

3

检查一下标注格式是否正确

标注格式是没有问题的

7

你好,可以debug看一下数据读取过程bbox是否正确读取到了

4

你好,可以debug看一下数据读取过程bbox是否正确读取到了

您好是正确读取到了,但是eval的时候是将预测的bbox的坐标与bbox_mask进行匹配计算了

5

问题解决了吗?

3

问题解决了吗?

没……

0

你好可以提供一下执行的命令吗,我排查一下

2

你的paddle和paddleocr版本是多少呢

7

ti

你的paddle和paddleocr版本是多少呢

你好,paddleocr-2.7.4. paddle-2.5.1 config文件 `Global: use_gpu: False epoch_num: 300 log_smooth_window: 20 print_batch_step: 20 save_model_dir: ./output/SLANet_ch/613_no_xuanzhuan_padding_LCPAN save_epoch_step: 400

evaluation is run every 331 iterations after the 0th iteration

eval_batch_step: [0, 331] cal_metric_during_train: True pretrained_model: checkpoints: save_inference_dir: ./output/SLANet_ch/613_no_xuanzhuan/infer/ use_visualdl: False infer_img: ./500_table/

for data or label process

character_dict_path: ppocr/utils/dict/table_structure_dict_ch.txt character_type: en max_text_length: &max_text_length 500 box_format: &box_format xyxyxyxy # 'xywh', 'xyxy', 'xyxyxyxy' infer_mode: False

use_sync_bn: True

use_sync_bn: False save_res_path: output/infer

Optimizer: name: Adam beta1: 0.9 beta2: 0.999 clip_norm: 5.0 lr: learning_rate: 0.001 regularizer: name: 'L2' factor: 0.00000

Architecture: model_type: table algorithm: SLANet Backbone: name: PPLCNet scale: 1.0 pretrained: True use_ssld: True Neck: name: LCPAN out_channels: 96 Head: name: SLAHead hidden_size: 256 max_text_length: *max_text_length loc_reg_num: &loc_reg_num 8

Loss: name: SLALoss structure_weight: 1.0 loc_weight: 2.0 loc_loss: smooth_l1

PostProcess: name: TableLabelDecode merge_no_span_structure: &merge_no_span_structure True

Metric: name: TableMetric main_indicator: acc compute_bbox_metric: True loc_reg_num: loc_reg_num box_format: box_format del_thead_tbody: True

Train: dataset: name: PubTabDataSet data_dir: 500_table_no_xuanzhuan label_file_list: [500_table_no_xuanzhuan_padding/train.txt] transforms:

  • DecodeImage: img_mode: BGR channel_first: False
  • TableLabelEncode: learn_empty_box: True merge_no_span_structure: merge_no_span_structure replace_empty_cell_token: False loc_reg_num: loc_reg_num max_text_length: *max_text_length
  • TableBoxEncode: in_box_format: box_format out_box_format: box_format
  • ResizeTableImage: max_len: 488
  • NormalizeImage: scale: 1./255. mean: [0.93135516, 0.93246497, 0.93411841] #[0.485, 0.456, 0.406] std: [0.1713343, 0.17117019, 0.17039258] #[0.229, 0.224, 0.225] order: 'hwc'
  • PaddingTableImage: size: [488, 488]
  • ToCHWImage:
  • KeepKeys: keep_keys: [ 'image', 'structure', 'bboxes', 'bbox_masks', 'shape' ] loader: shuffle: True batch_size_per_card: 4 batch_size_per_card: 48

    drop_last: True

    num_workers: 1

    num_workers: 0

Eval: dataset: name: PubTabDataSet data_dir: 500_table_no_xuanzhuan/ label_file_list: [500_table_no_xuanzhuan_padding/val.txt] transforms:

  • DecodeImage: img_mode: BGR channel_first: False
  • TableLabelEncode: learn_empty_box: True merge_no_span_structure: merge_no_span_structure replace_empty_cell_token: False loc_reg_num: loc_reg_num max_text_length: *max_text_length
  • TableBoxEncode: in_box_format: box_format out_box_format: box_format
  • ResizeTableImage: max_len: 488
  • NormalizeImage: scale: 1./255. mean: [0.93135516, 0.93246497, 0.93411841] #[0.485, 0.456, 0.406] std: [0.1713343, 0.17117019, 0.17039258] #[0.229, 0.224, 0.225] order: 'hwc'
  • PaddingTableImage: size: [488, 488]
  • ToCHWImage:
  • KeepKeys: keep_keys: [ 'image', 'structure', 'bboxes', 'bbox_masks', 'shape' ] loader: shuffle: False drop_last: False batch_size_per_card: 48 num_workers: 1

    batch_size_per_card: 4 num_workers: 0 `

9

你好,可以使用tools/infer_table.py推理可视化一下,看看模型输出是否正常,然后我们在检查box的评测哪里出了问题

4

你好,可以使用tools/infer_table.py推理可视化一下,看看模型输出是否正常,然后我们在检查box的评测哪里出了问题

infer_table,除了不准没啥问题,就是eval表格box的三个指标有问题,结果都是0

5

尝试切换分支到2.7版本试试,如果还是不行的话,我这边复现一下看看

4

尝试切换分支到2.7版本试试,如果还是不行的话,我这边复现一下看看

刚试了一下2.7也不行

9

好的 我这边尝试复现一下哈