[PaddlePaddle/PaddleOCR]表格识别自定义模型问题

2024-07-08 696 views
5
  • 系统环境/System Environment:Ubuntu 16

  • 版本号/Version:Paddle:2.3.2-GPU PaddleOCR:2.6 问题相关组件/Related components:

  • 运行指令/Command Code:

  • !python ppstructure/table/predict_table.py \ --det_model_dir=/data/lcpang/lc/project_table/model/infer_model/det_sast_ic15 \ --rec_model_dir=/data/lcpang/lc/project_table/model/infer_model/rec_r31_sar_infer \ --table_model_dir=/data/lcpang/lc/project_table/model/infer_model/table_master \ --rec_char_dict_path=ppocr/utils/dict90.txt \ --table_char_dict_path=ppocr/utils/dict/table_master_structure_dict.txt \ --image_dir=/data/lcpang/lc/project_table/dataset/pubtabnet_icdar_data/imgs/val/PMC1501036_002_00.png \ --output=output/table

  • 完整报错/Complete Error Message: [2022/09/26 21:57:33] ppocr INFO: [0/1] /data/lcpang/lc/project_table/dataset/pubtabnet_icdar_data/imgs/val/PMC1501036_002_00.png Traceback (most recent call last): File "ppstructure/table/predict_table.py", line 254, in main(args) File "ppstructure/table/predict_table.py", line 206, in main predres, = table_sys(img) File "ppstructure/table/predict_table.py", line 103, in call structure_res, elapse = self._structure(copy.deepcopy(img)) File "ppstructure/table/predict_table.py", line 132, in _structure structure_res, elapse = self.table_structurer(copy.deepcopy(img)) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppstructure/table/predict_structure.py", line 103, in call self.predictor.run() ValueError: In user code:

    File "tools/export_model.py", line 255, in main() File "tools/export_model.py", line 251, in main model, arch_config, save_path, logger, input_shape=input_shape) File "tools/export_model.py", line 171, in export_single_model paddle.jit.save(model, save_path) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 631, in wrapper func(layer, path, input_spec, configs) File "/data/lcpang/.local/lib/python3.7/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), *kw) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(args, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 51, in impl return func(*args, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/jit.py", line 861, in save inner_input_spec, with_hook=with_hook) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 528, in concrete_program_specify_input_spec *desired_input_spec, with_hook=with_hook) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 436, in get_concrete_program concrete_program, partial_program_layer = self._program_cache[cache_key] File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 801, in getitem self._caches[item_id] = self._build_once(item) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 790, in _build_once *cache_key.kwargs) File "/data/lcpang/.local/lib/python3.7/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 51, in impl return func(*args, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/program_translator.py", line 733, in from_func_spec outputs = static_func(inputs) File "/tmp/tmpsjo5nkzi.py", line 28, in forward false_fn_1, (x,), (x,), (x,)) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 211, in convert_ifelse out = _run_py_ifelse(pred, true_fn, false_fn, true_args, false_args) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 257, in _run_py_ifelse return true_fn(true_args) if pred else false_fn(false_args) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/modeling/architectures/base_model.py", line 86, in forward x = self.backbone(x) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, kwargs) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/modeling/backbones/table_master_resnet.py", line 214, in forward x = self.layer2(x) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, *kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/container.py", line 98, in forward input = layer(input) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(inputs, kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(*inputs, kwargs) File "/tmp/tmprh5qbiop.py", line 21, in forward true_fn_11, false_fn_11, (out,), (out,), (out,)) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 211, in convert_ifelse out = _run_py_ifelse(pred, true_fn, false_fn, true_args, false_args) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 257, in _run_py_ifelse return true_fn(true_args) if pred else false_fn(false_args) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/modeling/backbones/table_master_resnet.py", line 73, in forward out = self.context_block(out) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 930, in call return self._dygraph_call_func(*inputs, *kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 915, in _dygraph_call_func outputs = self.forward(inputs, kwargs) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/modeling/backbones/table_master_resnet.py", line 343, in forward context = self.spatial_pool(x) File "/tmp/tmp7fpv9x8x.py", line 59, in spatial_pool ,), (context, x)) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 211, in convert_ifelse out = _run_py_ifelse(pred, true_fn, false_fn, true_args, false_args) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/dygraph/dygraph_to_static/convert_operators.py", line 257, in _run_py_ifelse return true_fn(true_args) if pred else false_fn(false_args) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/modeling/backbones/table_master_resnet.py", line 300, in spatial_pool batch self.headers, self.single_header_inplanes, height, width File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/tensor/manipulation.py", line 2139, in reshape return paddle.fluid.layers.reshape(x=x, shape=shape, name=name) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 6450, in reshape "XShape": x_shape}) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 44, in append_op return self.main_program.current_block().append_op(args, **kwargs) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3621, in append_op attrs=kwargs.get("attrs", None)) File "/data/lcpang/anaconda3/envs/ppstr/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2635, in init for frame in traceback.extract_stack():

    InvalidArgumentError: The 'shape' attribute in ReshapeOp is invalid. The input tensor X'size must be divisible by known capacity of 'shape'. But received X's shape = [1, 256, 122, 122], X's size = 3810304, 'shape' is [-1, 256, 120, 120], known capacity of 'shape' is -3686400. [Hint: Expected output_shape[unk_dim_idx] capacity == -in_size, but received output_shape[unk_dim_idx] capacity:-3686400 != -in_size:-3810304.] (at /paddle/paddle/fluid/operators/reshape_op.cc:190) [operator < reshape2 > error]

回答

3

官网推荐的那几个检测,识别,结构en_ppocr_mobile_v2.0_table_det_infer,en_ppocr_mobile_v2.0_table_rec_infer,en_ppstructure_mobile_v2.0_SLANet_infer组合起来是可以对pubtabnet中的表格图片数据进行推理与评估,这个没问题. 单独自定义检测,识别,结构,都可以正常从训练转换成推理模型,并单个图片可以成功推理跑通,但是想把它们整合起来输出为一个完整的带有单元格信息的HTML串,也就是上面运行指令中的predict_table这个实现起来遇到问题了,困扰了我好几天了,希望能得到老师的解答,谢谢!

2

你是入图片什么尺寸,我看你用的tablemaster,这个模型要求图片是480*480

5

这张推理图片尺寸是(106, 503, 3) 这个我不知道在哪设定,如果按照tablemaster单独结构预测的话,加上一行--table_max_len=480,结果还是报错

[2022/09/26 23:02:36] ppocr INFO: [0/1] /data/lcpang/lc/project_table/dataset/pubtabnet_icdar_data/imgs/val/PMC1501036_002_00.png Traceback (most recent call last): File "ppstructure/table/predict_table.py", line 254, in main(args) File "ppstructure/table/predict_table.py", line 206, in main predres, = table_sys(img) File "ppstructure/table/predict_table.py", line 103, in call structure_res, elapse = self._structure(copy.deepcopy(img)) File "ppstructure/table/predict_table.py", line 132, in _structure structure_res, elapse = self.table_structurer(copy.deepcopy(img)) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppstructure/table/predict_structure.py", line 114, in call post_result = self.postprocess_op(preds, [shape_list]) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/postprocess/table_postprocess.py", line 56, in call result = self.decode(structure_probs, bbox_preds, shape_list) File "/data/lcpang/lc/PaddleOCR-2.6.0/ppocr/postprocess/table_postprocess.py", line 85, in decode text = self.character[char_idx] IndexError: list index out of range

4

你看看tablemaster的文档吧

7

老师,我看了很多遍,还是没发现predict_table中需要调整哪些参数,tablemaster中我看只有predict_structure的参数有参考.表格识别说明文档中有model_list参考,"PP-Structure目前提供了中英文两种语言的表格识别模型",但其中并没有我想用的这几个检测,识别,结构的算法,因为这几个算法可以用在自己的数据集上进行训练.或者PP-Structure提供的表格识别模型列表中,我并没有看到与这些模型对应的配置文件,那该如何去训练自己的数据集呢?求指教,谢谢老师

6

还有一个问题就是为什么我在用python3 ppstructure/table/eval_table.py 在pubtabnet_examples中(20张图片)做评估的时候,一开始测得teds结果很好是0.9935536326691713,但是我进一步找了20张别的图片,把他们的名称改为gt.txt中的那20个文件名,即文件名和图片信息是不相符的,然后我再去测评估得到的结果依然高的很是0.9945262718172962,不知道这是什么情况呢?

6

还有一个问题就是为什么我在用python3 ppstructure/table/eval_table.py 在pubtabnet_examples中(20张图片)做评估的时候,一开始测得teds结果很好是0.9935536326691713,但是我进一步找了20张别的图片,把他们的名称改为gt.txt中的那20个文件名,即文件名和图片信息是不相符的,然后我再去测评估得到的结果依然高的很是0.9945262718172962,不知道这是什么情况呢?

第一次评估会缓存图片的模型输出

6

老师,tablemaster是不是不能对表格进行单元格内容输出,而只能输出结构信息及bbox坐标信息呢?

2

或者说,如果我想对一张表格图片预测出带有单元格内容HTML文件,只能使用PP-Structure说明文档中的model_list提供的那些指定的检测,识别,结构模型去predict_table是吗?像pse,sar,tablemaster这种组合在一起是使用不了的吗? PP-Structure目前提供了中英文两种语言的表格识别模型,模型链接见 models_list。也提供了whl包的形式方便快速使用,详见 quickstart

8

其他模型可以直接替换的,如果报错的话,针对性的改一下就好了

9

老师,tablemaster是不是不能对表格进行单元格内容输出,而只能输出结构信息及bbox坐标信息呢?

tablemaster也能输出结构和坐标

1

请问您最后解决这个问题了吗?我最近也想尝试把表格结构预测的部分换成tablemaster,添加了table_max_len=480之后还是报错。