环境:
paddle.utils.run_check() Running verify PaddlePaddle program ... W0213 15:33:14.557688 1583214 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.0, Runtime API Version: 11.2 W0213 15:33:14.559726 1583214 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4. PaddlePaddle works well on 1 GPU. PaddlePaddle works well on 1 GPUs. PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
导出模型时报错日志如下:
Process Process-9: Traceback (most recent call last): File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap self.run() File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/multiprocessing/process.py", line 99, in run self._target(*self._args, self._kwargs) File "/home/zksc/duanzhiqiang/code/easy_deploy/paddlex_restful/restful/project/operate.py", line 371, in _call_paddlex_export_infer p_export(paddleclas_yaml_path, pretrained_model, save_inference_dir) File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/tools/export_model.py", line 46, in p_export engine = Engine(p_config, mode="export") File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/ppcls/engine/engine.py", line 192, in init self.model = build_model(self.config, self.mode) File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/ppcls/arch/init.py", line 40, in build_model arch = getattr(mod, model_type)(arch_config) File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/ppcls/arch/backbone/legendary_models/mobilenet_v1.py", line 257, in MobileNetV1 kwargs) File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/ppcls/arch/backbone/legendary_models/mobilenet_v1.py", line 125, in init padding=1) File "/home/zksc/duanzhiqiang/code/easy_deploy/PaddleClas/ppcls/arch/backbone/legendary_models/mobilenet_v1.py", line 64, in init bias_attr=False) File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 700, in init data_format=data_format, File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 160, in init default_initializer=_get_default_param_initializer(), File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 447, in create_parameter default_initializer) File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/fluid/layer_helper_base.py", line 379, in create_parameter attr._to_kwargs(with_initializer=True)) File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/fluid/framework.py", line 3965, in create_parameter initializer(param, self) File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/fluid/initializer.py", line 56, in call return self.forward(param, block) File "/home/zksc/anaconda3/envs/PaddleX/lib/python3.7/site-packages/paddle/fluid/initializer.py", line 803, in forward place) OSError: (External) CUDA error(3), initialization error. [Hint: Please search for the error code(3) on website (https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038) to get Nvidia's official solution and advice about CUDA Error.] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:243)
看日志是由于多进程调用了fluid.initializer初始化函数报错,且是偶然性的,执行clas、seg、det任务有时都会发生此报错,看了很多issue,都没有给出解决方法,请问这个问题应如何解决?