多卡训练模型时候出错:
training... [13/1964]Train 0: 0%| | 0/5334 [00:20<?, ?it/s] Traceback (most recent call last): File "train.py", line 221, in <module> train.run() File "train.py", line 160, in run self.trainer.train(epoch=epoch) File "/home/xueruini/onion_rain/pytorch/object_detection/traintest/trainer.py", line 137, in train loss, outputs = self.model(inputs, targets) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/modules/mo dule.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/d istributed.py", line 461, in forward output = self.gather(outputs, self.output_device) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/d istributed.py", line 488, in gather return gather(outputs, output_device, dim=self.dim) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/s catter_gather.py", line 68, in gather res = gather_map(outputs) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/s catter_gather.py", line 63, in gather_map return type(out)(map(gather_map, zip(*outputs))) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/s catter_gather.py", line 55, in gather_map return Gather.apply(target_device, dim, *outputs) File "/home/xueruini/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/_ functions.py", line 54, in forward assert all(map(lambda i: i.is_cuda, inputs)) AssertionErrormodel.forward()输出output前进行了output.cpu()操作 导致多卡推理结果合并时出错
删除.cpu()操作