tensorflow环境的配置

    技术2022-07-12  83

    最近在物理机上装了Ubuntu20.4,并成功弄好了nvidia-driver435 + cuda10.0 + cudnn v7.6.5 for cuda10.0

    cuda 和 cuddn的安装

    见这篇文 https://blog.csdn.net/ashome123/article/details/105822040

    在运行时遇到的报错

    找不到文件,或者不能打开文件libcuddn.so.10.1等等,这类问题,只需建立软链接即可 sudo ln -s libcuddn.so.10 libcuddn.so.10.1(注意这里的软链接一定要建立在这个文件被引用的地方(一般是虚拟环境的lib目录下),报错中会指明,或者是tensorflow的lib模块中,具体问题具体分析)提示out of memory或者cannot create cuddn等错误,这个是显存空间不足(如果没有出现有哪个文件加载出错的话),这个错误的解决方法: 设置tensorfloe动态获取显存减小batch_size gpus = tf.config.experimental.list_physical_devices('GPU') if gpus: try: # Currently, memory growth needs to be the same across GPUs for gpu in gpus: # 设置动态获取显存 tf.config.experimental.set_memory_growth(gpu, True) logical_gpus = tf.config.experimental.list_logical_devices('GPU') print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs") except RuntimeError as e: # Memory growth must be set before GPUs have been initialized print(e)
    Processed: 0.008, SQL: 9