GPU Memory足りないよ(Tensor Flow)

by ysawa

Tensor Flow をつかった演算がうまくいかない。orz

CPUでの演算にすれば、計算通るけど、これなんだろう。

minstの学習

参考書通りに書いたプログラムですが、
CPUで計算させるとうまくいくので間違ってはいないと思うけど、
今のままのGPUでは無理そう。

GPUの演算では、コア数もそうだけど、メモリもかなり重要そう。

使ったGPU


ASUSTek R.O.G. STRIXシリーズ NVIDIA GeForce GTX1060搭載ビデオカード オーバークロック メモリ6GB STRIX-GTX1060-O6G-GAMING

金がないときに買った3万円くらいのGPUだったと思うけど、メモリ2GBだし全然歯がたたないみたい。
安物買いの銭失いとはこのことか。

最近、お高いGPUを買い足したので、それで試してみるしかない。

実行結果

一応実行結果を載せておきます。

$ python ch5-mnist-deep.py
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 960
major: 5 minor: 2 memoryClockRate (GHz) 1.291
pciBusID 0000:03:00.0
Total memory: 1.95GiB
Free memory: 1.91GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (256): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (512): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1024): 	Total Chunks: 1, Chunks in use: 0 1.0KiB allocated for chunks. 4.79MiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2048): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4096): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8192): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16384): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (32768): 	Total Chunks: 1, Chunks in use: 0 40.0KiB allocated for chunks. 3.1KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (65536): 	Total Chunks: 1, Chunks in use: 0 78.5KiB allocated for chunks. 1.20MiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (131072): 	Total Chunks: 1, Chunks in use: 0 200.0KiB allocated for chunks. 153.1KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (262144): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (524288): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1048576): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (2097152): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (4194304): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (8388608): 	Total Chunks: 1, Chunks in use: 0 11.86MiB allocated for chunks. 390.6KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (16777216): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (33554432): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (67108864): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (134217728): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (268435456): 	Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.
I tensorflow/core/common_runtime/bfc_allocator.cc:656] Bin for 29.91MiB was 16.00MiB, Chunk State:
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e40000 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e40100 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e40200 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e40300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e40400 of size 4096
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e41400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e41500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e41600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e41700 of size 3328
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e42400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e42500 of size 204800
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e74500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb03e74600 of size 12845056
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04ab4600 of size 4096
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04ab5600 of size 40960
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abf600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abf700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abf800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abf900 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abfa00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abfb00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abfc00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abfd00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abfe00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04abff00 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04ac0400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04ac0500 of size 3328
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04ac1200 of size 40960
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04acb200 of size 80128
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04af2500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb04b24600 of size 204800
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb05732600 of size 4096
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573d600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573d700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573d800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573d900 of size 3328
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573e600 of size 3328
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573f300 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573f400 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0573f500 of size 204800
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb05771500 of size 204800
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb057a3500 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb057a3600 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb057a3700 of size 12845056
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb063e3700 of size 12845056
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07023700 of size 4096
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07024700 of size 4096
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07025700 of size 40960
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb0702f700 of size 40960
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07039700 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07039800 of size 256
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07039900 of size 13053184
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb07cac600 of size 31360000
I tensorflow/core/common_runtime/bfc_allocator.cc:674] Chunk at 0xb09a94a00 of size 1744090624
I tensorflow/core/common_runtime/bfc_allocator.cc:683] Free at 0xb04ac0000 of size 1024
I tensorflow/core/common_runtime/bfc_allocator.cc:683] Free at 0xb04adeb00 of size 80384
I tensorflow/core/common_runtime/bfc_allocator.cc:683] Free at 0xb04af2600 of size 204800
I tensorflow/core/common_runtime/bfc_allocator.cc:683] Free at 0xb04b56600 of size 12435456
I tensorflow/core/common_runtime/bfc_allocator.cc:683] Free at 0xb05733600 of size 40960
I tensorflow/core/common_runtime/bfc_allocator.cc:689]      Summary of in-use Chunks by size:
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 30 Chunks of size 256 totalling 7.5KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 4 Chunks of size 3328 totalling 13.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 5 Chunks of size 4096 totalling 20.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 4 Chunks of size 40960 totalling 160.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 80128 totalling 78.2KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 4 Chunks of size 204800 totalling 800.0KiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 3 Chunks of size 12845056 totalling 36.75MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 13053184 totalling 12.45MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 31360000 totalling 29.91MiB
I tensorflow/core/common_runtime/bfc_allocator.cc:692] 1 Chunks of size 1744090624 totalling 1.62GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:696] Sum Total of in-use chunks: 1.70GiB
I tensorflow/core/common_runtime/bfc_allocator.cc:698] Stats:
Limit:                  1840906240
InUse:                  1828143616
MaxInUse:               1828543744
NumAllocs:                     201
MaxAllocSize:           1744090624

W tensorflow/core/common_runtime/bfc_allocator.cc:270] ************************************************************xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 29.91MiB.  See logs for memory state.
W tensorflow/core/framework/op_kernel.cc:940] Resource exhausted: OOM when allocating tensor with shape[10000,1,28,28]
Traceback (most recent call last):
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 965, in _do_call
    return fn(*args)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 947, in _run_fn
    status, run_metadata)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[10000,1,28,28]
	 [[Node: conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](conv1/Reshape, conv1/W_conv1/read)]]
	 [[Node: predict/Mean/_18 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_77_predict/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ch5-mnist-deep.py", line 155, in 
    acc = sess.run(accuracy_step, feed_dict=test_fd)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 710, in run
    run_metadata_ptr)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 908, in _run
    feed_dict_string, options, run_metadata)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
    target_list, options, run_metadata)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shape[10000,1,28,28]
	 [[Node: conv1/Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](conv1/Reshape, conv1/W_conv1/read)]]
	 [[Node: predict/Mean/_18 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_77_predict/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op 'conv1/Conv2D', defined at:
  File "ch5-mnist-deep.py", line 62, in 
    h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
  File "ch5-mnist-deep.py", line 46, in conv2d
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 394, in conv2d
    data_format=data_format, name=name)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/ysawa/.pyenv/versions/3.5.2/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
    self._traceback = _extract_stack()

・ω・

この記事を読んだあとに

ysawa

エヌ次元株式会社代表取締役
東京工業大学工学部計算工学専攻卒業
符号理論の応用に関する研究
在学中よりフリーランスエンジニアとして活動
「持続可能な設計」を得意領域とする
会社設立後も設計からアプリ制作や
Webサイトのコーディングまでを幅広く担当
セキュリティスペシャリスト

 このブログについて

このブログは、プログラマやエンジニアのためになる情報を垂れ流しています。
ちょっと異端的なものも含まれているかもしれません。