CUDA10:TensorFlow-CPUとTensorFlow-GPUの速度比較

tensorflow-cpuとtensorflow-gpuの速度比較をこのサイトのコードを使ってやってみる。

import tensorflow as tf
import time
import matplotlib.pyplot as plt
def performanceTest(device_name, size):
    with tf.device(device_name):
        W = tf.random_normal([size, size], name = 'W')
        X = tf.random_normal([size, size], name = 'X')
        mul = tf.matmul(W, X, name = 'mul')
        sum_result = tf.reduce_sum(mul, name = 'sum')
        
    startTime = time.time()
    tfconfig = tf.ConfigProto(log_device_placement = True)
    with tf.Session(config = tfconfig) as sess:
        result = sess.run(sum_result)
    takeTimes = time.time() - startTime
    print(device_name, " size = ", size, ",Time: ", takeTimes)
    return takeTimes
gpu_set = []
cpu_set = []
i_set = []
for i in range(500, 15001, 500):
    g = performanceTest("/gpu:0", i)
    c = performanceTest("/cpu:0", i)
    print('Speed difference: {:0.1f}x'.format(c / g))
    gpu_set.append(g)
    cpu_set.append(c)
    i_set.append(i)
/gpu:0  size =  500 ,Time:  0.034740447998046875
/cpu:0  size =  500 ,Time:  0.02341914176940918
Speed difference: 0.7x
/gpu:0  size =  1000 ,Time:  0.023832082748413086
/cpu:0  size =  1000 ,Time:  0.031587839126586914
Speed difference: 1.3x
/gpu:0  size =  1500 ,Time:  0.025037288665771484
/cpu:0  size =  1500 ,Time:  0.0460662841796875
Speed difference: 1.8x
/gpu:0  size =  2000 ,Time:  0.03046250343322754
/cpu:0  size =  2000 ,Time:  0.08000874519348145
Speed difference: 2.6x
/gpu:0  size =  2500 ,Time:  0.03577899932861328
/cpu:0  size =  2500 ,Time:  0.12499022483825684
Speed difference: 3.5x
/gpu:0  size =  3000 ,Time:  0.04348945617675781
/cpu:0  size =  3000 ,Time:  0.20012235641479492
Speed difference: 4.6x
/gpu:0  size =  3500 ,Time:  0.0599055290222168
/cpu:0  size =  3500 ,Time:  0.2805497646331787
Speed difference: 4.7x
/gpu:0  size =  4000 ,Time:  0.08099937438964844
/cpu:0  size =  4000 ,Time:  0.3906996250152588
Speed difference: 4.8x
/gpu:0  size =  4500 ,Time:  0.08886313438415527
/cpu:0  size =  4500 ,Time:  0.5612015724182129
Speed difference: 6.3x
/gpu:0  size =  5000 ,Time:  0.1092383861541748
/cpu:0  size =  5000 ,Time:  0.6990952491760254
Speed difference: 6.4x
/gpu:0  size =  5500 ,Time:  0.14148521423339844
/cpu:0  size =  5500 ,Time:  0.8891515731811523
Speed difference: 6.3x
/gpu:0  size =  6000 ,Time:  0.18179869651794434
/cpu:0  size =  6000 ,Time:  1.1581580638885498
Speed difference: 6.4x
/gpu:0  size =  6500 ,Time:  0.21825504302978516
/cpu:0  size =  6500 ,Time:  1.404735803604126
Speed difference: 6.4x
/gpu:0  size =  7000 ,Time:  0.2697911262512207
/cpu:0  size =  7000 ,Time:  1.6760876178741455
Speed difference: 6.2x
/gpu:0  size =  7500 ,Time:  0.3307380676269531
/cpu:0  size =  7500 ,Time:  2.054910898208618
Speed difference: 6.2x
/gpu:0  size =  8000 ,Time:  0.38135743141174316
/cpu:0  size =  8000 ,Time:  2.4505362510681152
Speed difference: 6.4x
/gpu:0  size =  8500 ,Time:  0.4736137390136719
/cpu:0  size =  8500 ,Time:  2.925536632537842
Speed difference: 6.2x
/gpu:0  size =  9000 ,Time:  0.5450668334960938
/cpu:0  size =  9000 ,Time:  3.538905143737793
Speed difference: 6.5x
/gpu:0  size =  9500 ,Time:  0.6508128643035889
/cpu:0  size =  9500 ,Time:  6.326648950576782
Speed difference: 9.7x
/gpu:0  size =  10000 ,Time:  0.7966570854187012
/cpu:0  size =  10000 ,Time:  8.249387264251709
Speed difference: 10.4x
/gpu:0  size =  10500 ,Time:  0.8959507942199707
/cpu:0  size =  10500 ,Time:  6.489131450653076
Speed difference: 7.2x
/gpu:0  size =  11000 ,Time:  0.9568905830383301
/cpu:0  size =  11000 ,Time:  6.871312856674194
Speed difference: 7.2x
/gpu:0  size =  11500 ,Time:  1.1280040740966797
/cpu:0  size =  11500 ,Time:  7.875150680541992
Speed difference: 7.0x
/gpu:0  size =  12000 ,Time:  1.2369775772094727
/cpu:0  size =  12000 ,Time:  8.983980655670166
Speed difference: 7.3x
/gpu:0  size =  12500 ,Time:  1.45833420753479
/cpu:0  size =  12500 ,Time:  10.16003131866455
Speed difference: 7.0x
/gpu:0  size =  13000 ,Time:  1.565847396850586
/cpu:0  size =  13000 ,Time:  11.471773862838745
Speed difference: 7.3x
/gpu:0  size =  13500 ,Time:  1.798452615737915
/cpu:0  size =  13500 ,Time:  13.203768014907837
Speed difference: 7.3x
/gpu:0  size =  14000 ,Time:  1.912426471710205
/cpu:0  size =  14000 ,Time:  14.051488161087036
Speed difference: 7.3x
/gpu:0  size =  14500 ,Time:  2.2451210021972656
/cpu:0  size =  14500 ,Time:  15.841988563537598
Speed difference: 7.1x
/gpu:0  size =  15000 ,Time:  2.4142425060272217
/cpu:0  size =  15000 ,Time:  17.313440322875977
Speed difference: 7.2x

CPUとGPUの速度比較は処理が重くなれば比率が上がる訳ではなく、10.4倍をピークに大体7倍前後で推移しているのが興味深い。

fig = plt.gcf()
fig.set_size_inches(15,8)
plt.rcParams["font.size"] = "17"
plt.plot(i_set, gpu_set, label = 'gpu')
plt.plot(i_set, cpu_set, label = 'cpu')
plt.legend()

print('lowest with gpu: ', max(gpu_set), 'lowest with cpu: ', max(cpu_set))
lowest with gpu:  2.4142425060272217 lowest with cpu:  17.313440322875977

GPUのラインに比べ、CPUの動きがなかなか怪しいのが面白い。かなりいい加減にビルドしたのが仇となっているのかもしれない。

スポンサーリンク