python3のNo module named ‘StringIO’エラーを解決する方法

caffe2のtutorialはpython2用に書かれているので、caffe2をインストールする時は、python2.7にインストした方がいいことは言うまでもないだろう。しかし、誤ってpython3.6の環境にインストしてしまい、再ビルドに半日潰したくない場合は、非常に面倒な、python2コードをpython3用に改変する作業が必要になる。

caffe2のMNISTチュートリアルに必要なデータを作る以下のコードがこのサイトに載っている。

コード

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

import os

def DownloadResource(url, path):
'''Downloads resources from s3 by url and unzips them to the provided path'''
import requests, zipfile, StringIO
print("Downloading... {} to {}".format(url, path))
r = requests.get(url, stream=True)
z = zipfile.ZipFile(StringIO.StringIO(r.content))
z.extractall(path)
print("Completed download and extraction.")

def GenerateDB(image, label, name):
'''Calls the make_mnist_db binary to generate a leveldb from a mnist dataset'''
name = os.path.join(data_folder, name)
print 'DB: ', name
if not os.path.exists(name):
syscall = "/usr/local/bin/make_mnist_db --channel_first --db leveldb --image_file " + image + " --label_file " + label + " --output_file " + name
# print "Creating database with: ", syscall
os.system(syscall)
else:
print "Database exists already. Delete the folder if you have issues/corrupted DB, then rerun this."
if os.path.exists(os.path.join(name, "LOCK")):
# print "Deleting the pre-existing lock file"
os.remove(os.path.join(name, "LOCK"))

current_folder = os.path.join(os.path.expanduser('~'), 'caffe2_notebooks')
data_folder = os.path.join(current_folder, 'tutorial_data', 'mnist')

# Downloads and extracts the lmdb databases of MNIST images - both test and train
if not os.path.exists(os.path.join(data_folder,"mnist-train-nchw-lmdb")):
DownloadResource("http://download.caffe2.ai/databases/mnist-lmdb.zip", data_folder)

# Downloads and extracts the MNIST data set
if not os.path.exists(os.path.join(data_folder, "train-images-idx3-ubyte")):
DownloadResource("http://download.caffe2.ai/datasets/mnist/mnist.zip", data_folder)

# (Re)generate the leveldb database (it can get locked with multi-user setups or abandoned threads)
# Requires the download of the dataset (mnist.zip) - see DownloadResource above.
# You also need to change references in the MNIST tutorial code where you train or test from lmdb to leveldb
image_file_train = os.path.join(data_folder, "train-images-idx3-ubyte")
label_file_train = os.path.join(data_folder, "train-labels-idx1-ubyte")
image_file_test = os.path.join(data_folder, "t10k-images-idx3-ubyte")
label_file_test = os.path.join(data_folder, "t10k-labels-idx1-ubyte")
GenerateDB(image_file_train, label_file_train, "mnist-train-nchw-leveldb")
GenerateDB(image_file_test, label_file_test, "mnist-test-nchw-leveldb")

printの部分を()で括った後でこのコードを実行すると、以下のようなエラーが吐き出される。

ModuleNotFoundError: No module named 'StringIO'

このサイトが指摘しているようなやり方を踏襲すると、以下のようなエラーが吐き出される。

AttributeError: type object '_io.StringIO' has no attribute 'StringIO'

StringIO.StringIO(r.content)の部分をStringIO(r.content)のように書き変えたら、以下のようなエラーが吐き出された。

TypeError: initial_value must be str or None, not bytes

ネットで色々調べた結果、このサイトが一番適格な解決策を提供してくれた。
StringIO(r.content)の部分を、io.BytesIOに変えたら、無事データのダウンロードが出来た。

Downloading... http://download.caffe2.ai/databases/mnist-lmdb.zip to /home/aho/caffe2_notebooks/tutorial_data/mnist
Completed download and extraction.
Downloading... http://download.caffe2.ai/datasets/mnist/mnist.zip to /home/aho/caffe2_notebooks/tutorial_data/mnist
Completed download and extraction.
DB:  /home/aho/caffe2_notebooks/tutorial_data/mnist/mnist-train-nchw-leveldb
DB:  /home/aho/caffe2_notebooks/tutorial_data/mnist/mnist-test-nchw-leveldb

pythonプログラミングの学習は思っていたより難しい。特に、機械学習の分野は高度な数学力が要求されるので、数学に自身の無い人間には敷居がかなり高いような気がしてきた。