【機械学習】ディープラーニングの学習効果を超高める【深層学習】

ライス大学のコンピュータ科学者が、deep learning(ディープラーニング)の学習効果を最大で95％も高められる新手法を開発しています。この新手法が、計算量を劇的に減らす事で、深層学習(機械学習の一種)に必要なエネルギーと時間を大幅に削減してくれるようです。今回新たに開発されたテクニックは、あらゆる種類のディープラーニング構造に適用可能で、この技術が適用されるディープニューラルネットワーク(深層神経回路網)が大規模な程、計算量が節約されると、ライス大助教授で筆頭研究員のAnshumali Shrivastava氏は言いました。

深層学習ネットワーク

Rice U. scientists slash computations for deep learning

The research will be presented in August at the KDD 2017 conference in Halifax, Nova Scotia. It addresses one of the biggest issues facing tech giants like Google, Facebook and Microsoft as they race to build, train and deploy massive deep-learning networks for a growing body of products as diverse as self-driving cars, language translators and intelligent replies to emails.

今回の研究は、8月にカナダのノバスコシア州ハリファックスで開催されるKDD2017会議でプレゼンされる予定で、今後ますます増え続ける、自律走行車(自動運転車)、多言語翻訳、Eメール自動知的応答等の、非常に多岐にわたる製品用の巨大深層学習ネットワークの構築・訓練・運用を、グーグル、フェイスブック、マイクロソフト等のテック企業大手が競い合う中で直面する最も大きな課題の1つを扱っています。

ハッシュ法由来の新手法

Shrivastava and Rice graduate student Ryan Spring have shown that techniques from “hashing,” a tried-and-true data-indexing method, can be adapted to dramatically reduce the computational overhead for deep learning. Hashing involves the use of smart hash functions that convert data into manageable small numbers called hashes. The hashes are stored in tables that work much like the index in a printed book.

シュリバスタバ氏とライス大学の大学院生ライアン・スプリング氏の二人は、ハッシュ法由来のテクニックで、有効性が実証済みのデータインデックス化法が、ディープラーニングの演算オーバーヘッドを劇的に低減できる事を証明しています。ハッシュ法は、データをハッシュ値と呼ばれる処理しやすい小さな数に変換する、賢いハッシュ関数を使用しています。ハッシュ値は、書籍におけるインデックスとほぼ同じ働きをするテーブルに格納されています。

新手法は高効率・高精度

“Our approach blends two techniques — a clever variant of locality-sensitive hashing and sparse backpropagation — to reduce computational requirements without significant loss of accuracy,” Spring said. “For example, in small-scale tests we found we could reduce computation by as much as 95 percent and still be within 1 percent of the accuracy obtained with standard approaches.”

”我々の手法は、精度を大幅に犠牲にすることなく計算量を削減するために、計算局所性鋭敏型ハッシュをより賢くした変異型と、スパースバックプロパゲーション(疎誤差逆伝播法)の2つのテクニックをブレンドしています。”と、スプリング氏は言いました。”例えば、小規模なテストにおいて、最大で95％計算を減らすことできた上に、標準手法で得た精度と誤差1%以内であることを、私たちは見い出しています。”

機械学習には演算パワーが必須

“Adding more neurons to a network layer increases its expressive power, and there’s no upper limit to how big we want our networks to be,” Shrivastava said. “Google is reportedly trying to train one with 137 billion neurons.” By contrast, he said, there are limits to the amount of computational power that can be brought to bear to train and deploy such networks.

”ネットワークプレイヤーにより多くのニューロンを加える事で表現力が増し、ネットワークの大きさには上限はありません。”と、シュリバスタバ氏は言います。”グーグルは、1370億個のニューロンを持ったネットワークを訓練中だと伝えられています。” それとは対照的に、そのような巨大なネットワークを現実的に訓練・運用するための演算能力には制限が存在します。

“Most machine-learning algorithms in use today were developed 30-50 years ago,” he said. “They were not designed with computational complexity in mind. But with ‘big data,’ there are fundamental limits on resources like compute cycles, energy and memory. Our lab focuses on addressing those limitations.”

”今日使われているほとんどの機械学習アルゴリズムは、30~50年前に開発されています。それらは、計算の複雑性を念頭に置いては作られていません。しかし、ビッグデータでは、演算サイクル、エネルギー、メモリのようなリソースに根本的な制約が存在します。私たちの研究室では、そういった制限に対処することに焦点を当てています。”

超高効率深層学習

“The savings increase with scale because we are exploiting the inherent sparsity in big data,” he said. “For instance, let’s say a deep net has a billion neurons. For any given input — like a picture of a dog — only a few of those will become excited. In data parlance, we refer to that as sparsity, and because of sparsity our method will save more as the network grows in size. So while we’ve shown a 95 percent savings with 1,000 neurons, the mathematics suggests we can save more than 99 percent with a billion neurons.”

”我々が、ビッグデータの持つ内在的希薄性をフルで利用しているので、ネットワークの規模が大きくなるに従って、計算オーバーヘッドを大幅に削減できます。例えば、ディープニューラルネットワークが10億ニューロンを持っていたとしましょう。例えば、犬の絵のような任意の入力に対して、ほんの僅かなニューロンだけが発火します。データ用語だと、それをスパーシティ(希薄性、疎性)と言い、疎性であるが故に、我々の手法は、ネットワークサイズが大きくなると、演算節約性能も向上します。なので、1000ニューロンで95%の演算量削減を示している一方で、数学的予測が、10億ニューロンだと99%以上節約可能な事を示唆しています。”

1000ニューロンで95％、10億ニューロンで99％以上だということは、100兆ニューロンを有するニューラルネットワークだと、200％以上演算オーバーヘッドを削減できそうです。今回のこの研究結果が実際に使えるとしたら、深層学習の学習効率がとんでもないくらい高効率になります。機械学習の学習効果が高まれば、マシンパワーを必要としない、つまり、超省エネで超小型な回路で複雑な処理が可能になり、AI(人工知能)分野に革命的な進歩をもたらしてくれそうです。実際問題として、計算上のオーバーヘッド削減が、どの程度学習効果を高めてくれるのかはわかりませんが、深層学習効率を高めてくれることだけは確かみたいです。