今回はこのサイトを参考にしながらPythonプログラミングの基礎であるセットの学習を試みる。
Sets¶
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference. Sets can also have important efficiency benefits.
セットは、重複要素を持たない非順序コレクションである。基本的な用途には、メンバーシップテストと重複入力消去を含む。セットオブジェクトは、和集合、交差、差分、対称差のような数学演算もサポートする。セットは、重要な効率効果も有し得る。
リストはとにかく処理が遅い¶
One motivation for using sets is that several important operations (adding an element, determining whether an element is in the set) take constant time regardless of the size of the set, rather than linear time in the size of the list.
セットを使う動機の一つが、いくつかの重要な演算(要素の加算、セットの中にある要素が存在するかどうかを割り出す)が、リストの大きさに比例する線形時間ではなく、セットのサイズに関係なく一定時間を取ることにある。
big_num = 10000000 # ten million
big_num_list = list(range(big_num))
print(len(big_num_list))
How long do you think the following will take?
下のセルの実行時間がどれくらいかかるか?
- less than 1 second
1秒未満 - longer than 1 second but less than 10 seconds
1秒以上10秒未満 - longer than 10 seconds but less than 1 minute
10秒以上1分未満 - longer than 1 minute
1分以上
small_num = 100
small_num_list = list(range(big_num - small_num, big_num))
# how many of small_num_list elements are in big_num_list?
import time
start = time.time()
count = 0
print("counting...")
for i in small_num_list:
count = count + (1 if i in big_num_list else 0) #side question: parens needed?
end = time.time()
print("count using list:", count, "; time:", end-start, "sec")
How long for the following different version?
以下の別バージョンはどれくらいかかるか?
# how many of small_num_list elements are in big_num_set?
count = 0
##small_num_list = big_num_list
print("counting...")
start = time.time()
big_num_set = set(big_num_list) #include the time to build this
end1 = time.time()
print("time to build big_num_set:", end1-start, "sec")
for i in small_num_list:
count = count + (1 if i in big_num_set else 0)
end2 = time.time()
print("count using set:", count, "; time:", end2-end1, "sec")
start = time.time()
small_num_set = set(small_num_list)
count_intersection = len(big_num_set.intersection(small_num_set))
end = time.time()
print("count using set intersection:", count_intersection, "; time:", end-start, "sec")
セットはリストに比べると桁違いに演算処理が高速なことが分かる。