A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference. Sets can also have important efficiency benefits.
One motivation for using sets is that several important operations (adding an element, determining whether an element is in the set) take constant time regardless of the size of the set, rather than linear time in the size of the list.
big_num = 10000000 # ten million
big_num_list = list(range(big_num))
How long do you think the following will take?
- less than 1 second
1秒未満 - longer than 1 second but less than 10 seconds
1秒以上10秒未満 - longer than 10 seconds but less than 1 minute
10秒以上1分未満 - longer than 1 minute
small_num = 100
small_num_list = list(range(big_num - small_num, big_num))
# how many of small_num_list elements are in big_num_list?
import time
start = time.time()
count = 0
for i in small_num_list:
count = count + (1 if i in big_num_list else 0) #side question: parens needed?
end = time.time()
print("count using list:", count, "; time:", end-start, "sec")
How long for the following different version?
# how many of small_num_list elements are in big_num_set?
count = 0
##small_num_list = big_num_list
start = time.time()
big_num_set = set(big_num_list) #include the time to build this
end1 = time.time()
print("time to build big_num_set:", end1-start, "sec")
for i in small_num_list:
count = count + (1 if i in big_num_set else 0)
end2 = time.time()
print("count using set:", count, "; time:", end2-end1, "sec")
start = time.time()
small_num_set = set(small_num_list)
count_intersection = len(big_num_set.intersection(small_num_set))
end = time.time()
print("count using set intersection:", count_intersection, "; time:", end-start, "sec")