スタンフォード/CS131/宿題3-5 複数画像のステッチング

前回のStanford University/CS131/宿題3-4 RANSACとHOGの続きをやる。今回の宿題は、Extra Credit: Stitching Multiple Images(複数画像のステッチング)をやる。この宿題は、パノラマ画像作りの集大成となっている。

スポンサーリンク

Extra Credit: Stitching Multiple Images

Work in the cell below to complete the code to stitch an ordered chain of images.
下のセルのコードを完成させて順序通りに画像をステッチングせよ。

Given a sequence of $m$ images ($I_1, I_2,…,I_m$), take every neighboring pair of images and compute the transformation matrix which converts points from the coordinate frame of $I_{i+1}$ to the frame of $I_{i}$. Then, select a reference image $I_{ref}$, which is in the middle of the chain. We want our final panorama image to be in the coordinate frame of $I_{ref}$. So, for each $I_i$ that is not the reference image, we need a transformation matrix that will convert points in frame $i$ to frame $ref$.
順番$m$の画像($I_1, I_2,…,I_m$)を所与として、全ての隣接画像ペアを取得して、$I_{i+1}$の座標フレームから$I_{i}$のフレームに至る点を変換する変換行列を算出する。次に、一連の画像の真ん中の参照画像$I_{ref}$を選択する。最終的なパノラマ画像を、$I_{ref}$の座標フレームに置きたいので、参照画像でない各$I_i$に対して、フレーム$i$からフレーム$ref$までのポイントを変換する変換行列が必要になる。

-Hint:

  • If you are confused, you may want to review the Linear Algebra slides on how to combine the effects of multiple transformation matrices.
    行き詰まったら、複数の変換行列の効果を組み合わせる方法を説明している線形代数のスライドを復習してみるとよい。
  • The inverse of transformation matrix has the reverse effect. Please use numpy.linalg.inv function whenever you want to compute matrix inverse.
    逆変換行列(変換行列の逆行列)は逆の効果を有している。逆行列を計算する時は必ずnumpy.linalg.inv関数を使うようにして下さい。
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (20.0, 16.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
from utils import get_output_space,warp_image,pad,plot_matches
from panorama import harris_corners,describe_keypoints,ransac
from skimage.feature import corner_peaks
from panorama import simple_descriptor,match_descriptors,fit_affine_matrix
from skimage.io import imread
img1 = imread('yosemite1.jpg', as_gray=True)
img2 = imread('yosemite2.jpg', as_gray=True)
img3 = imread('yosemite3.jpg', as_gray=True)
img4 = imread('yosemite4.jpg', as_gray=True)

patch_size = 5

# Detect keypoints in each image
keypoints1 = corner_peaks(harris_corners(img1, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)
keypoints2 = corner_peaks(harris_corners(img2, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)
keypoints3 = corner_peaks(harris_corners(img3, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)
keypoints4 = corner_peaks(harris_corners(img4, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)

# Describe keypoints
desc1 = describe_keypoints(img1, keypoints1,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)
desc2 = describe_keypoints(img2, keypoints2,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)
desc3 = describe_keypoints(img3, keypoints3,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)
desc4 = describe_keypoints(img4, keypoints4,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)

# Match keypoints in neighboring images
matches12 = match_descriptors(desc1, desc2, 0.7)
matches23 = match_descriptors(desc2, desc3, 0.7)
matches34 = match_descriptors(desc3, desc4, 0.7)

### YOUR CODE HERE
H12, _ = ransac(keypoints1, keypoints2, matches12)
H23, _ = ransac(keypoints2, keypoints3, matches23)
H34, _ = ransac(keypoints3, keypoints4, matches34)
# Take image2 as reference image
output_shape, offset = get_output_space(img2, [img1, img3, img4], [np.linalg.inv(H12), H23, H23.dot(H34)])

img1_warped = warp_image(img1, np.linalg.inv(H12), output_shape, offset)
img1_mask = (img1_warped != -1)
img1_warped[~img1_mask] = 0

img2_warped = warp_image(img2, np.eye(3), output_shape, offset)
img2_mask = (img2_warped != -1)
img2_warped[~img2_mask] = 0

img3_warped = warp_image(img3, H23, output_shape, offset)
img3_mask = (img3_warped != -1)
img3_warped[~img3_mask] = 0

img4_warped = warp_image(img4, H23.dot(H34), output_shape, offset)
img4_mask = (img4_warped != -1)
img4_warped[~img4_mask] = 0
### END YOUR CODE
# Plot warped images
plt.imshow(img1_warped)
plt.axis('off')
plt.title('Image 1 warped')
plt.show()

plt.imshow(img2_warped)
plt.axis('off')
plt.title('Image 2 warped')
plt.show()

plt.imshow(img3_warped)
plt.axis('off')
plt.title('Image 3 warped')
plt.show()

plt.imshow(img4_warped)
plt.axis('off')
plt.title('Image 4 warped')
plt.show()
merged = img1_warped + img2_warped + img3_warped + img4_warped

# Track the overlap by adding the masks together
overlap = (img2_mask * 1.0 +  # Multiply by 1.0 for bool -> float conversion
           img1_mask + img3_mask + img4_mask)

# Normalize through division by `overlap` - but ensure the minimum is 1
normalized = merged / np.maximum(overlap, 1)
plt.imshow(normalized)
plt.axis('off')
plt.show()