スタンフォード/CS131/宿題3-3 変換推定

前回のStanford University/CS131/宿題3-2 キーポイントと記述子の続きをやる。今回の宿題は、Transformation Estimation(変換推定)をカバーしている。

スポンサーリンク

Part 3 Transformation Estimation

We now have a list of matched keypoints across the two images. We will use this to find a transformation matrix that maps points in the second image to the corresponding coordinates in the first image. In other words, if the point $p_1 = [y_1,x_1]$ in image 1 matches with $p_2=[y_2, x_2]$ in image 2, we need to find an affine transformation matrix $H$ such that
2つの画像にまたがって一致したキーポイントのリストを使って、2つ目の画像の点を、最初の画像の対応座標にマッピングする変換行列を探し出す。言い換えれば、画像1の点$p_1 = [y_1,x_1]$が、画像2の点$p_2=[y_2, x_2]$と一致する場合、以下のようなアフィン変換行列$H$を探し出す必要がある。

$$
\tilde{p_2}H = \tilde{p_1},
$$

where $\tilde{p_1}$ and $\tilde{p_2}$ are homogenous coordinates of $p_1$ and $p_2$.
$\tilde{p_1}$と$\tilde{p_2}$は$p_1$と$p_2$の斉次座標である。

Note that it may be impossible to find the transformation $H$ that maps every point in image 2 exactly to the corresponding point in image 1. However, we can estimate the transformation matrix with least squares. Given $N$ matched keypoint pairs, let $X_1$ and $X_2$ be $N \times 3$ matrices whose rows are homogenous coordinates of corresponding keypoints in image 1 and image 2 respectively. Then, we can estimate $H$ by solving the least squares problem,
画像2の全ての点を正確に画像1の対応点に写像するアフィン変換行列$H$を探すのは不可能かもしれないことに留意する。とは言っても、最小二乗法を使って変換行列を推定することは可能だ。$N$をマッチしたキーポイントペアと仮定して、$X_1$と$X_2$を、行がそれぞれ画像1と画像2の対応するキーポイントの斉次座標である$N \times 3$行列とすれば、下記の最少二乗問題を解くことで$H$を推定できる。

$$
X_2 H = X_1
$$

Implement fit_affine_matrix in panorama.py
panorama.pyにfit_affine_matrixを実装せよ。

-Hint: read the documentation about np.linalg.lstsq
ヒント:np.linalg.lstsqについてこのdocumentationを読んでみよう。

import numpy as np
from utils import pad, unpad

def fit_affine_matrix(p1, p2):
    """ Fit affine matrix such that p2 * H = p1     
    Hint:
        You can use np.linalg.lstsq function to solve the problem.         
    Args:
        p1: an array of shape (M, P)
        p2: an array of shape (M, P)        
    Return:
        H: a matrix of shape (P * P) that transform p2 to p1.
    """
    assert (p1.shape[0] == p2.shape[0]),\
        'Different number of points in p1 and p2'
    p1 = pad(p1)
    p2 = pad(p2)
    ### YOUR CODE HERE
    H = np.linalg.lstsq(p2, p1, rcond=None)[0]
    ### END YOUR CODE
    # Sometimes numerical issues cause least-squares to produce the last
    # column which is not exactly [0, 0, 1]
    H[:,2] = np.array([0, 0, 1])
    return H
# Sanity check for fit_affine_matrix
# Test inputs
a = np.array([[0.5, 0.1], [0.4, 0.2], [0.8, 0.2]])
b = np.array([[0.3, -0.2], [-0.4, -0.9], [0.1, 0.1]])

H = fit_affine_matrix(b, a)

# Target output
sol = np.array(
    [[1.25, 2.5, 0.0],
     [-5.75, -4.5, 0.0],
     [0.25, -1.0, 1.0]]
)

error = np.sum((H - sol) ** 2)

if error < 1e-20:
    print('Implementation correct!')
else:
    print('There is something wrong.')
Implementation correct!

After checking that your fit_affine_matrix function is running correctly, run the following code to apply it to images.
Images will be warped and image 2 will be mapped to image 1. Then, the two images are merged to get a panorama. Your panorama may not look good at this point, but we will later use other techniques to get a better result.
fit_affine_matrix関数が正しく走るかチェックした後、下のコードを実行して関数を画像に適応する。画像は歪められて画像2は画像1に写像される。その後、2画像はパノラマ画像になるように統合される。現時点では、パノラマ画像は見栄えはよくないかもしれないが、もっと良い画像を得るためのテクニックを後で使う予定である。

import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
from utils import get_output_space, warp_image
from panorama import harris_corners,describe_keypoints
from skimage.feature import corner_peaks
from panorama import simple_descriptor,match_descriptors
from skimage.io import imread

img1 = imread('uttower1.jpg', as_gray=True)
img2 = imread('uttower2.jpg', as_gray=True)

# Detect keypoints in two images
keypoints1 = corner_peaks(harris_corners(img1, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)
keypoints2 = corner_peaks(harris_corners(img2, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)

patch_size = 5
# Extract features from the corners
desc1 = describe_keypoints(img1, keypoints1,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)
desc2 = describe_keypoints(img2, keypoints2,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)

# Match descriptors in image1 to those in image2
matches = match_descriptors(desc1, desc2, 0.7)

# Extract matched keypoints
p1 = keypoints1[matches[:,0]]
p2 = keypoints2[matches[:,1]]

# Find affine transformation matrix H that maps p2 to p1
H = fit_affine_matrix(p1, p2)

output_shape, offset = get_output_space(img1, [img2], [H])
print("Output shape:", output_shape)
print("Offset:", offset)

# Warp images into output sapce
img1_warped = warp_image(img1, np.eye(3), output_shape, offset)
img1_mask = (img1_warped != -1) # Mask == 1 inside the image
img1_warped[~img1_mask] = 0     # Return background values to 0

img2_warped = warp_image(img2, H, output_shape, offset)
img2_mask = (img2_warped != -1) # Mask == 1 inside the image
img2_warped[~img2_mask] = 0     # Return background values to 0

# Plot warped images
plt.subplot(1,2,1)
plt.imshow(img1_warped)
plt.title('Image 1 warped')
plt.axis('off')

plt.subplot(1,2,2)
plt.imshow(img2_warped)
plt.title('Image 2 warped')
plt.axis('off')
plt.show()
Output shape: [496 615]
Offset: [-39.37184617   0.        ]
merged = img1_warped + img2_warped

# Track the overlap by adding the masks together
overlap = (img1_mask * 1.0 +  # Multiply by 1.0 for bool -> float conversion
           img2_mask)

# Normalize through division by `overlap` - but ensure the minimum is 1
normalized = merged / np.maximum(overlap, 1)
plt.imshow(normalized)
plt.axis('off')
plt.show()