スタンフォード/CS131/宿題3-2 キーポイントと記述子

前回のStanford University/CS131/宿題3-1 ハリスコーナー検出器の続きをやる。今回の宿題は、記述子の作成とマッチングをカバーしている。

スポンサーリンク

Part 2 Describing and Matching Keypoints

We are now able to localize keypoints in two images by running the Harris corner detector independently on them. Next question is, how do we determine which pair of keypoints come from corresponding locations in those two images? In order to match the detected keypoints, we must come up with a way to describe the keypoints based on their local appearance. Generally, each region around detected keypoint locations is converted into a fixed-size vectors called descriptors.
ハリスコーナー検出器を別々に2つの画像に対して実行することで、キーポイントをローカライズできる。次の問題は、どのキーポイントペアがその2画像の対応箇所に由来するのかを特定する方法だ。検出したキーポイントをマッチさせるために、それらの局所的な見えに基づいてキーポイントを記述する方法を考え付く必要がある。一般的に、検出したキーポイント位置周辺の各領域は、記述子と呼ばれる固定サイズベクトルに変換される。

Creating Descriptors

In this section, you are going to implement a simple_descriptor; each keypoint is described by normalized intensity in a small patch around it.
このセクションでは、simple_descriptorを実装する。各々のキーポイントは、その周囲の小パッチの正規化明度によって記述される。

import numpy as np
from skimage import filters
from skimage.feature import corner_peaks
from skimage.io import imread
import matplotlib.pyplot as plt
from time import time
from __future__ import print_function

%matplotlib inline
plt.rcParams['figure.figsize'] = (20.0, 20.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
from panorama import harris_corners

img1 = imread('uttower1.jpg', as_gray=True)
img2 = imread('uttower2.jpg', as_gray=True)

# Detect keypoints in two images
keypoints1 = corner_peaks(harris_corners(img1, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)
keypoints2 = corner_peaks(harris_corners(img2, window_size=3),
                          threshold_rel=0.05,
                          exclude_border=8)

# Display detected keypoints
plt.subplot(1,2,1)
plt.imshow(img1)
plt.scatter(keypoints1[:,1], keypoints1[:,0], marker='x')
plt.axis('off')
plt.title('Detected Keypoints for Image 1')

plt.subplot(1,2,2)
plt.imshow(img2)
plt.scatter(keypoints2[:,1], keypoints2[:,0], marker='x')
plt.axis('off')
plt.title('Detected Keypoints for Image 2')
plt.show()

Matching Descriptors

Then, implement match_descriptors function to find good matches in two sets of descriptors. First, calculate Euclidean distance between all pairs of descriptors from image 1 and image 2. Then use this to determine if there is a good match: if the distance to the closest vector is significantly (by a factor which is given) smaller than the distance to the second-closest, we call it a match. The output of the function is an array where each row holds the indices of one pair of matching descriptors.
次に、match_descriptors関数を実装して、二組の記述子の中でぴったりの組み合わせを見つけ出す。先ず、画像1と画像2から全ての記述子ペア間のユークリッド距離を計算する。その後、計算結果を使って似合いのペアが存在するかを割り出す。最も近いベクトルの距離が、その次に近いベクトルの距離より(与えられたファクターで)有意に小さい場合、我々はそれをマッチとコールする。関数の出力は、各行が1つのマッチング記述子ペアのインデックスを保持する配列である。

def simple_descriptor(patch):
    """
    Describe the patch by normalizing the image values into a standard 
    normal distribution (having mean of 0 and standard deviation of 1) 
    and then flattening into a 1D array.     
    The normalization will make the descriptor more robust to change 
    in lighting condition.    
    Hint:
        If a denominator is zero, divide by 1 instead.    
    Args:
        patch: grayscale image patch of shape (h, w)    
    Returns:
        feature: 1D array of shape (h * w)
    """
    feature = []
    ### YOUR CODE HERE
    std = np.std(patch)
    mean = np.mean(patch)
    if std > 0.0:
        feature = (patch - mean) / std
    else:
        feature = patch - mean
    feature = feature.flatten()
    ### END YOUR CODE
    return feature
from scipy.spatial.distance import cdist

def match_descriptors(desc1, desc2, threshold=0.5):
    """
    Match the feature descriptors by finding distances between them. A match is formed 
    when the distance to the closest vector is much smaller than the distance to the 
    second-closest, that is, the ratio of the distances should be smaller
    than the threshold. Return the matches as pairs of vector indices.    
    Args:
        desc1: an array of shape (M, P) holding descriptors of size P about M keypoints
        desc2: an array of shape (N, P) holding descriptors of size P about N keypoints        
    Returns:
        matches: an array of shape (Q, 2) where each row holds the indices of one pair 
        of matching descriptors
    """
    matches = []    
    N = desc1.shape[0]
    dists = cdist(desc1,desc2)
    ### YOUR CODE HERE
    for i in range(N):
        dist = dists[i,:]
        if np.min(dist)/(np.partition(dist,2)[1])<=threshold:
            matches.append([i,np.argmin(dist)])
    matches = np.array(matches).reshape(-1,2)
    ### END YOUR CODE    
    return matches
from panorama import describe_keypoints
from utils import plot_matches

patch_size = 5

# Extract features from the corners
desc1 = describe_keypoints(img1, keypoints1,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)
desc2 = describe_keypoints(img2, keypoints2,
                           desc_func=simple_descriptor,
                           patch_size=patch_size)

# Match descriptors in image1 to those in image2
matches = match_descriptors(desc1, desc2, 0.7)

# Plot matches
fig, ax = plt.subplots(1, 1, figsize=(15, 12))
ax.axis('off')
plot_matches(ax, img1, img2, keypoints1, keypoints2, matches)
plt.show()
plt.imshow(imread('solution_simple_descriptor.png'))
plt.axis('off')
plt.title('Matched Simple Descriptor Solution')
plt.show()