スタンフォード/CS131/宿題5-2 画素レベル特徴(色・位置特徴)

前回のStanford University/CS131/宿題5-1 Clustering Algorithm(クラスタ化アルゴリズム)の続きをやる。今回の宿題は、Pixel-Level Featur(画素レベル特徴)、Color Feature(色特徴)、Color and Position Feature (色・位置特徴)、Implement Your Own Feature(自作特徴実装)をカバーする。

スポンサーリンク

Pixel-Level Features

Before we can use a clustering algorithm to segment an image, we must compute some feature vectore for each pixel. The feature vector for each pixel should encode the qualities that we care about in a good segmentation. More concretely, for a pair of pixels $p_i$ and $p_j$ with corresponding feature vectors $f_i$ and $f_j$, the distance between $f_i$ and $f_j$ should be small if we believe that $p_i$ and $p_j$ should be placed in the same segment and large otherwise.
クラスタ化アルゴリズムを使って画像を分割する前に、各画素のfeature vectoreを算出しなければならない。各画素に対する特徴ベクトルは、分割の出来栄えを良くする質をエンコードしている必要がある。もっと具体的に言うと、対応する特徴ベクトル $f_i$, $f_j$ を持つ画素 $p_i$, $p_j$ のペアに関して、$f_i$, $f_j$間の距離は、$p_i$, $p_j$が同じセグメントに置かれているなら小さくなり、そうでないなら大きくなるはずである。

from time import time
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc
from skimage import io,color
import random
from segmentation import hierarchical_clustering,kmeans_fast
from scipy.spatial.distance import squareform, pdist
from skimage.util import img_as_float
from __future__ import print_function

%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 10.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
# Load and display image
img = io.imread('train.jpg')
H, W, C = img.shape

plt.imshow(img)
plt.axis('off')
plt.show()

Color Features

One of the simplest possible feature vectors for a pixel is simply the vector of colors for that pixel. Implement color_features in segmentation.py. Output should look like the following:
画素に対する最も単純な考えられる特徴ベクトルの一つは、単純にその画素に対する色ベクトルだ。color_featuresを実装せよ。出力は以下のように見えるはずである。
alt text

def color_features(img):
    """ Represents a pixel by its color.
    Args:
        img - array of shape (H, W, C)
    Returns:
        features - array of (H * W, C)
    """
    H, W, C = img.shape
    img = img_as_float(img)
    features = np.zeros((H*W, C))
    ### YOUR CODE HERE
    features = img.reshape(H*W, C)
    ### END YOUR CODE
    return features
np.random.seed(0)

features = color_features(img)

# Sanity checks
assert features.shape == (H * W, C),\
    "Incorrect shape! Check your implementation."

assert features.dtype == np.float,\
    "dtype of color_features should be float."

assignments = kmeans_fast(features, 8)
segments = assignments.reshape((H, W))

# Display segmentation
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()

In the cell below, we visualize each segment as the mean color of pixels in the segment.
下のセルで、各セグメントをそのセグメント内の画素の平均色として視覚化する。

from utils import visualize_mean_color_image
visualize_mean_color_image(img, segments)

Color and Position Features

Another simple feature vector for a pixel is to concatenate its color and position within the image. In other words, for a pixel of color $(r, g, b)$ located at position $(x, y)$ in the image, its feature vector would be $(r, g, b, x, y)$. However, the color and position features may have drastically different ranges; for example each color channel of an image may be in the range $[0, 1)$, while the position of each pixel may have a much wider range. Uneven scaling between different features in the feature vector may cause clustering algorithms to behave poorly.
他の単純な画素に対する特徴ベクトルは、画像内のその画素の色と位置を連結することだ。言い換えれば、画像内の位置$(x, y)$にある色の画素$(r, g, b)$に関して、その画素のベクトルは$(r, g, b, x, y)$になる。しかしながら、色・位置特徴は、大きく異なる範囲を有する可能性がある。例えば、画像の各カラーチャネルは範囲$[0, 1)$にある可能性がある一方で、各画素の位置がはるかに広い範囲を持つ可能性がある。特徴ベクトルの異なる特徴間の不均等なスケーリングが、クラスター化アルゴリズムが上手く機能しない原因になっている可能性がある。

One way to correct for uneven scaling between different features is to apply some sort of normalization to the feature vector. One of the simplest types of normalization is to force each feature to have zero mean and unit variance.
異なる特徴間の不均一スケーリングを修正する1つの方法が、特徴ベクトルに正規化のようなものを適用することだ。最も単純なタイプの正規化の1つが、各特徴が平均0分散1を持つように強いることだ。

Implement color_position_features in segmentation.py.
color_position_featuresを実装せよ。

Output segmentation should look like the following:
出力されるセグメンテーションは以下の画像のように見えるはずだ。
alt text

def color_position_features(img):
    """ Represents a pixel by its color and position.
    Combine pixel's RGB value and xy coordinates into a feature vector.
    i.e. for a pixel of color (r, g, b) located at position (x, y) in the
    image. its feature vector would be (r, g, b, x, y).
    Don't forget to normalize features.
    Hints
    - You may find np.mgrid and np.dstack useful
    - You may use np.mean and np.std
    Args:
        img - array of shape (H, W, C)
    Returns:
        features - array of (H * W, C+2)
    """
    H, W, C = img.shape
    color = img_as_float(img)
    features = np.zeros((H*W, C+2))
    ### YOUR CODE HERE
    position = np.dstack(np.mgrid[0:H,0:W]).reshape((H*W,2))
    features[:,0:C] = color.reshape((H*W,C))
    features[:,C:C+2] = position
    features = (features-np.mean(features,axis=0))\
                        /np.std(features,axis=0)
    ### END YOUR CODE
    return features
np.random.seed(0)

features = color_position_features(img)

# Sanity checks
assert features.shape == (H * W, C + 2),\
    "Incorrect shape! Check your implementation."

assert features.dtype == np.float,\
    "dtype of color_features should be float."

assignments = kmeans_fast(features, 8)
segments = assignments.reshape((H, W))

# Display segmentation
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()
visualize_mean_color_image(img, segments)

Extra Credit: Implement Your Own Feature

For this programming assignment we have asked you to implement a very simple feature transform for each pixel. While it is not required, you should feel free to experiment with other feature transforms. Could your final segmentations be improved by adding gradients, edges, SIFT descriptors, or other information to your feature vectors? Could a different type of normalization give better results?
このプログラミング課題では、各画素に対する非常に単純な特徴変換を実装する。必須ではないが、他の特徴変換を使って自由に実験を試みるべきだろう。最終的なセグメンテーションは、勾配、エッジ、Scale Invariant Feature Transform descriptor(スケール不変特徴変換記述子)、あるいは、他の情報を特徴ベクトルに付け加えることで向上できるか?種類の異なる正規化はより良い結果をもたらすか?

Implement your feature extractor my_features in segmentation.py
特徴抽出器my_featuresを実装せよ。

def my_features(img):
    """ Implement your own features
    Args:
        img - array of shape (H, W, C)
    Returns:
        features - array of (H * W, C)
    """
    features = None
    ### YOUR CODE HERE
    H, W, C = img.shape
    colors = img_as_float(img)
    position = np.dstack(np.mgrid[0:H,0:W]).reshape((H*W,2))
    gray = color.rgb2gray(img)
    grad = np.gradient(gray)
    grad = np.abs(grad[0])+np.abs(grad[1])
    features = np.zeros((H*W,C+3))
    features[:,0:C] = np.reshape(colors,(H*W,C))
    features[:,C:C+2] = position
    features[:,C+2] = grad.reshape((H*W))
    features = (features-np.mean(features,axis=0))\
                        / (np.std(features,axis=0))
    ### END YOUR CODE
    return features
# Feel free to experiment with different images
# and varying number of segments
img = io.imread('train.jpg')
num_segments = 8

H, W, C = img.shape

# Extract pixel-level features
features = my_features(img)

# Run clustering algorithm
assignments = kmeans_fast(features, num_segments)

segments = assignments.reshape((H, W))

# Display segmentation
plt.imshow(segments, cmap='viridis')
plt.axis('off')
plt.show()
参考サイトhttps://github.com/