スタンフォード/CS131/宿題2-1 キャニーエッジ検出

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from time import time
from skimage import io
from __future__ import print_function
%matplotlib inline
plt.rcParams['figure.figsize'] = 18.0, 11.0 # set default size of plots
plt.rcParams["font.size"] = "18"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
スポンサーリンク

Part 1: Canny Edge Detector¶

In this part, you are going to implment Canny edge detector. The Canny edge detection algorithm can be broken down in to five steps:
このパートでは、キャニー法を実装する。キャニーエッジ検出アルゴリズムは、5つのステップに分解することができる。

1. Smoothing(スムージング)
3. Non-maximum suppression(非極大値抑制)
4. Double thresholding(二重閾値)
5. Edge tracking by hysterisis(ヒステリシスによるエッジ追跡)

Smoothing¶

Implementation¶

We first smooth the input image by convolving it with a Gaussian kernel. The equation for a Gaussian kernel of size $(2k+1)\times(2k+1)$ is given by:
ガウス核を使った畳み込みで最初に入力画像をスムーズにする。サイズが $(2k+1)\times(2k+1)$のガウス核に対する方程式は以下の通り。

$$h_{ij}=\frac{1}{2\pi\sigma^2}\exp{\Bigl(-\frac{(i-k)^2+(j-k)^2}{2\sigma^2}\Bigr)}, 0\leq i,j < 2k+1$$

Implement gaussian_kernel in edge.py and run the code below.
edge.pyにconvgaussian_kernelを実装して下のコードを走らせる。

def conv(image, kernel):
""" An implementation of convolution filter.
This function uses element-wise multiplication and np.sum()
to efficiently compute weighted sum of neighborhood at each
pixel.
Args:
image: numpy array of shape (Hi, Wi)
kernel: numpy array of shape (Hk, Wk)
Returns:
out: numpy array of shape (Hi, Wi)
"""
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))
# For this assignment, we will use edge values to pad the images.
# Zero padding will make derivatives at the image boundary very big,
# whereas we want to ignore the edges at the boundary.
kernel = np.flip(np.flip(kernel, 0), 1)
for i in range(Hi):
for j in range(Wi):
return out
def gaussian_kernel(size, sigma):
""" Implementation of Gaussian Kernel.
This function follows the gaussian kernel formula,
and creates a kernel matrix.
Hints:
- Use np.pi and np.exp to compute pi and exp
Args:
size: int of the size of output matrix
sigma: float of sigma to calculate kernel
Returns:
kernel: numpy array of shape (size, size)
"""
kernel = np.zeros((size, size))
k = size // 2
for i in range(size):
for j in range(size):
kernel[i][j]=(1/(2*np.pi*sigma**2))* \
np.exp(-((i-k)**2+(j-k)**2)/(2*sigma**2))
return kernel
# Define 3x3 Gaussian kernel with std = 1
kernel = gaussian_kernel(3, 1)
kernel_test = np.array(
[[ 0.05854983, 0.09653235, 0.05854983],
[ 0.09653235, 0.15915494, 0.09653235],
[ 0.05854983, 0.09653235, 0.05854983]]
)
# Test Gaussian kernel
if not np.allclose(kernel, kernel_test):
# Test with different kernel_size and sigma
kernel_size = 5
sigma = 1.4

# Define 5x5 Gaussian kernel with std = sigma
kernel = gaussian_kernel(kernel_size, sigma)

# Convolve image with kernel to achieve smoothed effect
smoothed = conv(img, kernel)

plt.subplot(1,2,1)
plt.imshow(img)
plt.title('Original image')
plt.axis('off')

plt.subplot(1,2,2)
plt.imshow(smoothed)
plt.title('Smoothed image')
plt.axis('off')
plt.show()

What is the effect of the kernel_size and sigma?
カーネルとシグマの効果は何か？
The larger sigma and kernel size makes the image more blurry.
シグマとカーネルサイズが大きくなるほど画像はより不鮮明になる。

The gradient of a 2D scalar function $I:\mathbb{R}^2\rightarrow{\mathbb{R}}$ in Cartesian coordinate is defined by:
デカルト座標の2Dスカラー関数$I:\mathbb{R}^2\rightarrow{\mathbb{R}}$の勾配は以下のように表わせる。

$$\nabla{I(x,y)}=\bigl[\frac{\partial{I}}{\partial{x}},\frac{\partial{I}}{\partial{y}}\bigr],$$

where

$$\frac{\partial{I(x,y)}}{\partial{x}}=\lim_{\Delta{x}\to{0}}\frac{I(x+\Delta{x},y)-I(x,y)}{\Delta{x}} \\ \frac{\partial{I(x,y)}}{\partial{y}}=\lim_{\Delta{y}\to{0}}\frac{I(x,y+\Delta{y})-I(x,y)}{\Delta{y}}.$$

In case of images, we can approximate the partial derivatives by taking differences at one pixel intervals:

$$\frac{\partial{I(x,y)}}{\partial{x}}\approx{\frac{I(x+1,y)-I(x-1,y)}{2}} \\ \frac{\partial{I(x,y)}}{\partial{y}}\approx{\frac{I(x,y+1)-I(x,y-1)}{2}}$$

Note that the partial derivatives can be computed by convolving the image $I$ with some appropriate kernels $D_x$ and $D_y$:

$$\frac{\partial{I}}{\partial{x}}\approx{I*D_x}=G_x \\ \frac{\partial{I}}{\partial{y}}\approx{I*D_y}=G_y$$

Implementation¶

Find the kernels $D_x$ and $D_y$ and implement partial_x and partial_y using conv defined in edge.py.
カーネル$D_x$と$D_y$を探し、edge.pyで定義されているconvを利用してpartial_xpartial_yを実装する。

-Hint: Remember that convolution flips the kernel.
ヒント：畳み込みがカーネルを反転させることを忘れずに。

def partial_x(img):
""" Computes partial x-derivative of input img.
Hints:
- You may use the conv function in defined in this file.
Args:
img: numpy array of shape (H, W)
Returns:
out: x-derivative image
"""
out = None
kernel = np.array([[.5, 0., -.5]])
out = conv(img, kernel)
return out
def partial_y(img):
""" Computes partial y-derivative of input img.
Hints:
- You may use the conv function in defined in this file.
Args:
img: numpy array of shape (H, W)
Returns:
out: y-derivative image
"""
out = None
kernel = np.array([[.5], [0.], [-.5]])
out = conv(img, kernel)
return out
# Test input
I = np.array(
[[0, 0, 0],
[0, 1, 0],
[0, 0, 0]]
)

# Expected outputs
I_x_test = np.array(
[[ 0, 0, 0],
[ 0.5, 0, -0.5],
[ 0, 0, 0]]
)

I_y_test = np.array(
[[ 0, 0.5, 0],
[ 0, 0, 0],
[ 0, -0.5, 0]]
)

# Compute partial derivatives
I_x = partial_x(I)
I_y = partial_y(I)

# Test correctness of partial_x and partial_y
if not np.all(I_x == I_x_test):
print('partial_x incorrect')

if not np.all(I_y == I_y_test):
print('partial_y incorrect')
# Compute partial derivatives of smoothed image
Gx = partial_x(smoothed)
Gy = partial_y(smoothed)

plt.subplot(1,2,1)
plt.imshow(Gx)
plt.title('Derivative in x direction')
plt.axis('off')

plt.subplot(1,2,2)
plt.imshow(Gy)
plt.title('Derivative in y direction')
plt.axis('off')
plt.show()

What is the reason for performing smoothing prior to computing the gradients?

Because if there is a lot of noise, it degrades the computation performance.
ノイズが多いと計算速度を低下させるから。

Implementation¶

Now, we can compute the magnitude and direction of gradient with the two partial derivatives:

$$G = \sqrt{G_{x}^{2}+G_{y}^{2}} \\ \Theta = arctan\bigl(\frac{G_{y}}{G_{x}}\bigr)$$

Implement gradient in edge.py which takes in an image and outputs $G$ and $\Theta$.
edge.pyに、画層を取り込んで、$G$と$\Theta$を出力するgradientを実装せよ。

-Hint: Use np.arctan2 to compute $\Theta$.
ヒント；$\Theta$の計算にnp.arctan2が使える。

""" Returns gradient magnitude and direction of input img.
Args:
img: Grayscale image. Numpy array of shape (H, W)
Returns:
G: Magnitude of gradient at each pixel in img.
Numpy array of shape (H, W)
theta: Direction(in degrees, 0 <= theta < 360) of gradient
at each pixel in img. Numpy array of shape (H, W)
"""
G = np.zeros(img.shape)
theta = np.zeros(img.shape)
Gx = partial_x(img)
Gy = partial_y(img)
G = np.sqrt(Gx**2+Gy**2)
theta = np.arctan2(Gx,Gx)
return G, theta

if not np.all(G >= 0):
print('Magnitude of gradients should be non-negative.')

if not np.all((theta >= 0) * (theta < 360)):
print('Direction of gradients should be in range 0 <= theta < 360')

plt.imshow(G)
plt.axis('off')
plt.show()

Non-maximum suppression¶

You should be able to note that the edges extracted from the gradient of the smoothed image is quite thick and blurry. The purpose of this step is to convert the “blurred” edges into “sharp” edges. Basically, this is done by preserving all local maxima in the gradient image and discarding everything else. The algorithm is for each pixel (x,y) in the gradient image:

1. Round the gradient direction $\Theta[y,x]$ to the nearest 45 degrees, corresponding to the use of an 8-connected neighbourhood.
勾配方向$\Theta[y,x]$を、8連結近傍適用に相当する直近45度へ四捨五入する。

2. Compare the edge strength of the current pixel with the edge strength of the pixel in the positive and negative gradient direction. For example, if the gradient direction is south (theta=90), compare with the pixels to the north and south.
現画素のエッジ強度と、正・負勾配方向画素のエッジ強度を比較する。例えば、勾配方向が南(θ=90)であれば、画素を北と南で比較する。

3. If the edge strength of the current pixel is the largest; preserve the value of the edge strength. If not, suppress (i.e. remove) the value.
現画素のエッジ強度が最大な場合はエッジ強度の値を維持する。もし、最大でない場合は値を除去する。

Implement non_maximum_suppression in edge.py.
edge.pyにnon_maximum_suppressionを実装せよ。

def non_maximum_suppression(G, theta):
""" Performs non-maximum suppression
This function performs non-maximum suppression along the direction
Args:
G: gradient magnitude image with shape of (H, W)
theta: direction of gradients with shape of (H, W)
Returns:
out: non-maxima suppressed image
"""
H, W = G.shape
out = np.zeros((H, W))
# Round the gradient direction to the nearest 45 degrees
theta = np.floor((theta+22.5)/45)*45
for i in range(1,H-1):
for j in range(1,W-1):
p1 = G[i-m,j-n]
p2 = G[i+m,j+n]
if not (G[i,j]>=p1 and G[i,j]>=p2):
out[i,j] = 0
else:
out[i,j] = G[i,j]
return out
# Test input
g = np.array(
[[0.4, 0.5, 0.6],
[0.3, 0.5, 0.7],
[0.4, 0.5, 0.6]]
)
# Print out non-maximum suppressed output
# varying theta
for angle in range(0, 180, 45):
print('Thetas:', angle)
t = np.ones((3, 3)) * angle # Initialize theta
print(non_maximum_suppression(g, t))
Thetas: 0
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
Thetas: 45
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
Thetas: 90
[[0.  0.  0. ]
[0.  0.5 0. ]
[0.  0.  0. ]]
Thetas: 135
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
nms = non_maximum_suppression(G, theta)
plt.imshow(nms)
plt.title('Non-maximum suppressed')
plt.axis('off')
plt.show()
スポンサーリンク
スポンサーリンク

フォローする