前回のStanford University/CS131/宿題2-2の続きをやる。今回のチュートリアルは質問編と番外編で、番外編はやってもやらなくてもどっちでもいい。
(a) Suppose that the Canny edge detector successfully detects an edge in an image. The edge (see the figure above) is then rotated by θ, where the relationship between a point on the original edge $(x, y)$ and a point on the rotated edge $(x’, y’)$ is defined as
キャニーエッジ検出器が画像内のエッジをうまく検出したとする。その後、エッジ(上の図を参照のこと)は、元エッジのポイント$(x, y)$と回転エッジのポイント$(x’, y’)$の関係が以下のように表わせるθ回転をする。
Will the rotated edge be detected using the same Canny edge detector? Provide either a mathematical proof or a counter example.
-Hint: The detection of an edge by the Canny edge detector depends only on the magnitude of its derivative. The derivative at point (x, y) is determined by its components along the x and y directions. Think about how these magnitudes have changed because of the rotation.
ヒント:キャニーエッジ検出器によるエッジの検出は、エッジの微分係数の大きさだけに依存している。ポイント(x, y)の微分係数は、エッジのx,y方向成分によって決定される。これらの大きさが回転によってどう変化するかについて考えてみる。
Suppose the magnitude of origin derivative is
ポイント(x, y)の微分係数の大きさを以下のように仮定すれば
The magnitude of rotated derivative is
ポイント$(x’, y’)$の微分係数の大きさは以下のように表わせる。
So the magnitude doesn’t change, it can be detected using the same Canny edge detector.
(b) After running the Canny edge detector on an image, you notice that long edges are broken into short segments separated by gaps. In addition, some spurious edges appear. For each of the two thresholds (low and high) used in hysteresis thresholding, explain how you would adjust the threshold (up or down) to address both problems. Assume that a setting exists for the two thresholds that produces the desired result. Briefly explain your answer.
If long edges are broken into short segments, it means we lose some weak edges, so we should decrease the low threshold. If spurious edges appear, it means we come up with extra edges, so we should increase the high threshold.
Extra Credit: Optimizing Edge Detector¶
One way of evaluating an edge detector is to compare detected edges with manually specified ground truth edges. Here, we use precision, recall and F1 score as evaluation metrics. We provide you 40 images of objects with ground truth edge annotations. Run the code below to compute precision, recall and F1 score over the entire set of images. Then, tweak the parameters of the Canny edge detector to get as high F1 score as possible. You should be able to achieve F1 score higher than 0.31 by carefully setting the parameters.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from time import time
from skimage import io
from __future__ import print_function
%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
%load_ext autoreload
%autoreload 2
from edge import conv,gaussian_kernel,partial_x,partial_y,\
from os import listdir
from itertools import product
# Define parameters to test
sigmas = [1.25]
highs = [10.0]
lows = [9.0,9.25,9.50,9.75,9.90,9.95]
for sigma, high, low in product(sigmas, highs, lows):
print("sigma={}, high={}, low={}".format(sigma, high, low))
n_detected = 0.0
n_gt = 0.0
n_correct = 0.0
for img_file in listdir('images/objects'):
img = io.imread('images/objects/'+img_file, as_gray=True)
gt = io.imread('images/gt/'+img_file+'.gtf.pgm', as_gray=True)
mask = (gt != 5) # 'don't' care region
gt = (gt == 0) # binary image of GT edges
edges = canny(img, kernel_size=5, sigma=sigma, high=high, low=low)
edges = edges * mask
n_detected += np.sum(edges)
n_gt += np.sum(gt)
n_correct += np.sum(edges * gt)
p_total = n_correct / n_detected
r_total = n_correct / n_gt
f1 = 2 * (p_total * r_total) / (p_total + r_total)
print('Total precision={:.4f}, Total recall={:.4f}'.format(p_total, r_total))
print('F1 score={:.4f}'.format(f1))