前回のStanford University/CS131/宿題2-3の続きをやる。今回の宿題は、キャニーエッジ検出器を使った道路のレーン検出をカバーする。

Part2: Lane Detection¶

In this section we will implement a simple lane detection application using Canny edge detector and Hough transform.
Here are some example images of how your final lane detector will look like.
このセクションでは、キャニーエッジ検出器とハフ変換を使って単純なレーン検出アプリを実装する。以下が、最終レーン検出器がどのように見えるかの実例画像だ。

The algorithm can broken down into the following steps:
今回のアルゴリズムは以下のステップに分割できる。

Detect edges using the Canny edge detector.
キャニーエッジ検出器を使ってエッジを検出する。
Extract the edges in the region of interest (a triangle covering the bottom corners and the center of the image).
関心領域(画像の下両隅と中央を覆う三角形)のエッジを抽出する
Run Hough transform to detect lanes.
ハフ変換を実行してレーンを検出する。

Edge detection¶

Lanes on the roads are usually thin and long lines with bright colors. Our edge detection algorithm by itself should be able to find the lanes pretty well. Run the code cell below to load the example image and detect edges from the image.
道路上のレーンは、一般的に鮮明な色をしていて細長い。我々のエッジ検出アルゴリズムだけでエッジをうまく検出できるはずなので、下のコードセルを実行して実例画像をロードして画像からエッジを検出する。

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from skimage import io
from edge import canny

# Load image
img = io.imread('road.jpg', as_gray=True)

# Run Canny edge detector
edges = canny(img, kernel_size=5, sigma=1.4, high=0.03, low=0.02)

%matplotlib inline
plt.rcParams['figure.figsize'] = 30.0, 20.0 # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
plt.rcParams["font.size"] = "17"

plt.subplot(211)
plt.imshow(img)
plt.axis('off')
plt.title('Input Image')

plt.subplot(212)
plt.imshow(edges)
plt.axis('off')
plt.title('Edges')
plt.show()

Extracting region of interest (ROI)¶

We can see that the Canny edge detector could find the edges of the lanes. However, we can also see that there are edges of other objects that we are not interested in. Given the position and orientation of the camera, we know that the lanes will be located in the lower half of the image. The code below defines a binary mask for the ROI and extract the edges within the region.
我々は、キャニーエッジ検出器が線のエッジを検出できることを見ることができる。しかしながら、我々は、関心の無い他の物体のエッジが存在することも視認できる。カメラの位置と配向を所与とすれば、我々は、レーンが画像下半分に位置していることを知っている。下のコードは、ROI(関心領域)に対するバイナリマスクを定義し、その領域のエッジを抽出する。

H, W = img.shape

# Generate mask for ROI (Region of Interest)
mask = np.zeros((H, W))
for i in range(H):
    for j in range(W):
        if i > (float(H) / W) * j and i > -(float(H) / W) * j + H:
            mask[i, j] = 1

# Extract edges in ROI
roi = edges * mask

plt.subplot(1,2,1)
plt.imshow(mask)
plt.title('Mask')
plt.axis('off')

plt.subplot(1,2,2)
plt.imshow(roi)
plt.title('Edges in ROI')
plt.axis('off')
plt.show()

Fitting lines using Hough transform¶

The output from the edge detector is still a collection of connected points. However, it would be more natural to represent a lane as a line parameterized as $y = ax + b$, with a slope $a$ and y-intercept $b$. We will use Hough transform to find parameterized lines that represent the detected edges.
エッジ検出からの出力はまだ連結点の集合に過ぎない。しかしながら、レーンは、傾き$a$とy切片$b$を持った$y = ax + b$としてパラメーター化された線として表すのがよりナチュラルだろう。

In general, a straight line $y = ax + b$ can be represented as a point $(a, b)$ in the parameter space. However, this cannot represent vertical lines as the slope parameter will be unbounded. Alternatively, we parameterize a line using $\theta\in{[-\pi, \pi]}$ and $\rho\in{\mathbb{R}}$ as follows:
一般的に、直線$y = ax + b$は、パラメータ空間においては点$(a, b)$として表わすことができる。しかしながら、これは、傾きパラメータが無限になるので鉛直線を表わすことはできない。代わりに、$\theta\in{[-\pi, \pi]}$ and $\rho\in{\mathbb{R}}$を用いて線を以下のようにパラメータ化する。

$$
\rho = x\cdot{cos\theta} + y\cdot{sin\theta}
$$

Using this parameterization, we can map everypoint in $xy$-space to a sine-like line in $\theta\rho$-space (or Hough space). We then accumulate the parameterized points in the Hough space and choose points (in Hough space) with highest accumulated values. A point in Hough space then can be transformed back into a line in $xy$-space.
このパラメータ化を用いて、我々は、$\theta\rho$-空間(ハフ空間)の正弦様ラインに$xy$-空間の全ての点をマップできる。次に、我々は、ハフ空間にパラメータ化された点を累積し、最も高い累積値を持った(ハフ空間の)点を選択する。ハフ空間の点は、その後、$xy$-空間において線に変換し直せる。

See notes on Hough transform.

Implement hough_transform in edge.py.
edge.pyにhough_transformを実装せよ。

def hough_transform(img):
    """ Transform points in the input image into Hough space.
    Use the parameterization:
        rho = x * cos(theta) + y * sin(theta)
    to transform a point (x,y) to a sine-like function in Hough space.
    Args:
        img: binary image of shape (H, W)        
    Returns:
        accumulator: numpy array of shape (m, n)
        rhos: numpy array of shape (m, )
        thetas: numpy array of shape (n, )
    """
    # Set rho and theta ranges
    W, H = img.shape
    diag_len = int(np.ceil(np.sqrt(W * W + H * H)))
    rhos = np.linspace(-diag_len, diag_len, diag_len * 2.0 + 1)
    thetas = np.deg2rad(np.arange(-90.0, 90.0))
    # Cache some reusable values
    cos_t = np.cos(thetas)
    sin_t = np.sin(thetas)
    num_thetas = len(thetas)
    # Initialize accumulator in the Hough space
    accumulator = np.zeros((2 * diag_len + 1, num_thetas), dtype=np.uint64)
    ys, xs = np.nonzero(img)
    # Transform each point (x, y) in image
    # Find rho corresponding to values in thetas
    # and increment the accumulator in the corresponding coordiate.
    ### YOUR CODE HERE
    for i, j in zip(ys, xs):
        for idx in range(thetas.shape[0]):
            r = j * cos_t[idx] + i * sin_t[idx]
            accumulator[int(r + diag_len), idx] += 1
    ### END YOUR CODE
    return accumulator, rhos, thetas

from edge import hough_transform

# Perform Hough transform on the ROI
acc, rhos, thetas = hough_transform(roi)

# Coordinates for right lane
xs_right = []
ys_right = []

# Coordinates for left lane
xs_left = []
ys_left = []

for i in range(20):
    idx = np.argmax(acc)
    r_idx = idx // acc.shape[1]
    t_idx = idx % acc.shape[1]
    acc[r_idx, t_idx] = 0 # Zero out the max value in accumulator

    rho = rhos[r_idx]
    theta = thetas[t_idx]
    
    # Transform a point in Hough space to a line in xy-space.
    a = - (np.cos(theta)/np.sin(theta)) # slope of the line
    b = (rho/np.sin(theta)) # y-intersect of the line

    # Break if both right and left lanes are detected
    if xs_right and xs_left:
        break
    
    if a < 0: # Left lane
        if xs_left:
            continue
        xs = xs_left
        ys = ys_left
    else: # Right Lane
        if xs_right:
            continue
        xs = xs_right
        ys = ys_right

    for x in range(img.shape[1]):
        y = a * x + b
        if y > img.shape[0] * 0.6 and y < img.shape[0]:
            xs.append(x)
            ys.append(int(round(y)))

plt.imshow(img)
plt.plot(xs_left, ys_left, linewidth=5.0)
plt.plot(xs_right, ys_right, linewidth=5.0)
plt.axis('off')
plt.show()