スタンフォード/CS131/宿題4-1 シームカービング

Stanford University/CS131/宿題4-1をやる。今回の宿題はSeam Carving (シームカービング)をカバーしている。

スポンサーリンク

Seam Carving

The material presented here is inspired from:
ここにあるマテリアルは以下からインスパイアされている。

Don’t hesitate to check these links if you have any doubt on the seam carving process.
シームカービングプロセスで混乱したら上のリンクをチェックしよう。

The whole seam carving process was covered in lecture 8, please refer to the slides for more details to the different concepts introduced here.
シームカービングプロセスについてはレクチャー8で全てカバーされているので、ここで紹介されているものと異なるコンセプトに対する詳細に関してはスライドを参照して下さい。

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc
from skimage import color
from time import time
from IPython.display import HTML
from __future__ import print_function

%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

Image Reducing using Seam Carving

Seam carving is an algorithm for content-aware image resizing.
To understand all the concepts in this homework, make sure to read again the slides from lecture 8: http://vision.stanford.edu/teaching/cs131_fall1718/files/08_seam_carving.pdf
シームカービングは、コンテンツに合わせて画像をリサイズするためのアルゴリズムだ。この宿題の全てのコンセプトを理解するために、講義8のスライドを必ず再読するように。

from skimage import io, util

# Load image
img = io.imread('imgs/broadway_tower.jpg')
img = util.img_as_float(img)

plt.title('Original Image')
plt.imshow(img)
plt.show()

Energy function

We will now implement the energy_function to compute the energy of the image.
次に、画像のエネルギーを求めるenergy_functionを実装する。
The energy at each pixel is the sum of:
各画素のエネルギーは以下の総和である。

  • absolute value of the gradient in the $x$ direction
    $x$方向の傾きの絶対値
  • absolute value of the gradient in the $y$ direction
    $y$方向の傾きの絶対値
    The function should take around 0.01 to 0.1 seconds to compute.
    作成した関数は、計算におおよそ.01〜.1秒を要するはず。
import numpy as np
from skimage import color

def energy_function(image):
    """Computes energy of the input image.
    For each pixel, we will sum the absolute value of the gradient in each direction.
    Don't forget to convert to grayscale first.
    Hint: you can use np.gradient here
    Args:
        image: numpy array of shape (H, W, 3)
    Returns:
        out: numpy array of shape (H, W)
    """
    H, W, _ = image.shape
    out = np.zeros((H, W))
    ### YOUR CODE HERE
    gray = color.rgb2gray(image)
    grad = np.gradient(gray)
    out = np.abs(grad[0]) + np.abs(grad[1])
    ### END YOUR CODE
    return out
test_img = np.array([[1.0, 2.0, 1.5],
                     [3.0, 1.0, 2.0],
                     [4.0, 0.5, 3.0]])
test_img = np.stack([test_img] * 3, axis=2)
assert test_img.shape == (3, 3, 3)

# Compute energy function
test_energy = energy_function(test_img)

solution_energy = np.array([[3.0, 1.25,  1.0],
                            [3.5, 1.25, 1.75],
                            [4.5,  1.0,  3.5]])

print("Image (channel 0):")
print(test_img[:, :, 0])

print("Energy:")
print(test_energy)
print("Solution energy:")
print(solution_energy)

assert np.allclose(test_energy, solution_energy)
Image (channel 0):
[[1.  2.  1.5]
 [3.  1.  2. ]
 [4.  0.5 3. ]]
Energy:
[[3.   1.25 1.  ]
 [3.5  1.25 1.75]
 [4.5  1.   3.5 ]]
Solution energy:
[[3.   1.25 1.  ]
 [3.5  1.25 1.75]
 [4.5  1.   3.5 ]]
# Compute energy function
start = time()
energy = energy_function(img)
end = time()

print("Computing energy function: %f seconds." % (end - start))

plt.title('Energy')
plt.axis('off')
plt.imshow(energy)
plt.show()
Computing energy function: 0.015337 seconds.

Compute cost

Now implement the function compute_cost.
次に、compute_cost関数を実装する。

Starting from the energy map, we’ll go from the first row of the image to the bottom and compute the minimal cost at each pixel.
エネルギーマップからスタートし、画像の最初の行から底の行へ移動して各画素の最少コストを計算する。

We’ll use dynamic programming to compute the cost line by line starting from the first row.
先頭行から始めるライン毎のコスト計算に動的プログラミングを用いる。

The function should take around 0.05 seconds to complete.
この関数は、計算に約.05秒要するだろう。

def compute_cost(image, energy, axis=1):
    """Computes optimal cost map (vertical) and paths of the seams.
    Starting from the first row, compute the cost of each pixel as the sum of energy along the
    lowest energy path from the top.
    We also return the paths, which will contain at each pixel either -1, 0 or 1 depending on
    where to go up if we follow a seam at this pixel.
    Make sure your code is vectorized because this function will be called a lot.
    You should only have one loop iterating through the rows.
    Args:
        image: not used for this function
               (this is to have a common interface with compute_forward_cost)
        energy: numpy array of shape (H, W)
        axis: compute cost in width (axis=1) or height (axis=0)
    Returns:
        cost: numpy array of shape (H, W)
        paths: numpy array of shape (H, W) containing values -1, 0 or 1
    """
    energy = energy.copy()
    if axis == 0:
        energy = np.transpose(energy, (1, 0))
    H, W = energy.shape
    cost = np.zeros((H, W))
    paths = np.zeros((H, W), dtype=np.int)
    # Initialization
    cost[0] = energy[0]
    paths[0] = 0  # we don't care about the first row of paths
    ### YOUR CODE HERE
    for i in range(1, H):
        M1 = np.insert(cost[i-1,0:W-1],0,1e10,axis=0)
        M2 = cost[i-1,:]
        M3 = np.insert(cost[i-1,1:W],W-1,1e10,axis=0)
        M = np.r_[M1,M2,M3].reshape(3,-1)
        cost[i] = energy[i]+np.min(M,axis=0)
        paths[i] = np.argmin(M,axis=0)-1     
    ### END YOUR CODE
    if axis == 0:
        cost = np.transpose(cost, (1, 0))
        paths = np.transpose(paths, (1, 0))
    # Check that paths only contains -1, 0 or 1
    assert np.all(np.any([paths == 1, paths == 0, \
     paths == -1], axis=0)), \
      "paths contains other values than -1, 0 or 1"
    return cost, paths
# Let's first test with a small example

test_energy = np.array([[1.0, 2.0, 1.5],
                        [3.0, 1.0, 2.0],
                        [4.0, 0.5, 3.0]])

solution_cost = np.array([[1.0, 2.0, 1.5],
                          [4.0, 2.0, 3.5],
                          [6.0, 2.5, 5.0]])

solution_paths = np.array([[ 0,  0,  0],
                           [ 0, -1,  0],
                           [ 1,  0, -1]])

# Vertical Cost Map
vcost, vpaths = compute_cost(_, test_energy, axis=1)  # don't need the first argument for compute_cost

print("Energy:")
print(test_energy)

print("Cost:")
print(vcost)
print("Solution cost:")
print(solution_cost)

print("Paths:")
print(vpaths)
print("Solution paths:")
print(solution_paths)
Energy:
[[1.  2.  1.5]
 [3.  1.  2. ]
 [4.  0.5 3. ]]
Cost:
[[1.  2.  1.5]
 [4.  2.  3.5]
 [6.  2.5 5. ]]
Solution cost:
[[1.  2.  1.5]
 [4.  2.  3.5]
 [6.  2.5 5. ]]
Paths:
[[ 0  0  0]
 [ 0 -1  0]
 [ 1  0 -1]]
Solution paths:
[[ 0  0  0]
 [ 0 -1  0]
 [ 1  0 -1]]
# Vertical Cost Map
start = time()
vcost, _ = compute_cost(_, energy, axis=1)  # don't need the first argument for compute_cost
end = time()

print("Computing vertical cost map: %f seconds." % (end - start))

plt.title('Vertical Cost Map')
plt.axis('off')
plt.imshow(vcost, cmap='inferno')
plt.show()
Computing vertical cost map: 0.051027 seconds.
# Horizontal Cost Map
start = time()
hcost, _ = compute_cost(_, energy, axis=0)
end = time()

print("Computing horizontal cost map: %f seconds." % (end - start))

plt.title('Horizontal Cost Map')
plt.axis('off')
plt.imshow(hcost, cmap='inferno')
plt.show()
Computing horizontal cost map: 0.060227 seconds.