スタンフォード/CS131/宿題4-3 画像拡大

前回のStanford University/CS131/宿題4-2 最適シーム探索の続きをやる。今回の宿題は、Image Enlarging(画像拡大)、Enlarge naive(単純拡大)、Enlarge(拡大)をカバーする。

スポンサーリンク

Enlarge naive

We now want to tackle the reverse problem of enlarging an image.
One naive way to approach the problem would be to duplicate the optimal seam iteratively until we reach the desired size.
次に、画像拡大の逆の問題に取り組みたい。この問題にアプローチするための1つの単純な方法は、所望のサイズを得るまで最適シームを反復的に複製することだ。

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc
from skimage import color,io, util
from time import time
from IPython.display import HTML
from __future__ import print_function
from seam_carving import compute_cost,energy_function
from seam_carving import backtrack_seam

%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'
def duplicate_seam(image, seam):
    """Duplicates pixels of the seam, making the pixels on the seam path "twice larger".
    This function will be helpful in functions enlarge_naive and enlarge.
    Args:
        image: numpy array of shape (H, W, C)
        seam: numpy array of shape (H,) of indices
    Returns:
        out: numpy array of shape (H, W+1, C)
    """
    H, W, C = image.shape
    out = np.zeros((H, W + 1, C))
    ### YOUR CODE HERE
    for i in range(H):
        out[i] = np.insert(image[i], \
         seam[i],image[i,seam[i]],axis=0)
    ### END YOUR CODE
    return out
def enlarge_naive(image, size, axis=1, efunc=energy_function, cfunc=compute_cost):
    """Increases the size of the image using the seam duplication process.
    At each step, we duplicate the lowest energy seam from the image. We repeat the process
    until we obtain an output of desired size.
    Use functions:
        - efunc
        - cfunc
        - backtrack_seam
        - duplicate_seam
    Args:
        image: numpy array of shape (H, W, C)
        size: size to increase height or width to (depending on axis)
        axis: increase in width (axis=1) or height (axis=0)
        efunc: energy function to use
        cfunc: cost function to use
    Returns:
        out: numpy array of shape (size, W, C) if axis=0, or (H, size, C) if axis=1
    """
    out = np.copy(image)
    if axis == 0:
        out = np.transpose(out, (1, 0, 2))
    H = out.shape[0]
    W = out.shape[1]
    assert size > W, "size must be greather than %d" % W
    ### YOUR CODE HERE
    for i in range(size-W):
        energy = efunc(out)
        cost,path = cfunc(out,energy)
        seam = backtrack_seam(path,np.argmin(cost[-1]))
        out = duplicate_seam(out,seam)
    ### END YOUR CODE
    if axis == 0:
        out = np.transpose(out, (1, 0, 2))
    return out
# Let's first test with a small example
test_img = np.arange(9, dtype=np.float64).reshape((3, 3))
test_img = np.stack([test_img, test_img, test_img], axis=2)
assert test_img.shape == (3, 3, 3)

cost = np.array([[1.0, 2.0, 1.5],
                 [4.0, 2.0, 3.5],
                 [6.0, 2.5, 5.0]])

paths = np.array([[ 0,  0,  0],
                  [ 0, -1,  0],
                  [ 1,  0, -1]])

# Increase image width
W_new = 4

# We force the cost and paths to our values
out = enlarge_naive(test_img, W_new, cfunc=lambda x, y: (cost, paths))

print("Original image (channel 0):")
print(test_img[:, :, 0])
print("Enlarged image (channel 0): we see that seam [0, 4, 7] is duplicated")
print(out[:, :, 0])

assert np.allclose(out[:, :, 0], np.array([[0, 0, 1, 2], [3, 4, 4, 5], [6, 7, 7, 8]]))
Original image (channel 0):
[[0. 1. 2.]
 [3. 4. 5.]
 [6. 7. 8.]]
Enlarged image (channel 0): we see that seam [0, 4, 7] is duplicated
[[0. 0. 1. 2.]
 [3. 4. 4. 5.]
 [6. 7. 7. 8.]]
W_new = 800

img = io.imread('imgs/broadway_tower.jpg')
img = util.img_as_float(img)
H, W, _ = img.shape
# This is a naive implementation of image enlarging
# which iteratively computes energy function, finds optimal seam
# and duplicates it.
# This process will a stretching artifact by choosing the same seam
start = time()
enlarged = enlarge_naive(img, W_new)
end = time()

# Can take around 20 seconds
print("Enlarging(naive) height from %d to %d: %f seconds." \
      % (W, W_new, end - start))

plt.imshow(enlarged)
plt.show()
Enlarging(naive) height from 640 to 800: 9.395121 seconds.

The issue with enlarge_naive is that the same seam will be selected again and again, so this low energy seam will be the only to be duplicated.
enlarge_naiveの問題は、同じシームが何度も選択されることで、この低エネルギーシームだけが複製され続けることにある。

Another way to get k different seams is to apply the process we used in function reduce, and keeping track of the seams we delete progressively.
k別シームを得るための他の方法は、関数reduceを使用して、漸進的に消去するシームを記録することだ。

The function find_seams(image, k) will find the top k seams for removal iteratively.
関数find_seams(image, k)は、反復的に除去するための第一kシーム候補を探し出していく。

The inner workings of the function are a bit tricky so we’ve implemented it for you, but you should go into the code and understand how it works.
This should also help you for the implementation of enlarge.
この関数の内部構造は少し複雑なので、こっちで実装するが、コードを詳しく調べて関数の仕組みを理解する必要がある。このことは、enlarge関数の実装にも役立つはずである。

from seam_carving import find_seams

# Alternatively, find k seams for removal and duplicate them.
start = time()
seams = find_seams(img, W_new - W)
end = time()

# Can take around 10 seconds
print("Finding %d seams: %f seconds." % (W_new - W, end - start))

plt.imshow(seams, cmap='viridis')
plt.show()
Finding 160 seams: 8.977444 seconds.

Enlarge

We can see that all the seams found are different, and they avoid the castle and the person.
全探知シームが異なり、それらが城と人を避けているのが分かる。

One issue we can mention is that we cannot enlarge more than we can reduce. Because of our process, the maximum enlargement is the width of the image W because we first need to find W different seams in the image.
1つ言える課題は、拡大倍率は縮小倍率を超えられないということだ。使用プロレスの制約により、最大拡大は、最初に画像のW別シームを見つけ出す必要がある理由から、画像の幅であるWになる。

One effect we can see on this image is that the blue sky at the right of the castle can only be enlarged x2. The concentration of seams in this area is very strong.
この画像に見られる1つの効果が、城の右の青空がx2だけ拡大可能であるということだ。この領域のシーム濃度は非常に濃い。

We can also note that the seams at the right of the castle have a blue color, which means they have low value and were removed in priority in the seam selection process.
城の右のシームが、低値でシーム選択過程で優先的に除去されることを意味する青色であることにも気付くことができる。

def enlarge(image, size, axis=1, efunc=energy_function, cfunc=compute_cost):
    """Enlarges the size of the image by duplicating the low energy seams.
    We start by getting the k seams to duplicate through function find_seams.
    We iterate through these seams and duplicate each one iteratively.
    Use functions:
        - find_seams
        - duplicate_seam
    Args:
        image: numpy array of shape (H, W, C)
        size: size to reduce height or width to (depending on axis)
        axis: enlarge in width (axis=1) or height (axis=0)
        efunc: energy function to use
        cfunc: cost function to use
    Returns:
        out: numpy array of shape (size, W, C) if axis=0, or (H, size, C) if axis=1
    """
    out = np.copy(image)
    # Transpose for height resizing
    if axis == 0:
        out = np.transpose(out, (1, 0, 2))
    H, W, C = out.shape
    assert size > W, "size must be greather than %d" % W
    assert size <= 2 * W, "size must be smaller than %d" % (2 * W)
    ### YOUR CODE HERE
    seams = find_seams(out,size-W)
    seams = np.expand_dims(seams,axis=2)
    for i in range(size-W):
        out = duplicate_seam(out,np.where(seams==i+1)[1])
        seams = duplicate_seam(seams,np.where(seams==i+1)[1])        
    ### END YOUR CODE
    if axis == 0:
        out = np.transpose(out, (1, 0, 2))
    return out
# Let's first test with a small example
test_img = np.array([[0.0, 1.0, 3.0],
                     [0.0, 1.0, 3.0],
                     [0.0, 1.0, 3.0]])
#test_img = np.arange(9, dtype=np.float64).reshape((3, 3))
test_img = np.stack([test_img, test_img, test_img], axis=2)
assert test_img.shape == (3, 3, 3)

# Increase image width
W_new = 5

out_naive = enlarge_naive(test_img, W_new)
out = enlarge(test_img, W_new)

print("Original image (channel 0):")
print(test_img[:, :, 0])
print("Enlarged naive image (channel 0): first seam is duplicated twice.")
print(out_naive[:, :, 0])
print("Enlarged image (channel 0): first and second seam are each duplicated once.")
print(out[:, :, 0])

assert np.allclose(out[:, :, 0], np.array([[0, 0, 1, 1, 3], [0, 0, 1, 1, 3], [0, 0, 1, 1, 3]]))
Original image (channel 0):
[[0. 1. 3.]
 [0. 1. 3.]
 [0. 1. 3.]]
Enlarged naive image (channel 0): first seam is duplicated twice.
[[0. 0. 0. 1. 3.]
 [0. 0. 0. 1. 3.]
 [0. 0. 0. 1. 3.]]
Enlarged image (channel 0): first and second seam are each duplicated once.
[[0. 0. 1. 1. 3.]
 [0. 0. 1. 1. 3.]
 [0. 0. 1. 1. 3.]]
W_new = 800

start = time()
out = enlarge(img, W_new)
end = time()

# Can take around 20 seconds
print("Enlarging width from %d to %d: %f seconds." \
      % (W, W_new, end - start))

plt.subplot(2, 1, 1)
plt.title('Original')
plt.imshow(img)

plt.subplot(2, 1, 2)
plt.title('Resized')
plt.imshow(out)
plt.show()
Enlarging width from 640 to 800: 12.055351 seconds.
# Map of the seams for horizontal seams.
start = time()
seams = find_seams(img, W_new - W, axis=0)
end = time()

# Can take around 15 seconds
print("Finding %d seams: %f seconds." % (W_new - W, end - start))

plt.imshow(seams, cmap='viridis')
plt.show()
Finding 160 seams: 12.105962 seconds.
H_new = 600
np.set_printoptions(threshold='nan')
start = time()
out = enlarge(img, H_new, axis=0)
end = time()

# Can take around 20 seconds
print("Enlarging height from %d to %d: %f seconds." \
      % (H, H_new, end - start))

plt.subplot(1, 2, 1)
plt.title('Original')
plt.imshow(img)

plt.subplot(1, 2, 2)
plt.title('Resized')
plt.imshow(out)
plt.show()
Enlarging height from 434 to 600: 16.042105 seconds.

As you can see in the example above, the sky above the castle has doubled in size, the grass below has doubled in size but we still can’t reach a height of 600.
上の例で分かるように、城の上の空はサイズが倍増され、下の草はサイズが倍増されているが、それでも高さ600に届くことができていない。
The algorithm then needs to enlarge the castle itself, while trying to avoid enlarging the windows for instance.
従って、アルゴリズムは、例えば、窓は拡大を避ける一方で、城自体を拡大する必要がある。