スタンフォード/CS131/宿題4-3 画像拡大

スポンサーリンク

Enlarge naive¶

We now want to tackle the reverse problem of enlarging an image.
One naive way to approach the problem would be to duplicate the optimal seam iteratively until we reach the desired size.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc
from skimage import color,io, util
from time import time
from IPython.display import HTML
from __future__ import print_function
from seam_carving import compute_cost,energy_function
from seam_carving import backtrack_seam

%matplotlib inline
plt.rcParams['figure.figsize'] = (15.0, 12.0) # set default size of plots
plt.rcParams["font.size"] = "17"
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

def duplicate_seam(image, seam):
"""Duplicates pixels of the seam, making the pixels on the seam path "twice larger".
This function will be helpful in functions enlarge_naive and enlarge.
Args:
image: numpy array of shape (H, W, C)
seam: numpy array of shape (H,) of indices
Returns:
out: numpy array of shape (H, W+1, C)
"""
H, W, C = image.shape
out = np.zeros((H, W + 1, C))
for i in range(H):
out[i] = np.insert(image[i], \
seam[i],image[i,seam[i]],axis=0)
return out

def enlarge_naive(image, size, axis=1, efunc=energy_function, cfunc=compute_cost):
"""Increases the size of the image using the seam duplication process.
At each step, we duplicate the lowest energy seam from the image. We repeat the process
until we obtain an output of desired size.
Use functions:
- efunc
- cfunc
- backtrack_seam
- duplicate_seam
Args:
image: numpy array of shape (H, W, C)
size: size to increase height or width to (depending on axis)
axis: increase in width (axis=1) or height (axis=0)
efunc: energy function to use
cfunc: cost function to use
Returns:
out: numpy array of shape (size, W, C) if axis=0, or (H, size, C) if axis=1
"""
out = np.copy(image)
if axis == 0:
out = np.transpose(out, (1, 0, 2))
H = out.shape[0]
W = out.shape[1]
assert size > W, "size must be greather than %d" % W
for i in range(size-W):
energy = efunc(out)
cost,path = cfunc(out,energy)
seam = backtrack_seam(path,np.argmin(cost[-1]))
out = duplicate_seam(out,seam)
if axis == 0:
out = np.transpose(out, (1, 0, 2))
return out

# Let's first test with a small example
test_img = np.arange(9, dtype=np.float64).reshape((3, 3))
test_img = np.stack([test_img, test_img, test_img], axis=2)
assert test_img.shape == (3, 3, 3)

cost = np.array([[1.0, 2.0, 1.5],
[4.0, 2.0, 3.5],
[6.0, 2.5, 5.0]])

paths = np.array([[ 0,  0,  0],
[ 0, -1,  0],
[ 1,  0, -1]])

# Increase image width
W_new = 4

# We force the cost and paths to our values
out = enlarge_naive(test_img, W_new, cfunc=lambda x, y: (cost, paths))

print("Original image (channel 0):")
print(test_img[:, :, 0])
print("Enlarged image (channel 0): we see that seam [0, 4, 7] is duplicated")
print(out[:, :, 0])

assert np.allclose(out[:, :, 0], np.array([[0, 0, 1, 2], [3, 4, 4, 5], [6, 7, 7, 8]]))

Original image (channel 0):
[[0. 1. 2.]
[3. 4. 5.]
[6. 7. 8.]]
Enlarged image (channel 0): we see that seam [0, 4, 7] is duplicated
[[0. 0. 1. 2.]
[3. 4. 4. 5.]
[6. 7. 7. 8.]]

W_new = 800

img = util.img_as_float(img)
H, W, _ = img.shape
# This is a naive implementation of image enlarging
# which iteratively computes energy function, finds optimal seam
# and duplicates it.
# This process will a stretching artifact by choosing the same seam
start = time()
enlarged = enlarge_naive(img, W_new)
end = time()

# Can take around 20 seconds
print("Enlarging(naive) height from %d to %d: %f seconds." \
% (W, W_new, end - start))

plt.imshow(enlarged)
plt.show()

Enlarging(naive) height from 640 to 800: 9.395121 seconds.


The issue with enlarge_naive is that the same seam will be selected again and again, so this low energy seam will be the only to be duplicated.
enlarge_naiveの問題は、同じシームが何度も選択されることで、この低エネルギーシームだけが複製され続けることにある。

Another way to get k different seams is to apply the process we used in function reduce, and keeping track of the seams we delete progressively.
k別シームを得るための他の方法は、関数reduceを使用して、漸進的に消去するシームを記録することだ。

The function find_seams(image, k) will find the top k seams for removal iteratively.

The inner workings of the function are a bit tricky so we’ve implemented it for you, but you should go into the code and understand how it works.
この関数の内部構造は少し複雑なので、こっちで実装するが、コードを詳しく調べて関数の仕組みを理解する必要がある。このことは、enlarge関数の実装にも役立つはずである。

from seam_carving import find_seams

# Alternatively, find k seams for removal and duplicate them.
start = time()
seams = find_seams(img, W_new - W)
end = time()

# Can take around 10 seconds
print("Finding %d seams: %f seconds." % (W_new - W, end - start))

plt.imshow(seams, cmap='viridis')
plt.show()

Finding 160 seams: 8.977444 seconds.


Enlarge¶

We can see that all the seams found are different, and they avoid the castle and the person.

One issue we can mention is that we cannot enlarge more than we can reduce. Because of our process, the maximum enlargement is the width of the image W because we first need to find W different seams in the image.
1つ言える課題は、拡大倍率は縮小倍率を超えられないということだ。使用プロレスの制約により、最大拡大は、最初に画像のW別シームを見つけ出す必要がある理由から、画像の幅であるWになる。

One effect we can see on this image is that the blue sky at the right of the castle can only be enlarged x2. The concentration of seams in this area is very strong.
この画像に見られる1つの効果が、城の右の青空がx2だけ拡大可能であるということだ。この領域のシーム濃度は非常に濃い。

We can also note that the seams at the right of the castle have a blue color, which means they have low value and were removed in priority in the seam selection process.

def enlarge(image, size, axis=1, efunc=energy_function, cfunc=compute_cost):
"""Enlarges the size of the image by duplicating the low energy seams.
We start by getting the k seams to duplicate through function find_seams.
We iterate through these seams and duplicate each one iteratively.
Use functions:
- find_seams
- duplicate_seam
Args:
image: numpy array of shape (H, W, C)
size: size to reduce height or width to (depending on axis)
axis: enlarge in width (axis=1) or height (axis=0)
efunc: energy function to use
cfunc: cost function to use
Returns:
out: numpy array of shape (size, W, C) if axis=0, or (H, size, C) if axis=1
"""
out = np.copy(image)
# Transpose for height resizing
if axis == 0:
out = np.transpose(out, (1, 0, 2))
H, W, C = out.shape
assert size > W, "size must be greather than %d" % W
assert size <= 2 * W, "size must be smaller than %d" % (2 * W)
seams = find_seams(out,size-W)
seams = np.expand_dims(seams,axis=2)
for i in range(size-W):
out = duplicate_seam(out,np.where(seams==i+1)[1])
seams = duplicate_seam(seams,np.where(seams==i+1)[1])
if axis == 0:
out = np.transpose(out, (1, 0, 2))
return out

# Let's first test with a small example
test_img = np.array([[0.0, 1.0, 3.0],
[0.0, 1.0, 3.0],
[0.0, 1.0, 3.0]])
#test_img = np.arange(9, dtype=np.float64).reshape((3, 3))
test_img = np.stack([test_img, test_img, test_img], axis=2)
assert test_img.shape == (3, 3, 3)

# Increase image width
W_new = 5

out_naive = enlarge_naive(test_img, W_new)
out = enlarge(test_img, W_new)

print("Original image (channel 0):")
print(test_img[:, :, 0])
print("Enlarged naive image (channel 0): first seam is duplicated twice.")
print(out_naive[:, :, 0])
print("Enlarged image (channel 0): first and second seam are each duplicated once.")
print(out[:, :, 0])

assert np.allclose(out[:, :, 0], np.array([[0, 0, 1, 1, 3], [0, 0, 1, 1, 3], [0, 0, 1, 1, 3]]))

Original image (channel 0):
[[0. 1. 3.]
[0. 1. 3.]
[0. 1. 3.]]
Enlarged naive image (channel 0): first seam is duplicated twice.
[[0. 0. 0. 1. 3.]
[0. 0. 0. 1. 3.]
[0. 0. 0. 1. 3.]]
Enlarged image (channel 0): first and second seam are each duplicated once.
[[0. 0. 1. 1. 3.]
[0. 0. 1. 1. 3.]
[0. 0. 1. 1. 3.]]

W_new = 800

start = time()
out = enlarge(img, W_new)
end = time()

# Can take around 20 seconds
print("Enlarging width from %d to %d: %f seconds." \
% (W, W_new, end - start))

plt.subplot(2, 1, 1)
plt.title('Original')
plt.imshow(img)

plt.subplot(2, 1, 2)
plt.title('Resized')
plt.imshow(out)
plt.show()

Enlarging width from 640 to 800: 12.055351 seconds.

# Map of the seams for horizontal seams.
start = time()
seams = find_seams(img, W_new - W, axis=0)
end = time()

# Can take around 15 seconds
print("Finding %d seams: %f seconds." % (W_new - W, end - start))

plt.imshow(seams, cmap='viridis')
plt.show()

Finding 160 seams: 12.105962 seconds.

H_new = 600
np.set_printoptions(threshold='nan')
start = time()
out = enlarge(img, H_new, axis=0)
end = time()

# Can take around 20 seconds
print("Enlarging height from %d to %d: %f seconds." \
% (H, H_new, end - start))

plt.subplot(1, 2, 1)
plt.title('Original')
plt.imshow(img)

plt.subplot(1, 2, 2)
plt.title('Resized')
plt.imshow(out)
plt.show()

Enlarging height from 434 to 600: 16.042105 seconds.


As you can see in the example above, the sky above the castle has doubled in size, the grass below has doubled in size but we still can’t reach a height of 600.

The algorithm then needs to enlarge the castle itself, while trying to avoid enlarging the windows for instance.

スポンサーリンク

フォローする