tf.nn.conv2d

I would evlaute api of tensorlfow about convolution, tf.nn.conv2d.

as you already know it, the number of suffix of tf.nn.conv00 api mean input date’s number of dimensionality.

So Tensorflow has 3 types of api about conv0d such as conv1d, conv2d, conv3d.

e.g.

  • conv1d is input data is width and channels

  • conv2d is input data is height, width, and channels

  • conv3d is input data is depth, height, width and channels.

In here I would go through conv2d with data format, “NHWC”.

  • N : The number of batch

  • H : The number of height

  • W : The number of width

  • C : The number of channels

In order to computes 2-D convolution given 4-D input and filter tensors.

Given an input tensor of shape [batch, in_height, in_width, in_channels] and a filter/kernel tensor of shape [filter_height, filter_width, in_channels, out_channels], this op performs the following:

  1. Flatten the filter to a 2-D matrix with shape [filter_height * fliter_width * in_channels, output_channels].

  2. Extracts image patches from the input tensor to form a virtual tensor of shape [batch, out_height, out_width, filter_height * filter_width * in_channels].

  3. For each patch, right-multiplies the filter matrix and the image patch vector.

In detail, with the default NHWC format,

output[b, i, j, k] =
    sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
                    filter[di, dj, q, k]

This format must have strdies[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides,

strides = [1, stride, stride, 1]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
"""Example code for tf.nn.embedding_lookup(https://www.tensorflow.org/versions/r1.8/api_docs/python/tf/nn/conv2d)

tf.nn.conv2d(
    input,
    filter,
    strides,
    padding,
    use_cudnn_on_gpu=True,
    data_format='NHWC',
    dilations=[1, 1, 1, 1],
    name=None
)
"""

import sys
import tensorflow as tf
import numpy as np

print("=== Version checking ===")
print("The version of sys: \n{}".format(sys.version))
print("Tensorflow version: {}".format(tf.__version__))
print("========================")
=== Version checking ===
The version of sys: 
3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609]
Tensorflow version: 1.8.0
========================

The following functions would show you how much the height and width of output have.

But you have to provide height and widht of input image, and filter size, strides with padding scheme.

You could know the height and width of output and also know size of padding.

In Tensorflow, the scheme of padding is the smallest possible padding.

Let’s see an example:

In here, in_channels of input have to be same from filter’s in_channel.

and filter_out_channels is the size of output after filtering each patches of input.

finally, strides list which has 4 dimension for each striding dimension.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
batch = 1
in_height = 4 
in_width = 5
in_channels = 1


filter_height = 2
filter_width = 5
filter_in_channels = in_channels
filter_out_channels = 1

strides = [1, 1, 1, 1]

print("[batch, in_height, in_width, in_channels] = \n[{}, {}, {}, {}]".format(batch, in_height, in_width, in_channels))

print("[filter_height, filter_width, filter_in_channels, filter_out_channels] = \n[{}, {}, {}, {}]".format(filter_height, filter_width, filter_in_channels, filter_out_channels))

print("strides = \n[{}, {}, {}, {}]".format(strides[0], strides[1], strides[2], strides[3]))
[batch, in_height, in_width, in_channels] = 
[1, 4, 5, 1]
[filter_height, filter_width, filter_in_channels, filter_out_channels] = 
[2, 5, 1, 1]
strides = 
[1, 1, 1, 1]

First of all, let’s evaluate output size with “VALID” scheme padding:

1
2
3
4
5
6
7
8
9
def check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width):
    out_height = np.ceil(float(in_height - filter_height + 1) / float(strides[1]))
    out_width  = np.ceil(float(in_width - filter_width + 1) / float(strides[2]))
    print("VAILD padding is no padding")
    print("output_height: {}".format(out_height))
    print("output_width: {}".format(in_width))
    
print("What is height and width of output??????b")
check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width)    
What is height and width of output??????b
VAILD padding is no padding
output_height: 3.0
output_width: 5

Secondly, check output size with “SAME” scheme padding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def check_output_size_with_SAME(in_height, in_width, strides, filter_height, filter_width):
    out_height = np.ceil(float(in_height) / float(strides[1]))
    out_width  = np.ceil(float(in_width) / float(strides[2]))
    print("SAME has padding and it the smallest possible padding")
    print("output_height: {}".format(out_height))
    print("output_width: {}".format(out_width))

    if (in_height % strides[1] == 0):
        pad_along_height = max(filter_height - strides[1], 0)
    else:
        pad_along_height = max(filter_height - (in_height % strides[1]), 0)
    if (in_width % strides[2] == 0):
        pad_along_width = max(filter_width - strides[2], 0)
    else:
        pad_along_width = max(filter_width - (in_width % strides[2]), 0)

    pad_top = pad_along_height // 2
    pad_bottom = pad_along_height - pad_top
    pad_left = pad_along_width // 2
    pad_right = pad_along_width - pad_left

    print("pad along height and width...")
    print("pad along height: {}".format(pad_along_height))
    print("pad along width: {}".format(pad_along_width))
    pad_top = pad_along_height // 2 # divied by 2
    pad_bottom = pad_along_height - pad_top
    pad_left = pad_along_width // 2
    pad_right = pad_along_width - pad_left    

    print("Padding size on top, bottom, left and right")
    print("top: {}".format(pad_top))
    print("bottom: {}".format(pad_bottom))
    print("left: {}".format(pad_left))
    print("right: {}".format(pad_right))   

print("What is height and width of output???????")
check_output_size_with_SAME(in_height, in_width, strides, filter_height, filter_width)
What is height and width of output???????
SAME has padding and it the smallest possible padding
output_height: 4.0
output_width: 5.0
pad along height and width...
pad along height: 1
pad along width: 4
Padding size on top, bottom, left and right
top: 0
bottom: 1
left: 2
right: 2

Basically, covolution computes per channels like this.

From http://cs231n.github.io/convolutional-networks/

Let’s go through “VALID” Scheme padding from bottom in detail.

Input image data is random number and kernel is constant tensor.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
batch = 1
in_height = 4 
in_width = 5
in_channels = 1


filter_height = 2
filter_width = 5
filter_in_channels = in_channels
filter_out_channels = 1

strides = [1, 1, 1, 1]

# image is [batch, in_height, in_width, in_channels]
image_tensor = tf.get_variable(shape=[batch, in_height, in_width, in_channels], dtype=tf.float32, name="image")

# filter's size is [filter_height, filter_width, in_channels, out_channels]
kernel = tf.constant(1, shape=[filter_height, filter_width, filter_in_channels, filter_out_channels], dtype=tf.float32, name="kernel")

# the result of convolution with each patches of image and filter.
result = tf.nn.conv2d(image_tensor, kernel, strides=strides, padding="VALID")

init_op = tf.global_variables_initializer()

print("What is height and width of output ?")
check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width)
What is height and width of output ?
VAILD padding is no padding
output_height: 3.0
output_width: 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
with tf.Session() as sess:
    sess.run(init_op)
    print("====== Image Tensor ======= \n{}".format(image_tensor))
    image = sess.run(image_tensor)
    print(image)
    print("\n====== Kernel Tensor ======= \n{}".format(kernel))
    print(sess.run(kernel))
    #print("\n====== Result Tensor ======= \n{}".format(result))
    #print(sess.run(result))
    print("\n========== Checking separately how to evaluate conv2d =============")
    a = image[0, 0]; b = image[0, 1]; c = image[0, 2]; d = image[0, 3];
    print("\nimage(0, 0, :, :) \n{}".format(a))
    print("\nimage(0, 1, :, :) \n{}".format(b))
    print("\nimage(0, 2, :, :) \n{}".format(c))
    print("\nimage(0, 3, :, :) \n{}".format(d))
    print("\n=============== Sum ===========")
    sumA=np.sum(a); sumB = np.sum(b); sumC = np.sum(c); sumD = np.sum(d)
    print("\nSum of image(0, 0, :, :) \n{}".format(sumA))
    print("\nSum of image(0, 1, :, :) \n{}".format(sumB))
    print("\nSum of image(0, 2, :, :) \n{}".format(sumC))
    print("\nSum of image(0, 3, :, :) \n{}".format(sumD))
    print("\n============== after conv =================")
    print("Predcition of mine")
    print("\nSum of image(0, 0:2, :, :) \n{}".format(sumA + sumB))
    print("\nSum of image(0, 1:3, :, :) \n{}".format(sumB + sumC))
    print("\nSum of image(0, 2:4, :, :) \n{}".format(sumC + sumD))
    print("Real result of conv2d")
    print("\nThe result after conv2d: \n{}".format(sess.run(result)))
====== Image Tensor ======= 
<tf.Variable 'image:0' shape=(1, 4, 5, 1) dtype=float32_ref>
[[[[-0.1517433 ]
   [-0.07601047]
   [ 0.04082358]
   [ 0.23615754]
   [ 0.09633338]]

  [[ 0.2741418 ]
   [-0.0265708 ]
   [ 0.26267707]
   [ 0.20932722]
   [ 0.05905843]]

  [[-0.13125479]
   [ 0.26252484]
   [-0.41824389]
   [ 0.34488845]
   [ 0.39004552]]

  [[ 0.3134501 ]
   [ 0.4909582 ]
   [-0.40073442]
   [ 0.4317845 ]
   [-0.27387238]]]]

====== Kernel Tensor ======= 
Tensor("kernel:0", shape=(2, 5, 1, 1), dtype=float32)
[[[[1.]]

  [[1.]]

  [[1.]]

  [[1.]]

  [[1.]]]


 [[[1.]]

  [[1.]]

  [[1.]]

  [[1.]]

  [[1.]]]]

========== Checking separately how to evaluate conv2d =============

image(0, 0, :, :) 
[[-0.1517433 ]
 [-0.07601047]
 [ 0.04082358]
 [ 0.23615754]
 [ 0.09633338]]

image(0, 1, :, :) 
[[ 0.2741418 ]
 [-0.0265708 ]
 [ 0.26267707]
 [ 0.20932722]
 [ 0.05905843]]

image(0, 2, :, :) 
[[-0.13125479]
 [ 0.26252484]
 [-0.41824389]
 [ 0.34488845]
 [ 0.39004552]]

image(0, 3, :, :) 
[[ 0.3134501 ]
 [ 0.4909582 ]
 [-0.40073442]
 [ 0.4317845 ]
 [-0.27387238]]

=============== Sum ===========

Sum of image(0, 0, :, :) 
0.14556074142456055

Sum of image(0, 1, :, :) 
0.778633713722229

Sum of image(0, 2, :, :) 
0.44796013832092285

Sum of image(0, 3, :, :) 
0.5615860223770142

============== after conv =================
Predcition of mine

Sum of image(0, 0:2, :, :) 
0.9241944551467896

Sum of image(0, 1:3, :, :) 
1.2265938520431519

Sum of image(0, 2:4, :, :) 
1.009546160697937
Real result of conv2d

The result after conv2d: 
[[[[0.92419446]]

  [[1.2265939 ]]

  [[1.0095462 ]]]]

Let’s go through another example changin the in_channels and output_channels

In this case, in_channels is 2 and output_channels is 1

Keep in mind This con2d operation execute right-muplication.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
batch = 1
in_height = 4 
in_width = 5
in_channels = 2


filter_height = 2
filter_width = 5
filter_in_channels = in_channels
filter_out_channels = 1

strides = [1, 1, 1, 1]

# image is [batch, in_height, in_width, in_channels]
image_tensor = tf.get_variable(shape=[batch, in_height, in_width, in_channels], dtype=tf.float32, name="image_with_in_channel2")

# filter's size is [filter_height, filter_width, in_channels, out_channels]
kernel = tf.constant(1, shape=[filter_height, filter_width, filter_in_channels, filter_out_channels], dtype=tf.float32, name="kernel")

# the result of convolution with each patches of image and filter.
result = tf.nn.conv2d(image_tensor, kernel, strides=strides, padding="VALID")

init_op = tf.global_variables_initializer()

print("What is height and width of output ?")
check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width)
What is height and width of output ?
VAILD padding is no padding
output_height: 3.0
output_width: 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
with tf.Session() as sess:
    sess.run(init_op)
    print("====== Image Tensor ======= \n{}".format(image_tensor))
    image = sess.run(image_tensor)
    print(image)
    print("\n====== Kernel Tensor ======= \n{}".format(kernel))
    print(sess.run(kernel))
    #print("\n====== Result Tensor ======= \n{}".format(result))
    #print(sess.run(result))
    print("\n========== Checking separately how to evaluate conv2d =============")
    a = image[0, 0]; b = image[0, 1]; c = image[0, 2]; d = image[0, 3];
    print("\nimage(0, 0, :, :) \n{}".format(a))
    print("\nimage(0, 1, :, :) \n{}".format(b))
    print("\nimage(0, 2, :, :) \n{}".format(c))
    print("\nimage(0, 3, :, :) \n{}".format(d))
    print("\n=============== Sum ===========")
    sumA=np.sum(a); sumB = np.sum(b); sumC = np.sum(c); sumD = np.sum(d)
    print("\nSum of image(0, 0, :, :) \n{}".format(sumA))
    print("\nSum of image(0, 1, :, :) \n{}".format(sumB))
    print("\nSum of image(0, 2, :, :) \n{}".format(sumC))
    print("\nSum of image(0, 3, :, :) \n{}".format(sumD))
    print("\n============== after conv =================")
    print("Predcition of mine")
    print("\nSum of image(0, 0:2, :, :) \n{}".format(sumA + sumB))
    print("\nSum of image(0, 1:3, :, :) \n{}".format(sumB + sumC))
    print("\nSum of image(0, 2:4, :, :) \n{}".format(sumC + sumD))
    print("Real result of conv2d")
    print("\nThe result after conv2d: \n{}".format(sess.run(result)))
====== Image Tensor ======= 
<tf.Variable 'image_with_in_channel2:0' shape=(1, 4, 5, 2) dtype=float32_ref>
[[[[ 0.08029783 -0.28304037]
   [ 0.27321577 -0.30532986]
   [ 0.11224103 -0.14924899]
   [-0.36968923 -0.01813132]
   [-0.06485182  0.00812727]]

  [[ 0.12902611  0.15143675]
   [ 0.090442    0.4418546 ]
   [ 0.1522311   0.03220689]
   [ 0.20738578 -0.33341202]
   [-0.31545156 -0.3610437 ]]

  [[ 0.28795433  0.32723296]
   [ 0.2937019  -0.39851546]
   [-0.11825514  0.40925205]
   [-0.38389468 -0.33764756]
   [-0.16338924  0.05253029]]

  [[ 0.39701647  0.20766819]
   [-0.3134635  -0.2894466 ]
   [ 0.03414229 -0.00896153]
   [ 0.17535049  0.13804382]
   [ 0.07206517  0.16677117]]]]

====== Kernel Tensor ======= 
Tensor("kernel_1:0", shape=(2, 5, 2, 1), dtype=float32)
[[[[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]]


 [[[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]

  [[1.]
   [1.]]]]

========== Checking separately how to evaluate conv2d =============

image(0, 0, :, :) 
[[ 0.08029783 -0.28304037]
 [ 0.27321577 -0.30532986]
 [ 0.11224103 -0.14924899]
 [-0.36968923 -0.01813132]
 [-0.06485182  0.00812727]]

image(0, 1, :, :) 
[[ 0.12902611  0.15143675]
 [ 0.090442    0.4418546 ]
 [ 0.1522311   0.03220689]
 [ 0.20738578 -0.33341202]
 [-0.31545156 -0.3610437 ]]

image(0, 2, :, :) 
[[ 0.28795433  0.32723296]
 [ 0.2937019  -0.39851546]
 [-0.11825514  0.40925205]
 [-0.38389468 -0.33764756]
 [-0.16338924  0.05253029]]

image(0, 3, :, :) 
[[ 0.39701647  0.20766819]
 [-0.3134635  -0.2894466 ]
 [ 0.03414229 -0.00896153]
 [ 0.17535049  0.13804382]
 [ 0.07206517  0.16677117]]

=============== Sum ===========

Sum of image(0, 0, :, :) 
-0.7164096832275391

Sum of image(0, 1, :, :) 
0.1946759819984436

Sum of image(0, 2, :, :) 
-0.0310305655002594

Sum of image(0, 3, :, :) 
0.579185962677002

============== after conv =================
Predcition of mine

Sum of image(0, 0:2, :, :) 
-0.5217337012290955

Sum of image(0, 1:3, :, :) 
0.1636454164981842

Sum of image(0, 2:4, :, :) 
0.5481554269790649
Real result of conv2d

The result after conv2d: 
[[[[-0.52173376]]

  [[ 0.16364542]]

  [[ 0.5481554 ]]]]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
batch = 1
in_height = 4 
in_width = 5
in_channels = 2


filter_height = 2
filter_width = 5
filter_in_channels = in_channels
filter_out_channels = 1


strides = [1, 1, 1, 1]

# image is [batch, in_height, in_width, in_channels]
image_tensor = tf.get_variable(shape=[batch, in_height, in_width, in_channels], dtype=tf.float32, name="image_other_filter")

# filter's size is concatenation of two, [filter_height, filter_width, in_channels-1, out_channels].
kernel1 = tf.constant(1, shape=[filter_height, filter_width, filter_in_channels-1, filter_out_channels], dtype=tf.float32, name="kernel_1")
kernel2 = tf.constant(0, shape=[filter_height, filter_width, filter_in_channels-1, filter_out_channels], dtype=tf.float32, name="kernel_2")

kernel = tf.concat([kernel1, kernel2], axis=2)

# the result of convolution with each patches of image and filter.
result = tf.nn.conv2d(image_tensor, kernel, strides=strides, padding="VALID")

init_op = tf.global_variables_initializer()

print("What is height and width of output ?")
check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width)
What is height and width of output ?
VAILD padding is no padding
output_height: 3.0
output_width: 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
with tf.Session() as sess:
    sess.run(init_op)
    print("====== Image Tensor ======= \n{}".format(image_tensor))
    image = sess.run(image_tensor)
    print(image)
    print("\n====== Kernel Tensor ======= \n{}, \n{}, \n{}".format(kernel1, kernel2, kernel))
    #print(sess.run(kernel1))
    #print(sess.run(kernel2))
    print(sess.run(kernel))
    #print("\n====== Result Tensor ======= \n{}".format(result))
    #print(sess.run(result))
    print("\n========== Checking separately how to evaluate conv2d =============")
    a = image[0, 0]; b = image[0, 1]; c = image[0, 2]; d = image[0, 3];
    one_hot = np.array([1., 0.])
    aMul=np.multiply(a, one_hot); bMul=np.multiply(b, one_hot);
    cMul=np.multiply(c, one_hot); dMul=np.multiply(d, one_hot);
    print("\nimage(0, 0, :, :), one_hot  \n{}, \n{}".format(a, aMul))
    print("\nimage(0, 1, :, :), one_hot  \n{}, \n{}".format(b, bMul))
    print("\nimage(0, 2, :, :), one_hot  \n{}, \n{}".format(c, cMul))
    print("\nimage(0, 3, :, :), one_hot  \n{}, \n{}".format(d, dMul))
    print("\n=============== Sum ===========")
    sumA=np.sum(aMul); sumB = np.sum(bMul); sumC = np.sum(cMul); sumD = np.sum(dMul)
    print("\nSum of image(0, 0, :, :)*one_hot \n{}".format(sumA))
    print("\nSum of image(0, 1, :, :)*one_hot \n{}".format(sumB))
    print("\nSum of image(0, 2, :, :)*one_hot \n{}".format(sumC))
    print("\nSum of image(0, 3, :, :)*one_hot \n{}".format(sumD))
    print("\n============== after conv =================")
    print("Predcition of mine")
    print("\nSum of image(0, 0:2, :, :) \n{}".format(sumA + sumB))
    print("\nSum of image(0, 1:3, :, :) \n{}".format(sumB + sumC))
    print("\nSum of image(0, 2:4, :, :) \n{}".format(sumC + sumD))
    print("Real result of conv2d")
    print("\nThe result after conv2d: \n{}".format(sess.run(result)))
====== Image Tensor ======= 
<tf.Variable 'image_other_filter:0' shape=(1, 4, 5, 2) dtype=float32_ref>
[[[[-0.02370977 -0.01503521]
   [-0.16839004 -0.38485444]
   [-0.20123771 -0.39222306]
   [ 0.32345092  0.28650707]
   [-0.373312    0.17185831]]

  [[ 0.24836594 -0.38405308]
   [-0.17415094 -0.23223642]
   [ 0.42959112 -0.13308945]
   [ 0.06653082 -0.09891927]
   [ 0.22589082  0.3532731 ]]

  [[-0.12551215  0.01440436]
   [ 0.17919159 -0.08958769]
   [-0.0690102   0.20290041]
   [-0.06356671  0.04334927]
   [ 0.30124927  0.34986454]]

  [[ 0.3900764  -0.23495927]
   [-0.24632071 -0.22620258]
   [ 0.43515533 -0.15318993]
   [-0.10825992 -0.12556458]
   [-0.35311309  0.07129389]]]]

====== Kernel Tensor ======= 
Tensor("kernel_1_1:0", shape=(2, 5, 1, 1), dtype=float32), 
Tensor("kernel_2:0", shape=(2, 5, 1, 1), dtype=float32), 
Tensor("concat:0", shape=(2, 5, 2, 1), dtype=float32)
[[[[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]]


 [[[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]

  [[1.]
   [0.]]]]

========== Checking separately how to evaluate conv2d =============

image(0, 0, :, :), one_hot  
[[-0.02370977 -0.01503521]
 [-0.16839004 -0.38485444]
 [-0.20123771 -0.39222306]
 [ 0.32345092  0.28650707]
 [-0.373312    0.17185831]], 
[[-0.02370977 -0.        ]
 [-0.16839004 -0.        ]
 [-0.20123771 -0.        ]
 [ 0.32345092  0.        ]
 [-0.373312    0.        ]]

image(0, 1, :, :), one_hot  
[[ 0.24836594 -0.38405308]
 [-0.17415094 -0.23223642]
 [ 0.42959112 -0.13308945]
 [ 0.06653082 -0.09891927]
 [ 0.22589082  0.3532731 ]], 
[[ 0.24836594 -0.        ]
 [-0.17415094 -0.        ]
 [ 0.42959112 -0.        ]
 [ 0.06653082 -0.        ]
 [ 0.22589082  0.        ]]

image(0, 2, :, :), one_hot  
[[-0.12551215  0.01440436]
 [ 0.17919159 -0.08958769]
 [-0.0690102   0.20290041]
 [-0.06356671  0.04334927]
 [ 0.30124927  0.34986454]], 
[[-0.12551215  0.        ]
 [ 0.17919159 -0.        ]
 [-0.0690102   0.        ]
 [-0.06356671  0.        ]
 [ 0.30124927  0.        ]]

image(0, 3, :, :), one_hot  
[[ 0.3900764  -0.23495927]
 [-0.24632071 -0.22620258]
 [ 0.43515533 -0.15318993]
 [-0.10825992 -0.12556458]
 [-0.35311309  0.07129389]], 
[[ 0.3900764  -0.        ]
 [-0.24632071 -0.        ]
 [ 0.43515533 -0.        ]
 [-0.10825992 -0.        ]
 [-0.35311309  0.        ]]

=============== Sum ===========

Sum of image(0, 0, :, :)*one_hot 
-0.4431985914707184

Sum of image(0, 1, :, :)*one_hot 
0.796227753162384

Sum of image(0, 2, :, :)*one_hot 
0.2223517894744873

Sum of image(0, 3, :, :)*one_hot 
0.11753802001476288

============== after conv =================
Predcition of mine

Sum of image(0, 0:2, :, :) 
0.35302916169166565

Sum of image(0, 1:3, :, :) 
1.0185795426368713

Sum of image(0, 2:4, :, :) 
0.3398898094892502
Real result of conv2d

The result after conv2d: 
[[[[0.35302916]]

  [[1.0185795 ]]

  [[0.3398898 ]]]]

Let’s go through “SAME” Scheme with padding.

“SAME” scheme is the smalles possible zero-padding and the height and width are the same between output and input.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
batch = 1
in_height = 4 
in_width = 5
in_channels = 1


filter_height = 2
filter_width = 5
filter_in_channels = in_channels
filter_out_channels = 1


strides = [1, 1, 1, 1]


image_tensor = tf.get_variable(shape=[batch, in_height, in_width, in_channels], dtype=tf.float32, name="same_zero_padding")

kernel = tf.constant(1, shape=[filter_height, filter_width, filter_in_channels, filter_out_channels], dtype=tf.float32, name="image_tensor")

result = tf.nn.conv2d(image_tensor, kernel, strides=strides, padding="SAME")

init_op = tf.global_variables_initializer()

print("What is height and width of output ?")
check_output_size_with_SAME(in_height, in_width, strides, filter_height, filter_width)
What is height and width of output ?
SAME has padding and it the smallest possible padding
output_height: 4.0
output_width: 5.0
pad along height and width...
pad along height: 1
pad along width: 4
Padding size on top, bottom, left and right
top: 0
bottom: 1
left: 2
right: 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
with tf.Session() as sess:
    sess.run(init_op)
    print("====== Image Tensor ======= \n{}".format(image_tensor))
    image = sess.run(image_tensor)
    print(image)
    print("\n====== Kernel Tensor ======= \n{}".format(kernel))
    print(sess.run(kernel))
    #print("\n====== Result Tensor ======= \n{}".format(result))
    #print(sess.run(result))
    print("\n========== Checking separately how to evaluate conv2d =============")
    a0 = image[0, 0, 0:3]; a1 = image[0, 0, 0:4]; a2 = image[0, 0, 0:5]; a3 = image[0, 0, 1:5]; a4 = image[0, 0, 2:5]; 
    b0 = image[0, 1, 0:3]; b1 = image[0, 1, 0:4]; b2 = image[0, 1, 0:5]; b3 = image[0, 1, 1:5]; b4 = image[0, 1, 2:5]; 
    c0 = image[0, 2, 0:3]; c1 = image[0, 2, 0:4]; c2 = image[0, 2, 0:5]; c3 = image[0, 2, 1:5]; c4 = image[0, 2, 2:5]; 
    d0 = image[0, 3, 0:3]; d1 = image[0, 3, 0:4]; d2 = image[0, 3, 0:5]; d3 = image[0, 3, 1:5]; d4 = image[0, 3, 2:5]; 
    print("\nimage(0, 0, 0:3, :)- \n{}".format(a0))
    print("\nimage(0, 0, 0:4, :)- \n{}".format(a1))
    print("\nimage(0, 0, 0:5, :)- \n{}".format(a2))
    print("\nimage(0, 0, 1:5, :)- \n{}".format(a3))
    print("\nimage(0, 0, 2:5, :)- \n{}".format(a4))
    print("\n=============== Sum ===========")
    sumA0 = np.sum(a0); sumA1 = np.sum(a1); sumA2 = np.sum(a2); sumA3 = np.sum(a3); sumA4 = np.sum(a4);
    sumB0 = np.sum(b0); sumB1 = np.sum(b1); sumB2 = np.sum(b2); sumB3 = np.sum(b3); sumB4 = np.sum(b4);
    sumC0 = np.sum(c0); sumC1 = np.sum(c1); sumC2 = np.sum(c2); sumC3 = np.sum(c3); sumC4 = np.sum(c4);
    sumD0 = np.sum(d0); sumD1 = np.sum(d1); sumD2 = np.sum(d2); sumD3 = np.sum(d3); sumD4 = np.sum(d4);
    print("\nSum of image(0, 0, :, :)- \n({}, {}, {}, {}, {})".format(sumA0,sumA1,sumA2,sumA3,sumA4))
    print("\nSum of image(0, 1, :, :)- \n({}, {}, {}, {}, {})".format(sumB0,sumB1,sumB2,sumB3,sumB4))
    print("\nSum of image(0, 2, :, :)- \n({}, {}, {}, {}, {})".format(sumC0,sumC1,sumC2,sumC3,sumC4))
    print("\nSum of image(0, 3, :, :)- \n({}, {}, {}, {}, {})".format(sumD0,sumD1,sumD2,sumD3,sumD4))
    print("\n============== after conv =================")
    print("\nSum of image(0, 0:2, :, :)- \n({}, {}, {}, {}, {})".format((sumA0+sumB0),(sumA1+sumB1),(sumA2+sumB2),(sumA3+sumB3),(sumA4+sumB4)))
    print("\nSum of image(0, 1:3, :, :)- \n({}, {}, {}, {}, {})".format((sumB0+sumC0),(sumB1+sumC1),(sumB2 + sumC2),(sumB3 + sumC3),(sumB4 + sumC4)))
    print("\nSum of image(0, 2:4, :, :)- \n({}, {}, {}, {}, {})".format((sumC0 + sumD0),(sumC1 + sumD1),(sumC2 + sumD2),(sumC3 + sumD3),(sumC4 + sumD4)))
    print("\nSum of image(0, 2:4, :, :)- \n({}, {}, {}, {}, {})".format(sumD0, sumD1, sumD2, sumD3, sumD4))
    print("\nThe result after conv2d: \n{}".format(sess.run(result)))
====== Image Tensor ======= 
<tf.Variable 'same_zero_padding:0' shape=(1, 4, 5, 1) dtype=float32_ref>
[[[[ 0.03944325]
   [ 0.41089988]
   [ 0.17444944]
   [ 0.05654764]
   [-0.01039255]]

  [[ 0.4735849 ]
   [ 0.09873235]
   [ 0.12767589]
   [ 0.04909849]
   [-0.22798371]]

  [[-0.09408021]
   [-0.0354234 ]
   [ 0.1446271 ]
   [ 0.48654997]
   [ 0.1193465 ]]

  [[-0.01348555]
   [-0.16268647]
   [ 0.45568395]
   [ 0.40519345]
   [ 0.19998193]]]]

====== Kernel Tensor ======= 
Tensor("image_tensor:0", shape=(2, 5, 1, 1), dtype=float32)
[[[[1.]]

  [[1.]]

  [[1.]]

  [[1.]]

  [[1.]]]


 [[[1.]]

  [[1.]]

  [[1.]]

  [[1.]]

  [[1.]]]]

========== Checking separately how to evaluate conv2d =============

image(0, 0, 0:3, :)- 
[[0.03944325]
 [0.41089988]
 [0.17444944]]

image(0, 0, 0:4, :)- 
[[0.03944325]
 [0.41089988]
 [0.17444944]
 [0.05654764]]

image(0, 0, 0:5, :)- 
[[ 0.03944325]
 [ 0.41089988]
 [ 0.17444944]
 [ 0.05654764]
 [-0.01039255]]

image(0, 0, 1:5, :)- 
[[ 0.41089988]
 [ 0.17444944]
 [ 0.05654764]
 [-0.01039255]]

image(0, 0, 2:5, :)- 
[[ 0.17444944]
 [ 0.05654764]
 [-0.01039255]]

=============== Sum ===========

Sum of image(0, 0, :, :)- 
(0.6247925758361816, 0.681340217590332, 0.6709476709365845, 0.6315044164657593, 0.2206045389175415)

Sum of image(0, 1, :, :)- 
(0.6999931335449219, 0.749091625213623, 0.5211079120635986, 0.04752302169799805, -0.051209330558776855)

Sum of image(0, 2, :, :)- 
(0.015123486518859863, 0.5016734600067139, 0.6210199594497681, 0.7151001691818237, 0.750523567199707)

Sum of image(0, 3, :, :)- 
(0.2795119285583496, 0.684705376625061, 0.8846873044967651, 0.8981728553771973, 1.0608593225479126)

============== after conv =================

Sum of image(0, 0:2, :, :)- 
(1.3247857093811035, 1.430431842803955, 1.192055583000183, 0.6790274381637573, 0.16939520835876465)

Sum of image(0, 1:3, :, :)- 
(0.7151166200637817, 1.250765085220337, 1.1421278715133667, 0.7626231908798218, 0.6993142366409302)

Sum of image(0, 2:4, :, :)- 
(0.2946354150772095, 1.186378836631775, 1.5057072639465332, 1.613273024559021, 1.8113828897476196)

Sum of image(0, 2:4, :, :)- 
(0.2795119285583496, 0.684705376625061, 0.8846873044967651, 0.8981728553771973, 1.0608593225479126)

The result after conv2d: 
[[[[1.3247857 ]
   [1.4304318 ]
   [1.1920556 ]
   [0.67902744]
   [0.16939521]]

  [[0.7151166 ]
   [1.2507651 ]
   [1.1421279 ]
   [0.7626232 ]
   [0.69931424]]

  [[0.29463542]
   [1.1863788 ]
   [1.5057073 ]
   [1.613273  ]
   [1.8113829 ]]

  [[0.27951193]
   [0.6847054 ]
   [0.8846873 ]
   [0.89817286]
   [1.0608593 ]]]]

If you want to know more result tested, visit here

Reference