Pooling, in particular Max pooling
Pooling helps over-fitting by abstracted form of the representation(e.g. image, activation features, etc.).
Basically, pooling is down-sampling as follows.
Max pooling performs max pooing on input, i.e. this take maximum value on input per ksize window.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
"""Example code for tf.nn.max_pool(https://www.tensorflow.org/versions/r1.8/api_docs/python/tf/nn/max_pool)
tf.nn.max_pool(
value,
ksize,
strides,
padding,
data_format='NHWC',
name=None
)
"""
import sys
import tensorflow as tf
import numpy as np
print("=== Version checking ===")
print("The version of sys: \n{}".format(sys.version))
print("Tensorflow version: {}".format(tf.__version__))
print("========================")
Before pooling, Let’s expect output’s height and width.
padding scheme is the same when you convolve the data of format NHWC.
If you want another data format, use argument, data_format=”NHWC”,
This value, by default, is the format of NHWC.
The following is the functions which calculate output’s height and width,
If you already know the list of ksize as well as in_height, in_width, and the list of stride.
The list of ksize is similar with filter list on conv2d operation.
Let’s see an example which calculates output’s height and width.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# max pooling verification after cnn text classification
batch_size = 1
in_height = 5 # the same from sequence length
in_width = 1
in_channels = 1
filter_size = 2
sequence_length = in_height
strides = [1,1,1,1]
filter_height = sequence_length-filter_size+1
filter_width = 1
ksize = [1, filter_height, filter_width, 1]
def check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width):
out_height = np.ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = np.ceil(float(in_width - filter_width + 1) / float(strides[2]))
print("VAILD padding is no padding")
print("output_height: {}".format(out_height))
print("output_width: {}".format(in_width))
print("What is height and width of output??????b")
check_output_size_with_VALID(in_height, in_width, strides, filter_height, filter_width)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def check_output_size_with_SAME(in_height, in_width, strides, filter_height, filter_width):
out_height = np.ceil(float(in_height) / float(strides[1]))
out_width = np.ceil(float(in_width) / float(strides[2]))
print("SAME has padding and it the smallest possible padding")
print("output_height: {}".format(out_height))
print("output_width: {}".format(out_width))
if (in_height % strides[1] == 0):
pad_along_height = max(filter_height - strides[1], 0)
else:
pad_along_height = max(filter_height - (in_height % strides[1]), 0)
if (in_width % strides[2] == 0):
pad_along_width = max(filter_width - strides[2], 0)
else:
pad_along_width = max(filter_width - (in_width % strides[2]), 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left
print("pad along height and width...")
print("pad along height: {}".format(pad_along_height))
print("pad along width: {}".format(pad_along_width))
pad_top = pad_along_height // 2 # divied by 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left
print("Padding size on top, bottom, left and right")
print("top: {}".format(pad_top))
print("bottom: {}".format(pad_bottom))
print("left: {}".format(pad_left))
print("right: {}".format(pad_right))
print("What is height and width of output???????")
check_output_size_with_SAME(in_height, in_width, strides, filter_height, filter_width)
As you can see the resulting height and width of output from the function above, you could expect the pooling result as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
input_ = tf.get_variable("input_data", shape=[batch_size, in_height, in_width, in_channels], dtype=tf.float32)
pooled_with_valid = tf.nn.max_pool(input_,
ksize=ksize,
strides=strides,
padding="VALID",
name="Pooling_")
pooled_with_same = tf.nn.max_pool(input_,
ksize=ksize,
strides=strides,
padding="SAME",
name="Pooling_")
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
print("""Original input data""")
print(sess.run(input_))
print("""\nMax pooling with VALID Scheme""")
print(sess.run(pooled_with_valid))
print("""\nMAX polling with SAME Scheme""")
print(sess.run(pooled_with_same))