tensorflow padding解析

Posted by zihuaweng on April 9, 2018

卷积神经网络卷积后得到的feature map大小计算如下:

W‘ = (W − F + 2P )/S + 1
  • W’:out_h,w
  • W:in_h, w
  • F: filter大小
  • P: padding大小
  • S:stride

padding的方式在tensorflow里分两种,一种是VALID,一种是SAME.

SAME padding

SAME会自动加入padding里面的0

new_height = ceil(new_width = W / S)

在高度上需要pad的像素数为

pad_needed_height = (new_height – 1)  × S + F - W

根据上式,输入矩阵上方添加的像素数为

pad_top = pad_needed_height / 2  (结果取整)

下方添加的像素数为

pad_down = pad_needed_height - pad_top

以此类推,在宽度上需要pad的像素数和左右分别添加的像素数为

pad_needed_width = (new_width – 1)  × S + F - W

pad_left = pad_needed_width  / 2 (结果取整)

pad_right = pad_needed_width – pad_left

VALID padding

VALID不会添加padding里面的0

new_height = new_width = ceil((W – F + 1) / S)

ceil: 向上取整. ceil(5.1) = 6

自定义padding

如果想要自己定义padding大小的话,可以使用tf.pad()

# 't' is [[1, 2, 3], [4, 5, 6]].
# 'paddings' is [[1, 1,], [2, 2]].
# rank of 't' is 2.
tf.pad(t, paddings, "CONSTANT") ==> [[0, 0, 0, 0, 0, 0, 0],
                                  [0, 0, 1, 2, 3, 0, 0],
                                  [0, 0, 4, 5, 6, 0, 0],
                                  [0, 0, 0, 0, 0, 0, 0]]

具体操作:

input = tf.placeholder(tf.float32, [None, 28, 28, 3])
padded_input = tf.pad(input, [[0, 0], [2, 2], [1, 1], [0, 0]], "CONSTANT")

filter = tf.placeholder(tf.float32, [5, 5, 3, 16])
output = tf.nn.conv2d(padded_input, filter, strides=[1, 1, 1, 1], padding="VALID")

Reference:

  1. https://www.jianshu.com/p/05c4f1621c7e
  2. https://stackoverflow.com/questions/45238939/tensorflow-is-there-a-way-to-specify-padding-along-axes
  3. https://stackoverflow.com/questions/37659538/custom-padding-for-convolutions-in-tensorflow