P_VggNet. A convolutional neural network (CNN) with pixel-based attention map

 

P_VggNet: A convolutional neural network (CNN) with pixel-based attention map

Abstract

This paper is about a new convolutional neural network (CNN) architecture, P_VggNet, which is based on the VGGNet architecture. The P_VggNet architecture comprising the following parts: P_Net and VggNet with 16 layers.

Pixel-based attention map

LBP is one classical feature extraction approach that changes image pixel values to weight values. If a pixel is larger than one pixel number, the pixel is set to 1, otherwise, the pixel is set to 0.

As the pooling layers increase, the feature map size are much smaller than the original image. Therefore, the attention map generated by the LBP is inefficient.

From this, they resize the raw image in different sizes, and then calculate the weight of each pixel.

Here, this paper uses a new method to generate the attention map. The attention map is generated by the following steps:

graph LR
    A[Image] --> B[Gray Image]
    B --> C[resize Image]
    C --> D[Weight Image]
    D --> E[Attention Map]

Note