This paper shows that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. This repo adapts vgg16 networks into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task on PASCAL VOC2012 dataset.
Download VOC PASCAL trainval and test data.
$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
Extract all of these tars into one directory and rename them, which should have the following basic structure.
VOC # path: /home/yang/dataset/VOC
├── test
| └──VOCdevkit
| └──VOC2007 (from VOCtest_06-Nov-2007.tar)
└── train
└──VOCdevkit
└──VOC2007 (from VOCtrainval_06-Nov-2007.tar)
└──VOC2012 (from VOCtrainval_11-May-2012.tar)
Then you need parser_voc.py to parser PASCAL VOC dataset and train it. Finally you can evaluate model by python test.py
. The prediction results are expected to save in ./data/prediction
. Here is my trained weight FCN8s.h5
$ python parser_voc.py --voc_path /home/yang/dataset/VOC
$ python train.py
Epoch 1/30
6000/6000 [==============================] - 3571s 595ms/step - loss: 1.2374 - accuracy: 0.7233
Epoch 2/30
6000/6000 [==============================] - 3552s 592ms/step - loss: 1.1319 - accuracy: 0.7497
...
Epoch 30/30
6000/6000 [==============================] - 3552s 592ms/step - loss: 0.0811 - accuracy: 0.9797
$ python test.py
@Github_Project{TensorFlow2.0-Examples,
author = YunYang1994,
email = www.dreameryangyun@sjtu.edu.cn,
title = "FCN: Fully Convolutional Networks for Semantic Segmentation.",
url = https://github.com/YunYang1994/TensorFlow2.0-Examples,
year = 2019,
}