Section4 딥러닝 과제
<Segmentation, Object Detection 레포지토리 찾아서 따라해보기>
github에서 segmentation으로 검색해서 가장 먼저 나온 레포
'awesome semantic segmentation'(https://github.com/mrgloom/awesome-semantic-segmentation)에서 따라해볼 만한 레포지토리를 찾았다. 이 레포에는 다양한 딥러닝 모델을 활용한 segmentation 레포 정리되어 있다. 여기 있는 것만 다 따라해봐도 많이 배울 듯..
mrgloom/awesome-semantic-segmentation
:metal: awesome-semantic-segmentation. Contribute to mrgloom/awesome-semantic-segmentation development by creating an account on GitHub.
github.com
내가 따라할 레포지토리는keras_segmentation_python_example(https://github.com/divamgupta/image-segmentation-keras)이다. 간단하고 친절해보여서 선택했는데 처음부터 바로 돌아가진 않았다. 역시 환경문제가 계속 발생했다.
AttributeError: module 'keras.utils' has no attribute 'get_file'
AttributeError: module 'tensorflow_core.compat.v2' has no attribute '__internal__' site:stackoverflow.com
이런 식으로 계속 AttributeError가 발생했는데 제대로 된 버전의 텐서플로우와 케라스를 설치해주지 않아서 발생하는 것이었다.
코랩을 돌리기 전에
!pip install keras==2.4.3
!pip install tensorflow==2.4.1
!apt-get install -y libsm6 libxext6 libxrender-dev
!pip install opencv-python
이렇게 깃헙 레포에서 추천하는 버전을 다시 깔아주니까 깔끔하게 돌아갔다.
내가 따라한 튜토리얼은 아래와 같은 길거리 이미지를 이용하여 이미지를 분류하는 것이었다.
간단한 튜토리얼이라서 특별한 전처리 과정은 없었다.
from keras_segmentation.models.unet import vgg_unet
model = vgg_unet(n_classes=50 , input_height=320, input_width=640)
pre-trained Unet 모델을 활용했다.
모델의 은닉층을 살펴보면 다음과 같다. convolution 층, maxpooling 층이 반복되다가 zeropadding, batch normalization 이후 upsampling, concatenate, zeropadding, batch normalization이 반복되는 Unet의 형태를 보여준다.
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 320, 640, 3) 0
__________________________________________________________________________________________________
block1_conv1 (Conv2D) (None, 320, 640, 64) 1792 input_1[0][0]
__________________________________________________________________________________________________
block1_conv2 (Conv2D) (None, 320, 640, 64) 36928 block1_conv1[0][0]
__________________________________________________________________________________________________
block1_pool (MaxPooling2D) (None, 160, 320, 64) 0 block1_conv2[0][0]
__________________________________________________________________________________________________
block2_conv1 (Conv2D) (None, 160, 320, 128 73856 block1_pool[0][0]
__________________________________________________________________________________________________
block2_conv2 (Conv2D) (None, 160, 320, 128 147584 block2_conv1[0][0]
__________________________________________________________________________________________________
block2_pool (MaxPooling2D) (None, 80, 160, 128) 0 block2_conv2[0][0]
__________________________________________________________________________________________________
block3_conv1 (Conv2D) (None, 80, 160, 256) 295168 block2_pool[0][0]
__________________________________________________________________________________________________
block3_conv2 (Conv2D) (None, 80, 160, 256) 590080 block3_conv1[0][0]
__________________________________________________________________________________________________
block3_conv3 (Conv2D) (None, 80, 160, 256) 590080 block3_conv2[0][0]
__________________________________________________________________________________________________
block3_pool (MaxPooling2D) (None, 40, 80, 256) 0 block3_conv3[0][0]
__________________________________________________________________________________________________
block4_conv1 (Conv2D) (None, 40, 80, 512) 1180160 block3_pool[0][0]
__________________________________________________________________________________________________
block4_conv2 (Conv2D) (None, 40, 80, 512) 2359808 block4_conv1[0][0]
__________________________________________________________________________________________________
block4_conv3 (Conv2D) (None, 40, 80, 512) 2359808 block4_conv2[0][0]
__________________________________________________________________________________________________
block4_pool (MaxPooling2D) (None, 20, 40, 512) 0 block4_conv3[0][0]
__________________________________________________________________________________________________
zero_padding2d (ZeroPadding2D) (None, 22, 42, 512) 0 block4_pool[0][0]
__________________________________________________________________________________________________
conv2d (Conv2D) (None, 20, 40, 512) 2359808 zero_padding2d[0][0]
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 20, 40, 512) 2048 conv2d[0][0]
__________________________________________________________________________________________________
up_sampling2d (UpSampling2D) (None, 40, 80, 512) 0 batch_normalization[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate) (None, 40, 80, 768) 0 up_sampling2d[0][0]
block3_pool[0][0]
__________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D (None, 42, 82, 768) 0 concatenate[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 40, 80, 256) 1769728 zero_padding2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 40, 80, 256) 1024 conv2d_1[0][0]
__________________________________________________________________________________________________
up_sampling2d_1 (UpSampling2D) (None, 80, 160, 256) 0 batch_normalization_1[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 80, 160, 384) 0 up_sampling2d_1[0][0]
block2_pool[0][0]
__________________________________________________________________________________________________
zero_padding2d_2 (ZeroPadding2D (None, 82, 162, 384) 0 concatenate_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 80, 160, 128) 442496 zero_padding2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 80, 160, 128) 512 conv2d_2[0][0]
__________________________________________________________________________________________________
up_sampling2d_2 (UpSampling2D) (None, 160, 320, 128 0 batch_normalization_2[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 160, 320, 192 0 up_sampling2d_2[0][0]
block1_pool[0][0]
__________________________________________________________________________________________________
zero_padding2d_3 (ZeroPadding2D (None, 162, 322, 192 0 concatenate_2[0][0]
__________________________________________________________________________________________________
seg_feats (Conv2D) (None, 160, 320, 64) 110656 zero_padding2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 160, 320, 64) 256 seg_feats[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 320, 50) 28850 batch_normalization_3[0][0]
__________________________________________________________________________________________________
reshape (Reshape) (None, 51200, 50) 0 conv2d_3[0][0]
__________________________________________________________________________________________________
activation (Activation) (None, 51200, 50) 0 reshape[0][0]
==================================================================================================
Total params: 12,350,642
Trainable params: 12,348,722
Non-trainable params: 1,920
__________________________________________________________________________________________________
epoch 5번을 돌린 결과는 loss: 0.2675 - accuracy: 0.9123로 엄청 높은 성능을 보인다.
test 파일에 있는 그림을 넣어보면 다음과 같은 결과를 보여준다. 위의 이미지가 아래 이미지처럼 분류가 되고 라벨을 넣으면 각각의 색이 어떤 라벨인지도 보여준다.
튜토리얼이라 결과가 잘 나온 것 같다. 다음에는 주어진 데이터 말고 다른 데이터로 실험을 해봐야겠다.
마지막으로 해당 깃헙 레포를 설명하는 블로그 글
https://divamgupta.com/image-segmentation/2019/06/06/deep-learning-semantic-segmentation-keras.html
A Beginner’s guide to Deep Learning based Semantic Segmentation using Keras
Pixel-wise image segmentation is a well-studied problem in computer vision. The task of semantic image segmentation is to classify each pixel in the image. In this post, we will discuss how to use deep convolutional neural networks to do image segmentation
divamgupta.com
'코드스테이츠 AI 부트캠프' 카테고리의 다른 글
OOP 정리 (0) | 2021.07.12 |
---|---|
코드스테이츠 AI 부트캠프 section 1 review (0) | 2021.04.01 |