2021-08-16 04:01:29 +00:00
# U-Net: Semantic segmentation with PyTorch
2021-08-17 22:56:56 +00:00
< a href = "#" > < img src = "https://img.shields.io/github/workflow/status/milesial/PyTorch-UNet/Publish%20Docker%20image?logo=github&style=for-the-badge" / > < / a >
2021-08-17 21:09:50 +00:00
< a href = "https://hub.docker.com/r/milesial/unet" > < img src = "https://img.shields.io/badge/docker%20image-available-blue?logo=Docker&style=for-the-badge" / > < / a >
2021-08-17 22:56:56 +00:00
< a href = "https://pytorch.org/" > < img src = "https://img.shields.io/badge/PyTorch-v1.9.0-red.svg?logo=PyTorch&style=for-the-badge" / > < / a >
2021-08-17 21:09:50 +00:00
< a href = "#" > < img src = "https://img.shields.io/badge/python-v3.6+-blue.svg?logo=python&style=for-the-badge" / > < / a >
2020-07-24 00:04:38 +00:00
2021-01-21 13:43:08 +00:00
![input and output for a random image in the test dataset ](https://i.imgur.com/GD8FcB7.png )
2017-11-30 06:44:34 +00:00
2019-10-24 19:37:21 +00:00
Customized implementation of the [U-Net ](https://arxiv.org/abs/1505.04597 ) in PyTorch for Kaggle's [Carvana Image Masking Challenge ](https://www.kaggle.com/c/carvana-image-masking-challenge ) from high definition images.
2017-11-30 02:44:29 +00:00
2021-11-13 09:07:03 +00:00
- [Quick start ](#quick-start )
- [Without Docker ](#without-docker )
- [With Docker ](#with-docker )
2021-08-17 15:18:19 +00:00
- [Description ](#description )
- [Usage ](#usage )
- [Docker ](#docker )
- [Training ](#training )
- [Prediction ](#prediction )
- [Weights & Biases ](#weights--biases )
- [Pretrained model ](#pretrained-model )
- [Data ](#data )
2021-11-13 09:07:03 +00:00
## Quick start
### Without Docker
1. [Install CUDA ](https://developer.nvidia.com/cuda-downloads )
2. [Install PyTorch ](https://pytorch.org/get-started/locally/ )
3. Install dependencies
```bash
pip install -r requirements.txt
```
4. Download the data and run training:
```bash
bash scripts/download_data.sh
python train.py --amp
```
### With Docker
2021-08-16 14:54:06 +00:00
1. [Install Docker 19.03 or later: ](https://docs.docker.com/get-docker/ )
2021-08-17 20:25:24 +00:00
```bash
2021-08-16 14:54:06 +00:00
curl https://get.docker.com | sh & & sudo systemctl --now enable docker
```
2. [Install the NVIDIA container toolkit: ](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html )
2021-08-17 20:25:24 +00:00
```bash
2021-08-16 14:54:06 +00:00
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
& & curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
& & curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
```
3. [Download and run the image: ](https://hub.docker.com/repository/docker/milesial/unet )
2021-08-17 20:25:24 +00:00
```bash
2021-08-19 08:42:39 +00:00
sudo docker run --rm --shm-size=8g --ulimit memlock=-1 --gpus all -it milesial/unet
2021-08-16 14:54:06 +00:00
```
4. Download the data and run training:
2021-08-17 20:25:24 +00:00
```bash
2021-08-16 14:54:06 +00:00
bash scripts/download_data.sh
python train.py --amp
```
## Description
2021-08-18 22:30:35 +00:00
This model was trained from scratch with 5k images and scored a [Dice coefficient ](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient ) of 0.988423 on over 100k test images.
2017-11-30 02:44:29 +00:00
2021-08-18 22:30:35 +00:00
It can be easily used for multiclass segmentation, portrait segmentation, medical segmentation, ...
2021-08-16 04:01:29 +00:00
2017-11-30 02:44:29 +00:00
2017-11-30 05:45:19 +00:00
## Usage
2020-05-22 06:34:03 +00:00
**Note : Use Python 3.6 or newer**
2021-08-16 04:01:29 +00:00
### Docker
A docker image containing the code and the dependencies is available on [DockerHub ](https://hub.docker.com/repository/docker/milesial/unet ).
2021-08-18 22:30:35 +00:00
You can download and jump in the container with ([docker >=19.03](https://docs.docker.com/get-docker/)):
2021-08-16 04:01:29 +00:00
2021-08-17 20:25:24 +00:00
```console
2021-08-19 08:42:39 +00:00
docker run -it --rm --shm-size=8g --ulimit memlock=-1 --gpus all milesial/unet
2021-08-16 04:01:29 +00:00
```
### Training
2021-08-17 20:25:24 +00:00
```console
2021-08-16 04:01:29 +00:00
> python train.py -h
usage: train.py [-h] [--epochs E] [--batch-size B] [--learning-rate LR]
[--load LOAD] [--scale SCALE] [--validation VAL] [--amp]
Train the UNet on images and target masks
optional arguments:
-h, --help show this help message and exit
--epochs E, -e E Number of epochs
--batch-size B, -b B Batch size
--learning-rate LR, -l LR
Learning rate
--load LOAD, -f LOAD Load model from a .pth file
--scale SCALE, -s SCALE
Downscaling factor of the images
--validation VAL, -v VAL
Percent of the data that is used as validation (0-100)
--amp Use mixed precision
```
By default, the `scale` is 0.5, so if you wish to obtain better results (but use more memory), set it to 1.
2021-08-18 22:30:35 +00:00
Automatic mixed precision is also available with the `--amp` flag. [Mixed precision ](https://arxiv.org/abs/1710.03740 ) allows the model to use less memory and to be faster on recent GPUs by using FP16 arithmetic. Enabling AMP is recommended.
2021-08-16 14:54:06 +00:00
2017-11-30 06:44:34 +00:00
### Prediction
2017-11-30 02:44:29 +00:00
2021-08-16 04:01:29 +00:00
After training your model and saving it to `MODEL.pth` , you can easily test the output masks on your images via the CLI.
2017-11-30 07:30:38 +00:00
2017-11-30 05:45:19 +00:00
To predict a single image and save it:
2017-11-30 06:19:52 +00:00
2018-06-08 17:28:46 +00:00
`python predict.py -i image.jpg -o output.jpg`
2017-11-30 05:45:19 +00:00
To predict a multiple images and show them without saving them:
2017-11-30 06:19:52 +00:00
2017-11-30 05:45:19 +00:00
`python predict.py -i image1.jpg image2.jpg --viz --no-save`
2021-08-17 20:25:24 +00:00
```console
2019-10-24 19:37:21 +00:00
> python predict.py -h
2021-08-16 04:01:29 +00:00
usage: predict.py [-h] [--model FILE] --input INPUT [INPUT ...]
2019-10-24 19:37:21 +00:00
[--output INPUT [INPUT ...]] [--viz] [--no-save]
[--mask-threshold MASK_THRESHOLD] [--scale SCALE]
Predict masks from input images
optional arguments:
-h, --help show this help message and exit
--model FILE, -m FILE
Specify the file in which the model is stored
--input INPUT [INPUT ...], -i INPUT [INPUT ...]
2021-08-16 04:01:29 +00:00
Filenames of input images
2019-10-24 19:37:21 +00:00
--output INPUT [INPUT ...], -o INPUT [INPUT ...]
2021-08-16 04:01:29 +00:00
Filenames of output images
--viz, -v Visualize the images as they are processed
--no-save, -n Do not save the output masks
2019-10-24 19:37:21 +00:00
--mask-threshold MASK_THRESHOLD, -t MASK_THRESHOLD
2021-08-16 04:01:29 +00:00
Minimum probability value to consider a mask pixel white
2019-10-24 19:37:21 +00:00
--scale SCALE, -s SCALE
2021-08-16 04:01:29 +00:00
Scale factor for the input images
2019-10-24 19:37:21 +00:00
```
2017-11-30 05:45:19 +00:00
You can specify which model file to use with `--model MODEL.pth` .
2017-11-30 02:44:29 +00:00
2021-08-17 15:18:19 +00:00
## Weights & Biases
2017-11-30 07:30:38 +00:00
2021-08-16 05:22:02 +00:00
The training progress can be visualized in real-time using [Weights & Biases ](https://wandb.ai/ ). Loss curves, validation curves, weights and gradient histograms, as well as predicted masks are logged to the platform.
2019-10-24 19:37:21 +00:00
2021-08-16 04:01:29 +00:00
When launching a training, a link will be printed in the console. Click on it to go to your dashboard. If you have an existing W& B account, you can link it
by setting the `WANDB_API_KEY` environment variable.
2019-10-24 19:37:21 +00:00
2021-08-17 15:18:19 +00:00
## Pretrained model
2021-08-19 09:14:36 +00:00
A [pretrained model ](https://github.com/milesial/Pytorch-UNet/releases/tag/v2.0 ) is available for the Carvana dataset. It can also be loaded from torch.hub:
2020-08-12 07:42:01 +00:00
```python
2021-08-19 09:15:41 +00:00
net = torch.hub.load('milesial/Pytorch-UNet', 'unet_carvana', pretrained=True)
2020-08-12 07:42:01 +00:00
```
2021-08-19 09:14:36 +00:00
The training was done with a 50% scale and bilinear upsampling.
2020-08-12 07:42:01 +00:00
2021-08-16 04:01:29 +00:00
## Data
The Carvana data is available on the [Kaggle website ](https://www.kaggle.com/c/carvana-image-masking-challenge/data ).
2017-11-30 06:44:34 +00:00
2021-08-17 15:18:19 +00:00
You can also download it using the helper script:
2017-11-30 17:50:25 +00:00
2021-08-17 20:25:24 +00:00
```
2021-08-17 15:18:19 +00:00
bash scripts/download_data.sh
2021-08-16 04:01:29 +00:00
```
2020-03-16 05:37:20 +00:00
2021-10-24 21:07:54 +00:00
The input images and target masks should be in the `data/imgs` and `data/masks` folders respectively (note that the `imgs` and `masks` folder should not contain any sub-folder or any other files, due to the greedy data-loader). For Carvana, images are RGB and masks are black and white.
2020-07-24 00:04:38 +00:00
2021-08-18 22:30:35 +00:00
You can use your own dataset as long as you make sure it is loaded properly in `utils/data_loading.py` .
2020-07-24 00:04:38 +00:00
2019-10-24 19:37:21 +00:00
---
2021-08-16 04:21:40 +00:00
Original paper by Olaf Ronneberger, Philipp Fischer, Thomas Brox:
[U-Net: Convolutional Networks for Biomedical Image Segmentation ](https://arxiv.org/abs/1505.04597 )
2019-10-24 19:37:21 +00:00
![network architecture ](https://i.imgur.com/jeDVpqF.png )