REVA-QCAV/README.md

# Pytorch-UNet
![input and output for a random image in the test dataset](https://framapic.org/OcE8HlU6me61/KNTt8GFQzxDR.png)


Customized implementation of the [U-Net](https://arxiv.org/pdf/1505.04597.pdf) in Pytorch for Kaggle's [Carvana Image Masking Challenge](https://www.kaggle.com/c/carvana-image-masking-challenge) from a high definition image. This was used with only one output class but it can be scaled easily.

This model was trained from scratch with 5000 images (no data augmentation) and scored a [dice coefficient](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) of 0.988423 (511 out of 735) on over 100k test images. This score is not quite good but could be improved with more training, data augmentation, fine tuning, playing with CRF post-processing, and applying more weights on the edges of the masks.

The model used for the last submission is stored in the `MODEL.pth` file, if you wish to play with it. The data is available on the [Kaggle website](https://www.kaggle.com/c/carvana-image-masking-challenge/data).

## Usage

### Prediction

You can easily test the output masks on your images via the CLI.

To see all options:
`python predict.py -h`

To predict a single image and save it:

`python predict.py -i image.jpg -o ouput.jpg`

To predict a multiple images and show them without saving them:

`python predict.py -i image1.jpg image2.jpg --viz --no-save`

You can use the cpu-only version with `--cpu`.

You can specify which model file to use with `--model MODEL.pth`.

### Training

`python train.py -h` should get you started. A proper CLI is yet to be added.
## Warning
In order to process the image, it is split into two squares (a left on and a right one), and each square is passed into the net. The two square masks are then merged again to produce the final image. As a consequence, the height of the image must be strictly superior than half the width. Make sure the width is even too.

## Dependencies
This package depends on [pydensecrf](https://github.com/lucasb-eyer/pydensecrf), available via `pip install`.

## Notes on memory

The model has be trained from scratch on a GTX970M 3GB.
Predicting images of 1918*1280 takes 1.5GB of memory.
Training takes approximately 3GB, so if you are a few MB shy of memory, consider turning off all graphical displays.
This assumes you use bilinear up-sampling, and not transposed convolution in the model.
Initial commit 2017-08-16 12:17:08 +00:00			`# Pytorch-UNet`
Tweaked README Former-commit-id: f49d66e44456731f0f10cebe8b6d9d2be985fdb2 2017-11-30 07:30:38 +00:00			`![input and output for a random image in the test dataset](https://framapic.org/OcE8HlU6me61/KNTt8GFQzxDR.png)`
Cleaned code, added image to README.md Former-commit-id: 3acf1ff8dadb74e95786fb6ddcf1a90de63f5079 2017-11-30 06:44:34 +00:00

Tweaked README Former-commit-id: f49d66e44456731f0f10cebe8b6d9d2be985fdb2 2017-11-30 07:30:38 +00:00			`Customized implementation of the [U-Net](https://arxiv.org/pdf/1505.04597.pdf) in Pytorch for Kaggle's [Carvana Image Masking Challenge](https://www.kaggle.com/c/carvana-image-masking-challenge) from a high definition image. This was used with only one output class but it can be scaled easily.`
Updated README.md Former-commit-id: 03065569518394e7b686e6995d05599f9822417f 2017-11-30 02:44:29 +00:00
Tweaked README Former-commit-id: f49d66e44456731f0f10cebe8b6d9d2be985fdb2 2017-11-30 07:30:38 +00:00			`This model was trained from scratch with 5000 images (no data augmentation) and scored a [dice coefficient](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) of 0.988423 (511 out of 735) on over 100k test images. This score is not quite good but could be improved with more training, data augmentation, fine tuning, playing with CRF post-processing, and applying more weights on the edges of the masks.`
Updated README.md Former-commit-id: 03065569518394e7b686e6995d05599f9822417f 2017-11-30 02:44:29 +00:00
			The model used for the last submission is stored in the `MODEL.pth` file, if you wish to play with it. The data is available on the [Kaggle website](https://www.kaggle.com/c/carvana-image-masking-challenge/data).

Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00			`## Usage`
Modified to take any image size (with even width, height > width/2) Former-commit-id: 2751e6a3df45c1527376a4697d3804d683095d83 2017-11-30 06:19:52 +00:00
Cleaned code, added image to README.md Former-commit-id: 3acf1ff8dadb74e95786fb6ddcf1a90de63f5079 2017-11-30 06:44:34 +00:00			`### Prediction`
Updated README.md Former-commit-id: 03065569518394e7b686e6995d05599f9822417f 2017-11-30 02:44:29 +00:00
Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00			`You can easily test the output masks on your images via the CLI.`
Tweaked README Former-commit-id: f49d66e44456731f0f10cebe8b6d9d2be985fdb2 2017-11-30 07:30:38 +00:00
Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00			`To see all options:`
			`python predict.py -h`

			`To predict a single image and save it:`
Modified to take any image size (with even width, height > width/2) Former-commit-id: 2751e6a3df45c1527376a4697d3804d683095d83 2017-11-30 06:19:52 +00:00
			`python predict.py -i image.jpg -o ouput.jpg`
Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00
			`To predict a multiple images and show them without saving them:`
Modified to take any image size (with even width, height > width/2) Former-commit-id: 2751e6a3df45c1527376a4697d3804d683095d83 2017-11-30 06:19:52 +00:00
Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00			`python predict.py -i image1.jpg image2.jpg --viz --no-save`

			You can use the cpu-only version with `--cpu`.
Modified to take any image size (with even width, height > width/2) Former-commit-id: 2751e6a3df45c1527376a4697d3804d683095d83 2017-11-30 06:19:52 +00:00
Added CLI for predict, cleaned up code, updated README Former-commit-id: 77555ccc0925a8fba796ce7e42843d95b6e9dce0 2017-11-30 05:45:19 +00:00			You can specify which model file to use with `--model MODEL.pth`.
Updated README.md Former-commit-id: 03065569518394e7b686e6995d05599f9822417f 2017-11-30 02:44:29 +00:00
Tweaked README Former-commit-id: f49d66e44456731f0f10cebe8b6d9d2be985fdb2 2017-11-30 07:30:38 +00:00			`### Training`

			`python train.py -h` should get you started. A proper CLI is yet to be added.
Modified to take any image size (with even width, height > width/2) Former-commit-id: 2751e6a3df45c1527376a4697d3804d683095d83 2017-11-30 06:19:52 +00:00			`## Warning`
Small changes to README Former-commit-id: 2c17549a3dc926730e9fcd16ec18610a9e5ec391 2017-11-30 17:50:25 +00:00			`In order to process the image, it is split into two squares (a left on and a right one), and each square is passed into the net. The two square masks are then merged again to produce the final image. As a consequence, the height of the image must be strictly superior than half the width. Make sure the width is even too.`
Cleaned code, added image to README.md Former-commit-id: 3acf1ff8dadb74e95786fb6ddcf1a90de63f5079 2017-11-30 06:44:34 +00:00
			`## Dependencies`
			This package depends on [pydensecrf](https://github.com/lucasb-eyer/pydensecrf), available via `pip install`.
Small changes to README Former-commit-id: 2c17549a3dc926730e9fcd16ec18610a9e5ec391 2017-11-30 17:50:25 +00:00
			`## Notes on memory`

			`The model has be trained from scratch on a GTX970M 3GB.`
			`Predicting images of 1918*1280 takes 1.5GB of memory.`
			`Training takes approximately 3GB, so if you are a few MB shy of memory, consider turning off all graphical displays.`
			`This assumes you use bilinear up-sampling, and not transposed convolution in the model.`