337 lines
8.7 KiB
Markdown
337 lines
8.7 KiB
Markdown
---
|
||
marp: true
|
||
paginate: true
|
||
author: Laurent Fainsin, Damien Guillotin, Pierre-Eliot Jourdan
|
||
math: katex
|
||
---
|
||
|
||
<style>
|
||
section::after {
|
||
/*custom pagination*/
|
||
content: attr(data-marpit-pagination) ' / ' attr(data-marpit-pagination-total);
|
||
}
|
||
</style>
|
||
|
||
<style scoped>
|
||
h1, h2 {
|
||
color: white;
|
||
}
|
||
</style>
|
||
|
||
# Projet IAM
|
||
## SimCLR + SGAN
|
||
|
||
![bg 100%](https://images.unsplash.com/photo-1600174097100-3f347cf15996?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8)
|
||
|
||
<footer>
|
||
|
||
Laurent Fainsin, Damien Guillotin, Pierre-Eliot Jourdan
|
||
|
||
</footer>
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Sujet
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
table, td, th, tr {
|
||
border: none !important;
|
||
border-collapse: collapse !important;
|
||
border-style: none !important;
|
||
background-color: unset !important;
|
||
overflow: hidden;
|
||
margin: auto;
|
||
text-align: center;
|
||
}
|
||
</style>
|
||
|
||
Images d'animaux $\rightarrow$ 18 classes différentes
|
||
|
||
| | | |
|
||
| :----------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------: |
|
||
| ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/Chacal/23954942153_7c3b7c0ec5_c.jpg?raw=true) | ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/Z%C3%A8bre/51854102817_e3ae6af27f_c.jpg?raw=true) | ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/autruche/48114752957_be666e72ca_c.jpg?raw=true) |
|
||
| ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/Girafe/19646362821_2cda943958_c.jpg?raw=true) | ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/Lion/51110276872_152f4fdf38_c.jpg?raw=true) | ![height:225px](https://github.com/axelcarlier/projsemisup/blob/master/Lab/Gnou/6967679426_ce23f4fef3_c.jpg?raw=true) |
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Sujet
|
||
|
||
</header>
|
||
|
||
### Dataset
|
||
- Données labellisées $\rightarrow$ 20 images/classe $\rightarrow$ 360 images
|
||
- Données non labellisées $\rightarrow$ 2000 images
|
||
- Données de test $\rightarrow$ 100 images/classe $\rightarrow$ 1800 images
|
||
|
||
### Model
|
||
- Input $\rightarrow$ 128x128px
|
||
- Network $\rightarrow$ [MobileNetV1](https://www.tensorflow.org/api_docs/python/tf/compat/v1/keras/applications/mobilenet)
|
||
|
||
<!-- Les consignes à respecter -->
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Méthode contrastive <br> (SimCLR)
|
||
|
||
</header>
|
||
|
||
![bg 50%](https://camo.githubusercontent.com/5ab5e0c019cdd8129b4450539231f34dc028c0cd64ba5d50db510d1ba2184160/68747470733a2f2f312e62702e626c6f6773706f742e636f6d2f2d2d764834504b704539596f2f586f3461324259657276492f414141414141414146704d2f766146447750584f79416f6b4143385868383532447a4f67457332324e68625877434c63424741735948512f73313630302f696d616765342e676966)
|
||
|
||
<!--
|
||
|
||
augmentations: cf slide suivante
|
||
CNN (le même dans les deux colonnes): mobilenetv1
|
||
representation: espace latent de taille `width` = 128
|
||
MLP: multi layer perceptron (projection head) -> taille sortie = `width` = 128
|
||
linear probe (non représenté) -> input: representation latente -> couche dense -> taille sortie = len(labels) = 18
|
||
|
||
pendant le training: on encode nos images dans l'espace latent (`representation`). Ensuite y'a deux trucs:
|
||
1. on projete notre espace latent via un MLP et on calcule la contrastive loss (qui va se charger de attract/repel).
|
||
2. on calcul un label via le linear probe sur l'espace latent (`representation`), et on calcul une loss via SparseCategoricalCrossentropy.
|
||
|
||
Une fois qu'on a nos deux loss (et nos gradients lors du forward) on rétropropage tout (on update nos paramètres quoi), et on passe au batch suivant.
|
||
|
||
À noter que lors de l'inférence on se sert pas du MLP, donc on peut le jeter.
|
||
|
||
-->
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Augmentations
|
||
|
||
</header>
|
||
|
||
<div style="display: flex; align-items: center;">
|
||
|
||
![width:800](https://1.bp.blogspot.com/-bO6c2IGpXDY/Xo4cR6ebFUI/AAAAAAAAFpo/CPVNlMP08hUfNPHQb2tKeHju4Y_UsNzegCLcBGAsYHQ/s640/image3.png)
|
||
|
||
![width:400](https://1.bp.blogspot.com/-ZzzYCgg9g0s/Xo4bo4oj7bI/AAAAAAAAFpc/W-LAIS28d1sJ3-KETCXlaxvLKlS_KG8-QCLcBGAsYHQ/s320/image1.png)
|
||
|
||
</div>
|
||
|
||
<!-- Pleins d'augmentations possible, mais google trouve que les seules nécéssaires pour avoir au moins des bonnes perfs sont le cropping (spécial, cf figure droite) et le jittering -->
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Méthode contrastive <br> (SimCLR)
|
||
|
||
</header>
|
||
|
||
![bg 50%](https://miro.medium.com/max/720/1*E6UUEmxKp5ZTRgCRNbIP-g.webp)
|
||
|
||
<!--
|
||
|
||
intuition derrière:
|
||
|
||
en supervisé c'est super simple de trouver les frontière pour la classif, juste on calcule la loss, et avec le gradient ça fait bouger. Ici on a pas ce luxe, du coup on par du principe qu'une image, même augmenté est à peut près au même endroit dans la répartition de l'espace, d'où le fait qu'on va essayer de faire en sorte que ça soit le cas via le contrastive loss (next slide). Nous en plus on va venir parsement d'images supervisée pour "régulariser" un peu le tout
|
||
|
||
-->
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Contrastive loss
|
||
|
||
</header>
|
||
|
||
$$l_{i,j} = -\log \frac{ \exp( \text{sim}(z_i, z_j) / \tau ) }{\sum^{2N}_{k=1\neq i} \exp( \text{sim}(z_i, z_j) / \tau) }$$
|
||
|
||
<!--
|
||
|
||
sim -> cosine distance
|
||
+ en gros un softmax
|
||
+ le tout dans un log
|
||
|
||
tau -> température (hyper paramètre)
|
||
|
||
-->
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats fully-supervised
|
||
|
||
</header>
|
||
|
||
![bg 98%](assets/baseline_accuracy_simclr.png)
|
||
![bg 98%](assets/baseline_loss_simclr.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats semi-supervised
|
||
|
||
</header>
|
||
|
||
![bg 98%](assets/accuracy_simclr.png)
|
||
![bg 98%](assets/loss_simclr.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats supervised fine-tuning
|
||
|
||
</header>
|
||
|
||
![bg 98%](assets/finetuning_accuracy_simclr.png)
|
||
![bg 98%](assets/finetuning_loss_simclr.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Comparaison des résultats
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
img {
|
||
margin: auto;
|
||
margin-top: 2.5rem;
|
||
display: block;
|
||
}
|
||
</style>
|
||
|
||
![height:620px](assets/comp_accuracy_simclr.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Méthode générative (SGAN)
|
||
|
||
</header>
|
||
|
||
![bg 60%](https://miro.medium.com/max/640/1*Grve_j-Mv4Jgmtq3u7yKyQ.webp)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Architecture du SGAN
|
||
|
||
</header>
|
||
|
||
![bg 110%](https://cdn.discordapp.com/attachments/953586522572066826/1068158716945379358/Screenshot_from_2023-01-26_14-04-44.png)
|
||
|
||
<!-- Générateur : 3,425,155 paramètres -->
|
||
|
||
![bg 100%](https://cdn.discordapp.com/attachments/953586522572066826/1068157883717517382/a_1.png)
|
||
|
||
<!-- Discriminateur : 710,930 paramètres (708,737) -->
|
||
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats générateur
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
table, td, th, tr {
|
||
border: none !important;
|
||
border-collapse: collapse !important;
|
||
border-style: none !important;
|
||
background-color: unset !important;
|
||
overflow: hidden;
|
||
margin: auto;
|
||
text-align: center;
|
||
}
|
||
</style>
|
||
|
||
| | 5 epochs | 100 epochs |
|
||
| :---------------------------------------: | :----------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------: |
|
||
| <p style="font-size:1.5rem">!pretrain</p> | ![](https://cdn.discordapp.com/attachments/953586522572066826/1068113375474753566/download4.png) | ![](https://cdn.discordapp.com/attachments/953586522572066826/1065674997672190082/download2.png) |
|
||
| <p style="font-size:1.5rem">pretrain</p> | ![](https://cdn.discordapp.com/attachments/953586522572066826/1068115321807962122/image.png) | ![](https://cdn.discordapp.com/attachments/953586522572066826/1067144843610034227/download3.png) |
|
||
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats fully-supervised
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
img {
|
||
margin: auto;
|
||
margin-top: 2.5rem;
|
||
display: block;
|
||
}
|
||
</style>
|
||
|
||
![height:620px](assets/baseline_accuracy_sgan.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats semi-supervised
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
img {
|
||
margin: auto;
|
||
margin-top: 2.5rem;
|
||
display: block;
|
||
}
|
||
</style>
|
||
|
||
![height:620px](assets/accuracy_sgan.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Résultats pre-training
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
img {
|
||
margin: auto;
|
||
margin-top: 2.5rem;
|
||
display: block;
|
||
}
|
||
</style>
|
||
|
||
![height:620px](assets/pretrain_accuracy_sgan.png)
|
||
|
||
---
|
||
|
||
<header>
|
||
|
||
# Comparaison des résultats
|
||
|
||
</header>
|
||
|
||
<style scoped>
|
||
img {
|
||
margin: auto;
|
||
margin-top: 2.5rem;
|
||
display: block;
|
||
}
|
||
</style>
|
||
|
||
![height:620px](assets/comp_accuracy_sgan.png)
|