more shit

This commit is contained in:
Laureηt 2023-01-29 23:18:17 +01:00
parent b8940f367d
commit 7e8f1fe174
Signed by: Laurent
SSH key fingerprint: SHA256:kZEpW8cMJ54PDeCvOhzreNr4FSh6R13CMGH/POoO8DI
6 changed files with 247 additions and 63 deletions

BIN
assets/inp_n7.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

View file

@ -10,6 +10,7 @@
\usepackage{graphicx}
\usepackage{microtype}
\usepackage{amsmath}
\usepackage[numbers]{natbib}
% pdfx loads both hyperref and xcolor internally
% \usepackage{hyperref}
@ -24,13 +25,35 @@
\graphicspath{{../assets/}}
\usepackage{lastpage}
\usepackage{fancyhdr}
\pagestyle{fancy}
\renewcommand{\headrulewidth}{0pt}
\fancyhead{}
\cfoot{}
\rfoot{\hypersetup{hidelinks}\thepage/\pageref{LastPage}}
\title{
\vspace{5cm}
\textbf{Bibliographie de projet long}
}
\author{Laurent Fainsin}
\date{
\vspace{10cm}
Département Sciences du Numérique \\
Troisième année \\
2022 — 2023
}
\begin{document}
\title{"Projet Long" Bibliography}
\author{Laurent Fainsin}
\date{2023-01-24}
\begin{figure}[t]
\centering
\includegraphics[width=5cm]{inp_n7.jpg}
\end{figure}
\maketitle
\thispagestyle{empty}
\newpage
{
\hypersetup{hidelinks}
@ -87,7 +110,7 @@ In~\cite{spheredetect}, it is explained that obtaining clean photographs with sp
\label{fig:spheredetect_dataset}
\end{figure}
Additionally, synthetic images of chrome spheres can also be generated using free (CC0 1.0 Universal Public Domain Dedication) environment maps from PolyHaven~\cite{haven_hdris_nodate}. These environment maps provide a wide range of realistic lighting conditions and can be used to simulate different lighting scenarios, such as different times of day, weather conditions, or indoor lighting setups. This can help to further increase the diversity of the dataset and make the model more robust to different lighting conditions, which is crucial for the task of detecting chrome sphere markers.
Additionally, synthetic images of chrome spheres can also be generated using free (CC0 1.0 Universal Public Domain Dedication) environment maps from PolyHaven~\cite{polyhaven}. These environment maps provide a wide range of realistic lighting conditions and can be used to simulate different lighting scenarios, such as different times of day, weather conditions, or indoor lighting setups. This can help to further increase the diversity of the dataset and make the model more robust to different lighting conditions, which is crucial for the task of detecting chrome sphere markers.
\subsection{Antoine Laurent}
@ -138,9 +161,9 @@ In the paper "A Dataset of Multi-Illumination Images in the Wild"~\cite{murmann_
\subsection{Labelling \& Versionning}
Label Studio~\cite{noauthor_label_nodate} is an open source web-based annotation tool that allows multiple annotators to label data simultaneously and provides a user-friendly interface for creating annotation tasks. It also enables to manage annotation projects, assign tasks to different annotators, and view the progress of the annotation process. It also allows to version the data and can handle different annotation formats.
Label Studio~\cite{Label_Studio} is an open source web-based annotation tool that allows multiple annotators to label data simultaneously and provides a user-friendly interface for creating annotation tasks. It also enables to manage annotation projects, assign tasks to different annotators, and view the progress of the annotation process. It also allows to version the data and can handle different annotation formats.
The output of such annotators can be integrated with HuggingFace Datasets~\cite{noauthor_datasets_nodate} library, which allows to load, preprocess, share and version datasets, and easily reproduce experiments. This library has built-in support for a wide range of datasets and can handle different file formats, making it easy to work with data from multiple sources. By integrating these tools, one can have a powerful pipeline for annotation, versioning, and sharing datasets, which can improve reproducibility and collaboration in computer vision research and development.
The output of such annotators can be integrated with HuggingFace Datasets~\cite{lhoest-etal-2021-datasets} library, which allows to load, preprocess, share and version datasets, and easily reproduce experiments. This library has built-in support for a wide range of datasets and can handle different file formats, making it easy to work with data from multiple sources. By integrating these tools, one can have a powerful pipeline for annotation, versioning, and sharing datasets, which can improve reproducibility and collaboration in computer vision research and development.
\section{Models}
@ -226,7 +249,7 @@ DINO (DETR with Improved deNoising anchOr boxes)~\cite{zhang_dino_2022} is a sta
\subsection{Mask2Former}
Mask2Former~\cite{cheng_masked-attention_2022} is a novel method for instance segmentation. It represents a transformer-based approach that leverages the strengths of the Transformer architecture to perform instance segmentation in a direct and simple manner. The main idea behind Mask2Former is to treat instance segmentation as a direct prediction problem, where the goal is to predict a set of instance masks directly from an input image. Unlike traditional instance segmentation methods that require multiple stages and hand-designed components, such as anchor generation, Non-maximum suppression, or post-processing steps, Mask2Former streamlines the instance segmentation pipeline.
Mask2Former~\cite{cheng_masked-attention_2022} is a recent development in object detection and instance segmentation tasks. It leverages the strengths of two popular models in this field: Transformer-based architectures, such as DETR, and fully convolutional networks (FCN), like Mask R-CNN.
\begin{figure}[ht]
\centering
@ -235,30 +258,30 @@ Mask2Former~\cite{cheng_masked-attention_2022} is a novel method for instance se
\label{fig:mask2former}
\end{figure}
Mask2Former uses a set-based loss function and a transformer encoder-decoder architecture to perform instance segmentation. Given a fixed set of instance queries, Mask2Former uses its encoder to extract features from the input image and the decoder to directly output the final set of instance masks. The set-based loss function enforces unique predictions and ensures that the output masks are well-formed and accurate. The use of the transformer architecture in Mask2Former enables it to effectively model the relations between the instances and the image context, leading to improved instance segmentation performance.
Similar to DETR, Mask2Former views object detection as a direct set prediction problem, streamlining the detection pipeline and removing the need for hand-designed components like non-maximum suppression and anchor generation. Unlike DETR, however, Mask2Former also uses a fully convolutional network to perform instance segmentation, outputting a mask for each detected object. This combination of a transformer-based architecture and an FCN provides a balance between the speed and accuracy of both models.
Overall, Mask2Former offers a simple and effective approach to instance segmentation that can achieve state-of-the-art performance on standard instance segmentation benchmarks. Its direct and efficient pipeline makes it well-suited for real-world applications, and its ability to leverage the strengths of the Transformer architecture makes it an attractive choice for researchers and practitioners alike.
Compared to Mask R-CNN, Mask2Former has a simpler architecture, with fewer components and a more straightforward pipeline. This simplicity leads to improved efficiency, making Mask2Former a good choice for real-time applications. The use of a transformer-based architecture also provides an advantage in handling complex scenes, where objects may have arbitrary shapes and sizes.
\section{Training}
For the training process, we plan to utilize PyTorch Lightning, a high-level library for PyTorch, and the HuggingFace Transformers library for our transformer model. The optimizer we plan to use is AdamW, a variation of the Adam optimizer that is well-suited for training deep learning models. We aim to ensure reproducibility by using Nix for our setup. The development environment will be in Visual Studio Code and we will use Poetry for managing Python dependencies. This combination of tools is expected to streamline the training process and ensure reliable results.
For the training process, we plan to utilize PyTorch Lightning~\cite{Falcon_PyTorch_Lightning_2019}, a high-level library for PyTorch~\cite{NEURIPS2019_9015}, and the HuggingFace Transformers~\cite{wolf-etal-2020-transformers} library for our transformer model. The optimizer we plan to use is AdamW~\cite{loshchilov_decoupled_2019}, a variation of the Adam~\cite{kingma_adam_2017} optimizer that is well-suited for training deep learning models. We aim to ensure reproducibility by using Nix for our setup and we will use Poetry for managing Python dependencies. This combination of tools is expected to streamline the training process and ensure reliable results.
\subsection{Loss functions}
\subsection{Metrics}
pytorch metrics
pytorch metrics~\cite{TorchMetrics_2022}
dice
IoU
\subsection{Experiment tracking}
To keep track of our experiments and their results, we will utilize Weights \& Biases (W\&B) and Aim. W\&B is a popular experiment tracking tool that provides a simple interface for logging and visualizing metrics, models, and artifacts. Aim is a collaborative machine learning platform that provides a unified way to track, compare, and explain experiments across teams and tools. By utilizing these tools, we aim to efficiently track our experiments and compare results. This will allow us to make data-driven decisions and achieve better results if we have enough time.
To keep track of our experiments and their results, we will utilize Weights \& Biases (W\&B)~\cite{wandb} and Aim~\cite{Arakelyan_Aim_2020}. W\&B is a popular experiment tracking tool that provides a simple interface for logging and visualizing metrics, models, and artifacts. Aim is a collaborative machine learning platform that provides a unified way to track, compare, and explain experiments across teams and tools. By utilizing these tools, we aim to efficiently track our experiments and compare results. This will allow us to make data-driven decisions and achieve better results if we have enough time.
\section{Deployment}
For deployment, we plan to use the ONNX format. This format provides a standard for interoperability between different AI frameworks and helps ensure compatibility with a wide range of deployment scenarios. To ensure the deployment process is seamless, we will carefully choose an architecture that is exportable, though most popular architectures are compatible with ONNX. Our model will be run in production using ONNXRuntime, a framework that allows for efficient inference using ONNX models. This combination of tools and formats will ensure that our model can be deployed quickly and easily in a variety of production environments such as AliceVision Meshroom.
For deployment, we plan to use the ONNX~\cite{ONNX} format. This format provides a standard for interoperability between different AI frameworks and helps ensure compatibility with a wide range of deployment scenarios. To ensure the deployment process is seamless, we will carefully choose an architecture that is exportable, though most popular architectures are compatible with ONNX. Our model will be run in production using ONNXRuntime~\cite{ONNX_Runtime_2018}, a framework that allows for efficient inference using ONNX models. This combination of tools and formats will ensure that our model can be deployed quickly and easily in a variety of production environments such as AliceVision Meshroom.
\section{Conclusion}
@ -268,7 +291,8 @@ In conclusion, the detection of matte spheres has been explored and is possible,
\newpage
\bibliography{zotero,qcav}
\bibliographystyle{plain}
\addcontentsline{toc}{section}{References}
\bibliography{zotero,qcav,softs}
\bibliographystyle{plainnat}
\end{document}

View file

@ -1,6 +1,6 @@
\Author{Laurent Fainsin}
\Title{
"Projet Long" Bibliography
Bibliographie de projet long
}
\Language{English}
\Keywords{}

144
src/softs.bib Normal file
View file

@ -0,0 +1,144 @@
@inproceedings{wolf-etal-2020-transformers,
title = {Transformers: State-of-the-Art Natural Language Processing},
author = {Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush},
booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
month = oct,
year = {2020},
address = {Online},
publisher = {Association for Computational Linguistics},
url = {https://www.aclweb.org/anthology/2020.emnlp-demos.6},
pages = {38--45}
}
@inproceedings{lhoest-etal-2021-datasets,
title = {Datasets: A Community Library for Natural Language Processing},
author = {Lhoest, Quentin and
Villanova del Moral, Albert and
Jernite, Yacine and
Thakur, Abhishek and
von Platen, Patrick and
Patil, Suraj and
Chaumond, Julien and
Drame, Mariama and
Plu, Julien and
Tunstall, Lewis and
Davison, Joe and
{\v{S}}a{\v{s}}ko, Mario and
Chhablani, Gunjan and
Malik, Bhavitvya and
Brandeis, Simon and
Le Scao, Teven and
Sanh, Victor and
Xu, Canwen and
Patry, Nicolas and
McMillan-Major, Angelina and
Schmid, Philipp and
Gugger, Sylvain and
Delangue, Cl{\'e}ment and
Matussi{\`e}re, Th{\'e}o and
Debut, Lysandre and
Bekman, Stas and
Cistac, Pierric and
Goehringer, Thibault and
Mustar, Victor and
Lagunas, Fran{\c{c}}ois and
Rush, Alexander and
Wolf, Thomas},
booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
month = nov,
year = {2021},
address = {Online and Punta Cana, Dominican Republic},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2021.emnlp-demo.21},
pages = {175--184},
abstract = {The scale, variety, and quantity of publicly-available NLP datasets has grown rapidly as researchers propose new tasks, larger models, and novel benchmarks. Datasets is a community library for contemporary NLP designed to support this ecosystem. Datasets aims to standardize end-user interfaces, versioning, and documentation, while providing a lightweight front-end that behaves similarly for small datasets as for internet-scale corpora. The design of the library incorporates a distributed, community-driven approach to adding datasets and documenting usage. After a year of development, the library now includes more than 650 unique datasets, has more than 250 contributors, and has helped support a variety of novel cross-dataset research projects and shared tasks. The library is available at https://github.com/huggingface/datasets.},
eprint = {2109.02846},
archiveprefix = {arXiv},
primaryclass = {cs.CL}
}
@software{ONNX,
title = {{ONNX}: Open Neural Network Exchange},
url = {https://github.com/onnx/onnx},
license = {Apache-2.0},
version = {1.13.0},
author = {{ONNX community}},
year = {2018-2023}
}
@software{ONNX_Runtime_2018,
author = {ONNX Runtime developers},
license = {MIT},
month = {11},
title = {{ONNX Runtime}},
url = {https://github.com/microsoft/onnxruntime},
year = {2018}
}
@software{Arakelyan_Aim_2020,
author = {Arakelyan, Gor and Soghomonyan, Gevorg and {The Aim team}},
doi = {10.5281/zenodo.6536395},
license = {Apache-2.0},
month = {6},
title = {{Aim}},
url = {https://github.com/aimhubio/aim},
version = {3.9.3},
year = {2020}
}
@software{Label_Studio,
title = {{Label Studio}: Data labeling software},
url = {https://github.com/heartexlabs/label-studio},
license = {Apache-2.0},
version = {1.7.1},
author = {{Maxim Tkachenko} and {Mikhail Malyuk} and {Andrey Holmanyuk} and {Nikolai Liubimov}},
year = {2020-2022}
}
@software{wandb,
title = {{Weights \& Biases}: Track, visualize, and share your machine learning experiments},
url = {https://github.com/wandb/wandb},
license = {MIT},
version = {0.13.9},
author = {{Wandb team}},
year = {2023}
}
@software{Falcon_PyTorch_Lightning_2019,
author = {Falcon, William and {The PyTorch Lightning team}},
doi = {10.5281/zenodo.3828935},
license = {Apache-2.0},
month = {3},
title = {{PyTorch Lightning}},
url = {https://github.com/Lightning-AI/lightning},
version = {1.4},
year = {2019}
}
@software{TorchMetrics_2022,
author = {{Nicki Skafte Detlefsen} and {Jiri Borovec} and {Justus Schock} and {Ananya Harsh} and {Teddy Koker} and {Luca Di Liello} and {Daniel Stancl} and {Changsheng Quan} and {Maxim Grechkin} and {William Falcon}},
doi = {10.21105/joss.04101},
license = {Apache-2.0},
month = {2},
title = {{TorchMetrics - Measuring Reproducibility in PyTorch}},
url = {https://github.com/Lightning-AI/metrics},
year = {2022}
}
@incollection{NEURIPS2019_9015,
title = {PyTorch: An Imperative Style, High-Performance Deep Learning Library},
author = {Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and Kopf, Andreas and Yang, Edward and DeVito, Zachary and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith},
booktitle = {Advances in Neural Information Processing Systems 32},
pages = {8024--8035},
year = {2019},
publisher = {Curran Associates, Inc.},
url = {http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf}
}
@sofwate{polyhaven,
title = {{Poly Haven}: 3D models for everyone},
url = {https://polyhaven.com/},
license = {CC-BY-NC-4.0},
author = {{Poly Haven team}},
year = {2021}
}

View file

@ -165,14 +165,6 @@
keywords = {balloon, balloons, colab, colab-notebook, colaboratory, detr, facebook, finetune, finetunes, finetuning, google-colab, google-colab-notebook, google-colaboratory, instance, instance-segmentation, instances, segementation, segment},
}
@misc{noauthor_datasets_nodate,
title = {Datasets {HuggingFace}},
url = {https://huggingface.co/docs/datasets/index},
abstract = {Were on a journey to advance and democratize artificial intelligence through open source and open science.},
urldate = {2023-01-17},
file = {Snapshot:/home/laurent/Zotero/storage/RYXSCZR7/index.html:text/html},
}
@misc{rogge_transformers_2020,
title = {Transformers {Tutorials}"},
copyright = {MIT},
@ -248,28 +240,6 @@
note = {original-date: 2022-03-09T05:11:49Z},
}
@misc{arakelyan_aim_2020,
title = {Aim},
copyright = {Apache-2.0},
url = {https://github.com/aimhubio/aim},
abstract = {Aim 💫 — easy-to-use and performant open-source ML experiment tracker.},
urldate = {2023-01-17},
author = {Arakelyan, Gor and Soghomonyan, Gevorg and {The Aim team}},
month = jun,
year = {2020},
doi = {10.5281/zenodo.6536395},
}
@misc{noauthor_label_nodate,
title = {Label {Studio}},
url = {https://labelstud.io/},
abstract = {A flexible data labeling tool for all data types. Prepare training data for computer vision, natural language processing, speech, voice, and video models.},
language = {en},
urldate = {2023-01-17},
journal = {Label Studio},
file = {Snapshot:/home/laurent/Zotero/storage/7Y3X7GTY/labelstud.io.html:text/html},
}
@misc{noauthor_miscellaneous_nodate,
title = {Miscellaneous {Transformations} and {Projections}},
url = {http://paulbourke.net/geometry/transformationprojection/},
@ -327,14 +297,6 @@ Publisher: IEEE},
file = {arXiv Fulltext PDF:/home/laurent/Zotero/storage/AN3SNSVC/Lahoud et al. - 2022 - 3D Vision with Transformers A Survey.pdf:application/pdf;arXiv.org Snapshot:/home/laurent/Zotero/storage/6BXWCFI5/2208.html:text/html},
}
@misc{noauthor_weights_nodate,
title = {Weights \& {Biases} {Developer} tools for {ML}},
url = {https://wandb.ai/site/, http://wandb.ai/site},
abstract = {WandB is a central dashboard to keep track of your hyperparameters, system metrics, and predictions so you can compare models live, and share your findings.},
urldate = {2023-01-17},
file = {Snapshot:/home/laurent/Zotero/storage/GRIMYX6J/site.html:text/html},
}
@article{dong_ellipse_2021,
title = {Ellipse {R}-{CNN}: {Learning} to {Infer} {Elliptical} {Object} from {Clustering} and {Occlusion}},
volume = {30},
@ -393,13 +355,6 @@ Publisher: IEEE},
urldate = {2023-01-24},
}
@misc{noauthor_format_nodate,
title = {Format selector for 2112.01527},
url = {https://arxiv.org/format/2112.01527},
urldate = {2023-01-25},
file = {Format selector for 2112.01527:/home/laurent/Zotero/storage/LUPN2K2W/2112.html:text/html},
}
@misc{cheng_masked-attention_2022,
title = {Masked-attention {Mask} {Transformer} for {Universal} {Image} {Segmentation}},
url = {http://arxiv.org/abs/2112.01527},
@ -411,7 +366,68 @@ Publisher: IEEE},
month = jun,
year = {2022},
note = {arXiv:2112.01527 [cs]},
keywords = {Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning},
keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Artificial Intelligence},
annote = {Comment: CVPR 2022. Project page/code/models: https://bowenc0221.github.io/mask2former},
file = {arXiv Fulltext PDF:/home/laurent/Zotero/storage/9XS7V8FP/Cheng et al. - 2022 - Masked-attention Mask Transformer for Universal Im.pdf:application/pdf;arXiv.org Snapshot:/home/laurent/Zotero/storage/LC5ZEEIC/2112.html:text/html},
}
@misc{dai_ao2-detr_2022,
title = {{AO2}-{DETR}: {Arbitrary}-{Oriented} {Object} {Detection} {Transformer}},
shorttitle = {{AO2}-{DETR}},
url = {http://arxiv.org/abs/2205.12785},
abstract = {Arbitrary-oriented object detection (AOOD) is a challenging task to detect objects in the wild with arbitrary orientations and cluttered arrangements. Existing approaches are mainly based on anchor-based boxes or dense points, which rely on complicated hand-designed processing steps and inductive bias, such as anchor generation, transformation, and non-maximum suppression reasoning. Recently, the emerging transformer-based approaches view object detection as a direct set prediction problem that effectively removes the need for handdesigned components and inductive biases. In this paper, we propose an Arbitrary-Oriented Object DEtection TRansformer framework, termed AO2-DETR, which comprises three dedicated components. More precisely, an oriented proposal generation mechanism is proposed to explicitly generate oriented proposals, which provides better positional priors for pooling features to modulate the cross-attention in the transformer decoder. An adaptive oriented proposal refinement module is introduced to extract rotation-invariant region features and eliminate the misalignment between region features and objects. And a rotationaware set matching loss is used to ensure the one-to-one matching process for direct set prediction without duplicate predictions. Our method considerably simplifies the overall pipeline and presents a new AOOD paradigm. Comprehensive experiments on several challenging datasets show that our method achieves superior performance on the AOOD task.},
language = {en},
urldate = {2023-01-25},
publisher = {arXiv},
author = {Dai, Linhui and Liu, Hong and Tang, Hao and Wu, Zhiwei and Song, Pinhao},
month = may,
year = {2022},
note = {arXiv:2205.12785 [cs]},
keywords = {Computer Science - Computer Vision and Pattern Recognition},
file = {Dai et al. - 2022 - AO2-DETR Arbitrary-Oriented Object Detection Tran.pdf:/home/laurent/Zotero/storage/BL5QA9W7/Dai et al. - 2022 - AO2-DETR Arbitrary-Oriented Object Detection Tran.pdf:application/pdf},
}
@misc{mmrotate_contributors_openmmlab_2022,
title = {{OpenMMLab} rotated object detection toolbox and benchmark},
copyright = {Apache-2.0},
url = {https://github.com/open-mmlab/mmrotate},
abstract = {AO2-DETR: Arbitrary-Oriented Object Detection Transformer},
urldate = {2023-01-25},
author = {{MMRotate Contributors}},
month = feb,
year = {2022},
note = {original-date: 2022-05-26T01:38:15Z},
}
@misc{loshchilov_decoupled_2019,
title = {Decoupled {Weight} {Decay} {Regularization}},
url = {http://arxiv.org/abs/1711.05101},
doi = {10.48550/arXiv.1711.05101},
abstract = {L\$\_2\$ regularization and weight decay regularization are equivalent for standard stochastic gradient descent (when rescaled by the learning rate), but as we demonstrate this is {\textbackslash}emph\{not\} the case for adaptive gradient algorithms, such as Adam. While common implementations of these algorithms employ L\$\_2\$ regularization (often calling it "weight decay" in what may be misleading due to the inequivalence we expose), we propose a simple modification to recover the original formulation of weight decay regularization by {\textbackslash}emph\{decoupling\} the weight decay from the optimization steps taken w.r.t. the loss function. We provide empirical evidence that our proposed modification (i) decouples the optimal choice of weight decay factor from the setting of the learning rate for both standard SGD and Adam and (ii) substantially improves Adam's generalization performance, allowing it to compete with SGD with momentum on image classification datasets (on which it was previously typically outperformed by the latter). Our proposed decoupled weight decay has already been adopted by many researchers, and the community has implemented it in TensorFlow and PyTorch; the complete source code for our experiments is available at https://github.com/loshchil/AdamW-and-SGDW},
urldate = {2023-01-29},
publisher = {arXiv},
author = {Loshchilov, Ilya and Hutter, Frank},
month = jan,
year = {2019},
note = {arXiv:1711.05101 [cs, math]},
keywords = {Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing, Mathematics - Optimization and Control},
annote = {Comment: Published as a conference paper at ICLR 2019},
file = {arXiv Fulltext PDF:/home/laurent/Zotero/storage/JJ33N7CY/Loshchilov and Hutter - 2019 - Decoupled Weight Decay Regularization.pdf:application/pdf;arXiv.org Snapshot:/home/laurent/Zotero/storage/R3Y868LM/1711.html:text/html},
}
@misc{kingma_adam_2017,
title = {Adam: {A} {Method} for {Stochastic} {Optimization}},
shorttitle = {Adam},
url = {http://arxiv.org/abs/1412.6980},
doi = {10.48550/arXiv.1412.6980},
abstract = {We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.},
urldate = {2023-01-29},
publisher = {arXiv},
author = {Kingma, Diederik P. and Ba, Jimmy},
month = jan,
year = {2017},
note = {arXiv:1412.6980 [cs]},
keywords = {Computer Science - Machine Learning},
annote = {Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015},
file = {arXiv Fulltext PDF:/home/laurent/Zotero/storage/EQ38Q4BJ/Kingma and Ba - 2017 - Adam A Method for Stochastic Optimization.pdf:application/pdf;arXiv.org Snapshot:/home/laurent/Zotero/storage/JSNDPECJ/1412.html:text/html},
}