style: change tab size to 2 space

This commit is contained in:
Laureηt 2023-01-18 16:06:21 +01:00
parent ca2f6e11e0
commit 8f65a57b33
Signed by: Laurent
SSH key fingerprint: SHA256:kZEpW8cMJ54PDeCvOhzreNr4FSh6R13CMGH/POoO8DI
2 changed files with 40 additions and 40 deletions

View file

@ -5,7 +5,7 @@ root = true
[*] [*]
indent_style = space indent_style = space
indent_size = 4 indent_size = 2
end_of_line = lf end_of_line = lf
charset = utf-8 charset = utf-8
trim_trailing_whitespace = true trim_trailing_whitespace = true

View file

@ -19,8 +19,8 @@
\newpage \newpage
{ {
\hypersetup{hidelinks} \hypersetup{hidelinks}
\tableofcontents \tableofcontents
} }
\newpage \newpage
@ -31,13 +31,13 @@
The field of 3D reconstruction techniques in photography, such as Reflectance Transformation Imaging (RTI)~\cite{giachetti2018} and Photometric Stereo~\cite{durou2020}, often require a precise understanding of the lighting conditions in the scene being captured. One common method for calibrating the lighting is to include one or more spheres in the scene, as shown in the left example of Figure~\ref{fig:intro}. However, manually outlining these spheres can be tedious and time-consuming, especially in the field of visual effects where the presence of chrome spheres is prevalent~\cite{jahirul_grey_2021}. This task can be made more efficient by using deep learning methods for detection. The goal of this project is to develop a neural network that can accurately detect both matte and shiny spheres in a scene. The field of 3D reconstruction techniques in photography, such as Reflectance Transformation Imaging (RTI)~\cite{giachetti2018} and Photometric Stereo~\cite{durou2020}, often require a precise understanding of the lighting conditions in the scene being captured. One common method for calibrating the lighting is to include one or more spheres in the scene, as shown in the left example of Figure~\ref{fig:intro}. However, manually outlining these spheres can be tedious and time-consuming, especially in the field of visual effects where the presence of chrome spheres is prevalent~\cite{jahirul_grey_2021}. This task can be made more efficient by using deep learning methods for detection. The goal of this project is to develop a neural network that can accurately detect both matte and shiny spheres in a scene.
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\begin{tabular}{cc} \begin{tabular}{cc}
\includegraphics[height=0.35\linewidth]{matte.jpg} & \includegraphics[height=0.35\linewidth]{matte.jpg} &
\includegraphics[height=0.35\linewidth]{shiny.jpg} \includegraphics[height=0.35\linewidth]{shiny.jpg}
\end{tabular} \end{tabular}
\caption{Left: a scene with matte spheres. Right: a scene with a shiny sphere.} \caption{Left: a scene with matte spheres. Right: a scene with a shiny sphere.}
\label{fig:intro} \label{fig:intro}
\end{figure} \end{figure}
\section{Previous work} \section{Previous work}
@ -45,13 +45,13 @@ The field of 3D reconstruction techniques in photography, such as Reflectance Tr
Previous work by Laurent Fainsin et al. in~\cite{spheredetect} attempted to address this problem by using a neural network called Mask R-CNN~\cite{MaskRCNN} for instance segmentation of spheres in images. However, this approach is limited in its ability to detect shiny spheres, as demonstrated in the right image of Figure~\ref{fig:previouswork}. The network was trained on images of matte spheres and was unable to generalize to shiny spheres, which highlights the need for further research in this area. Previous work by Laurent Fainsin et al. in~\cite{spheredetect} attempted to address this problem by using a neural network called Mask R-CNN~\cite{MaskRCNN} for instance segmentation of spheres in images. However, this approach is limited in its ability to detect shiny spheres, as demonstrated in the right image of Figure~\ref{fig:previouswork}. The network was trained on images of matte spheres and was unable to generalize to shiny spheres, which highlights the need for further research in this area.
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\begin{tabular}{cc} \begin{tabular}{cc}
\includegraphics[height=0.35\linewidth]{matte_inference.png} & \includegraphics[height=0.35\linewidth]{matte_inference.png} &
\includegraphics[height=0.35\linewidth]{shiny_inference.png} \includegraphics[height=0.35\linewidth]{shiny_inference.png}
\end{tabular} \end{tabular}
\caption{Mask R-CNN~\cite{MaskRCNN} inferences from~\cite{spheredetect} on Figure~\ref{fig:intro}.} \caption{Mask R-CNN~\cite{MaskRCNN} inferences from~\cite{spheredetect} on Figure~\ref{fig:intro}.}
\label{fig:previouswork} \label{fig:previouswork}
\end{figure} \end{figure}
\section{Current state of the art} \section{Current state of the art}
@ -63,13 +63,13 @@ The automatic detection (or segmentation) of spheres in scenes is a rather niche
In~\cite{spheredetect}, it is explained that obtaining clean photographs with spherical markers for use in 3D reconstruction techniques are unsurprisingly rare. To address this issue, the authors of the paper crafted a training custom dataset using python and blender scripts. This was done by compositing known spherical markers (real or synthetic) onto background images from the COCO dataset~\cite{COCO}. The result of such technique is visible in Figure~\ref{fig:spheredetectdataset}. In~\cite{spheredetect}, it is explained that obtaining clean photographs with spherical markers for use in 3D reconstruction techniques are unsurprisingly rare. To address this issue, the authors of the paper crafted a training custom dataset using python and blender scripts. This was done by compositing known spherical markers (real or synthetic) onto background images from the COCO dataset~\cite{COCO}. The result of such technique is visible in Figure~\ref{fig:spheredetectdataset}.
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\begin{tabular}{cc} \begin{tabular}{cc}
\includegraphics[height=0.3\linewidth]{dataset1.jpg} & \includegraphics[height=0.3\linewidth]{dataset1.jpg} &
\includegraphics[height=0.3\linewidth]{dataset2.jpg} \includegraphics[height=0.3\linewidth]{dataset2.jpg}
\end{tabular} \end{tabular}
\caption{Example of the synthetic dataset used in~\cite{spheredetect}.} \caption{Example of the synthetic dataset used in~\cite{spheredetect}.}
\label{fig:spheredetectdataset} \label{fig:spheredetectdataset}
\end{figure} \end{figure}
During the research of this bibliography we found some additional datasets that we may be able to use. During the research of this bibliography we found some additional datasets that we may be able to use.
@ -87,10 +87,10 @@ In~\cite{spheredetect}, the authors use Mask R-CNN~\cite{MaskRCNN} as a base mod
The network is composed of two parts: a backbone network and a region proposal network (RPN). The backbone network is a convolutional neural network that is used to extract features from the input image. The RPN is a fully convolutional network that is used to generate region proposals, which are bounding boxes that are used to crop the input image. The RPN is then used to generate a mask for each region proposal, which is used to segment the object in the image. The network is composed of two parts: a backbone network and a region proposal network (RPN). The backbone network is a convolutional neural network that is used to extract features from the input image. The RPN is a fully convolutional network that is used to generate region proposals, which are bounding boxes that are used to crop the input image. The RPN is then used to generate a mask for each region proposal, which is used to segment the object in the image.
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\includegraphics[width=0.6\linewidth]{MaskRCNN.png} \includegraphics[width=0.6\linewidth]{MaskRCNN.png}
\caption{The Mask-RCNN~\cite{MaskRCNN} architecture.} \caption{The Mask-RCNN~\cite{MaskRCNN} architecture.}
\label{fig:maskrcnn} \label{fig:maskrcnn}
\end{figure} \end{figure}
The network is trained using a loss function that is composed of three terms: the classification loss, the bounding box regression loss, and the mask loss. The classification loss is used to train the network to classify each region proposal as either a sphere or not a sphere. The bounding box regression loss is used to train the network to regress the bounding box of each region proposal. The mask loss is used to train the network to generate a mask for each region proposal. The original network was trained using the COCO dataset~\cite{COCO}. The network is trained using a loss function that is composed of three terms: the classification loss, the bounding box regression loss, and the mask loss. The classification loss is used to train the network to classify each region proposal as either a sphere or not a sphere. The bounding box regression loss is used to train the network to regress the bounding box of each region proposal. The mask loss is used to train the network to generate a mask for each region proposal. The original network was trained using the COCO dataset~\cite{COCO}.
@ -104,28 +104,28 @@ To detect spheres in images, it is sufficient to estimate the center and radius
The Ellipse R-CNN~\cite{dong_ellipse_2021} is a modified version of the Mask R-CNN~\cite{MaskRCNN} which can detect ellipses in images, it addresses this issue by using an additional branch in the network to predict the axes of the ellipse and its orientation, which allows for more accurate detection of objects and in our case spheres. It also have a feature of handling occlusion, by predicting the segmentation mask for each ellipse, it can handle overlapping and occluded objects. This makes it an ideal choice for detecting spheres in real-world images with complex backgrounds and variable lighting conditions. The Ellipse R-CNN~\cite{dong_ellipse_2021} is a modified version of the Mask R-CNN~\cite{MaskRCNN} which can detect ellipses in images, it addresses this issue by using an additional branch in the network to predict the axes of the ellipse and its orientation, which allows for more accurate detection of objects and in our case spheres. It also have a feature of handling occlusion, by predicting the segmentation mask for each ellipse, it can handle overlapping and occluded objects. This makes it an ideal choice for detecting spheres in real-world images with complex backgrounds and variable lighting conditions.
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\includegraphics[width=0.6\linewidth]{EllipseRCNN.png} \includegraphics[width=0.6\linewidth]{EllipseRCNN.png}
\caption{The Ellipse R-CNN~\cite{dong_ellipse_2021} architecture.} \caption{The Ellipse R-CNN~\cite{dong_ellipse_2021} architecture.}
\label{fig:ellipsercnn} \label{fig:ellipsercnn}
\end{figure} \end{figure}
\subsubsection{GPN} \subsubsection{GPN}
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\includegraphics[width=0.6\linewidth]{GPN.png} \includegraphics[width=0.6\linewidth]{GPN.png}
\caption{The GPN~\cite{li_detecting_2019} architecture.} \caption{The GPN~\cite{li_detecting_2019} architecture.}
\label{fig:gpn} \label{fig:gpn}
\end{figure} \end{figure}
\subsubsection{DETR} \subsubsection{DETR}
\begin{figure}[h] \begin{figure}[h]
\centering \centering
\includegraphics[width=0.8\linewidth]{DETR.png} \includegraphics[width=0.8\linewidth]{DETR.png}
\caption{The DETR~\cite{carion_end--end_2020} architecture.} \caption{The DETR~\cite{carion_end--end_2020} architecture.}
\label{fig:detr} \label{fig:detr}
\end{figure} \end{figure}
+ \cite{zhang_dino_2022} + \cite{zhang_dino_2022}