added connective paragraphs in setup environ and runtimes section

This commit is contained in:
Jan Kowalczyk
2025-09-11 14:00:33 +02:00
parent 85cd33cd5b
commit 35766b9028
2 changed files with 46 additions and 40 deletions

Binary file not shown.

View File

@@ -1213,6 +1213,12 @@ We adapted the baseline implementations to our data loader and input format \tod
\section{Experiment Overview \& Computational Environment}
\threadtodo
{\textit{"What should the reader know after reading this section?"}}
{\textit{"Why is that of interest to the reader at this point?"}}
{\textit{"How am I achieving the stated goal?"}}
{\textit{"How does this lead to the next question or section?"}}
\threadtodo
{give overview of experiments and their motivations}
{training setup clear, but not what was trained/tested}
@@ -1246,9 +1252,6 @@ Table~\ref{tab:exp_grid} summarizes the full experiment matrix.
\label{tab:exp_grid}
\end{table}
% Combines: Experiment Matrix + Hardware & Runtimes
% Goals: clearly enumerate each experiment configuration and give practical runtime details
%\newsubsubsectionNoTOC{Table of experiment variants (architectures, hyperparameters, data splits)}
\threadtodo
{give overview about hardware setup and how long things take to train}
@@ -1256,56 +1259,54 @@ Table~\ref{tab:exp_grid} summarizes the full experiment matrix.
{table of hardware and of how long different trainings took}
{experiment setup understood $\rightarrow$ what were the experiments' results}
Having outlined the full grid of experiments in Table~\ref{tab:exp_grid}, we next describe the computational environment in which they were conducted. The hardware and software stack used throughout all experiments is summarized in Table~\ref{tab:system_setup}.
\begin{table}[p]
\centering
\caption{Computational Environment (Hardware \& Software)} \label{tab:system_setup}
\begin{tabular}{p{0.34\linewidth} p{0.62\linewidth}}
\begin{tabularx}{0.8\textwidth}{rX}
\toprule
\textbf{Item} & \textbf{Details} \\
\textbf{System} & \\
\midrule
\multicolumn{2}{l}{\textbf{System}} \\
Operating System & \ttfamily NixOS 25.11 (Xantusia) \\
Kernel & \ttfamily 6.12.45 \\
Architecture & \ttfamily x86\_64 \\
CPU Model & \ttfamily AMD Ryzen 5 3600 6-Core Processor \\
CPU Cores (physical) & \ttfamily 6 × 1 \\
CPU Threads (logical) & \ttfamily 12 \\
CPU Base Frequency & \ttfamily 2200 MHz \\
CPU Max Frequency & \ttfamily 4208 MHz \\
Total RAM & \ttfamily 31.29 GiB \\
Operating System & \ttfamily Ubuntu 22.04.5 LTS \\
Kernel & \ttfamily 6.5.0-44-generic \\
Architecture & \ttfamily x86\_64 \\
CPU Model & \ttfamily AMD Ryzen Threadripper 3970X 32-Core Processor \\
CPU Cores (physical) & \ttfamily 32 \\
CPU Threads (logical) & \ttfamily 64 \\
CPU Base Frequency & \ttfamily 2200 MHz \\
CPU Max Frequency & \ttfamily 3700 MHz \\
Total RAM & \ttfamily 94.14 GiB \\
\addlinespace
\multicolumn{2}{l}{\textbf{GPU}} \\
GPU Name & \ttfamily NVIDIA GeForce RTX 2070 SUPER \\
GPU Memory & \ttfamily 8.00 GiB \\
GPU Compute Capability & \ttfamily 7.5 \\
NVIDIA Driver Version & \ttfamily 570.181 \\
CUDA (Driver) Version & \ttfamily 12.8 \\
\midrule
\textbf{GPU} & \\
\midrule
GPU Name & \ttfamily NVIDIA GeForce RTX 4090 \\
GPU Memory & \ttfamily 23.99 GiB \\
GPU Compute Capability & \ttfamily 8.9 \\
NVIDIA Driver Version & \ttfamily 535.161.07 \\
CUDA (Driver) Version & \ttfamily 12.2 \\
\addlinespace
\multicolumn{2}{l}{\textbf{Software Environment}} \\
Python & \ttfamily 3.12.11 \\
PyTorch & \ttfamily 2.7.1+cu128 \\
PyTorch Built CUDA & \ttfamily 12.8 \\
cuDNN (PyTorch build) & \ttfamily 91100 \\
scikit-learn & \ttfamily 1.7.0 \\
NumPy & \ttfamily 2.3.1 \\
SciPy & \ttfamily 1.16.0 \\
NumPy Build Config & \begin{minipage}[t]{\linewidth}\ttfamily\small blas:
name: blas
openblas configuration: unknown
pc file directory: /nix/store/x19i4pf7zs1pp96mikj8azyn6v891i33-blas-3-dev/lib/pkgconfig
lapack:
name: lapack
openblas configuration: unknown
pc file directory: /nix/store/g819v6ri55f2gdczsi8s8bljkh0lkgwb-lapack-3-dev/lib/pkgconfig\end{minipage} \\
\midrule
\textbf{Software Environment} & \\
\midrule
Python & \ttfamily 3.12.11 \\
PyTorch & \ttfamily 2.7.1+cu126 \\
PyTorch Built CUDA & \ttfamily 12.6 \\
cuDNN (PyTorch build) & \ttfamily 90501 \\
scikit-learn & \ttfamily 1.7.0 \\
NumPy & \ttfamily 2.3.0 \\
SciPy & \ttfamily 1.15.3 \\
\addlinespace
\bottomrule
\end{tabular}
\end{tabularx}
\end{table}
Pretraining runtimes for the autoencoders are reported in Table~\ref{tab:ae_pretrain_runtimes}. These values are averaged across folds and labeling regimes, since the pretraining step itself does not make use of labels. \todo[inline]{why is efficient taking longer with less params and MACs?}
\begin{table}
\centering
\caption{Autoencoder pretraining runtime (seconds): mean ± std across folds.}
\caption{Autoencoder pretraining runtime (seconds): mean ± std across folds and across semi-supervised lableling regimes.}
\label{tab:ae_pretrain_runtimes}
\begin{tabularx}{\textwidth}{cYY}
\toprule
@@ -1323,9 +1324,11 @@ Table~\ref{tab:exp_grid} summarizes the full experiment matrix.
\end{tabularx}
\end{table}
The full DeepSAD training times are shown in Table~\ref{tab:train_runtimes_compact}, alongside the two classical baselines Isolation Forest and One-Class SVM. Here the contrast between methods is clear: while DeepSAD requires on the order of 1520 minutes of GPU training per configuration, both baselines complete training in seconds on CPU. The OCSVM training can only be this fast due to the reduced input dimensionality from utilizing DeepSAD's pretraining encoder as a preprocessing step, although other dimensionality reduction methods may also be used which may require less computational resources for this step.
\begin{table}
\centering
\caption{Training runtime: total seconds (mean ± std). DeepSAD cells also show \textit{seconds per epoch} in parentheses.}
\caption{Training runtime: total seconds (mean ± std).}
\label{tab:train_runtimes_compact}
\begin{tabularx}{\textwidth}{crrrr}
\toprule
@@ -1343,6 +1346,7 @@ Table~\ref{tab:exp_grid} summarizes the full experiment matrix.
\end{tabularx}
\end{table}
Inference latency per sample is presented in Table~\ref{tab:inference_latency_compact}. These measurements highlight an important property: once trained, all methods are extremely fast at inference, with DeepSAD operating in the sub-millisecond range and the classical baselines being even faster. This confirms that, despite higher training costs, DeepSAD can be deployed in real-time systems without inference becoming a bottleneck.
\begin{table}
\centering
@@ -1364,6 +1368,8 @@ Table~\ref{tab:exp_grid} summarizes the full experiment matrix.
\end{tabularx}
\end{table}
Together, these results provide a comprehensive overview of the computational requirements of our experimental setup. They show that while our deep semi-supervised approach is significantly more demanding during training than classical baselines, it remains highly efficient at inference, which is the decisive factor for deployment in time-critical domains such as rescue robotics.
\newchapter{results_discussion}{Results and Discussion}
\newsection{results}{Results}
\todo[inline]{some results, ROC curves, for both global and local}