WithWorkingCode

Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models

Publication Date: Sept. 11, 2024

Journal: arXiv

DOI: http://arxiv.org/abs/2409.07434v1

Preprint: http://arxiv.org/abs/2409.07434v1

Abstract

This paper proposes an asymptotic theory for online inference of the stochastic gradient descent (SGD) iterates with dropout regularization in linear regression. Specifically, we establish the geometric-moment contraction (GMC) for constant step-size SGD dropout iterates to show the existence of a unique stationary distribution of the dropout recursive function. By the GMC property, we provide quenched central limit theorems (CLT) for the difference between dropout and $\ell^2$-regularized iterates, regardless of initialization. The CLT for the difference between the Ruppert-Polyak averaged SGD (ASGD) with dropout and $\ell^2$-regularized iterates is also presented. Based on these asymptotic normality results, we further introduce an online estimator for the long-run covariance matrix of ASGD dropout to facilitate inference in a recursive manner with efficiency in computational time and memory. The numerical experiments demonstrate that for sufficiently large samples, the proposed confidence intervals for ASGD with dropout nearly achieve the nominal coverage probability.

Associated Code Repositories

https://github.com/jiaqili97/Dropout_SGD
GitHub repository associated with the preprint: Asymptotics of Stochastic Gradient Descent with Dropout Regularization in Linear Models

Mention Context: dology. The codes for reproducing all results and figures can be found online\footnote{\url{https://github.com/jiaqili97/Dropout_SGD}}. \subsection{Sharp Range of the Learning Rate} The GD dropout iterates can be defined via the

No metrics available for this repository.