Endogeneity, i.e. the dependence of noise and covariates, is a common
phenomenon in real data due to omitted variables, strategic behaviours,
measurement errors etc. In contrast, the existing analyses of stochastic online
linear regression with unbounded noise and linear bandits depend heavily on
exogeneity, i.e. the independence of noise and covariates. Motivated by this
gap, we study the over- and just-identified Instrumental Variable (IV)
regression, specifically Two-Stage Least Squares, for stochastic online
learning, and propose to use an online variant of Two-Stage Least Squares,
namely O2SLS. We show that O2SLS achieves O(dxβdzβlog2T)
identification and O(Ξ³dzβTβ) oracle
regret after T interactions, where dxβ and dzβ are the dimensions of
covariates and IVs, and Ξ³ is the bias due to endogeneity. For
Ξ³=0, i.e. under exogeneity, O2SLS exhibits O(dx2βlog2T) oracle regret, which is of the same order as that of the stochastic online
ridge. Then, we leverage O2SLS as an oracle to design OFUL-IV, a stochastic
linear bandit algorithm to tackle endogeneity. OFUL-IV yields
O(dxβdzβTβ) regret that matches the regret
lower bound under exogeneity. For different datasets with endogeneity, we
experimentally show efficiencies of O2SLS and OFUL-IV