112 research outputs found
The Theory Behind Overfitting, Cross Validation, Regularization, Bagging, and Boosting: Tutorial
In this tutorial paper, we first define mean squared error, variance,
covariance, and bias of both random variables and classification/predictor
models. Then, we formulate the true and generalization errors of the model for
both training and validation/test instances where we make use of the Stein's
Unbiased Risk Estimator (SURE). We define overfitting, underfitting, and
generalization using the obtained true and generalization errors. We introduce
cross validation and two well-known examples which are -fold and
leave-one-out cross validations. We briefly introduce generalized cross
validation and then move on to regularization where we use the SURE again. We
work on both and norm regularizations. Then, we show that
bootstrap aggregating (bagging) reduces the variance of estimation. Boosting,
specifically AdaBoost, is introduced and it is explained as both an additive
model and a maximum margin model, i.e., Support Vector Machine (SVM). The upper
bound on the generalization error of boosting is also provided to show why
boosting prevents from overfitting. As examples of regularization, the theory
of ridge and lasso regressions, weight decay, noise injection to input/weights,
and early stopping are explained. Random forest, dropout, histogram of oriented
gradients, and single shot multi-box detector are explained as examples of
bagging in machine learning and computer vision. Finally, boosting tree and SVM
models are mentioned as examples of boosting.Comment: 23 pages, 9 figure
Impact of impurities on thermo-physical properties of CO2-rich systems : experimental and modelling
Numerous industrial and academic communities have directed their efforts into
developing technologies for reducing the emission of CO2 in the atmosphere. Carbon
dioxide capture and storage (CCS) is one of the most promising technologies that can
eliminate/reduce global warming, helping the world to move towards a low-carbon
society. The process comprises of the separation of CO2 from industrial sources,
transport to a storage location and then long-term isolation from the atmosphere. CO2-
rich pipelines are a key part of any carbon capture and storage projects. Modelling of
these types of pipelines are challenging due to the lack of thermo-physical properties of
CO2 in presence of impurities. As these properties, particularly density and viscosity,
have a significant impact on the sizing of equipment, therefore, it is crucial to
investigate the impact of different impurities on the thermo-physical properties of CO2-
rich systems.
Densities and viscosities of pure CO2, two CO2 – H2 binary systems (with 5 and with 10
mol% H2), and 6 multi-component mixtures (MIX 1 with 5 mol% impurity, MIX 2 with
10 mol % impurity, MIX 3 with 30 mol % impurity, MIX 4 with 50 mol % impurity,
MIX 5 with 4 mol % impurity and MIX 6 with 30 mol % impurity) were measured at
pressures ranging from 10 to 1,400 bar (1 to 140 MPa) and six different temperatures, 0,
10, 25, 50, 100, 150 °C (273.15, 283.15, 298.15, 323.15, 373.15 and 423.15 K) in the
gas, liquid, and supercritical regions using an Anton Paar densitometer and capillary
tube technique for density and viscosity measurements, respectively. The experimental
density data then were applied to evaluate the models using CO2 correction volume,
Peneloux shift parameter and original equation of states (PR and SRK). Also, the
obtained viscosity data were employed to tune the correlative Lohrenz-Bray-Clark
(LBC) and CO2-LBC models and to evaluate the predictive models. The predictive
models in this work are based on corresponding states (CS) theory models. The “One
reference fluid” corresponding states model is based on the approach developed by
Pedersen et al. and modified for CO2-rich fluids; the “two reference fluids”
corresponding states models are based on the model proposed by Aasberg-Petersen
(CS2) and CO2-CS2 models. Two models based on the extended corresponding states
(ECS) theory, SUPERTRAP and CO2-SUPERTRAP models were also tested.
The densities of 95%CO2-5%H2S and 95%CO2-5%SO2 systems were measured
continuously using a high temperature and pressure Vibrating Tube Densitometer
(VTD), Anton Paar DMA 512 at pressures up to 400 bar (40 MPa) at five different
temperatures, 0, 10, 25, 50 and 80 °C (273.15, 283.15, 298.15, 323.15 and 353.15 K) in
the gas, liquid and supercritical regions at Mines Paristech, France. The experimental
data then were used to evaluate the new CO2 volume correction model by comparing to
the original PR and PR-Peneloux equations of state.
A good understanding of vapour-solid / vapour-liquid-solid / liquid-solid equilibrium of
CO2 and CO2-mixtures at low temperature is an important issue regarding the safety
assessment of CO2 pipelines and the possibility of solid or ‘dry ice’ discharge during an
accidental release or rapid decompression. The frost points of some of the above
systems were measured using the SETARAM BT 2.15 calorimeter at various pressures
- …