1,447 research outputs found
Developing a log file analysis tool:a machine learning approach for anomaly detection
Abstract. Log files, which record information about all events during the execution of a software, are important in troubleshooting tasks. However, modern software systems produce large quantities of complex logs, and their manual inspection is laborious and time-consuming. Therefore, technologies such as machine learning have been used to automate log file analysis. Anomaly detection is an especially popular approach, since anomalies in the log files are typically caused by erroneous behaviour of the software.
In this study, open source data mining and machine learning solutions are utilized to process log files collected from devices running embedded Linux. Following the Design Science Research methodology, a Python program called sgologs is developed. The tool uses components from logparser and loglizer toolkits to pre-process the input log file, train an unsupervised machine learning model, and detect anomalies on the input file.
The loglizer tools have not been used with Linux logs in previous research, possibly because they are rather difficult for automated processing. This finding is verified in this study as well, as the measured anomaly detection accuracy scores are quite modest. Nevertheless, sgologs is able to detect anomalies in the log files, with swift processing times, at least when certain things are taken into consideration. If the user is aware of these factors, sgologs can definitely point towards real anomalies in the Linux log files. Thus, the tool could be used in real-life settings to simplify debugging tasks, whenever logs are used as a source of information
Genetic architecture of rainbow trout survival from egg to adult
Survival from birth to a reproductive adult is a challenge that only robust individuals resistant to a variety of mortality factors will overcome. To assess whether survival traits share genetic architecture throughout the life cycle, we estimated genetic correlations for survival within fingerling stage, and across egg, fingerling and grow-out stages in farmed rainbow trout. Genetic parameters of survival at three life cycle stages were estimated for 249 166 individuals originating from ten year classes of a pedigreed population. Despite being an important fitness component, survival traits harboured significant but modest amount of genetic variation (h2=0·07–0·27). Weak associations between survival during egg-fry and fingerling periods, between early and late fingerling periods (rG=0·30) and generally low genetic correlations between fingerling and grow-out survival (mean rG=0·06) suggested that life-stage specific survival traits are best regarded as separate traits. However, in the sub-set of data with detailed time of death records, positive genetic correlations between early and late fingerling survival (rG=0·89) showed that during certain years the best genotypes in the early period were also among the best in the late period. That survival across fingerling period can be genetically the same, trait was indicated also by only slightly higher heritability (h2=0·15) estimated with the survival analysis of time to death during fingerling period compared to the analysis treating fingerling survival as a binary character (h2=0·11). The results imply that (1) inherited resistance against unknown mortality factors exists, but (2) ranking of genotypes changes across life stages
On the contribution of Aitken mode particles to cloud droplet populations at continental background areas ? a parametric sensitivity study
International audienceAitken mode particles are potentially an important source of cloud droplets in continental background areas. In order to find out which physico-chemical properties of Aitken mode particles are most important regarding their cloud-nucleating ability, we applied a global sensitivity method to an adiabatic air parcel model simulating the number of cloud droplets formed on Aitken mode particles, CD2. The technique propagates uncertainties in the parameters describing the properties of Aitken mode to CD2. The results show that if the Aitken mode particles do not contain molecules that are able to reduce the particle surface tension more than 30% and/or decrease the mass accommodation coefficient of water, ?, below 10?2, the chemical composition and modal properties may have roughly an equal importance at low updraft velocities characterized by maximum supersaturations CD2 exhibits largest sensitivity to the particle number concentration, followed by the particle size. Also the shape of the particle mode, characterized by the geometric standard deviation (GSD), can be as important as the mode mean size at low updraft velocities. Finally, the performed sensitivity analysis revealed also that the chemistry may dominate the total sensitivity of CD2 to the considered parameters if: 1) the value of ? varies at least one order of magnitude more than what is expected for pure water surfaces (10?2?1), or 2) the particle surface tension varies more than roughly 30% under conditions close to reaching supersaturation
Cloud Condensation Nuclei properties of model and atmospheric HULIS
Humic like substances (HULIS) have been identified as a major fraction of the organic component of atmospheric aerosols. These large multifunctional compounds of both primary and secondary sources are surface active and water soluble. Hence, it is expected that they could affect activation of organic aerosols into cloud droplets. We have compared the activation of aerosols containing atmospheric HULIS extracted from fresh, aged and pollution particles to activation of size fractionated fulvic acid from an aquatic source (Suwannee River Fulvic Acid), and correlated it to the estimated molecular weight and measured surface tension. A correlation was found between CCN-activation diameter of SRFA fractions and number average molecular weight of the fraction. The lower molecular weight fractions activated at lower critical diameters, which is explained by the greater number of solute species in the droplet with decreasing molecular weight. The three aerosol-extracted HULIS samples activated at lower diameters than any of the size-fractionated or bulk SRFA. The Köhler model was found to account for activation diameters, provided that accurate physico-chemical parameters are known
Pregnancy incidence and outcome before and after cervical intraepithelial neoplasia: a retrospective cohort study
We performed a retrospective cohort study of 3530 women treated for cervical intraepithelial neoplasia (CIN) in Helsinki University Central Hospital, Finland, to investigate whether CIN treatment itself affects pregnancy incidence and outcome. We estimated the incidence of live births, miscarriages, extrauterine pregnancies, molar pregnancies, and termination of pregnancies (TOPs) before and after CIN treatment using nationwide registers. Women were followed up until death, emigration, sterilization, or the end of 2004. The comparison of incidence of pregnancy outcomes before and after the treatment was estimated by calculating hazard ratios (HRs) with conditional Poisson regression. After 76,162 woman-years of follow-up, the incidence of any pregnancy remained constant over CIN-treatment, HR 1.02 and 95% confidence interval (CI) 0.97-1.08, but the incidence of the first pregnancy was significantly elevated after treatment, HR 1.13, and 95% CI 1.03-1.23. The incidence of live births was significantly elevated after treatment, HR 1.08 and 95% CI 1.01-1.15. Incidence of miscarriages, TOPs, extrauterine pregnancies, and molar pregnancies was not elevated. TOPs was significantly increased in the first pregnancy, HR 1.40, 95% CI 1.15-1.72 and after treatment by the loop electrosurgical excision procedure (LEEP), HR 1.36, 95% CI 1.15-1.60. CIN treatment did not reduce pregnancy incidence and women had more live births after than before CIN treatment. TOPs was more common in the first pregnancy or after treatment by LEEP. We encourage research on the psychosocial consequences of CIN treatment also in other countries and settings
Growth of sulphuric acid nanoparticles under wet and dry conditions
New particle formation, which greatly influences the number concentrations
and size distributions of an atmospheric aerosol, is often followed by a
rapid growth of freshly formed particles. The initial growth of newly
formed aerosol is the crucial process determining the fraction of nucleated
particles growing to cloud condensation nuclei sizes, which have a
significant influence on climate. In this study, we report the laboratory
observations of the growth of nanoparticles produced by nucleation of
H<sub>2</sub>SO<sub>4</sub> and water in a laminar flow tube at temperatures of 283, 293
and 303 K, under dry (a relative humidity of 1%) and wet conditions
(relative humidity of 30%) and residence times of 30, 45, 60 and 90 s.
The initial H<sub>2</sub>SO<sub>4</sub> concentration spans the range from 2 × 10<sup>8</sup>
to 1.4 × 10<sup>10</sup> molecule cm<sup>−3</sup> and the calculated
wall losses of H<sub>2</sub>SO<sub>4</sub> were assumed to be diffusion limited. The
detected particle number concentrations, measured by the Ultrafine
Condensation Particle Counter (UCPC) and Differential Mobility Particle
Sizer (DMPS), were found to depend strongly on the residence time.
Hygroscopic particle growth, presented by growth factors, was found to be in
good agreement with the previously reported studies. The experimental growth
rates ranged from 20 nm h<sup>−1</sup> to 890 nm h<sup>−1</sup> at relative humidity (RH) 1% and from
7 nm h<sup>−1</sup> to 980 nm h<sup>−1</sup> at RH 30% and were found to increase
significantly with the increasing concentration of H<sub>2</sub>SO<sub>4</sub>.
Increases in the nucleation temperature had a slight enhancing effect on the
growth rates under dry conditions. The influence of relative humidity on
growth was not consistent – at lower H<sub>2</sub>SO<sub>4</sub> concentrations, the
growth rates were higher under dry conditions while at H<sub>2</sub>SO<sub>4</sub>
concentrations greater than 1 × 10<sup>10</sup> molecule cm<sup>−3</sup>, the
growth rates were higher under wet conditions. The growth rates show only a
weak dependence on the residence time. The experimental observations were
compared with predictions made using a numerical model, which investigates
the growth of particles with three different extents of neutralization by
ammonia, NH<sub>3</sub>: (1) pure H<sub>2</sub>SO<sub>4</sub> – H<sub>2</sub>O particles; (2)
particles formed by ammonium bisulphate, (NH<sub>4</sub>)HSO<sub>4</sub>; (3) particles
formed by ammonium sulphate, (NH<sub>4</sub>)<sub>2</sub>SO<sub>4</sub>. The highest growth
rates were found for ammonium sulphate particles. Since the model accounting
for the initial H<sub>2</sub>SO<sub>4</sub> concentration predicted the experimental
growth rates correctly, our results suggest that the commonly presumed
diffusional wall losses of H<sub>2</sub>SO<sub>4</sub> in case of long-lasting
experiments are not so significant. We therefore assume that there are not
only losses of H<sub>2</sub>SO<sub>4</sub> on the wall, but also a flux of
H<sub>2</sub>SO<sub>4</sub> molecules from the wall into the flow tube, the effect being
more profound under dry conditions and at higher temperatures of the tube
wall. Based on a comparison with the atmospheric observations, our results
indicate that sulphuric acid alone cannot explain the growth rates of
particles formed in the atmosphere
Technical note: Analytical formulae for the critical supersaturations and droplet diameters of CCN containing insoluble material
International audienceIn this paper, we consider the cloud drop activation of aerosol particles consisting of water soluble material and an insoluble core. Based on the Köhler theory, we derive analytical equations for the critical diameters and supersaturations of such particles. We demonstrate the use of the equations by comparing the critical supersaturations of particles composed of ammonium sulfate and insoluble substances with those of model organic particles with varying molecular sizes
- …