4 research outputs found

    The Five Factor Model of personality and evaluation of drug consumption risk

    Full text link
    The problem of evaluating an individual's risk of drug consumption and misuse is highly important. An online survey methodology was employed to collect data including Big Five personality traits (NEO-FFI-R), impulsivity (BIS-11), sensation seeking (ImpSS), and demographic information. The data set contained information on the consumption of 18 central nervous system psychoactive drugs. Correlation analysis demonstrated the existence of groups of drugs with strongly correlated consumption patterns. Three correlation pleiades were identified, named by the central drug in the pleiade: ecstasy, heroin, and benzodiazepines pleiades. An exhaustive search was performed to select the most effective subset of input features and data mining methods to classify users and non-users for each drug and pleiad. A number of classification methods were employed (decision tree, random forest, kk-nearest neighbors, linear discriminant analysis, Gaussian mixture, probability density function estimation, logistic regression and na{\"i}ve Bayes) and the most effective classifier was selected for each drug. The quality of classification was surprisingly high with sensitivity and specificity (evaluated by leave-one-out cross-validation) being greater than 70\% for almost all classification tasks. The best results with sensitivity and specificity being greater than 75\% were achieved for cannabis, crack, ecstasy, legal highs, LSD, and volatile substance abuse (VSA).Comment: Significantly extended report with 67 pages, 27 tables, 21 figure

    Handling missing data in large healthcare dataset: A case study of unknown trauma outcomes

    No full text
    Handling of missed data is one of the main tasks in data preprocessing especially in large public service datasets. We have analysed data from the Trauma Audit and Research Network (TARN) database, the largest trauma database in Europe. For the analysis we used 165,559 trauma cases. Among them, there are 19,289 cases (11.35%) with unknown outcome. We have demonstrated that these outcomes are not missed 'completely at random' and, hence, it is impossible just to exclude these cases from analysis despite the large amount of available data. We have developed a system of non-stationary Markov models for the handling of missed outcomes and validated these models on the data of 15,437 patients which arrived into TARN hospitals later than 24h but within 30days from injury. We used these Markov models for the analysis of mortality. In particular, we corrected the observed fraction of death. Two naïve approaches give 7.20% (available case study) or 6.36% (if we assume that all unknown outcomes are 'alive'). The corrected value is 6.78%. Following the seminal paper of Trunkey (1983 [15]) the multimodality of mortality curves has become a much discussed idea. For the whole analysed TARN dataset the coefficient of mortality monotonically decreases in time but the stratified analysis of the mortality gives a different result: for lower severities the coefficient of mortality is a non-monotonic function of the time after injury and may have maxima at the second and third weeks. The approach developed here can be applied to various healthcare datasets which experience the problem of lost patients and missed outcomes

    Jet physics in electron--proton scattering

    No full text
    Hadronic jets in electron–proton collisions at HERA have been used for some considerable time as a tool for tests of the theory of strong interactions, quantum chromodynamics. Using jet final states, basic concepts like the factorisation ansatz for cross-section calculations, the perturbative approach to the cross section and the universality of the proton parton distribution functions can be examined. More concretely, jet measurements provide ready access to the strong coupling of QCD, α s , and to the parton distributions. In this report, an overview of jet results from the HERA experiments H1 and ZEUS and their interpretation is given together with a description of the theoretical foundations of jet physics in electron–proton collisions and of the experimental environment at HERA. Special emphasis is put on extractions of α s values and on the influence of jet data on fits of the proton parton distribution functions. Where useful, the HERA results are also discussed in the light of results from other colliders like LEP, the Tevatron or the LHC. The central message from these studies is that QCD does not only describe most of the measurements very well, but that QCD at HERA has achieved the status of a precision theory. On the other hand it is shown that further understanding of problematic issues relies critically on theoretical progress in the form of improved models or of increased precision in analytical calculations
    corecore