86,324 research outputs found

    Using machine learning to predict potential online gambling addicts.

    Get PDF
    Betting addicts on the gambling websites are difficult to identify because online gambling is by nature different from real gambling. This thesis attempts to identify potential gambling addicts in an online gambling website X using machine learning models. The models are based on user’s usage history on the website. The usage data is collected for each user from the site using JavaScript. The data is then analyzed and stored in a database. Machine learning models are then trained using Support Vector Machines with the data of users who are by definition problem gamblers. The system then makes a prediction for all active users based on their recent usage history. The final results include an automated system for daily learning and prediction of potential problem gamblers who show early signs of gambling addiction

    Predicting a User's Next Cell With Supervised Learning Based on Channel States

    Full text link
    Knowing a user's next cell allows more efficient resource allocation and enables new location-aware services. To anticipate the cell a user will hand-over to, we introduce a new machine learning based prediction system. Therein, we formulate the prediction as a classification problem based on information that is readily available in cellular networks. Using only Channel State Information (CSI) and handover history, we perform classification by embedding Support Vector Machines (SVMs) into an efficient pre-processing structure. Simulation results from a Manhattan Grid scenario and from a realistic radio map of downtown Frankfurt show that our system provides timely prediction at high accuracy.Comment: The 14th IEEE International Workshop on Signal Processing Advances for Wireless Communications (SPAWC), Darmstadt : Germany (2013

    A robust morphological classification of high-redshift galaxies using support vector machines on seeing limited images. I Method description

    Full text link
    We present a new non-parametric method to quantify morphologies of galaxies based on a particular family of learning machines called support vector machines. The method, that can be seen as a generalization of the classical CAS classification but with an unlimited number of dimensions and non-linear boundaries between decision regions, is fully automated and thus particularly well adapted to large cosmological surveys. The source code is available for download at http://www.lesia.obspm.fr/~huertas/galsvm.html To test the method, we use a seeing limited near-infrared (KsK_s band, 2,16ÎŒm2,16\mu m) sample observed with WIRCam at CFHT at a median redshift of z∌0.8z\sim0.8. The machine is trained with a simulated sample built from a local visually classified sample from the SDSS chosen in the high-redshift sample's rest-frame (i band, 0.77ÎŒm0.77\mu m) and artificially redshifted to match the observing conditions. We use a 12-dimensional volume, including 5 morphological parameters and other caracteristics of galaxies such as luminosity and redshift. We show that a qualitative separation in two main morphological types (late type and early type) can be obtained with an error lower than 20% up to the completeness limit of the sample (KAB∌22KAB\sim 22) which is more than 2 times better that what would be obtained with a classical C/A classification on the same sample and indeed comparable to space data. The method is optimized to solve a specific problem, offering an objective and automated estimate of errors that enables a straightforward comparison with other surveys.Comment: 11 pages, 7 figures, 3 tables. Submitted to A&A. High resolution images are available on reques

    Predicting Pancreatic Cancer Using Support Vector Machine

    Get PDF
    This report presents an approach to predict pancreatic cancer using Support Vector Machine Classification algorithm. The research objective of this project it to predict pancreatic cancer on just genomic, just clinical and combination of genomic and clinical data. We have used real genomic data having 22,763 samples and 154 features per sample. We have also created Synthetic Clinical data having 400 samples and 7 features per sample in order to predict accuracy of just clinical data. To validate the hypothesis, we have combined synthetic clinical data with subset of features from real genomic data. In our results, we observed that prediction accuracy, precision, recall with just genomic data is 80.77%, 20%, 4%. Prediction accuracy, precision, recall with just synthetic clinical data is 93.33%, 95%, 30%. While prediction accuracy, precision, recall for combination of real genomic and synthetic clinical data is 90.83%, 10%, 5%. The combination of real genomic and synthetic clinical data decreased the accuracy since the genomic data is weakly correlated. Thus we conclude that the combination of genomic and clinical data does not improve pancreatic cancer prediction accuracy. A dataset with more significant genomic features might help to predict pancreatic cancer more accurately

    Investigation of gas circulator response to load transients in nuclear power plant operation

    Get PDF
    Gas circulator units are a critical component of the Advanced Gas-cooled Reactor (AGR), one of the nuclear power plant (NPP) designs in current use within the UK. The condition monitoring of these assets is central to the safe and economic operation of the AGRs and is achieved through analysis of vibration data. Due to the dynamic nature of reactor operation, each plant item is subject to a variety of system transients of which engineers are required to identify and reason about with regards to asset health. The AGR design enables low power refueling (LPR) which results in a change in operational state for the gas circulators, with the vibration profile of each unit reacting accordingly. The changing conditions subject to these items during LPR and other such events may impact on the assets. From these assumptions, it is proposed that useful information on gas circulator condition can be determined from the analysis of vibration response to the LPR event. This paper presents an investigation into asset vibration during an LPR. A machine learning classification approach is used in order to define each transient instance and its behavioral features statistically. Classification and reasoning about the regular transients such as the LPR represents the primary stage in modeling higher complexity events for advanced event driven diagnostics, which may provide an enhancement to the current methodology, which uses alarm boundary limits

    Six Noise Type Military Sound Classifier

    Get PDF
    Blast noise from military installations often has a negative impact on the quality of life of residents living in nearby communities. This negatively impacts the military's testing \& training capabilities due to restrictions, curfews, or range closures enacted to address noise complaints. In order to more directly manage noise around military installations, accurate noise monitoring has become a necessity. Although most noise monitors are simple sound level meters, more recent ones are capable of discerning blasts from ambient noise with some success. Investigators at the University of Pittsburgh previously developed a more advanced noise classifier that can discern between wind, aircraft, and blast noise, while simultaneously lowering the measurement threshold. Recent work will be presented from the development of a more advanced classifier that identifies additional classes of noise such as machine gun fire, vehicles, and thunder. Additional signal metrics were explored given the increased complexity of the classifier. By broadening the types of noise the system can accurately classify and increasing the number of metrics, a new system was developed with increased blast noise accuracy, decreased number of missed events, and significantly fewer false positives

    Prediction of delayed graft function after kidney transplantation : comparison between logistic regression and machine learning methods

    Get PDF
    Background: Predictive models for delayed graft function (DGF) after kidney transplantation are usually developed using logistic regression. We want to evaluate the value of machine learning methods in the prediction of DGF. Methods: 497 kidney transplantations from deceased donors at the Ghent University Hospital between 2005 and 2011 are included. A feature elimination procedure is applied to determine the optimal number of features, resulting in 20 selected parameters (24 parameters after conversion to indicator parameters) out of 55 retrospectively collected parameters. Subsequently, 9 distinct types of predictive models are fitted using the reduced data set: logistic regression (LR), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs; using linear, radial basis function and polynomial kernels), decision tree (DT), random forest (RF), and stochastic gradient boosting (SGB). Performance of the models is assessed by computing sensitivity, positive predictive values and area under the receiver operating characteristic curve (AUROC) after 10-fold stratified cross-validation. AUROCs of the models are pairwise compared using Wilcoxon signed-rank test. Results: The observed incidence of DGF is 12.5 %. DT is not able to discriminate between recipients with and without DGF (AUROC of 52.5 %) and is inferior to the other methods. SGB, RF and polynomial SVM are mainly able to identify recipients without DGF (AUROC of 77.2, 73.9 and 79.8 %, respectively) and only outperform DT. LDA, QDA, radial SVM and LR also have the ability to identify recipients with DGF, resulting in higher discriminative capacity (AUROC of 82.2, 79.6, 83.3 and 81.7 %, respectively), which outperforms DT and RF. Linear SVM has the highest discriminative capacity (AUROC of 84.3 %), outperforming each method, except for radial SVM, polynomial SVM and LDA. However, it is the only method superior to LR. Conclusions: The discriminative capacities of LDA, linear SVM, radial SVM and LR are the only ones above 80 %. None of the pairwise AUROC comparisons between these models is statistically significant, except linear SVM outperforming LR. Additionally, the sensitivity of linear SVM to identify recipients with DGF is amongst the three highest of all models. Due to both reasons, the authors believe that linear SVM is most appropriate to predict DGF
    • 

    corecore