3,487 research outputs found

    Exploratory disease mapping: kriging the spatial risk function from regional count data

    Get PDF
    BACKGROUND: There is considerable interest in the literature on disease mapping to interpolate estimates of disease occurrence or risk of disease from a regional database onto a continuous surface. In addition to many interpolation techniques available the geostatistical method of kriging has been used but also criticised. RESULTS: To circumvent these critics one may use kriging along with already smoothed regional estimates, where smoothing is based on empirical Bayes estimates, also known as shrinkage estimates. The empirical Bayes step has the advantage of shrinking the unstable and often extreme estimates to the global or local mean, and also has a stabilising effect on variance by borrowing strength, as well. Negative interpolates are prevented by choice of the appropriate kriging method. The proposed mapping method is applied to the North Carolina SIDS data example as well as to an example data set from veterinary epidemiology. The SIDS data are modelled without spatial trend. And spatial interpolation is based on ordinary kriging. The second example is included to demonstrate the method when the phenomenon under study exhibits a spatial trend and interpolation is based on universal kriging. CONCLUSION: Interpolation of the regional estimates overcomes the areal bias problem and the resulting isopleth maps are easier to read than choropleth maps. The empirical Bayesian estimate for smoothing is related to internal standardization in epidemiology. Therefore, the proposed concept is easily communicable to map users

    Use of data mining techniques to investigate disease risk classification as a proxy for compromised biosecurity of cattle herds in Wales

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biosecurity is at the forefront of the fight against infectious diseases in animal populations. Few research studies have attempted to identify and quantify the effectiveness of biosecurity against disease introduction or presence in cattle farms and, when done, they have relied on the collection of on-farm data. Data on environmental, animal movement, demographic/husbandry systems and density disease determinants can be collated without requiring additional specific on-farm data collection activities, since they have already been collected for some other purposes. The aim of this study was to classify cattle herds according to their risk of disease presence as a proxy for compromised biosecurity in the cattle population of Wales in 2004 for risk-based surveillance purposes.</p> <p>Results</p> <p>Three data mining methods have been applied: logistic regression, classification trees and factor analysis. Using the cattle holding population in Wales, a holding was considered positive if at least bovine TB or one of the ten most frequently diagnosed infectious or transmissible non-notifiable diseases in England and Wales, according to the Veterinary Investigation Surveillance Report (VIDA) had been diagnosed in 2004. High-risk holdings can be described as open large cattle herds located in high-density cattle areas with frequent movements off to many locations within Wales. Additional risks are associated with the holding being a dairy enterprise and with a large farming area.</p> <p>Conclusion</p> <p>This work has demonstrated the potential of mining various livestock-relevant databases to obtain generic criteria for individual cattle herd biosecurity risk classification. Despite the data and analytical constraints the described risk profiles are highly specific and present variable sensitivity depending on the model specifications. Risk profiling of farms provides a tool for designing targeted surveillance activities for endemic or emerging diseases, regardless of the prior amount of information available on biosecurity at farm level. As the delivery of practical evidence-based information and advice is one of the priorities of Defra's new Animal Health and Welfare Strategy (AHWS), data-driven models, derived from existing databases, need to be developed that can then be used to inform activities during outbreaks of endemic diseases and to help design surveillance activities.</p

    Integrated smoothed location model and data reduction approaches for multi variables classification

    Get PDF
    Smoothed Location Model is a classification rule that deals with mixture of continuous variables and binary variables simultaneously. This rule discriminates groups in a parametric form using conditional distribution of the continuous variables given each pattern of the binary variables. To conduct a practical classification analysis, the objects must first be sorted into the cells of a multinomial table generated from the binary variables. Then, the parameters in each cell will be estimated using the sorted objects. However, in many situations, the estimated parameters are poor if the number of binary is large relative to the size of sample. Large binary variables will create too many multinomial cells which are empty, leading to high sparsity problem and finally give exceedingly poor performance for the constructed rule. In the worst case scenario, the rule cannot be constructed. To overcome such shortcomings, this study proposes new strategies to extract adequate variables that contribute to optimum performance of the rule. Combinations of two extraction techniques are introduced, namely 2PCA and PCA+MCA with new cutpoints of eigenvalue and total variance explained, to determine adequate extracted variables which lead to minimum misclassification rate. The outcomes from these extraction techniques are used to construct the smoothed location models, which then produce two new approaches of classification called 2PCALM and 2DLM. Numerical evidence from simulation studies demonstrates that the computed misclassification rate indicates no significant difference between the extraction techniques in normal and non-normal data. Nevertheless, both proposed approaches are slightly affected for non-normal data and severely affected for highly overlapping groups. Investigations on some real data sets show that the two approaches are competitive with, and better than other existing classification methods. The overall findings reveal that both proposed approaches can be considered as improvement to the location model, and alternatives to other classification methods particularly in handling mixed variables with large binary size

    A review onquantification learning

    Get PDF
    The task of quantification consists in providing an aggregate estimation (e.g. the class distribution in a classification problem) for unseen test sets, applying a model that is trained using a training set with a different data distribution. Several real-world applications demand this kind of methods that do not require predictions for individual examples and just focus on obtaining accurate estimates at an aggregate level. During the past few years, several quantification methods have been proposed from different perspectives and with different goals. This paper presents a unified review of the main approaches with the aim of serving as an introductory tutorial for newcomers in the fiel

    Support Vector Methods for Higher-Level Event Extraction in Point Data

    Get PDF
    Phenomena occur both in space and time. Correspondingly, ability to model spatiotemporal behavior translates into ability to model phenomena as they occur in reality. Given the complexity inherent when integrating spatial and temporal dimensions, however, the establishment of computational methods for spatiotemporal analysis has proven relatively elusive. Nonetheless, one method, the spatiotemporal helix, has emerged from the field of video processing. Designed to efficiently summarize and query the deformation and movement of spatiotemporal events, the spatiotemporal helix has been demonstrated as capable of describing and differentiating the evolution of hurricanes from sequences of images. Being derived from image data, the representations of events for which the spatiotemporal helix was originally created appear in areal form (e.g., a hurricane covering several square miles is represented by groups of pixels). ii Many sources of spatiotemporal data, however, are not in areal form and instead appear as points. Examples of spatiotemporal point data include those from an epidemiologist recording the time and location of cases of disease and environmental observations collected by a geosensor at the point of its location. As points, these data cannot be directly incorporated into the spatiotemporal helix for analysis. However, with the analytic potential for clouds of point data limited, phenomena represented by point data are often described in terms of events. Defined as change units localized in space and time, the concept of events allows for analysis at multiple levels. For instance lower-level events refer to occurrences of interest described by single data streams at point locations (e.g., an individual case of a certain disease or a significant change in chemical concentration in the environment) while higher-level events describe occurrences of interest derived from aggregations of lower-level events and are frequently described in areal form (e.g., a disease cluster or a pollution cloud). Considering that these higher-level events appear in areal form, they could potentially be incorporated into the spatiotemporal helix. With deformation being an important element of spatiotemporal analysis, however, at the crux of a process for spatiotemporal analysis based on point data would be accurate translation of lower-level event points into representations of higher-level areal events. A limitation of current techniques for the derivation of higher-level events is that they imply bias a priori regarding the shape of higher-level events (e.g., elliptical, convex, linear) which could limit the description of the deformation of higher-level events over time. The objective of this research is to propose two newly developed kernel methods, support vector clustering (SVC) and support vector machines (SVMs), as means for iii translating lower-level event points into higher-level event areas that follow the distribution of lower-level points. SVC is suggested for the derivation of higher-level events arising in point process data while SVMs are explored for their potential with scalar field data (i.e., spatially continuous real-valued data). Developed in the field of machine learning to solve complex non-linear problems, both of these methods are capable of producing highly non-linear representations of higher-level events that may be more suitable than existing methods for spatiotemporal analysis of deformation. To introduce these methods, this thesis is organized so that a context for these methods is first established through a description of existing techniques. This discussion leads to a technical explanation of the mechanics of SVC and SVMs and to the implementation of each of the kernel methods on simulated datasets. Results from these simulations inform discussion regarding the application potential of SVC and SVMs

    Risk prediction and an injectable collagen material for intervertebral disc degeneration

    Get PDF
    This research primarily focuses on early prediction and treatment for intervertebral disc degeneration (IVDD). In Phase 1, machine learning algorithms were evaluated to predict the risk of intervertebral disc degeneration in patients. This was done by using factors associated with IVDD and taken from patient medical history. Several classification algorithms were utilized to develop predictive models. Results demonstrated that machine learning algorithms could be used to predict IVDD risk and also the potential for developing an app from these predictive models. Phase 2 focused on the development of a collagen-based, gold nanoparticle material for intervertebral disc regeneration. Gold nanoparticles were conjugated to viscoelastic collagen using a natural crosslinker, genipin. This material was then characterized to evaluate its ability to serve as a treatment for chronic back pain caused by IVDD. Results demonstrated successful attachment of the gold nanoparticles to the collagen using the genipin crosslinker. Overall, the characterization studies of the collagen composite were successful and demonstrated potential for further application in IVDD treatment.Includes bibliographical reference

    New approaches to using scientific data - statistics, data mining and related technologies in research and research training

    No full text
    This paper surveys technological changes that affect the collection, organization, analysis and presentation of data. It considers changes or improvements that ought to influence the research process and direct the use of technology. It explores implications for graduate research training. The insights of Evidence-Based Medicine are widely relevant across many different research areas. Its insights provide a helpful context within which to discuss the use of technological change to improve the research process. Systematic data-based overview has to date received inadequate attention, both in research and in research training. Sharing of research data once results are published would both assist systematic overview and allow further scrutiny where published analyses seem deficient. Deficiencies in data collection and published data analysis are surprisingly common. Technologies that offer new perspectives on data collection and analysis include data warehousing, data mining, new approaches to data visualization and a variety of computing technologies that are in the tradition of knowledge engineering and machine learning. There is a large overlap of interest with statistics. Statistics is itself changing dramatically as a result of the interplay between theoretical development and the power of new computational tools. I comment briefly on other developing mathematical science application areas - notably molecular biology. The internet offers new possibilities for cooperation across institutional boundaries, for exchange of information between researchers, and for dissemination of research results. Research training ought to equip students both to use their research skills in areas different from those in which they have been immediately trained, and to respond to the challenge of steadily more demanding standards. There should be an increased emphasis on training to work cooperatively

    Conceptual Learning: Enhancing Student Understanding of Physiology

    Get PDF
    Students are leaving undergraduate science programs without the knowledge and skills they are expected to have. This is apparent in professional programs, such as medical and veterinary school, where students do not possess the critical thinking skills necessary to be successful. Physiology is a required discipline for these professional programs and often before, as a pre-requisite. Physiology classrooms are an excellent place to teach critical thinking skills because the content consists of integrated processes. Therefore, in one study, it was investigated whether focusing on physiological concepts improved student understanding of physiology in both a non-physiological science course, Invertebrate Zoology, and in an undergraduate physiology course. An educational intervention was used in Invertebrate Zoology, where students were exposed to human physiology concepts that were similar to comparative physiology concepts they had learned during the semester. A pre-/post-test was used to assess learning gains. In a second study, the use of multimedia file usage was correlated to student exam scores in a physiology course. This was done to see if providing additional study materials that focused on specific concepts improved student understanding, as assessed using exam scores. Overall these studies indicate that encouraging assimilation of new concepts that expand upon material from lecture may help students gain a more complete understanding of a concept. The integration of these concepts into pre-existing conceptual frameworks may serve to teach students valuable critical thinking skills such as evaluation of new ideas within their current understanding and synthesizing the new content with the existing information. Focusing on this type of conceptual learning may enable students to apply content knowledge and think through problems. Additionally, focusing on concepts may enable students to improve their understanding of material without being overwhelmed by content

    Stat.Edu’21 - New Perspectives in Statistics Education. Proceedings of the International Conference Stat.Edu’21

    Get PDF
    The volume collects the papers presented at the Conference “Stat.Edu’21 -New Perspectives in Statistics Education”. The Conference was held at the Department of Political Sciences of the University of Naples Federico II (25-26 March 2021). The conference was the final event of the “ALEAS - Adaptive LEArning in Statistics”, an ERASMUS+ project (https://aleas-project.eu) developed in the period 2018-2021 to design and implement an Adaptive LEArning system able to offer personalised learning paths to students, with the purpose to provide them remedial advice to deal with the “statistics anxiety”. Stat.Edu’21 aimed at stimulating discussions, solicitations and contributions around the central theme of ALEAS, the development of adaptive learning systems in the field of Higher Education as a complementary tool for traditional courses and promote a community of practice in this field. The volume collects 12 papers reporting reflections and quantitative studies covering mainly three topics: the assessment of the effects of anxiety or more generally of a different attitude in the study of Statistics, tools and methods for the assessment of training paths and technology-based learning experiencesillustratorIl volume raccoglie i contributi presentati alla conferenza “Stat.Edu’21 -New Perspectives in Statistics Education”. La Conferenza Ăš stata ospitata dal Dipartimento di Scienze Politiche dell’UniversitĂ  degli Studi di Napoli Federico II (25-26 marzo 2021). La conferenza Ăš stata organizzata come evento finale del progetto ERASMUS+ “ALEAS - Adaptive LEArning in Statistics” (https://aleas-project.eu) che si Ăš svolto dal 2018 al 2021. Il progetto ha avuto l’obiettivo di sviluppare e implementare un sistema di apprendimento adattivo che offra percorsi di apprendimento personalizzati agli studenti, con lo scopo ultimo di aiutare gli studenti a fronteggiare l’ansia statistica. Stat.Edu’21 ha stimolato riflessioni, discussioni e contributi sul tema di ALEAS e sullo sviluppo di sistemi di apprendimento adattivo in ambito universitario come strumenti complementari ai corsi tradizionali e contribuito lo scambio di buone pratiche. Il volume comprende 12 contributi che propongono riflessioni e studi quantitativi in particolare su 3 temi: la valutazione degli effetti dell’ansia o piĂč generalmente lo studio di diverse attitudini nello studio della statistica, strumenti e metodi per la valutazione dei percorsi di insegnamento e le esperienze di apprendimento basate sulla tecnologia
    • 

    corecore