11 research outputs found

    Técnicas de minería de datos como alternativa a las técnicas estadísticas de discriminación y clasificación multivariadas clásicas

    Get PDF
    En este trabajo se describe brevemente una de las líneas de investigación que se están llevando a cabo en el Departamento de Matemática de la Facultad de Ciencias Exactas y Naturales de la Universidad Nacional de La Pampa, en relación a Métodos Multivariados Discriminantes y de Clasificación, y su sensibilidad y fiabilidad en la aplicación a diferentes problemas reales o simulados. Si bien el estudio puede centrarse en ciertos métodos que podrían entenderse como clásicos y de una esencia más estadística, es indudable que, en los últimos años, se ha producido un gran crecimiento en las capacidades de generar y recolectar datos. En estos enormes volúmenes de datos, existe gran cantidad de información a la que sería difícil, cuando no imposible, acceder mediante los métodos clásicos. Técnicas propias de la Minería de Datos, posibilitan el análisis de estas masas de datos, en búsqueda de patrones y predicciones, que permitan generar información útil a partir de ellos. Se pretende, entonces, comparar las diferentes técnicas estadísticas clásicas con las propias de la Minería de Datos en las tareas de Discriminación y Clasificación, estableciendo similitudes y diferencias, y analizando las estimaciones que se obtienen con ellas al aplicarlas a problemas reales o simulados.Eje: Base de Datos y Minería de DatosRed de Universidades con Carreras en Informática (RedUNCI

    Técnicas de minería de datos como alternativa a las técnicas estadísticas de discriminación y clasificación multivariadas clásicas

    Get PDF
    En este trabajo se describe brevemente una de las líneas de investigación que se están llevando a cabo en el Departamento de Matemática de la Facultad de Ciencias Exactas y Naturales de la Universidad Nacional de La Pampa, en relación a Métodos Multivariados Discriminantes y de Clasificación, y su sensibilidad y fiabilidad en la aplicación a diferentes problemas reales o simulados. Si bien el estudio puede centrarse en ciertos métodos que podrían entenderse como clásicos y de una esencia más estadística, es indudable que, en los últimos años, se ha producido un gran crecimiento en las capacidades de generar y recolectar datos. En estos enormes volúmenes de datos, existe gran cantidad de información a la que sería difícil, cuando no imposible, acceder mediante los métodos clásicos. Técnicas propias de la Minería de Datos, posibilitan el análisis de estas masas de datos, en búsqueda de patrones y predicciones, que permitan generar información útil a partir de ellos. Se pretende, entonces, comparar las diferentes técnicas estadísticas clásicas con las propias de la Minería de Datos en las tareas de Discriminación y Clasificación, estableciendo similitudes y diferencias, y analizando las estimaciones que se obtienen con ellas al aplicarlas a problemas reales o simulados.Eje: Base de Datos y Minería de DatosRed de Universidades con Carreras en Informática (RedUNCI

    Tight Combinatorial Generalization Bounds for Threshold Conjunction Rules

    Full text link
    Abstract. We propose a combinatorial technique for obtaining tight data dependent generalization bounds based on a splitting and connec-tivity graph (SC-graph) of the set of classifiers. We apply this approach to a parametric set of conjunctive rules and propose an algorithm for effective SC-bound computation. Experiments on 6 data sets from the UCI ML Repository show that SC-bound helps to learn more reliable rule-based classifiers as compositions of less overfitted rules

    Rough set methodology in meta-analysis - a comparative and exploratory analysis

    Get PDF
    We study the applicability of the pattern recognition methodology "rough set data analysis" (RSDA) in the field of meta analysis. We give a summary of the mathematical and statistical background and then proceed to an application of the theory to a meta analysis of empirical studies dealing with the deterrent effect introduced by Becker and Ehrlich. Results are compared with a previously devised meta regression analysis. We find that the RSDA can be used to discover information overlooked by other methods, to preprocess the data for further studying and to strengthen results previously found by other methods.Rough Data Set, RSDA, Meta Analysis, Data Mining, Pattern Recognition, Deterrence, Criminometrics

    Implementation of decision trees for embedded systems

    Get PDF
    This research work develops real-time incremental learning decision tree solutions suitable for real-time embedded systems by virtue of having both a defined memory requirement and an upper bound on the computation time per training vector. In addition, the work provides embedded systems with the capabilities of rapid processing and training of streamed data problems, and adopts electronic hardware solutions to improve the performance of the developed algorithm. Two novel decision tree approaches, namely the Multi-Dimensional Frequency Table (MDFT) and the Hashed Frequency Table Decision Tree (HFTDT) represent the core of this research work. Both methods successfully incorporate a frequency table technique to produce a complete decision tree. The MDFT and HFTDT learning methods were designed with the ability to generate application specific code for both training and classification purposes according to the requirements of the targeted application. The MDFT allows the memory architecture to be specified statically before learning takes place within a deterministic execution time. The HFTDT method is a development of the MDFT where a reduction in the memory requirements is achieved within a deterministic execution time. The HFTDT achieved low memory usage when compared to existing decision tree methods and hardware acceleration improved the performance by up to 10 times in terms of the execution time

    Flood hazard hydrology: interdisciplinary geospatial preparedness and policy

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 2017Floods rank as the deadliest and most frequently occurring natural hazard worldwide, and in 2013 floods in the United States ranked second only to wind storms in accounting for loss of life and damage to property. While flood disasters remain difficult to accurately predict, more precise forecasts and better understanding of the frequency, magnitude and timing of floods can help reduce the loss of life and costs associated with the impact of flood events. There is a common perception that 1) local-to-national-level decision makers do not have accurate, reliable and actionable data and knowledge they need in order to make informed flood-related decisions, and 2) because of science--policy disconnects, critical flood and scientific analyses and insights are failing to influence policymakers in national water resource and flood-related decisions that have significant local impact. This dissertation explores these perceived information gaps and disconnects, and seeks to answer the question of whether flood data can be accurately generated, transformed into useful actionable knowledge for local flood event decision makers, and then effectively communicated to influence policy. Utilizing an interdisciplinary mixed-methods research design approach, this thesis develops a methodological framework and interpretative lens for each of three distinct stages of flood-related information interaction: 1) data generation—using machine learning to estimate streamflow flood data for forecasting and response; 2) knowledge development and sharing—creating a geoanalytic visualization decision support system for flood events; and 3) knowledge actualization—using heuristic toolsets for translating scientific knowledge into policy action. Each stage is elaborated on in three distinct research papers, incorporated as chapters in this dissertation, that focus on developing practical data and methodologies that are useful to scientists, local flood event decision makers, and policymakers. Data and analytical results of this research indicate that, if certain conditions are met, it is possible to provide local decision makers and policy makers with the useful actionable knowledge they need to make timely and informed decisions
    corecore