17 research outputs found

    Advances in Data Mining Knowledge Discovery and Applications

    Get PDF
    Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications

    How to Think About Indiscernible Particles

    Get PDF
    Permutation symmetries which arise in quantum mechanics pose an intriguing problem. It is not clear that particles which exhibit permutation symmetries (i.e. particles which are indiscernible, meaning that they can be swapped with each other without this yielding a new physical state) qualify as objects in any reasonable sense of the term. One solution to this puzzle, which I attribute to W.V. Quine, would have us eliminate such particles from our ontology altogether in order to circumvent the metaphysical vexations caused by permutation symmetries. In this essay I argue that Quine\u27s solution is too rash, and in its place I suggest a novel solution based on altering some of the language of quantum mechanics. Before launching into the technical details of indiscernible particles, however, I begin this essay with some remarks on the methodology - instrumentalism - which motivates my arguments

    Supervised ranking : from semantics to algorithms

    Get PDF

    Choosing a discernibility measure for reject-option of individual and multiple classifiers

    No full text
    A novel method for evaluating the reliability of a classifier on a pattern is proposed based on the discernibility of a pattern's class against other classes from the pattern. Three measures of discernibility are proposed and experimentally compared with each other and with more conventional techniques based on the classification scores for class labels. The classification accuracy can be significantly enhanced through discernibility measures using the most reliable - 'elite' - patterns. It can be further boosted by forming an amalgamation of the elites of different classifiers. Improved performance is achieved at the price of rejecting many patterns. There are situations in which this price is worth paying - when the non-reliable predictions, however good, lead to the need for the manual testing of very cumbersome and complex technical devices or in diagnostics of human terminal diseases. Contrary to conventional techniques for estimating reliability, the proposed measures are applicable to small datasets as well as to datasets with complex class structures on which conventional classifiers show low accuracy rates

    Heterogeneous recognition of bioacoustic signals for human-machine interfaces

    No full text
    Human-machine interfaces (HMI) provide a communication pathway between man and machine. Not only do they augment existing pathways, they can substitute or even bypass these pathways where functional motor loss prevents the use of standard interfaces. This is especially important for individuals who rely on assistive technology in their everyday life. By utilising bioacoustic activity, it can lead to an assistive HMI concept which is unobtrusive, minimally disruptive and cosmetically appealing to the user. However, due to the complexity of the signals it remains relatively underexplored in the HMI field. This thesis investigates extracting and decoding volition from bioacoustic activity with the aim of generating real-time commands. The developed framework is a systemisation of various processing blocks enabling the mapping of continuous signals into M discrete classes. Class independent extraction efficiently detects and segments the continuous signals while class-specific extraction exemplifies each pattern set using a novel template creation process stable to permutations of the data set. These templates are utilised by a generalised single channel discrimination model, whereby each signal is template aligned prior to classification. The real-time decoding subsystem uses a multichannel heterogeneous ensemble architecture which fuses the output from a diverse set of these individual discrimination models. This enhances the classification performance by elevating both the sensitivity and specificity, with the increased specificity due to a natural rejection capacity based on a non-parametric majority vote. Such a strategy is useful when analysing signals which have diverse characteristics, false positives are prevalent and have strong consequences, and when there is limited training data available. The framework has been developed with generality in mind with wide applicability to a broad spectrum of biosignals. The processing system has been demonstrated on real-time decoding of tongue-movement ear pressure signals using both single and dual channel setups. This has included in-depth evaluation of these methods in both offline and online scenarios. During online evaluation, a stimulus based test methodology was devised, while representative interference was used to contaminate the decoding process in a relevant and real fashion. The results of this research provide a strong case for the utility of such techniques in real world applications of human-machine communication using impulsive bioacoustic signals and biosignals in general

    Derivation of forest inventory parameters from high-resolution satellite imagery for the Thunkel area, Northern Mongolia. A comparative study on various satellite sensors and data analysis techniques.

    Get PDF
    With the demise of the Soviet Union and the transition to a market economy starting in the 1990s, Mongolia has been experiencing dramatic changes resulting in social and economic disparities and an increasing strain on its natural resources. The situation is exacerbated by a changing climate, the erosion of forestry related administrative structures, and a lack of law enforcement activities. Mongolia’s forests have been afflicted with a dramatic increase in degradation due to human and natural impacts such as overexploitation and wildfire occurrences. In addition, forest management practices are far from being sustainable. In order to provide useful information on how to viably and effectively utilise the forest resources in the future, the gathering and analysis of forest related data is pivotal. Although a National Forest Inventory was conducted in 2016, very little reliable and scientifically substantiated information exists related to a regional or even local level. This lack of detailed information warranted a study performed in the Thunkel taiga area in 2017 in cooperation with the GIZ. In this context, we hypothesise that (i) tree species and composition can be identified utilising the aerial imagery, (ii) tree height can be extracted from the resulting canopy height model with accuracies commensurate with field survey measurements, and (iii) high-resolution satellite imagery is suitable for the extraction of tree species, the number of trees, and the upscaling of timber volume and basal area based on the spectral properties. The outcomes of this study illustrate quite clearly the potential of employing UAV imagery for tree height extraction (R2 of 0.9) as well as for species and crown diameter determination. However, in a few instances, the visual interpretation of the aerial photographs were determined to be superior to the computer-aided automatic extraction of forest attributes. In addition, imagery from various satellite sensors (e.g. Sentinel-2, RapidEye, WorldView-2) proved to be excellently suited for the delineation of burned areas and the assessment of tree vigour. Furthermore, recently developed sophisticated classifying approaches such as Support Vector Machines and Random Forest appear to be tailored for tree species discrimination (Overall Accuracy of 89%). Object-based classification approaches convey the impression to be highly suitable for very high-resolution imagery, however, at medium scale, pixel-based classifiers outperformed the former. It is also suggested that high radiometric resolution bears the potential to easily compensate for the lack of spatial detectability in the imagery. Quite surprising was the occurrence of dark taiga species in the riparian areas being beyond their natural habitat range. The presented results matrix and the interpretation key have been devised as a decision tool and/or a vademecum for practitioners. In consideration of future projects and to facilitate the improvement of the forest inventory database, the establishment of permanent sampling plots in the Mongolian taigas is strongly advised.2021-06-0

    Identification of Data Structure with Machine Learning: From Fisher to Bayesian networks

    Get PDF
    This thesis proposes a theoretical framework to thoroughly analyse the structure of a dataset in terms of a) metric, b) density and c) feature associations. To look into the first aspect, Fisher's metric learning algorithms are the foundations of a novel manifold based on the information and complexity of a classification model. When looking at the density aspect, the Probabilistic Quantum clustering, a Bayesian version of the original Quantum Clustering is proposed. The clustering results will depend on local density variations, which is a desired feature when dealing with heteroscedastic data. To address the third aspect, the constraint-based PC-algorithm is the starting point of many structure learning algorithms, it is focused on finding feature associations by means of conditional independent tests. This is then used to select Bayesian networks, based on a regularized likelihood score. These three topics of data structure analysis were fully tested with synthetic data examples and real cases, which allowed us to unravel and discuss the advantages and limitations of these algorithms. One of the biggest challenges encountered was related to the application of these methods to a Big Data dataset that was analysed within the framework of a collaboration with a large UK retailer, where the interest was in the identification of the data structure underlying customer shopping baskets

    The People Inside

    Get PDF
    Our collection begins with an example of computer vision that cuts through time and bureaucratic opacity to help us meet real people from the past. Buried in thousands of files in the National Archives of Australia is evidence of the exclusionary “White Australia” policies of the nineteenth and twentieth centuries, which were intended to limit and discourage immigration by non-Europeans. Tim Sherratt and Kate Bagnall decided to see what would happen if they used a form of face-detection software made ubiquitous by modern surveillance systems and applied it to a security system of a century ago. What we get is a new way to see the government documents, not as a source of statistics but, Sherratt and Bagnall argue, as powerful evidence of the people affected by racism
    corecore