67,050 research outputs found

    Approximation-based feature selection and application for algae population estimation

    Get PDF
    This paper presents a data-driven approach for feature selection to address the common problem of dealing with high-dimensional data. This approach is able to handle the real-valued nature of the domain features, unlike many existing approaches. This is accomplished through the use of fuzzy-rough approximations. The paper demonstrates the effectiveness of this research by proposing an estimator of algae populations, a system that approximates, given certain water characteristics, the size of algae populations. This estimator significantly reduces computer time and space requirements, decreases the cost of obtaining measurements and increases runtime efficiency, making itself more viable economically. By retaining only information required for the estimation task, the system offers higher accuracy than conventional estimators. Finally, the system does not alter the domain semantics, making any distilled knowledge human-readable. The paper describes the problem domain, architecture and operation of the system, and provides and discusses detailed experimentation. The results show that algae estimators using a fuzzy-rough feature selection step produce more accurate predictions of algae populations in general. Keywords Feature evaluation and selection; Data-driven knowledge acquisition; Classification; Fuzzy-rough sets; Algae population estimation.

    A multilabel classification approach for complex human activities using a combination of emerging patterns and fuzzy sets

    Get PDF
    In our daily lives, humans perform different Activities of Daily Living (ADL), such as cooking, and studying. According to the nature of humans, they perform these activities in a sequential/simple or an overlapping/complex scenario. Many research attempts addressed simple activity recognition, but complex activity recognition is still a challenging issue. Recognition of complex activities is a multilabel classification problem, such that a test instance is assigned to a multiple overlapping activities. Existing data-driven techniques for complex activity recognition can recognize a maximum number of two overlapping activities and require a training dataset of complex (i.e. multilabel) activities. In this paper, we propose a multilabel classification approach for complex activity recognition using a combination of Emerging Patterns and Fuzzy Sets. In our approach, we require a training dataset of only simple (i.e. single-label) activities. First, we use a pattern mining technique to extract discriminative features called Strong Jumping Emerging Patterns (SJEPs) that exclusively represent each activity. Then, our scoring function takes SJEPs and fuzzy membership values of incoming sensor data and outputs the activity label(s). We validate our approach using two different dataset. Experimental results demonstrate the efficiency and superiority of our approach against other approaches

    Characterizing urban landscapes using fuzzy sets

    Get PDF
    Characterizing urban landscapes is important given the present and future projections of global population that favor urban growth. The definition of “urban” on a thematic map has proven to be problematic since urban areas are heterogeneous in terms of land use and land cover. Further, certain urban classes are inherently imprecise due to the difficulty in integrating various social and environmental inputs into a precise definition. Social components often include demographic patterns, transportation, building type and density while ecological components include soils, elevation, hydrology, climate, vegetation and tree cover. In this paper, we adopt a coupled human and natural system (CHANS) integrated scientific framework for characterizing urban landscapes. We implement the framework by adopting a fuzzy sets concept of “urban characterization” since fuzzy sets relate to classes of object with imprecise boundaries in which membership is a matter of degree. For dynamic mapping applications, user-defined classification schemes involving rules combining different social and ecological inputs can lead to a degree of quantification in class labeling varying from “highly urban” to “least urban”. A socio-economic perspective of urban may include threshold values for population and road network density while a more ecological perspective of urban may utilize the ratio of natural versus built area and percent forest cover. Threshold values are defined to derive the fuzzy rules of membership, in each case, and various combinations of rules offer a greater flexibility to characterize the many facets of the urban landscape. We illustrate the flexibility and utility of this fuzzy inference approach called the Fuzzy Urban Index for the Boston Metro region with five inputs and eighteen rules. The resulting classification map shows levels of fuzzy membership ranging from highly urban to least urban or rural in the Boston study region. We validate our approach using two experts assessing accuracy of the resulting fuzzy urban map. We discuss how our approach can be applied in other urban contexts with newly emerging descriptors of urban sustainability, urban ecology and urban metabolism.This research was partially supported by "Boston University Initiative on Cities Early Stage Urban Research Awards 2015-16" (Gopal & Phillips) and the Frederick S. Pardee Center for the Study of the Longer-Range Future at Boston University. We thank the anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions. (Boston University Initiative on Cities Early Stage Urban Research Awards; Frederick S. Pardee Center for the Study of the Longer-Range Future at Boston University)https://doi.org/10.1016/j.compenvurbsys.2016.02.002Published versio

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
