25,163 research outputs found

    What Works Better? A Study of Classifying Requirements

    Full text link
    Classifying requirements into functional requirements (FR) and non-functional ones (NFR) is an important task in requirements engineering. However, automated classification of requirements written in natural language is not straightforward, due to the variability of natural language and the absence of a controlled vocabulary. This paper investigates how automated classification of requirements into FR and NFR can be improved and how well several machine learning approaches work in this context. We contribute an approach for preprocessing requirements that standardizes and normalizes requirements before applying classification algorithms. Further, we report on how well several existing machine learning methods perform for automated classification of NFRs into sub-categories such as usability, availability, or performance. Our study is performed on 625 requirements provided by the OpenScience tera-PROMISE repository. We found that our preprocessing improved the performance of an existing classification method. We further found significant differences in the performance of approaches such as Latent Dirichlet Allocation, Biterm Topic Modeling, or Naive Bayes for the sub-classification of NFRs.Comment: 7 pages, the 25th IEEE International Conference on Requirements Engineering (RE'17

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    Data mining as a tool for environmental scientists

    Get PDF
    Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

    AbspectroscoPY, a Python toolbox for absorbance-based sensor data in water quality monitoring

    Get PDF
    The long-term trend of increasing natural organic matter (NOM) in boreal and north European surface waters represents an economic and environmental challenge for drinking water treatment plants (DWTPs). High-frequency measurements from absorbance-based online spectrophotometers are often used in modern DWTPs to measure the chromophoric fraction of dissolved organic matter (CDOM) over time. These data contain valuable information that can be used to optimise NOM removal at various stages of treatment and/or diagnose the causes of underperformance at the DWTP. However, automated monitoring systems generate large datasets that need careful preprocessing, followed by variable selection and signal processing before interpretation. In this work we introduce AbspectroscoPY ("Absorbance spectroscopic analysis in Python"), a Python toolbox for processing time-series datasets collected by in situ spectrophotometers. The toolbox addresses some of the main challenges in data preprocessing by handling duplicates, systematic time shifts, baseline corrections and outliers. It contains automated functions to compute a range of spectral metrics for the time-series data, including absorbance ratios, exponential fits, slope ratios and spectral slope curves. To demonstrate its utility, AbspectroscoPY was applied to 15-month datasets from three online spectrophotometers in a drinking water treatment plant. Despite only small variations in surface water quality over the time period, variability in the spectrophotometric profiles of treated water could be identified, quantified and related to lake turnover or operational changes in the DWTP. This toolbox represents a step toward automated early warning systems for detecting and responding to potential threats to treatment performance caused by rapid changes in incoming water quality

    Counterfeit Detection with Multispectral Imaging

    Get PDF
    Multispectral imaging is becoming more practical for a variety of applications due to its ability to provide hyper specific information through a non-destructive analysis. Multispectral imaging cameras can detect light reflectance from different spectral bands of visible and nonvisible wavelengths. Based on the different amount of band reflectance, information can be deduced on the subject. Counterfeit detection applications of multispectral imaging will be decomposed and analyzed in this thesis. Relations between light reflectance and objects’ features will be addressed. The process of the analysis will be broken down to show how this information can be used to provide more insight on the object. This technology provides desired and viable information that can greatly improve multiple fields. For this paper, the multispectral imaging research process of element solution concentrations and counterfeit detection applications of multispectral imaging will be discussed. BaySpec’s OCI-M Ultra Compact Multispectral Imager is used for data collection. This camera is capable of capturing light reflectance from wavelengths of 400 – 1000 nm. Further research opportunities of developing self-automated unmanned aerial vehicles for precision agriculture and extending counterfeit detection applications will also be explored

    DeepCough: A Deep Convolutional Neural Network in A Wearable Cough Detection System

    Full text link
    In this paper, we present a system that employs a wearable acoustic sensor and a deep convolutional neural network for detecting coughs. We evaluate the performance of our system on 14 healthy volunteers and compare it to that of other cough detection systems that have been reported in the literature. Experimental results show that our system achieves a classification sensitivity of 95.1% and a specificity of 99.5%.Comment: BioCAS-201

    Damage identification in structural health monitoring: a brief review from its implementation to the Use of data-driven applications

    Get PDF
    The damage identification process provides relevant information about the current state of a structure under inspection, and it can be approached from two different points of view. The first approach uses data-driven algorithms, which are usually associated with the collection of data using sensors. Data are subsequently processed and analyzed. The second approach uses models to analyze information about the structure. In the latter case, the overall performance of the approach is associated with the accuracy of the model and the information that is used to define it. Although both approaches are widely used, data-driven algorithms are preferred in most cases because they afford the ability to analyze data acquired from sensors and to provide a real-time solution for decision making; however, these approaches involve high-performance processors due to the high computational cost. As a contribution to the researchers working with data-driven algorithms and applications, this work presents a brief review of data-driven algorithms for damage identification in structural health-monitoring applications. This review covers damage detection, localization, classification, extension, and prognosis, as well as the development of smart structures. The literature is systematically reviewed according to the natural steps of a structural health-monitoring system. This review also includes information on the types of sensors used as well as on the development of data-driven algorithms for damage identification.Peer ReviewedPostprint (published version
    corecore