21,579 research outputs found

    From Review to Rating: Exploring Dependency Measures for Text Classification

    Full text link
    Various text analysis techniques exist, which attempt to uncover unstructured information from text. In this work, we explore using statistical dependence measures for textual classification, representing text as word vectors. Student satisfaction scores on a 3-point scale and their free text comments written about university subjects are used as the dataset. We have compared two textual representations: a frequency word representation and term frequency relationship to word vectors, and found that word vectors provide a greater accuracy. However, these word vectors have a large number of features which aggravates the burden of computational complexity. Thus, we explored using a non-linear dependency measure for feature selection by maximizing the dependence between the text reviews and corresponding scores. Our quantitative and qualitative analysis on a student satisfaction dataset shows that our approach achieves comparable accuracy to the full feature vector, while being an order of magnitude faster in testing. These text analysis and feature reduction techniques can be used for other textual data applications such as sentiment analysis.Comment: 8 page

    Parallel approach of a Galerkin-based methodology for predicting the compressive strength of the lightweight aggregate concrete

    Get PDF
    A methodology based on the Galerkin formulation of the finite element method has been analyzed for predicting the compressive strength of the lightweight aggregate concrete using ultrasonic pulse velocity. Due to both the memory requirements and the computational cost of this technique, its parallelization becomes necessary for solving this problem. For this purpose a mixed MPI/OpenMP parallel algorithm has been designed and different approaches and data distributions analyzed. On the other hand, this Galerkin methodology has been compared with multiple linear regression models, regression trees and artificial neural networks. Based on different measures of goodness of fit, the effectiveness of the Galerkin methodology, compared with these statistical techniques for data mining, is shown.This research was supported by the Spanish Ministry of Science, Innovation and Universities Grant RTI2018-098156-B-C54, co-financed by the European Commission (FEDER funds)

    Automatic programming methodologies for electronic hardware fault monitoring

    Get PDF
    This paper presents three variants of Genetic Programming (GP) approaches for intelligent online performance monitoring of electronic circuits and systems. Reliability modeling of electronic circuits can be best performed by the Stressor - susceptibility interaction model. A circuit or a system is considered to be failed once the stressor has exceeded the susceptibility limits. For on-line prediction, validated stressor vectors may be obtained by direct measurements or sensors, which after pre-processing and standardization are fed into the GP models. Empirical results are compared with artificial neural networks trained using backpropagation algorithm and classification and regression trees. The performance of the proposed method is evaluated by comparing the experiment results with the actual failure model values. The developed model reveals that GP could play an important role for future fault monitoring systems.This research was supported by the International Joint Research Grant of the IITA (Institute of Information Technology Assessment) foreign professor invitation program of the MIC (Ministry of Information and Communication), Korea

    Machine learning and its applications in reliability analysis systems

    Get PDF
    In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

    Assessing the role of EO in biodiversity monitoring: options for integrating in-situ observations with EO within the context of the EBONE concept

    Get PDF
    The European Biodiversity Observation Network (EBONE) is a European contribution on terrestrial monitoring to GEO BON, the Group on Earth Observations Biodiversity Observation Network. EBONE’s aims are to develop a system of biodiversity observation at regional, national and European levels by assessing existing approaches in terms of their validity and applicability starting in Europe, then expanding to regions in Africa. The objective of EBONE is to deliver: 1. A sound scientific basis for the production of statistical estimates of stock and change of key indicators; 2. The development of a system for estimating past changes and forecasting and testing policy options and management strategies for threatened ecosystems and species; 3. A proposal for a cost-effective biodiversity monitoring system. There is a consensus that Earth Observation (EO) has a role to play in monitoring biodiversity. With its capacity to observe detailed spatial patterns and variability across large areas at regular intervals, our instinct suggests that EO could deliver the type of spatial and temporal coverage that is beyond reach with in-situ efforts. Furthermore, when considering the emerging networks of in-situ observations, the prospect of enhancing the quality of the information whilst reducing cost through integration is compelling. This report gives a realistic assessment of the role of EO in biodiversity monitoring and the options for integrating in-situ observations with EO within the context of the EBONE concept (cfr. EBONE-ID1.4). The assessment is mainly based on a set of targeted pilot studies. Building on this assessment, the report then presents a series of recommendations on the best options for using EO in an effective, consistent and sustainable biodiversity monitoring scheme. The issues that we faced were many: 1. Integration can be interpreted in different ways. One possible interpretation is: the combined use of independent data sets to deliver a different but improved data set; another is: the use of one data set to complement another dataset. 2. The targeted improvement will vary with stakeholder group: some will seek for more efficiency, others for more reliable estimates (accuracy and/or precision); others for more detail in space and/or time or more of everything. 3. Integration requires a link between the datasets (EO and in-situ). The strength of the link between reflected electromagnetic radiation and the habitats and their biodiversity observed in-situ is function of many variables, for example: the spatial scale of the observations; timing of the observations; the adopted nomenclature for classification; the complexity of the landscape in terms of composition, spatial structure and the physical environment; the habitat and land cover types under consideration. 4. The type of the EO data available varies (function of e.g. budget, size and location of region, cloudiness, national and/or international investment in airborne campaigns or space technology) which determines its capability to deliver the required output. EO and in-situ could be combined in different ways, depending on the type of integration we wanted to achieve and the targeted improvement. We aimed for an improvement in accuracy (i.e. the reduction in error of our indicator estimate calculated for an environmental zone). Furthermore, EO would also provide the spatial patterns for correlated in-situ data. EBONE in its initial development, focused on three main indicators covering: (i) the extent and change of habitats of European interest in the context of a general habitat assessment; (ii) abundance and distribution of selected species (birds, butterflies and plants); and (iii) fragmentation of natural and semi-natural areas. For habitat extent, we decided that it did not matter how in-situ was integrated with EO as long as we could demonstrate that acceptable accuracies could be achieved and the precision could consistently be improved. The nomenclature used to map habitats in-situ was the General Habitat Classification. We considered the following options where the EO and in-situ play different roles: using in-situ samples to re-calibrate a habitat map independently derived from EO; improving the accuracy of in-situ sampled habitat statistics, by post-stratification with correlated EO data; and using in-situ samples to train the classification of EO data into habitat types where the EO data delivers full coverage or a larger number of samples. For some of the above cases we also considered the impact that the sampling strategy employed to deliver the samples would have on the accuracy and precision achieved. Restricted access to European wide species data prevented work on the indicator ‘abundance and distribution of species’. With respect to the indicator ‘fragmentation’, we investigated ways of delivering EO derived measures of habitat patterns that are meaningful to sampled in-situ observations

    A Study on the Phase-asynchronous PD Diagnosis Method for Gas Insulated Switchgears

    Get PDF
    Gas-insulated switchgear(GIS) is one of the most important power facilities and a valuable asset in a power system for providing stable and reliable electrical power. It has been in operation for more than 45 years due to its high reliability with low failure rate. Although GIS has a low-maintenance requirement, its failure caused by partial discharge(PD) leads to considerable financial loss. The ultra-high frequency(UHF) method is an effective tool to detect insulation defects inside GIS and widely used for on-line and on-site diagnosis. It is also less sensitive to noise as well as better for PD detection compared to other measurement methods. Most of utilities, laboratories, and countries perform the PD detection using narrow-band or wide-band frequency ranges and classify types of PDs by conventional methods with a phase angle of the voltage applied to power equipment. In many cases of on-site PD measurement in the field, however, it is difficult to classify types of PDs due to the phase-asynchronous PD signals. This thesis described a new method of PD diagnosis which can classify types of PDs without phase information of the voltage applied to GIS. The 327 cases of on-site measurement data were collected from 2003 to 2015. The statistical analysis of collected on-site measurement data was performed according to voltage classes, maintenance results, defect causes, and defect locations. From the statistical analysis, the most frequent PD and noise types were a floating element and an external interference, respectively. To develop the new method of PD diagnosis which is applicable to the on-site PD diagnosis without phase synchronization, the features were extracted to classify defect types using the representative data of 82 cases, including 66 PD and 16 noise cases. The features consisted of 5 frequency and 6 phase parameters. The 5 frequency parameters were the number of distribution ranges, maximum value, ranges of first and second peak value, peak differences between first and second peak value, and density levels. 6 phase parameters were the number of phase groups, overall distribution ranges or not, the distribution ranges of each group, density levels, peak differences between first and second group, and shapes. 82 cases of representative data were selected through the review of data validation and analyzed using the designed 11 feature parameters, from which 5 effective parameters were extracted to identify the defect types using the decision tree-based technique by 4 steps: the number of groups in phase parameters(first step), shapes in phase parameters(second step), the number of distribution ranges & density levels in frequency parameters(third step), and ranges of first and second peak value in frequency parameters(fourth step). As a result, the decision tree-based diagnosis algorithm was able to classify types of 6 PDs and 4 noises and 77 of 82 cases were exactly classified. The diagnosis performance of new method proposed in this thesis therefore had an accuracy rate over 94% and was able to diagnose almost every type of defect. The new method also was applied to on-site GIS diagnosis in South Korea and Malaysia to verify its reliability. In two cases, portable and on-line UHF PD systems were installed without phase synchronization, and the defect cause and location inside GISs were inspected visually by on-site engineers after on-site PD measurement. The two cases were analyzed by the new method based on decision-tree based diagnosis algorithm and results of the new method were identical to results of internal inspection. From the results, the new method of PD diagnosis proposed in this thesis is quite useful to classify various defect types using the phase-asynchronous PD signals in the on-site measurement.Contents ⅰ Lists of Figures and Tables ⅲ Abstract ⅶ Chapter 1 Introduction 1 Chapter 2 Partial Discharges 8 2.1 PD Classification 8 2.2 Typical PD sources in GIS 17 2.3 Technical methods and strategies for PD diagnosis 23 2.4 PD analysis methods 33 Chapter 3 Data Acquisition and Analysis 39 3.1 Statistical analysis 41 3.2 Feature extraction 50 Chapter 4 New Method of PD Diagnosis 84 4.1 New PD diagnostic algorithm 84 4.2 Case studies in Korea and Malaysia 86 Chapter 5 Conclusions 95 References 98Docto

    A Supervised STDP-based Training Algorithm for Living Neural Networks

    Full text link
    Neural networks have shown great potential in many applications like speech recognition, drug discovery, image classification, and object detection. Neural network models are inspired by biological neural networks, but they are optimized to perform machine learning tasks on digital computers. The proposed work explores the possibilities of using living neural networks in vitro as basic computational elements for machine learning applications. A new supervised STDP-based learning algorithm is proposed in this work, which considers neuron engineering constrains. A 74.7% accuracy is achieved on the MNIST benchmark for handwritten digit recognition.Comment: 5 pages, 3 figures, Accepted by ICASSP 201
    corecore