Search CORE

578 research outputs found

Line Based Multi-Range Asymmetric Conditional Random Field For Terrestrial Laser Scanning Data Classification

Author: Luo Chao
Publication venue
Publication date: 20/09/2016
Field of study

Terrestrial Laser Scanning (TLS) is a ground-based, active imaging method that rapidly acquires accurate, highly dense three-dimensional point cloud of object surfaces by laser range finding. For fully utilizing its benefits, developing a robust method to classify many objects of interests from huge amounts of laser point clouds is urgently required. However, classifying massive TLS data faces many challenges, such as complex urban scene, partial data acquisition from occlusion. To make an automatic, accurate and robust TLS data classification, we present a line-based multi-range asymmetric Conditional Random Field algorithm. The first contribution is to propose a line-base TLS data classification method. In this thesis, we are interested in seven classes: building, roof, pedestrian road (PR), tree, low man-made object (LMO), vehicle road (VR), and low vegetation (LV). The line-based classification is implemented in each scan profile, which follows the line profiling nature of laser scanning mechanism.Ten conventional local classifiers are tested, including popular generative and discriminative classifiers, and experimental results validate that the line-based method can achieve satisfying classification performance. However, local classifiers implement labeling task on individual line independently of its neighborhood, the inference of which often suffers from similar local appearance across different object classes. The second contribution is to propose a multi-range asymmetric Conditional Random Field (maCRF) model, which uses object context as post-classification to improve the performance of a local generative classifier. The maCRF incorporates appearance, local smoothness constraint, and global scene layout regularity together into a probabilistic graphical model. The local smoothness enforces that lines in a local area to have the same class label, while scene layout favours an asymmetric regularity of spatial arrangement between different object classes within long-range, which is considered both in vertical (above-bellow relation) and horizontal (front-behind) directions. The asymmetric regularity allows capturing directional spatial arrangement between pairwise objects (e.g. it allows ground is lower than building, not vice-versa). The third contribution is to extend the maCRF model by adding across scan profile context, which is called Across scan profile Multi-range Asymmetric Conditional Random Field (amaCRF) model. Due to the sweeping nature of laser scanning, the sequentially acquired TLS data has strong spatial dependency, and the across scan profile context can provide more contextual information. The final contribution is to propose a sequential classification strategy. Along the sweeping direction of laser scanning, amaCRF models were sequentially constructed. By dynamically updating posterior probability of common scan profiles, contextual information propagates through adjacent scan profiles

YorkSpace

Learning from Data Streams: An Overview and Update

Author: Read Jesse
Žliobaitė Indrė
Publication venue
Publication date: 03/08/2023
Field of study

The literature on machine learning in the context of data streams is vast and growing. However, many of the defining assumptions regarding data-stream learning tasks are too strong to hold in practice, or are even contradictory such that they cannot be met in the contexts of supervised learning. Algorithms are chosen and designed based on criteria which are often not clearly stated, for problem settings not clearly defined, tested in unrealistic settings, and/or in isolation from related approaches in the wider literature. This puts into question the potential for real-world impact of many approaches conceived in such contexts, and risks propagating a misguided research focus. We propose to tackle these issues by reformulating the fundamental definitions and settings of supervised data-stream learning with regard to contemporary considerations of concept drift and temporal dependence; and we take a fresh look at what constitutes a supervised data-stream learning task, and a reconsideration of algorithms that may be applied to tackle such tasks. Through and in reflection of this formulation and overview, helped by an informal survey of industrial players dealing with real-world data streams, we provide recommendations. Our main emphasis is that learning from data streams does not impose a single-pass or online-learning approach, or any particular learning regime; and any constraints on memory and time are not specific to streaming. Meanwhile, there exist established techniques for dealing with temporal dependence and concept drift, in other areas of the literature. For the data streams community, we thus encourage a shift in research focus, from dealing with often-artificial constraints and assumptions on the learning mode, to issues such as robustness, privacy, and interpretability which are increasingly relevant to learning in data streams in academic and industrial settings

arXiv.org e-Print Archive

LSTM 기반 언어 모델을 통한 침입 탐지 시스템

Author: 김규완
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (석사)-- 서울대학교 대학원 공과대학 전기·정보공학부, 2017. 8. 윤성로.컴퓨터 보안에서 견고한 침입 탐지 시스템을 설계하는 것은 가장 핵심적이고 중요한 문제 중의 하나이다. 본 논문에서는 비정상 기반 호스트 침입 탐지 시스템 설계를 위한 시스템 콜 시퀀스와 분기 시퀀스에 대한 언어 모델 방법을 제안한다. 기존의 방법에서 흔히 발생하는 높은 오탐율 문제를 해결하기 위해 여러 임계값 분류기를 혼합하여 정상적인 시퀀스들을 잘 모을 수 있는 새로운 앙상블 방법을 사용하였다. 본 언어 모델은 기존 방법들이 잘 하지 못했던 각 시스템 콜의 의미와 그들 간의 상호 작용을 학습 할 수 있다는 장점이 있다. 공개된 데이터들과 새롭게 생성한 데이터를 바탕으로 다양한 실험을 통해 제안 된 방법의 타당성과 유효성을 입증하였다. 또한, 본 모델이 높은 이식성을 갖고 있음을 보였다.국문초록 i Acknowledgement ii 1 Introduction 1 2 Language Model of System Call Sequences 6 2.1 Model Architecture 6 2.2 Baseline Classifiers 8 2.3 Performance Evaluation 9 3 Ensemble Method to Reduce False Alarms 14 3.1 Ensemble Method 14 3.2 Comparsion with Other Methods 15 4 Interpretation to Transfer Learning 19 4.1 Portability of Model 19 4.2 Visualization of Learned Representations 20 5 Generalization to Branch Sequences 23 5.1 Handling Open Vocabulary Problem 23 5.2 Experiments on Branch Sequences 24 5.3 Discussion on Branch Language Model 26 6 Future Work 28 6.1 Advanced Model Architecture 28 6.2 Finding Anomalous Segments 28 6.3 Adversarial Training 29 6.4 Online Learning Framework 30 7 Conclusion 31 References 32 Abstract 37Maste

SNU Open Repository and Archive

Data-Driven Transducer Design and Identification for Internally-Paced Motor Brain Computer Interfaces: A Review

Author: Marie-Caroline Schaeffer
Tetiana Aksenova
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Brain-Computer Interfaces (BCIs) are systems that establish a direct communication pathway between the users' brain activity and external effectors. They offer the potential to improve the quality of life of motor-impaired patients. Motor BCIs aim to permit severely motor-impaired users to regain limb mobility by controlling orthoses or prostheses. In particular, motor BCI systems benefit patients if the decoded actions reflect the users' intentions with an accuracy that enables them to efficiently interact with their environment. One of the main challenges of BCI systems is to adapt the BCI's signal translation blocks to the user to reach a high decoding accuracy. This paper will review the literature of data-driven and user-specific transducer design and identification approaches and it focuses on internally-paced motor BCIs. In particular, continuous kinematic biomimetic and mental-task decoders are reviewed. Furthermore, static and dynamic decoding approaches, linear and non-linear decoding, offline and real-time identification algorithms are considered. The current progress and challenges related to the design of clinical-compatible motor BCI transducers are additionally discussed

Directory of Open Access Journals

Frontiers - Publisher Connector

Evaluation methods and decision theory for classification of streaming data with temporal dependence

Author: Bifet Albert
Holmes Geoffrey
Pfahringer Bernhard
Read Jesse
Žliobaitė Indrė
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Predictive modeling on data streams plays an important role in modern data analysis, where data arrives continuously and needs to be mined in real time. In the stream setting the data distribution is often evolving over time, and models that update themselves during operation are becoming the state-of-the-art. This paper formalizes a learning and evaluation scheme of such predictive models. We theoretically analyze evaluation of classifiers on streaming data with temporal dependence. Our findings suggest that the commonly accepted data stream classification measures, such as classification accuracy and Kappa statistic, fail to diagnose cases of poor performance when temporal dependence is present, therefore they should not be used as sole performance indicators. Moreover, classification accuracy can be misleading if used as a proxy for evaluating change detectors with datasets that have temporal dependence. We formulate the decision theory for streaming data classification with temporal dependence and develop a new evaluation methodology for data stream classification that takes temporal dependence into account. We propose a combined measure for classification performance, that takes into account temporal dependence, and we recommend using it as the main performance measure in classification of streaming data

Research Commons@Waikato

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Predicting Flavonoid UGT Regioselectivity with Graphical Residue Models and Machine Learning.

Author: Jackson Arthur Rhydon
Publication venue: Digital Commons @ East Tennessee State University
Publication date: 19/12/2009
Field of study

Machine learning is applied to a challenging and biologically significant protein classification problem: the prediction of flavonoid UGT acceptor regioselectivity from primary protein sequence. Novel indices characterizing graphical models of protein residues are introduced. The indices are compared with existing amino acid indices and found to cluster residues appropriately. A variety of models employing the indices are then investigated by examining their performance when analyzed using nearest neighbor, support vector machine, and Bayesian neural network classifiers. Improvements over nearest neighbor classifications relying on standard alignment similarity scores are reported

East Tennessee State University