Search CORE

188 research outputs found

Pattern discovery in structural databases with applications to bioinformatics

Author: Zhang Sen
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2005
Field of study

Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data such as trees and graphs. FSM finds many applications in bioinformatics, XML processing, Web log analysis, and so on. In this thesis, two new FSM techniques are proposed for finding patterns in unordered labeled trees. Such trees can be used to model evolutionary histories of different species, among others. The first FSM technique finds cousin pairs in the trees. A cousin pair is a pair of nodes sharing the same parent, the same grandparent, or the same great-grandparent, etc. Given a tree T, our algorithm finds all interesting cousin pairs of T in O(|T|2) time where |T| is the number of nodes in T. Experimental results on synthetic data and phylogenies show the scalability and effectiveness of the proposed technique. This technique has been applied to locating co-occurring patterns in multiple evolutionary trees, evaluating the consensus of equally parsimonious trees, and finding kernel trees of groups of phylogenies. The technique is also extended to undirected acyclic graphs (or free trees). The second FSM technique extends traditional MAST (maximum agreement subtree) algorithms by employing the Apriori data mining technique to find frequent agreement subtrees in multiple phylogenies. The correctness and completeness of the new mining algorithm are presented. The method is also extended to unrooted phylogenetic trees. Both FSM techniques studied in the thesis have been implemented into a toolkit, which is fully operational and accessible on the World Wide Web

Digital Commons @ New Jersey Institute of Technology (NJIT)

A survey of frequent subgraph mining algorithms

Author: Coenen Frans
Jiang Chuntao
Zito Michele
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/03/2013
Field of study

University of Liverpool Repository

Characterization of Neuromuscular Disorders Using Quantitative Electromyographic Techniques

Author: Fouad Meena AbdelMaseeh Adly
Publication venue: 'University of Waterloo'
Publication date: 01/02/2016
Field of study

This thesis presents a multifaceted effort to develop a system that allows electrodiagnostic clinicians to perform a quantitative analysis of needle detected electromyographic (EMG) signals for characterization of neuromuscular disorders. Currently, the most widely adopted practise for evaluation of patients with suspected neuromuscular disorders is based on qualitative visual and auditory assessment of EMG signals. The resulting characterizations from this qualitative assessment are criticized for being subjective and highly dependent on the skill and experience of the examiner. The proposed system can be decomposed functionally into three stages: (1) extracting relevant information from the EMG signals, (2) representing the extracted information in formats suitable for qualitative, semi-quantitative and quantitative assessment, and (3) supporting the clinical decision, i.e., characterizing the examined muscle by estimating the likelihood of it being affected by a specific category of neuromuscular disorders. The main contribution of the thesis to the extraction stage is the development of an automated decomposition algorithm specifically tailored for characterization of neuromuscular disorders. The algorithm focuses on identifying as many representative motor unit potential trains as possible in times comparable to the times needed to complete a qualitative assessment. The identified trains are shown to reliably capture important aspects of the motor unit potential morphology and morphological stability. With regards to the representation stage, the thesis proposes ten new quantitative EMG features that are shown to be discriminative among the different disease categories. Along with eight traditional features, the features can be grouped into subsets, where each subset reflects a different aspect of the underlying motor unit structure and/or function. A muscle characterization obtained using a feature set in which every relevant aspect is included using the most representative feature is more structured, simple, and generalizable. All the investigated features are clinically relevant. An examiner can easily validate their values by visual inspection; interpret them from an anatomical, physiological, and pathological basis; and is aware of their limitations and dependence on the acquisition setup. The second main contribution to the representation stage is the evaluation of the possibility of detecting neurogenic disorders using a newly proposed set of quantitative features describing the firing patterns of the identified motor units. The last contribution to the representation stage is the development of novel methods that allow an examiner to detect contributions from fibres close to the detection surface of a needle electrode and to track them across a motor unit potential train. The work in this thesis related to the decision support stage aims at improving existing methods for obtaining transparent muscle characterization. Transparent methods do not only estimate the likelihood of the muscle being affected by a specific disorder, but also induce a set of rules explaining the likelihood estimates. The results presented in this thesis show that remodelling the characterization problem using an appropriate binarization mapping can overcome the decrease in accuracy associated with quantizing features, which is used to induce transparency rules. To attain the above mentioned objectives, different signal processing and machine learning methods are utilized and extended. This includes spectral clustering, Savitzky-Golay filtering, dynamic time warping, support vector machines, classification based on event association rules and Gaussian mixture models. The performance of the proposed methods has been evaluated with four different sets of examined limb muscles (342 muscles in total). Also, it has been evaluated using simulated EMG signals calculated using physiologically and anatomically sound models. A system capable of achieving the aforementioned objectives is expected to promote further clinical adoption of quantitative electromyographic techniques. These techniques have potential advantages over existing qualitative assessments including resolving equivocal cases, formalizing communication and evaluating prognosis

University of Waterloo's Institutional Repository

Foundations of Software Science and Computation Structures

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book constitutes the proceedings of the 22nd International Conference on Foundations of Software Science and Computational Structures, FOSSACS 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conference on Theory and Practice of Software, ETAPS 2019. The 29 papers presented in this volume were carefully reviewed and selected from 85 submissions. They deal with foundational research with a clear significance for software science