849 research outputs found

    A Multiple Instance Learning Approach to Electrophysiological Muscle Classification for Diagnosing Neuromuscular Disorders Using Quantitative EMG

    Get PDF
    Neuromuscular disorder is a broad term that refers to diseases that impair muscle functionality either by affecting any part of the nerve or muscle. Electrodiagnosis of most neuromuscular disorders is based on the electrophysiological classification of involved muscles which in turn, is performed by inferring the structure and function of the muscles by analyzing electromyographic (EMG) signals recorded during low to moderate levels of contraction. The functional unit of muscle contraction is called a motor unit (MU). The morphology and physiology of the MUs of an examined muscle are inferred by extracting motor unit potentials (MUPs) from the EMG signals detected from the muscle. As such, electrophysiological muscle classification is performed by first characterizing extracted MUPs and then aggregating these characterizations. The task of classifying muscles can be represented as an instance of a multiple instance learning (MIL) problem. In the MIL paradigm, a bag of instances shares a label and the instance labels are hidden, contrary to standard supervised learning, where each training instance is labeled. In MIL-based muscle classification, the instances are the MUPs extracted from the EMG signals of the analyzed muscle and the bag is the muscle. Detecting and counting the MUPs indicating a specific category of a neuromuscular disorder can result in accurately classifying the examined muscle. As such, three major issues usually arise: how to infer MUP labels without full supervision; how the cardinality relationships between MUP labels contribute to predict the muscle label; and how the muscle as a whole entity is classified. In this thesis, these three challenges are addressed. To this end, an MIL-based muscle classification system is proposed that has five major steps: 1) MUPs are represented using morphological, stability, and novel near fiber parameters as well as spectral features extracted from wavelet coefficients. This representation helps to analyze MUPs from a variety of aspects. 2) MUP feature selection using unsupervised similarity preserving Laplacian score which is independent of any learning algorithm. Hence, the features selected in this work can be used in other electrophysiological muscle classification systems. 3) MUP clustering using a novel clustering algorithm called Neighbourhood Distance Entropy Consistency (NDEC) which contributes to solve the traditional problem of finding representations of MUP normality and abnormality and provides a dynamic number of MUP characterization classes which will be used instead of the conventional three classes (i.e. normal, myopathic, and neurogenic). This clustering was performed to highlight the effects of disease on both fiber spatial distributions and fiber diameter distributions, which lead to a continuity of MUP characteristics. These clusters can potentially represent several concepts of MUP normality and abnormality. 4) Muscle representation by embedding its MUP cluster associations in a feature vector, and 5) Muscle classification using support vector machines or random forests. Quantitative results obtained by applying the proposed method to four electrophysiologically different groups of muscles including proximal arm, proximal leg, distal arm, and distal leg show the superior and stable performance of the proposed muscle classification system compared to previous works. Additionally, modelling electrophysiological muscle classification as an instance of the MIL can solve the traditional problem of characterizing MUPs without full supervision. The proposed clustering algorithm in this work, can be used as an effective technique in other pattern recognition and medical diagnostic systems in which discovering natural clusters within data is a necessity

    Scalable Topic Detection Approaches fromTwitter Streams

    Get PDF
    Real time topic detection in Twitter streams is an important task that helps discovering natural disasters in a real time from usersā€™ posts and helps political parties and companies understand usersā€™ opinions and needs. In 2014 the number of active users on Twitter is reported to be more than 288 million users who are posting around 500 million tweets daily. Therefore, detecting topics from Twitter streams in a real time becomes a challenging task that needs scalable and efficient techniques to handle this large amount of data. In this work, we scale an Exemplar-based technique that detects topics from Twitter streams, where each of the detected topics is represented by one tweet (i.e, exemplar). Using exemplar tweets to represent the detected topics, makes these topics easier to interpret as opposed to representing them by uncorrelated terms as in other topic detection algorithms. The approach is implemented using Apache Giraph and is being extended here to efficiently support sliding windows. Experimental results on four datasets show that the optimized Giraph implementation achieves a speedup of up to nineteen times over the native implementation, while maintaining good quality of the detected topics. In addition, Giraph Exemplar-based approach achieves the best topic recall and term precision against K-means, Latent Dirichlet Allocation (LDA), Non-negative matrix factorization (NMF) and Latent Semantic Analysis (LSA), while maintaining a good term recall and running time. The approach is also deployed for detecting topics from real-time Twitter streams and its scalability is demonstrated. Moreover, another clustering technique called Local Variance-based Clustering (LVC) is proposed in this thesis for detecting topics from Twitter streams. Local Variance-based Clustering (LVC) defines the data points densities based on their similarities. The proposed local variance measure is calculated based on the variance of the data points similarity histogram and is shown to well distinguish between core, border, connecting and outliers points. Experimental results show that LVC outperforms spectral clustering and affinity propagation in clustering quality using control charts, Ecoli and images datasets, while maintaining a good running time. In addition, results show that LVC can detect topics from Twitter with higher topic recall by 15% and higher term precision by 3% over DBSCAN

    Knowledge Structures.

    Get PDF
    This paper investigates how technological distance between firms affects their network of R&D alliances. Our theoretic model assumes that the benefit of an alliance between two firms is given by their technological distance. This benefit-distance relationship determines the ego-network of each firm as well as the overall network structure. Empirical relevance is confirmed for the bio-pharmaceutical industry. Although we find that the network structure is largely explained by firm size, technological distance determines the positioning of firms in the network.technological distance, research alliance, network formation, pharmaceutical industry.

    Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning

    Get PDF
    In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians\u27 viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic. As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts\u27 eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts\u27 cognitive reasoning processes. The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts\u27 domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions. To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts\u27 sense-making

    Mine evaluation optimisation

    Get PDF
    The definition of a mineral resource during exploration is a fundamental part of lease evaluation, which establishes the fair market value of the entire asset being explored in the open market. Since exact prediction of grades between sampled points is not currently possible by conventional methods, an exact agreement between predicted and actual grades will nearly always contain some error. These errors affect the evaluation of resources so impacting on characterisation of risks, financial projections and decisions about whether it is necessary to carry on with the further phases or not. The knowledge about minerals below the surface, even when it is based upon extensive geophysical analysis and drilling, is often too fragmentary to indicate with assurance where to drill, how deep to drill and what can be expected. Thus, the exploration team knows only the density of the rock and the grade along the core. The purpose of this study is to improve the process of resource evaluation in the exploration stage by increasing prediction accuracy and making an alternative assessment about the spatial characteristics of gold mineralisation. There is significant industrial interest in finding alternatives which may speed up the drilling phase, identify anomalies, worthwhile targets and help in establishing fair market value. Recent developments in nonconvex optimisation and high-dimensional statistics have led to the idea that some engineering problems such as predicting gold variability at the exploration stage can be solved with the application of clusterwise linear and penalised maximum likelihood regression techniques. This thesis attempts to solve the distribution of the mineralisation in the underlying geology using clusterwise linear regression and convex Least Absolute Shrinkage and Selection Operator (LASSO) techniques. The two presented optimisation techniques compute predictive solutions within a domain using physical data provided directly from drillholes. The decision-support techniques attempt a useful compromise between the traditional and recently introduced methods in optimisation and regression analysis that are developed to improve exploration targeting and to predict the gold occurrences at previously unsampled locations.Doctor of Philosoph

    A finder and representation system for knowledge carriers based on granular computing

    Get PDF
    In one of his publications Aristotle states ā€All human beings by their nature desire to knowā€ [Kraut 1991]. This desire is initiated the day we are born and accompanies us for the rest of our life. While at a young age our parents serve as one of the principle sources for knowledge, this changes over the course of time. Technological advances and particularly the introduction of the Internet, have given us new possibilities to share and access knowledge from almost anywhere at any given time. Being able to access and share large collections of written down knowledge is only one part of the equation. Just as important is the internalization of it, which in many cases can prove to be difficult to accomplish. Hence, being able to request assistance from someone who holds the necessary knowledge is of great importance, as it can positively stimulate the internalization procedure. However, digitalization does not only provide a larger pool of knowledge sources to choose from but also more people that can be potentially activated, in a bid to receive personalized assistance with a given problem statement or question. While this is beneficial, it imposes the issue that it is hard to keep track of who knows what. For this task so-called Expert Finder Systems have been introduced, which are designed to identify and suggest the most suited candidates to provide assistance. Throughout this Ph.D. thesis a novel type of Expert Finder System will be introduced that is capable of capturing the knowledge users within a community hold, from explicit and implicit data sources. This is accomplished with the use of granular computing, natural language processing and a set of metrics that have been introduced to measure and compare the suitability of candidates. Furthermore, are the knowledge requirements of a problem statement or question being assessed, in order to ensure that only the most suited candidates are being recommended to provide assistance

    Genetic population structure and dispersal of two North American woodpeckers in ephemeral habitats

    Get PDF
    Disturbance-dependent species regularly colonize ephemeral habitat patches. In this research, I used patterns of genetic variation to estimate the dispersal dynamics of black-backed woodpeckers (Picoides arcticus), a fire specialist, and compared these patterns to hairy woodpeckers (Picoides villosus), a generalist. I then examined how frequent colonization of ephemeral habitat patches versus stable migration among static habitat patches shapes the genetic structure of species. I examined patterns of genetic variation in mtDNA and microsatellites in both black-backed and hairy woodpeckers to determine large-scale spatial structure. Black-backed woodpeckers have high genetic connectivity across the boreal forest and lower genetic connectivity among sites separated by large gaps in forested habitat. Across the boreal forest, hairy woodpeckers have low genetic differentiation in mtDNA that lacks spatial structure, but moderate genetic differentiation in an isolation by distance pattern in microsatellite data. These results suggest that large gaps in forest act as a movement barrier to black-backed woodpeckers; movement patterns of hairy woodpeckers are primarily driven by geographic distance as opposed to landscape composition. Once I understood the primary mechanisms driving large-scale patterns, I determined the fine-scale spatial structure in both species. Black-backed woodpeckers apparently disperse twice as far as hairy woodpeckers based on patterns of fine-scale geneticstructure. Female black-backed woodpeckers have limited dispersal, with long-distance dispersal being male-biased. A weak pattern of female-biased dispersal was observed in hairy woodpeckers. I used simulations to evaluate how effective population size and dispersal distance interact with two models of dispersal, frequent colonization of ephemeral patches and stable migration, to shape large-scale genetic structure. Frequent colonization of ephemeral habitats resulted in lower spatial structure and higher genetic differentiation among patches in comparison to stable migration. Low genetic differentiation with little spatial structure occurred at an intermediate dispersal distance in the frequent colonization model, the pattern observed in black-backed woodpeckers. Stable migration with short dispersal distance results in isolation by distance, the pattern observed in hairy woodpeckers. Disturbance-dependent species have evolved with a natural mosaic of shifting habitat patches. As anthropogenic disturbance increasingly changes this mosaic, ecologists need to consider how this shift may affect connectivity for disturbance-dependent species

    An overview on user profiling in online social networks

    Get PDF
    Advances in Online Social Networks is creating huge data day in and out providing lot of opportunities to its users to express their interest and opinion. Due to the popularity and exposure of social networks, many intruders are using this platform for illegal purposes. Identifying such users is challenging and requires digging huge knowledge out of the data being flown in the social media. This work gives an insight to profile users in online social networks. User Profiles are established based on the behavioral patterns, correlations and activities of the user analyzed from the aggregated data using techniques like clustering, behavioral analysis, content analysis and face detection. Depending on application and purpose, the mechanism used in profiling users varies. Further study on other mechanisms used in profiling users is under the scope of future endeavors
    • ā€¦
    corecore