325,014 research outputs found

    Topic-based mixture language modelling

    Get PDF
    This paper describes an approach for constructing a mixture of language models based on simple statistical notions of semantics using probabilistic models developed for information retrieval. The approach encapsulates corpus-derived semantic information and is able to model varying styles of text. Using such information, the corpus texts are clustered in an unsupervised manner and a mixture of topic-specific language models is automatically created. The principal contribution of this work is to characterise the document space resulting from information retrieval techniques and to demonstrate the approach for mixture language modelling. A comparison is made between manual and automatic clustering in order to elucidate how the global content information is expressed in the space. We also compare (in terms of association with manual clustering and language modelling accuracy) alternative term-weighting schemes and the effect of singular value decomposition dimension reduction (latent semantic analysis). Test set perplexity results using the British National Corpus indicate that the approach can improve the potential of statistical language modelling. Using an adaptive procedure, the conventional model may be tuned to track text data with a slight increase in computational cost

    Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification

    Full text link
    This paper proposes a novel deep learning framework named bidirectional-convolutional long short term memory (Bi-CLSTM) network to automatically learn the spectral-spatial feature from hyperspectral images (HSIs). In the network, the issue of spectral feature extraction is considered as a sequence learning problem, and a recurrent connection operator across the spectral domain is used to address it. Meanwhile, inspired from the widely used convolutional neural network (CNN), a convolution operator across the spatial domain is incorporated into the network to extract the spatial feature. Besides, to sufficiently capture the spectral information, a bidirectional recurrent connection is proposed. In the classification phase, the learned features are concatenated into a vector and fed to a softmax classifier via a fully-connected operator. To validate the effectiveness of the proposed Bi-CLSTM framework, we compare it with several state-of-the-art methods, including the CNN framework, on three widely used HSIs. The obtained results show that Bi-CLSTM can improve the classification performance as compared to other methods

    Recognition and reconstruction of coherent energy with application to deep seismic reflection data

    Get PDF
    Reflections in deep seismic reflection data tend to be visible on only a limited number of traces in a common midpoint gather. To prevent stack degeneration, any noncoherent reflection energy has to be removed. In this paper, a standard classification technique in remote sensing is presented to enhance data quality. It consists of a recognition technique to detect and extract coherent energy in both common shot gathers and fi- nal stacks. This technique uses the statistics of a picked seismic phase to obtain the likelihood distribution of its presence. Multiplication of this likelihood distribution with the original data results in a “cleaned up” section. Application of the technique to data from a deep seismic reflection experiment enhanced the visibility of all reflectors considerably. Because the recognition technique cannot produce an estimate of “missing” data, it is extended with a reconstruction method. Two methods are proposed: application of semblance weighted local slant stacks after recognition, and direct recognition in the linear tau-p domain. In both cases, the power of the stacking process to increase the signal-to-noise ratio is combined with the direct selection of only specific seismic phases. The joint application of recognition and reconstruction resulted in data images which showed reflectors more clearly than application of a single technique

    Advanced correlation-based character recognition applied to the Archimedes Palimpsest

    Get PDF
    The Archimedes Palimpsest is a manuscript containing the partial text of seven treatises by Archimedes that were copied onto parchment and bound in the tenth-century AD. This work is aimed at providing tools that allow scholars of ancient Greek mathematics to retrieve as much information as possible from images of the remaining degraded text. Acorrelation pattern recognition (CPR) system has been developed to recognize distorted versions of Greek characters in problematic regions of the palimpsest imagery, which have been obscured by damage from mold and fire, overtext, and natural aging. Feature vectors for each class of characters are constructed using a series of spatial correlation algorithms and corresponding performance metrics. Principal components analysis (PCA) is employed prior to classification to remove features corresponding to filtering schemes that performed poorly for the spatial characteristics of the selected region-of-interest. A probability is then assigned to each class, forming a character probability distribution based on relative distances from the class feature vectors to the ROI feature vector in principal component (PC) space. However, the current CPR system does not produce a single classification decision, as is common in most target detection problems, but instead has been designed to provide intermediate results that allow the user to apply his or her own decisions (or evidence) to arrive at a conclusion. To achieve this result, a probabilistic network has been incorporated into the recognition system. A probabilistic network represents a method for modeling the uncertainty in a system, and for this application, it allows information from the existing iv partial transcription and contextual knowledge from the user to be an integral part of the decision-making process. The CPR system was designed to provide a framework for future research in the area of spatial pattern recognition by accommodating a broad range of applications and the development of new filtering methods. For example, during preliminary testing, the CPR system was used to confirm the publication date of a fifteenth-century Hebrew colophon, and demonstrated success in the detection of registration markers in three-dimensional MRI breast imaging. In addition, a new correlation algorithm that exploits the benefits of linear discriminant analysis (LDA) and the inherent shift invariance of spatial correlation has been derived, implemented, and tested. Results show that this composite filtering method provides a high level of class discrimination while maintaining tolerance to withinclass distortions. With the integration of this algorithm into the existing filter library, this work completes each stage of a cyclic workflow using the developed CPR system, and provides the necessary tools for continued experimentation

    Damage and repair classification in reinforced concrete beams using frequency domain data

    Get PDF
    This research aims at developing a new vibration-based damage classification technique that can efficiently be applied to a real-time large data. Statistical pattern recognition paradigm is relevant to perform a reliable site-location damage diagnosis system. By adopting such paradigm, the finite element and other inverse models with their intensive computations, corrections and inherent inaccuracies can be avoided. In this research, a two-stage combination between principal component analysis and Karhunen-Loéve transformation (also known as canonical correlation analysis) was proposed as a statistical-based damage classification technique. Vibration measurements from frequency domain were tested as possible damage-sensitive features. The performance of the proposed system was tested and verified on real vibration measurements collected from five laboratory-scale reinforced concrete beams modelled with various ranges of defects. The results of the system helped in distinguishing between normal and damaged patterns in structural vibration data. Most importantly, the system further dissected reasonably each main damage group into subgroups according to their severity of damage. Its efficiency was conclusively proved on data from both frequency response functions and response-only functions. The outcomes of this two-stage system showed a realistic detection and classification and outperform results from the principal component analysis-only. The success of this classification model is substantially tenable because the observed clusters come from well-controlled and known state conditions

    An Undergraduate Intern Model for Mathematics Teacher Preparation

    Get PDF
    • 

    corecore