436 research outputs found

    Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure

    Get PDF
    Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods

    Tetrahedral mesh improvement using moving mesh smoothing, lazy searching flips, and RBF surface reconstruction

    Get PDF
    Given a tetrahedral mesh and objective functionals measuring the mesh quality which take into account the shape, size, and orientation of the mesh elements, our aim is to improve the mesh quality as much as possible. In this paper, we combine the moving mesh smoothing, based on the integration of an ordinary differential equation coming from a given functional, with the lazy flip technique, a reversible edge removal algorithm to modify the mesh connectivity. Moreover, we utilize radial basis function (RBF) surface reconstruction to improve tetrahedral meshes with curved boundary surfaces. Numerical tests show that the combination of these techniques into a mesh improvement framework achieves results which are comparable and even better than the previously reported ones.Comment: Revised and improved versio

    Tetrahedral Mesh Improvement Using Moving Mesh Smoothing and Lazy Searching Flips

    Get PDF
    In this paper we combine two new smoothing and flipping techniques. The moving mesh smoothing is based on the integration of an ordinary differential coming from a given functional. The lazy flip technique is a reversible edge removal algorithm to automatically search flips for local quality improvement. On itself, these strategies already provide good mesh improvement, but their combination achieves astonishing results which have not been reported so far. Provided numerical examples show that we can obtain final tetrahedral meshes with dihedral angles between 40 and 123 degrees. We compare the new method with other publicly available mesh improving codes

    Cumulative reports and publications

    Get PDF
    A complete list of Institute for Computer Applications in Science and Engineering (ICASE) reports are listed. Since ICASE reports are intended to be preprints of articles that will appear in journals or conference proceedings, the published reference is included when it is available. The major categories of the current ICASE research program are: applied and numerical mathematics, including numerical analysis and algorithm development; theoretical and computational research in fluid mechanics in selected areas of interest to LaRC, including acoustics and combustion; experimental research in transition and turbulence and aerodynamics involving LaRC facilities and scientists; and computer science

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Fifth Biennial Report : June 1999 - August 2001

    No full text

    Sixth Biennial Report : August 2001 - May 2003

    No full text

    Analysis and Design of Communication Avoiding Algorithms for Out of Memory(OOM) SVD

    Get PDF
    Many applications — including big data analytics, information retrieval, gene expression analysis, and numerical weather prediction – require the solution of large, dense singular value decomposition (SVD). The size of matrices used in many of these applications is becoming too large to fit into into a computer’s main memory at one time, and the traditional SVD algorithms that require all the matrix components to be loaded into memory before computation starts cannot be used directly. Moving data (communication) between levels of memory hierarchy and the disk exposes extra challenges to design SVD for such big matrices because of the exponential growth in the gap between floating-point arithmetic rate and bandwidth for many different storage devices on modern high performance computers. In this dissertation, we have analyzed communication overhead on hierarchical memory systems and disks for SVD algorithms and designed communication-avoiding (CA) Out of Memory (OOM) SVD algorithms. By Out of Memory we mean that the matrix is too big to fit in the main memory and therefore must reside in external or internal storage. We have studied communication overhead for classical one-stage blocked SVD and two-stage tiled SVD algorithms and proposed our OOM SVD algorithm, which reduces the communication cost. We have presented theoretical analysis and strategies to design CA OOM SVD algorithms, developed optimized implementation of CA OOM SVD for multicore architecture, and presented its performance results. When matrices are tall, performance of OOM SVD can be improved significantly by carrying out QR decomposition on the original matrix in the first place. The upper triangular matrix generated by QR decomposition may fit in the main memory, and in-core SVD can be used efficiently. Even if the upper triangular matrix does not fit in the main memory, OOM SVD will work on a smaller matrix. That is why we have analyzed communication reduction for OOM QR algorithm, implemented optimized OOM tiled QR for multicore systems and showed performance improvement of OOM SVD algorithms for tall matrices

    New Methods for Network Traffic Anomaly Detection

    Get PDF
    In this thesis we examine the efficacy of applying outlier detection techniques to understand the behaviour of anomalies in communication network traffic. We have identified several shortcomings. Our most finding is that known techniques either focus on characterizing the spatial or temporal behaviour of traffic but rarely both. For example DoS attacks are anomalies which violate temporal patterns while port scans violate the spatial equilibrium of network traffic. To address this observed weakness we have designed a new method for outlier detection based spectral decomposition of the Hankel matrix. The Hankel matrix is spatio-temporal correlation matrix and has been used in many other domains including climate data analysis and econometrics. Using our approach we can seamlessly integrate the discovery of both spatial and temporal anomalies. Comparison with other state of the art methods in the networks community confirms that our approach can discover both DoS and port scan attacks. The spectral decomposition of the Hankel matrix is closely tied to the problem of inference in Linear Dynamical Systems (LDS). We introduce a new problem, the Online Selective Anomaly Detection (OSAD) problem, to model the situation where the objective is to report new anomalies in the system and suppress know faults. For example, in the network setting an operator may be interested in triggering an alarm for malicious attacks but not on faults caused by equipment failure. In order to solve OSAD we combine techniques from machine learning and control theory in a unique fashion. Machine Learning ideas are used to learn the parameters of an underlying data generating system. Control theory techniques are used to model the feedback and modify the residual generated by the data generating state model. Experiments on synthetic and real data sets confirm that the OSAD problem captures a general scenario and tightly integrates machine learning and control theory to solve a practical problem
    • …
    corecore