Search CORE

1,345 research outputs found

A resource-frugal probabilistic dictionary and applications in (meta)genomics

Author: Bittner Lucie
Limasset Antoine
Marchet Camille
Peterlongo Pierre
Publication venue
Publication date: 26/05/2016
Field of study

Genomic and metagenomic fields, generating huge sets of short genomic sequences, brought their own share of high performance problems. To extract relevant pieces of information from the huge data sets generated by current sequencing techniques, one must rely on extremely scalable methods and solutions. Indexing billions of objects is a task considered too expensive while being a fundamental need in this field. In this paper we propose a straightforward indexing structure that scales to billions of element and we propose two direct applications in genomics and metagenomics. We show that our proposal solves problem instances for which no other known solution scales-up. We believe that many tools and applications could benefit from either the fundamental data structure we provide or from the applications developed from this structure.Comment: Submitted to PSC 201

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Exploratory study to explore the role of ICT in the process of knowledge management in an Indian business environment

Author: Gururajan Raj
Hafeez-Baig Abdul
Heng Sheng Tasi
Sankaran Prema
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In the 21st century and the emergence of a digital economy, knowledge and the knowledge base economy are rapidly growing. To effectively be able to understand the processes involved in the creating, managing and sharing of knowledge management in the business environment is critical to the success of an organization. This study builds on the previous research of the authors on the enablers of knowledge management by identifying the relationship between the enablers of knowledge management and the role played by information communication technologies (ICT) and ICT infrastructure in a business setting. This paper provides the findings of a survey collected from the four major Indian cities (Chennai, Coimbatore, Madurai and Villupuram) regarding their views and opinions about the enablers of knowledge management in business setting. A total of 80 organizations participated in the study with 100 participants in each city. The results show that ICT and ICT infrastructure can play a critical role in the creating, managing and sharing of knowledge in an Indian business environment

University of Southern Queensland ePrints

Recommended from our members

Hierarchical video summarisation in reference frame subspace

Author: Crookes D
Jiang RM
Sadka AH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

In this paper, a hierarchical video structure summarization approach using Laplacian Eigenmap is proposed, where a small set of reference frames is selected from the video sequence to form a reference subspace to measure the dissimilarity between two arbitrary frames. In the proposed summarization scheme, the shot-level key frames are first detected from the continuity of inter-frame dissimilarity, and the sub-shot level and scene level representative frames are then summarized by using k-mean clustering. The experiment is carried on both test videos and movies, and the results show that in comparison with a similar approach using latent semantic analysis, the proposed approach using Laplacian Eigenmap can achieve a better recall rate in keyframe detection, and gives an efficient hierarchical summarization at sub shot, shot and scene levels subsequently

Brunel University Research Archive

Indexing Iris Database Using Multi-Dimensional R-Trees

Author: Sahu Tithy
Publication venue
Publication date: 14/05/2012
Field of study

Iris is one of the most widely used biometric modality for recognition due to its reliability, non-invasive characteristic, speed and performance. The patterns remain stable throughout the lifetime of an individual. Attributable to these advantages, the application of iris biometric is increasingly encouraged by various commercial as well as government agencies. Indexing is done to identify and retrieve a small subset of candidate data from the database of iris data of individuals in order to determine a possible match. Since the database is extremely large, it is necessary to find fast and efficient indexing methods. In this thesis, an efficient local feature based indexing approach is proposed using clustered scale invariant feature transform (SIFT) keypoints, that achieves invariance to similarity transformations, illumination and occlusion. These cluster centers are used to construct R-trees for indexing. This thesis proposes an application of R-trees for iris database indexing. The system is tested using publicly available BATH and CASIA-IrisV4 databases

ethesis@nitr

Improvement of fingerprint retrieval by a statistical classifier

Author: Leung CH
Leung KC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The topics of fingerprint classification, indexing, and retrieval have been studied extensively in the past decades. One problem faced by researchers is that in all publicly available fingerprint databases, only a few fingerprint samples from each individual are available for training and testing, making it inappropriate to use sophisticated statistical methods for recognition. Hence most of the previous works resorted to simple

k

-nearest neighbor (

k

-NN) classification. However, the

k

-NN classifier has the drawbacks of being comparatively slow and less accurate. In this paper, we tackle this problem by first artificially expanding the set of training samples using our previously proposed spatial modeling technique. With the expanded training set, we are then able to employ a more sophisticated classifier such as the Bayes classifier for recognition. We apply the proposed method to the problem of one-to-

N

fingerprint identification and retrieval. The accuracy and speed are evaluated using the benchmarking FVC 2000, FVC 2002, and NIST-4 databases, and satisfactory retrieval performance is achieved. © 2010 IEEE.published_or_final_versio

CiteSeerX

HKU Scholars Hub

A Survey to Fix the Threshold and Implementation for Detecting Duplicate Web Documents

Author: Bhimireddy Manojreddy
Gandi Krishna Pavan
Hicks Reuven
Veeramachaneni Bhargav Roy
Publication venue: OPUS Open Portal to University Scholarship
Publication date: 01/10/2015
Field of study

The drastic development in the information accessible on the World Wide Web has made the employment of automated tools to locate the information resources of interest, and for tracking and analyzing the same a certainty. Web Mining is the branch of data mining that deals with the analysis of World Wide Web. The concepts from various areas such as Data Mining, Internet technology and World Wide Web, and recently, Semantic Web can be said as the origin of web mining. Web mining can be defined as the procedure of determining hidden yet potentially beneficial knowledge from the data accessible in the web. Web mining comprise the sub areas: web content mining, web structure mining, and web usage mining. Web content mining is the process of mining knowledge from the web pages besides other web objects. The process of mining knowledge about the link structure linking web pages and some other web objects is defined as Web structure mining. Web usage mining is defined as the process of mining the usage patterns created by the users accessing the web pages. The search engine technology has led to the development of World Wide. The search engines are the chief gateways for access of information in the web. The ability to locate contents of particular interest amidst a huge heap has turned businesses beneficial and productive. The search engines respond to the queries by employing the process of web crawling that populates an indexed repository of web pages. The programs construct a confined repository of the segment of the web that they visit by navigating the web graph and retrieving pages. There are two main types of crawling, namely, Generic and Focused crawling. Generic crawlers crawls documents and links of diverse topics. Focused crawlers limit the number of pages with the aid of some prior obtained specialized knowledge. The systems that index, mine, and otherwise analyze pages (such as, the search engines) are provided with inputs from the repositories of web pages built by the web crawlers. The drastic development of the Internet and the growing necessity to incorporate heterogeneous data is accompanied by the issue of the existence of near duplicate data. Even if the near duplicate data don’t exhibit bit wise identical nature they are remarkably similar. The duplicate and near duplicate web pages either increase the index storage space or slow down or increase the serving costs which annoy the users, thus causing huge problems for the web search engines. Hence it is inevitable to design algorithms to detect such pages

Governors State University

On User Modelling for Personalised News Video Recommendation

Author: Hopfgartner F.
Jose J.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2009
Field of study

In this paper, we introduce a novel approach for modelling user interests. Our approach captures users evolving information needs, identifies aspects of their need and recommends relevant news items to the users. We introduce our approach within the context of personalised news video retrieval. A news video data set is used for experimentation. We employ a simulated user evaluation

Enlighten

The enablers and implementation model for mobile KMS in Australian healthcare

Author: Gururajan Raj
Tsai Heng-Sheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

In this research project, the enablers in implementing mobile KMS in Australian regional healthcare will be investigated, and a validated framework and guidelines to assist healthcare in implementing mobile KMS will also be proposed with both qualitative and quantitative approaches. The outcomes for this study are expected to improve the understanding the enabling factors in implementing mobile KMS in Australian healthcare, as well as provide better guidelines for this process

Crossref

University of Southern Queensland ePrints