Search CORE

2 research outputs found

Discovery of error-tolerant biclusters from noisy gene expression data

Author: A Ben-Dor
A Gyenesei
A Poernomo
A Poernomo
A Prelic
A Subramanian
A Tanay
C Becquet
C Creighton
C Yang
G Pandey
H Cheng
H Cheng
I Dhillon
J Besson
J Han
J Liu
J Liu
J Seppänen
M Ashburner
M Zhang
Navneet Rao
R Gupta
R Gupta
R Rastogi
R Srikant
Rohit Gupta
S Bergmann
S Hanhijärvi
SC Madeira
T Calders
T Fukuda
T Hughes
T Mcintosh
Vipin Kumar
Y Cheng
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

An important analysis performed on microarray gene-expression data is to discover biclusters, which denote groups of genes that are coherently expressed for a subset of conditions. Various biclustering algorithms have been proposed to find different types of biclusters from these real-valued gene-expression data sets. However, these algorithms suffer from several limitations such as inability to explicitly handle errors/noise in the data; difficulty in discovering small bicliusters due to their top-down approach; inability of some of the approaches to find overlapping biclusters, which is crucial as many genes participate in multiple biological processes. Association pattern mining also produce biclusters as their result and can naturally address some of these limitations. However, traditional association mining only finds exact biclusters, whic

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Graph Data Processing and Analysis: From Algorithms to System Development

Author: Li Shunyang
Publication venue: UNSW, Sydney
Publication date: 01/01/2022
Field of study

There are many real-world application domains where data can be naturally modelled as graphs, such as social networks and computer networks. The amount of data generated and published is rapidly increasing with the explosion of information. Effective storage of graph data and querying has become a significant challenge; hence the graph database is emerging to address this challenge. Graph databases have the unique advantages of modelling and querying complex relationships, capturing and navigating complex data relationships and recursive path querying when handling graph data. In this thesis, we enhance graph databases from both system and algorithm perspectives. Firstly, we propose two systems, SQL2Cypher and FSPS, to improve the usability and efficiency of graph databases. SQL2Cypher automatically migrates data from a relational database to a graph database. This system also supports translating SQL queries into Cypher queries. FSPS is the first FPGA-based system for accelerating graph queries on massive graphs. FSPS has the following features 1) a CPU-FPGA co-designed framework, 2) a fully pipelined FPGA execution, and 3) reduced data transfer from FPGA’s external memory. FSPS supports the two most fundamental types of graph queries, namely subgraph and path queries. Performance evaluation shows that FSPS outperforms the most popular graph database, Neo4j, by up to three orders of magnitude. All the draft demo videos can be found at https://www.youtube.com/watch?v=oSpHtJ8iVio and https://www.youtube.com/watch?v=eGaeBrVTJws. Secondly, the graph database does not widely support the cohesive subgraph models (i.e., Neo4j and PatMat). Many real-world relationships can be naturally represented as bipartite graphs such as customer-product, user-item, and author-paper. Therefore, we use efficient construct algorithms to investigate the bipartite hierarchy model. The bipartite hierarchy is the first model to discover the hierarchical structure of bipartite graphs based on the concept of (alpha, beta)-core and graph connectivity. These algorithms can effectively identify the affected regions to limit computation scope and avoid re-building the bipartite hierarchy from scratch. Extensive experiments on 10 real-world graphs demonstrate the effectiveness of the proposed bipartite hierarchy and validate the efficiency of our hierarchy constructions algorithms

UNSWorks