Search CORE

55 research outputs found

bdbms -- A Database Management System for Biological Data

Author: Aref Walid G.
Eltabakh Mohamed Y.
Ouzzani Mourad
Publication venue
Publication date: 01/12/2006
Field of study

Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce bdbms, an extensible prototype database management system for supporting biological data. bdbms extends the functionalities of current DBMSs to include: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as first class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

arXiv.org e-Print Archive

CiteSeerX

Purdue E-Pubs

Blockchain for Healthcare Systems: Concepts, Applications, Challenges, and Future Trends

Author: Eltabakh Mohamed nasr, Emad Abdelrahman, Roayat Abdelfatah, Mostafa
Publication venue: Arab Journals Platform
Publication date: 12/11/2023
Field of study

-Electronic medical records are digital documents that contain medical data pertaining to a patient\u27s medical care. Because electronic health records are regularly exchanged amongst stakeholders in healthcare, they are prone to a range of challenges such as data misuse and loss of privacy and security. These challenges may be solved by utilizing blockchain-based technologies in the healthcare area. Blockchain is a decentralized innovative technology that can completely transform, reshape, and reinvent how data is stored and processed in the healthcare sector. In this article, we offer an overview of the blockchain, its formation, its types, and how it works. We review the various applications of blockchain in the medical field and how Blockchain revolutionized the medical industry. We highlight previous scientific research on the application of blockchain to electronic health record systems (EHRs). Finally, we discuss the open research problems that limit the use of blockchain in the medical field

Arab Journals Platform

The SBC-Tree: An Index for Run-Length Compressed Sequences

Author: Aref Walid G.
Eltabakh Mohamed Y.
Hon Wing-Kai
Shah Rahul
Vitter Jeffrey S.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2005
Field of study

Run-Length-Encoding (RLE) is a data compression technique that is used in various applications, e.g., biological sequence databases. multimedia: and facsimile transmission. One of the main challenges is how to operate, e.g., indexing: searching, and retriexral: on the compressed data without decompressing it. In t.his paper, we present the String &tree for _Compressed sequences; termed the SBC-tree, for indexing and searching RLE-compressed sequences of arbitrary length. The SBC-tree is a two-level index structure based on the well-knoxvn String B-tree and a 3-sided range query structure. The SBC-tree supports substring as \\re11 as prefix m,atching, and range search operations over RLE-compressed sequences. The SBC-tree has an optimal external-memory space complexity of O(N/B) pages, where N is the total length of the compressed sequences, and B is the disk page size. The insertion and deletion of all suffixes of a compressed sequence of length m taltes O(m logB(N + m)) I/O operations. Substring match,ing, pre,fix matching, and range search execute in an optimal O(log, N + F) I/O operations, where Ip is the length of the compressed query pattern and T is the query output size. Re present also two variants of the SBC-tree: the SBC-tree that is based on an R-tree instead of the 3-sided structure: and the one-level SBC-tree that does not use a two-dimensional index. These variants do not have provable worstcase theoret.ica1 bounds for search operations, but perform well in practice. The SBC-tree index is realized inside PostgreSQL in t,he context of a biological protein database application. Performance results illustrate that using the SBC-tree to index RLE-compressed sequences achieves up to an order of magnitude reduction in storage, up to 30 % reduction in 110s for the insertion operations, and retains the optimal search performance achieved by the St,ring B-tree over the uncompressed sequences.!I c 0,

CiteSeerX

Purdue E-Pubs

A database server for next-generation scientific data management

Author: Aref Walid G.
Elmagarmid Ahmed
Eltabakh Mohamed
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2010
Field of study

The growth of scientific information and the increasing automation of data collection have made databases integral to many scientific disciplines including life sciences, physics, meteorology, earth and atmospheric sciences, and chemistry. These sciences pose new data management challenges to current database system technologies. This dissertation addresses the following three challenges: (1) Annotation Management: Annotations and provenance information are important metadata that go hand-in-hand with scientific data. Annotating scientific data represents a vital mechanism for scientists to share knowledge and build an interactive and collaborative environment. A major challenge is: How to manage large volumes of annotations, especially at various granularities, e.g., cell, column, and row level annotations, along with their corresponding data items. (2) Complex Dependencies Involving Real-world Activities: The processing of scientific data is a complex cycle that may involve sequences of activities external to the database system, e.g., wet-lab experiments, instrument readings, and manual measurements. These external activities may incur inherently long delays to prepare for and to conduct. Updating a database value may render parts of the database inconsistent until some external activity is executed and its output is reflected back and updated into the database. The challenge is: How to integrate these external activities within the database engine and accommodate the long delays between the updates while making the intermediate results instantly available for querying. (3) Fast Access to Scientific Data with Complex Data Types: Scientific experiments produce large volumes of data of complex types, e.g., arrays, images, long sequences, and multi-dimensional data. A major challenge is: How to provide fast access to these large pools of scientific data with non-traditional data types. In this dissertation, I present extensions to current database engines to address the above challenges. The proposed extensions enable scientific data to be stored and processed within their natural habitat: the database system. Experimental studies and performance analysis for all the proposed algorithms are carried out using both real-world and synthetic datasets. Our results show the applicability of the proposed extensions and their performance gains over other existing techniques and algorithms

Purdue E-Pubs

Robust Security Mechnisms for Data Streams Systems

Author: Ali Mohamed
ElTabakh Mohamed
Nita-Rotaru Cristina
Publication venue: 'Purdue University (bepress)'
Publication date: 01/06/2004
Field of study

Purdue E-Pubs

Hippocratic Data Streams-Concepts, Architectures and Issues

Author: Ali M. H.
Bertino Elisa
ElTabakh M. Y.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2005
Field of study

The goal of a Hippocratic DBMS is to preserve the privacy of data without much sacrificing performance. In this paper, we address problen of developing privacy-preserving systems into the more chollen.ging con,tezt o,f doto streonrs. We con,trmst dato st7eo711s to trodition.al dotobases a11.d ide7l.tify tire new challen,g~s posed by data streo7ns. We disctrss elr:o,nrples of stream.ing opplica~tion.s 771. which. priva,cy and security issues ore cr.uc~:o.L. 1Ve propose a visionary architectural design of 11.0 ~ (L Hippo-c7atic Data Streom Management System (HDSAdS) may look like. We iden.tzfy sellera1 open rese(lrc11. directi0n.s u!ll.ere privacy preservin.g issues meet the data streaming paradigm

CiteSeerX

Purdue E-Pubs

To Trie or Not to Trie? Realizing Space-partitioning Trees inside PostgreSQL: Challenges, Experiences and Performance

Author: Aref Walid G.
Eltabakh Mohamed Y.
Eltarras Ramy
Publication venue: 'Purdue University (bepress)'
Publication date: 01/04/2005
Field of study

Purdue E-Pubs

The SBC-tree: An index for run-length compressed sequences

Author: Eltabakh Mohamed Y.
Publication venue: Association for Computing Machinery
Publication date
Field of study

[[abstract]]Run-Length-Encoding (RLE) is a data compression technique that is used in various applications, e.g., time series, biological sequences, and multimedia databases. One of the main challenges is how to operate on (e.g., index, search, and retrieve) compressed data without decompressing it. In this paper, we introduce the String B-tree for Compressed sequences, termed the SBC-tree, for indexing and searching RLE-compressed sequences of arbitrary length. The SBC-tree is a two-level index structure based on the well-known String B-tree and a 3-sided range query structure [7]. The SBC-tree supports pattern matching queries such as substring matching, prefix matching, and range search operations over RLE-compressed sequences. The SBC-tree has an optimal external-memory space complexity of O (N/B) pages, where N is the total length of the compressed sequences, and B is the disk page size. Substring matching, prefix matching, and range search execute in an optimal O(logB N + |p|+T/B) I/O operations, where |p| is the length of the compressed query pattern and T is the query output size. The SBC-tree is also dynamic and supports insert and delete operations efficiently. The insertion and deletion of all suffixes of a compressed sequence of length m take O(m lagB (N + m)) amortized I/O operations. The SBC-tree index is realized inside PostgreSQL. Performance results illustrate that using the SBC-tree to index RLE-compressed sequences achieves up to an order of magnitude reduction in storage, while retains the optimal search performance achieved by the String B-tree over the uncompressed sequences.[[fileno]]2030245030009[[department]]資訊工程學

National Tsing Hua University Institutional Repository

A database server for next-generation scientific data management

Author: Eltabakh Mohamed Y
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2010
Field of study

Purdue E-Pubs

Supporting annotated Relations

Author: Aref Walid G.
Elmagarmid Ahmed K.
Eltabakh M. Y.
Laura-Silva Y.
Ouzzani M.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2007
Field of study

Annotations and provenance data play a key role in understanding and curating scientific databases. However, current database management systems lack adequate support for managing annotations and provenance data including: (1) handling annotations at multiple granularities, i.e., at the table, tuple, column and cell levels, (2) propagating annotations along with query answers, (3) querying data based on their annotations, and (4) providing declarative ways to add, archive, and restore annotations. In this paper, we propose to treat multi-granular annotations and provenance as first class objects inside the database. We introduce the concept of "Annotated Relations " along with new operators and extended semantics for the standard relational oDerators in support of annotated relations. We present an expressive and declarative extension to SQL to support the processing and querying of annotated tables. We study several schemes for storing and indexing annotations based on annotation granularity and annotation size. Extensions to PostgreSQL are introduced to support annotated relations and implementation challenges are discussed. Performance analysis illustrates the potential of annotated relations as they achieve up to an order-of-magnitude reduction in storage and I/O costs.

CiteSeerX

Purdue E-Pubs