33 research outputs found

    A framework for integrating DNA sequenced data

    Get PDF
    The Human Genome Project generated vast amounts of DNA sequenced data scattered in disparate data sources in a variety of formats. Integrating biological data and extracting information held in DNA sequences are major ongoing tasks for biologists and software professionals. This thesis explored issues of finding, extracting, merging and synthesizing information from multiple disparate data sources containing DNA sequenced data, which is composed of 3 billion chemical building blocks of bases. We proposed a biological data integration framework based on typical usage patterns to simplify these issues for biologists. The framework uses a relational database management system at the backend, and provides techniques to extract, store, and manage the data. This framework was implemented, evaluated, and compared with existing biological data integration solutions

    Beyond oracles – a critical look at real-world blockchains

    Get PDF
    This thesis intends to provide answers to the following questions: 1) What is the oracle problem, and how do the limitations of oracles affect different real-world applications? 2) What are the characteristics of the portion of the literature that leaves the oracle problem unaddressed? 3) Who are the main contributors to solving the oracle problem, and which issues are they focusing on? 4) How can the oracle problem be overcome in real-world applications? The first chapter aims to answer the first question through a literature review of the most current papers published in the field, bringing clarity to the blockchain oracle problem by discussing its effects in some of the most promising real-world blockchain applications. Thus, the chapter investigates the sectors of Intellectual Property Rights (IPRs), healthcare, supply chains, academic records, resource management, and law. By comparing the different applications, the review reveals that heterogeneous issues arise depending on the sector. The analysis supports the view that the more trusted a system is, the less the oracle problem has an impact. The second chapter presents the results of a systematic review intended to highlight the state-of-the-art of real-world blockchain applications using the oracle problem as a lens of analysis. Academic papers proposing real-world blockchain applications were reviewed to see if the authors considered the oracle’s role in the applications and related issues. The results found that almost 90% of the inspected literature neglected the role of oracles, thereby proposing incomplete or irreproducible projects. Through a bibliometric analysis, the third chapter sheds light on the institutions and authors that are actively contributing to the literature on oracles and promoting progress and cooperation. The study shows that, although there is still a lack of collaboration worldwide, there are dedicated authors and institutions working toward a similar and beneficial cause. The results also make it clear that most areas of oracle research are poorly addressed, with some remaining untouched. The fourth and last chapter focuses on a case study of a dairy company operating in the northeast region of Italy. The company applied blockchain technology to support the traceability of their products worldwide, and the study investigated the benefits of their innovation from the point of view of sustainability. The study also considers the role of oracle management, as it is a critical aspect of a blockchain-based project. Thus, the relationship between the company, the blockchain oracle, and the supervising authority is discussed, offering insight into how sustainable innovations can positively impact supply chain management. This work as a whole aims to shed light on blockchain oracles as an academic area of research, explaining why the study of oracles should be considered the backbone of blockchain literature development

    Workflow Management Systems and ERP Systems: Differences, Commonalities, and Applications

    Get PDF
    Two important classes of information systems, Workflow Management Systems(WfMSs) and Enterprise Resource Planning (ERP) systems, have been used to support e-business process redesign, integration, and management. While both technologies can help with business process automation, data transfer, and information sharing, the technological approach and features of solutions provided by WfMS and ERP are different. Currently, there is a lack of understanding of these two classes of information systems in the industry and academia, thus hindering their effective applications. In this paper, we present a comprehensive comparison between these two classes of systems. We discuss how the two types of systems can be used independently or together to develop intra- and inter-organizational application solutions. In particular, we also explore the roles of WfMS and ERP in the next generation of IT architecture based on web services. Our findings should help businesses make better decisions in the adoption of both WfMS and ERP in their e-business strategies

    Support for taxonomic data in systematics

    Get PDF
    The Systematics community works to increase our understanding of biological diversity through identifying and classifying organisms and using phylogenies to understand the relationships between those organisms. It has made great progress in the building of phylogenies and in the development of algorithms. However, it has insufficient provision for the preservation of research outcomes and making those widely accessible and queriable, and this is where database technologies can help. This thesis makes a contribution in the area of database usability, by addressing the query needs present in the community, as supported by the analysis of query logs. It formulates clearly the user requirements in the area of phylogeny and classification queries. It then reports on the use of warehousing techniques in the integration of data from many sources, to satisfy those requirements. It shows how to perform query expansion with synonyms and vernacular names, and how to implement hierarchical query expansion effectively. A detailed analysis of the improvements offered by those query expansion techniques is presented. This is supported by the exposition of the database techniques underlying this development, and of the user and programming interfaces (web services) which make this novel development available to both end-users and programs

    From Reality Keys to Oraclize. A Deep Dive into the History of Bitcoin Oracles

    Get PDF
    Before the advent of alternative blockchains such as Ethereum, the future of decentralization was all in the hands of Bitcoin. Together with Nakamoto itself, early developers were trying to leverage Bitcoin potential to decentralize traditionally centralized applications. However, being Bitcoin a decentralized machine, available non-trustless oracles were considered unsuitable. Therefore, strategies had to be elaborated to solve the so-called oracle problem in the newborn scenario. By interviewing early developers and crawling early forums and repositories, this paper aims to retrace and reconstruct the chain of events and contributions that gave birth to oracles on Bitcoin. The evolution of early trust models and approaches to solving the oracle problem is also outlined. Analyzing technical and social barriers to building oracles on Bitcoin, the transition to Ethereum will also be discussed.Comment: Literature background and methodology are deliberately omitted at this stage (preprint). To improve readability for a broader audience, the content is presented more like a stor

    Mining very long sequences with PLWAPLong algorithms

    Get PDF
    Sequential pattern mining is the process of finding inter-transaction frequent sequential patterns from a sequential database, where records consist of ordered sets of events (or items), by applying data mining techniques on such sequential databases. Discovering sequential patterns in web server logs is an example application of sequential mining, which is useful for predicting visiting patterns of web users for such purposes as targeted advertisements. Position Coded Pre-order Linked Web Access Pattern (PLWAP) mining algorithm is one of the existing efficient web sequential pattern mining algorithms, which stores the frequently stored sequences of the entire sequential database in a compressed tree form with position coded nodes. However, for very long sequences exceeding thirty two nodes, the number of bits an integer position code can hold, the PLWAP algorithm\u27s performance begins to degrade because it employs linked lists to store conjunctions of long position codes and the linked list traversals slow down the algorithm both during tree construction and mining. PLWAP algorithm also uses each and every node in the frequent 1-item event queue to test for that event inclusion in the suffix tree root set during mining. This is a very expensive operation since except for one node all other nodes that are its ancestors and descendents are not included in the root set. This thesis proposes two new algorithms, i.e. PLWAPLong1 and PLWAPLong2. Both of these new algorithms use a new position code numbering scheme where each node is assigned two numeric variables (startPosition, endPosition) instead of one. Using this scheme we can determine the ancestor node in O (1) operation by comparing the startPosition and endPosition of two nodes. PLWAPLong1 algorithm also proposes transforming the linked list based tree to an equivalent array representation and using binary search to find the immediate descendant in a suffix tree. PLWAPLong2 uses existing linked list based tree. Both PLWAPLong1 and PLWAPLong2 algorithms introduce a new technique called Last Descendant to eliminate the unwanted nodes from ancestor/descendent test when creating the suffix tree root set. Keywords: Data mining, Web Mining, Association Rule Mining, Long Sequences, PLWAP Minin

    Bioinformatics Analysis and Annotation of Microtubule Binding and Associated Proteins (MAPs) - Creating a Database of MAPs

    Get PDF
    A Thesis Submitted to the Faculty of the School of Informatics, Indiana University, Indianapolis By Narmada Shenoy In Partial Fulfillment of the Requirements for the Degree of Master of Science August 2005Microtubules have many roles in the cytoskeletal infrastructure. This infrastructure underlies vital processes of cellular life such as motility, division, morphology, and intracellular organization and transport. These different roles are carried out by the creation of different microtubule (MT) systems (such as basal bodies, centrioles, flagellum, kinetochores, and mitotic spindles). The changing roles require the cytoskeleton to be both dynamic and static in nature. Guiding these processes are a network of proteins that direct cellular behavior through their ability to bind microtubules (MTs) in a spatial- and temporal-specific manner. The identification and characterization of the suite of microtubule binding and associated proteins (MAPs) involved in MT systems is important for the understanding of the biological form and function of each MT system. This research involved the analysis and annotation of four MAPs – Ensconsin in Humans, Hook (homolog 3) in Humans, Protein Regulator of Cytokinesis 1 (PRC1) in Humans and Anaphase Spindle Elongation protein (ASE1) in yeast. A bioinformatics approach was used for the annotation and analysis. A protocol for analysis and annotation of MAPs was developed. During the process, some limitations in using bioinformatics tools and procedures were encountered. These limitations were overcome, the initial protocol was improved on and a modified protocol of analysis was developed. A database was designed and built to hold annotated information on the MAPs. We seek to disseminate this database and its functionalities as a web resource to the scientific community. It will provide an excellent forum for researchers to obtain relevant information on MT binding and associated proteins (MAPs). Infection by parasitic protozoa causes incalculable morbidity and mortality to humans and agricultural animals. In this research, we have also focused on MAPs in parasitic organisms of the Apicomplexan and Trypanosomatid genera. The protocol for analysis incorporates steps to analyze MAPs from these organisms as well. Malaria (a potentially life threatening disease) is caused by Plasmodium, an Apicomplexan parasite. This parasite is transmitted to people by the female Anopheles mosquito, which feeds on human blood. African Sleeping Sickness is an acute disease 8 caused by Trypanosoma brucei that typically leads to death within weeks or months if not treated. Microtubule-associated proteins (MAPs) and their alteration of the unique microtubule (MT) systems play major roles in these organisms throughout their life cycle and are required for their pathogenic mechanisms. Each parasite contains unique MT systems that will test our annotation process as well as prepare the DB for addition of other novel MT systems, such as those contained with plants. Additionally, these single cell organisms have a multistage life cycle that provide similar annotation challenges to those encountered when one considers multi-cellular organisms. Therefore, a researcher working on any MT system within the database will find useful information regardless of the organism that they are studying. This will leave us with a sub-set of MAPs from parasitic organisms in our database that are potential drug-targets

    Multimodal Content Delivery for Geo-services

    Get PDF
    This thesis describes a body of work carried out over several research projects in the area of multimodal interaction for location-based services. Research in this area has progressed from using simulated mobile environments to demonstrate the visual modality, to the ubiquitous delivery of rich media using multimodal interfaces (geo- services). To effectively deliver these services, research focused on innovative solutions to real-world problems in a number of disciplines including geo-location, mobile spatial interaction, location-based services, rich media interfaces and auditory user interfaces. My original contributions to knowledge are made in the areas of multimodal interaction underpinned by advances in geo-location technology and supported by the proliferation of mobile device technology into modern life. Accurate positioning is a known problem for location-based services, contributions in the area of mobile positioning demonstrate a hybrid positioning technology for mobile devices that uses terrestrial beacons to trilaterate position. Information overload is an active concern for location-based applications that struggle to manage large amounts of data, contributions in the area of egocentric visibility that filter data based on field-of-view demonstrate novel forms of multimodal input. One of the more pertinent characteristics of these applications is the delivery or output modality employed (auditory, visual or tactile). Further contributions in the area of multimodal content delivery are made, where multiple modalities are used to deliver information using graphical user interfaces, tactile interfaces and more notably auditory user interfaces. It is demonstrated how a combination of these interfaces can be used to synergistically deliver context sensitive rich media to users - in a responsive way - based on usage scenarios that consider the affordance of the device, the geographical position and bearing of the device and also the location of the device
    corecore