14,595 research outputs found

    Detecting fish aggregations from reef habitats mapped with high resolution side scan sonar imagery

    Get PDF
    As part of a multibeam and side scan sonar (SSS) benthic survey of the Marine Conservation District (MCD) south of St. Thomas, USVI and the seasonal closed areas in St. Croix—Lang Bank (LB) for red hind (Epinephelus guttatus) and the Mutton Snapper (MS) (Lutjanus analis) area—we extracted signals from water column targets that represent individual and aggregated fish over various benthic habitats encountered in the SSS imagery. The survey covered a total of 18 km2 throughout the federal jurisdiction fishery management areas. The complementary set of 28 habitat classification digital maps covered a total of 5,462.3 ha; MCDW (West) accounted for 45% of that area, and MCDE (East) 26%, LB 17%, and MS the remaining 13%. With the exception of MS, corals and gorgonians on consolidated habitats were significantly more abundant than submerged aquatic vegetation (SAV) on unconsolidated sediments or unconsolidated sediments. Continuous coral habitat was the most abundant consolidated habitat for both MCDW and MCDE (41% and 43% respectively). Consolidated habitats in LB and MS predominantly consisted of gorgonian plain habitat with 95% and 83% respectively. Coral limestone habitat was more abundant than coral patch habitat; it was found near the shelf break in MS, MCDW, and MCDE. Coral limestone and coral patch habitats only covered LB minimally. The high spatial resolution (0.15 m) of the acquired imagery allowed the detection of differing fish aggregation (FA) types. The largest FA densities were located at MCDW and MCDE over coral communities that occupy up to 70% of the bottom cover. Counts of unidentified swimming objects (USOs), likely representing individual fish, were similar among locations and occurred primarily over sand and shelf edge areas. Fish aggregation school sizes were significantly smaller at MS than the other three locations (MCDW, MCDE, and LB). This study shows the advantages of utilizing SSS in determining fish distributions and density

    Biochemical network matching and composition

    Get PDF
    This paper looks at biochemical network matching and compositio

    Deep Space Network information system architecture study

    Get PDF
    The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

    SAFIR: A Simple API for Financial Information Requests

    Get PDF
    We describe a general structure allowing to represent in a regular and extensible way all the financial data available in a research laboratory (at present, the Adaptive Computer Systems Laboratory of the Université de Montréal). After an analysis of field, we clarify the XML representation of information and introduce a C++ interface allowing to reach it by a powerful mechanism of requests. We describe in appendix a methodology allowing to find the option strike prices from databases containing only the prices and the ticker symbols; this methodology is robust in the presence of irregular strike prices (not corresponding to the tickers). Nous décrivons une structure générale permettant de représenter de manière régulière et extensible toutes les données financières disponibles dans un laboratoire de recherche (présentement, le Laboratoire d'informatique des systèmes adaptatifs de l'Université de Montréal). Après une analyse de domaine, nous explicitons la représentation XML de l'information et introduisons une interface C++ permettant d'y accéder par un mécanisme de requêtes puissant. Nous décrivons en appendice une méthodologie permettant de retrouver les prix d'exercice (strikes) d'options depuis des bases de données contenant seulement des prix et les "ticker symbols"; cette méthodologie est robuste en présence de prix d'exercice irréguliers (qui ne correspondent pas aux tickers).Financial database, XML, DTD, C++, option strike price discovery, irregular strike prices, Base de données financières, XML, DTD, C++, redécouverte de prix d'exercice d'options, prix d'exercice irréguliers

    High-Performance Meta-Genomic Gene Identification

    Get PDF
    Computational Genomics, or Computational Genetics, refers to the use of computational and statistical analysis for understanding the structure and the function of genetic material in organisms. The primary focus of research in computational genomics in the past three decades has been the understanding of genomes and their functional elements by analyzing biological sequence data. The high demand for low-cost sequencing has driven the development of highthroughput sequencing technologies, next-generation sequencing (NGS), that parallelize the sequencing process, producing thousands or millions of sequences concurrently. Moore’s Law is the observation that the number of transistors on integrated circuits doubles approximately every two years; correspondingly, the cost per transistor halves. The cost of DNA sequencing declines much faster, which implies more new DNA data will be obtained. This large-scale sequence data, produced with high throughput sequencing technologies, needs to be processed in a time-effective and cost-effective manner. In this dissertation, we present a high-performance meta-genome gene identification framework. This framework includes four modules: filter, alignment, error correction, and gene identification. The following chapters describe the proposed design and evaluation of this pipeline. The most computationally expensive kernel in the framework is the alignment procedure. Thus, the filter module is developed to determine unnecessary alignment operations. Without the filter module, the alignment module requires 1.9 hours to complete all-to-all alignment on a test file of size 512,000 sequences with each sequence average length 750 base pairs by using ten Kepler K20 NVIDIA GPU. On the other hand, when combined with the filter kernel, the total time is 11.3 minutes. Note that the ideal speedup is nearly 91.4 times faster when new alignment kernel is run on ten GPUs ( 10*9.14). We conclude that accuracy can be achieved at the expense of more resources while operating frequency can still be maintained

    Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods

    Get PDF
    Background The prediction of human gene–abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene–disease associations has been widely investigated, the related problem of gene–phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. Results We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a “flat” learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of- the-art algorithms and with a significant reduction of the computational complexity. Conclusions Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository

    Prospects and limitations of full-text index structures in genome analysis

    Get PDF
    The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared

    ExplainIt! -- A declarative root-cause analysis engine for time series data (extended version)

    Full text link
    We present ExplainIt!, a declarative, unsupervised root-cause analysis engine that uses time series monitoring data from large complex systems such as data centres. ExplainIt! empowers operators to succinctly specify a large number of causal hypotheses to search for causes of interesting events. ExplainIt! then ranks these hypotheses, reducing the number of causal dependencies from hundreds of thousands to a handful for human understanding. We show how a declarative language, such as SQL, can be effective in declaratively enumerating hypotheses that probe the structure of an unknown probabilistic graphical causal model of the underlying system. Our thesis is that databases are in a unique position to enable users to rapidly explore the possible causal mechanisms in data collected from diverse sources. We empirically demonstrate how ExplainIt! had helped us resolve over 30 performance issues in a commercial product since late 2014, of which we discuss a few cases in detail.Comment: SIGMOD Industry Track 201
    corecore