17 research outputs found

    Motif-finding and Other Applications in Bioinformatics

    Get PDF

    Preprocessing of microarray data and analysis and comparison techniques for the resulting graph structures

    Get PDF
    During our collaborations with scientists interested in high-throughput analysis of biological data, we have made much progress and facilitated some interesting findings using our clique-finding tools. However, we have also uncovered ways in which we can make our tools more efficient but have yet to write the programs to perform these tasks. Part of the problem is time constraints: in order to be useful to us, an application must be quite flexible and run efficiently. Programming such a tool is no small task, so we have resorted to scripting solutions that are geared to the specific task at hand. The first aim of this work is to produce a tool that is usable by both us and our collaborators to meet our common data processing needs. Also during our collaborations, we have been tasked with finding new ways to help find potentially interesting data among a large amount of information that would be prohibitive to analyze by hand. One of our current tools is one that can take a graph and return all the maximal cliques, which, using real data, can be done in a reasonable amount of time. However, the list of maximal cliques itself is usually long and impractical to analyze by hand. Thus, we have needed to come up with new ways to sleuth out those genes and cliques that may be of most interest from a list of millions of cliques. The second aim of this work is to describe new methods that we have been using to achieve this

    A procedure for developing an acceptance test for airborne bathymetric lidar data application to NOAA charts in shallow waters

    Get PDF
    National Oceanic and Atmospheric Administration (NOAA) hydrographic data is typically acquired using sonar systems, with a small percent acquired via airborne lidar bathymetry for near‐shore areas. This study investigated an integrated approach for meeting NOAA’s hydrographic survey requirements for near‐shore areas of NOAA charts, using the existing topographic‐bathymetric lidar data from USACE’s National Coastal Mapping Program (NCMP). Because these existing NCMP bathymetric lidar datasets were not collected to NOAA hydrographic surveying standards, it is unclear if, and under what circumstances, they might aid in meeting certain hydrographic surveying requirements. The NCMP’s bathymetric lidar data are evaluated through a comparison to NOAA’s Office of Coast Survey hydrographic data derived from acoustic surveys. As a result, it is possible to assess if NCMP’s bathymetry can be used to fill in the data gap shoreward of the navigable area limit line (0 to 4 meters) and if there is potential for applying NCMP’s bathymetry lidar data to near‐shore areas deeper than 10 meters. Based on the study results, recommendations will be provided to NOAA for the site conditions where this data will provide the most benefit. Additionally, this analysis may allow the development of future operating procedures and workflows using other topographic‐ bathymetric lidar datasets to help update near‐shore areas of the NOAA charts

    Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

    Get PDF
    Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., “guilt-by-association”). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response

    Innovative computational methods for transcriptomic data analysis

    No full text
    Tools of molecular biology and the evolving tools of genomics can now be exploited to study the genetic regulatory mechanisms that control cellular responses to a wide variety of stimuli. These responses are highly complex, and involve many genes and gene products. The main objectives of this paper are to describe a novel research program centered on understanding these responses by (i) developing powerful graph algorithms that exploit the innovative principles of fixed parameter tractability in order to generate distilled gene sets; (ii) producing scalable, high performance parallel and distributed implementations of these algorithms utilizing cutting-edge computing platforms and auxiliary resources; (iii) employing these implementations to identify gene sets suggestive of co-regulation; and (iv) performing sequence analysis and genomic data mining to examine, winnow and highlight the most promising gene sets for more detailed investigation. As a case study, we describe our work aimed at elucidating genetic regulatory mechanisms that control cellular responses to low-dose ionizing radiation (IR). A low-dose exposure, as defined here, is an exposure of at most 10 cGy (rads). While the consequences of high doses of radiation are well known, the net outcome of low-dose exposures continue

    Resulting Graph Structures

    No full text
    examined the final electronic copy of this thesis for form and content and recommend that it b

    Genes Co-Expressed with Tulp4 in HSCs

    No full text
    <p>Gene expression data from HSCs [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.0020089#pcbi-0020089-b043" target="_blank">43</a>] were used in WebQTL (webqtl.org) to identify genes most highly correlated with <i>Tulp4</i>. The majority of genes encode proteins involved in immune function (e.g., immunoglobulins).</p
    corecore