121 research outputs found

    CAD Tools for DNA Micro-Array Design, Manufacture and Application

    Get PDF
    Motivation: As the human genome project progresses and some microbial and eukaryotic genomes are recognized, numerous biotechnological processes have attracted increasing number of biologists, bioengineers and computer scientists recently. Biotechnological processes profoundly involve production and analysis of highthroughput experimental data. Numerous sequence libraries of DNA and protein structures of a large number of micro-organisms and a variety of other databases related to biology and chemistry are available. For example, microarray technology, a novel biotechnology, promises to monitor the whole genome at once, so that researchers can study the whole genome on the global level and have a better picture of the expressions among millions of genes simultaneously. Today, it is widely used in many fields- disease diagnosis, gene classification, gene regulatory network, and drug discovery. For example, designing organism specific microarray and analysis of experimental data require combining heterogeneous computational tools that usually differ in the data format; such as, GeneMark for ORF extraction, Promide for DNA probe selection, Chip for probe placement on microarray chip, BLAST to compare sequences, MEGA for phylogenetic analysis, and ClustalX for multiple alignments. Solution: Surprisingly enough, despite huge research efforts invested in DNA array applications, very few works are devoted to computer-aided optimization of DNA array design and manufacturing. Current design practices are dominated by ad-hoc heuristics incorporated in proprietary tools with unknown suboptimality. This will soon become a bottleneck for the new generation of high-density arrays, such as the ones currently being designed at Perlegen [109]. The goal of the already accomplished research was to develop highly scalable tools, with predictable runtime and quality, for cost-effective, computer-aided design and manufacturing of DNA probe arrays. We illustrate the utility of our approach by taking a concrete example of combining the design tools of microarray technology for Harpes B virus DNA data

    A comprehensive comparison of metaheuristics for the repetition-free longest common subsequence problem

    Get PDF
    This paper deals with an NP-hard string problem from the bio-informatics field: the repetition-free longest common subsequence problem. This problem has enjoyed an increasing interest in recent years, which has resulted in the application of several pure as well as hybrid metaheuristics. However, the literature lacks a comprehensive comparison between those approaches. Moreover, it has been shown that general purpose integer linear programming solvers are very efficient for solving many of the problem instances that were used so far in the literature. Therefore, in this work we extend the available benchmark set, adding larger instances to which integer linear programming solvers cannot be applied anymore. Moreover, we provide a comprehensive comparison of the approaches found in the literature. Based on the results we propose a hybrid between two of the best methods which turns out to inherit the complementary strengths of both methods.Peer ReviewedPostprint (author's final draft

    Algorithms for peptide and PTM identification using Tandem mass spectrometry

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Fast ReRoute on Programmable Switches

    Get PDF
    Highly dependable communication networks usually rely on some kind of Fast Re-Route (FRR) mechanism which allows to quickly re-route traffic upon failures, entirely in the data plane. This paper studies the design of FRR mechanisms for emerging reconfigurable switches. Our main contribution is an FRR primitive for programmable data planes, PURR, which provides low failover latency and high switch throughput, by avoiding packet recirculation. PURR tolerates multiple concurrent failures and comes with minimal memory requirements, ensuring compact forwarding tables, by unveiling an intriguing connection to classic ``string theory'' (i.e., stringology), and in particular, the shortest common supersequence problem. PURR is well-suited for high-speed match-action forwarding architectures (e.g., PISA) and supports the implementation of a broad variety of FRR mechanisms. Our simulations and prototype implementation (on an FPGA and a Tofino switch) show that PURR improves TCAM memory occupancy by a factor of 1.5x-10.8x compared to a naïve encoding when implementing state-of-the-art FRR mechanisms. PURR also improves the latency and throughput of datacenter traffic up to a factor of 2.8x-5.5x and 1.2x-2x, respectively, compared to approaches based on recirculating packets

    Tectonostratigraphic evolution of the Williston Basin

    Get PDF
    In the Williston Basin five regional seismic profiles, covering \sim3090 km were utilized for a comprehensive study of this complex geologic feature. 2300 km field data were added to the existing 790 km profile. The novel seismic information in conjunction with a sizeable number of wireline data and incorporation of structural and isopach maps provided a unique data environment for development of a new elaborate tectonostratigraphic model of this major continental depression. Standard reflection seismic processing procedures were implemented with special emphasis on regional perspectives, including "Earth curvature correction", to generate images of the basin fill. The latter helped to reveal the true nature of this large scale cratonic basin. This novel information permitted new approaches in establishing the deformation styles in the Williston Basin. Structural studies of the newly reprocessed regional seismic profiles revealed the compressional nature of the radially arranged tectonic elements in the center of the basin, and the extensional character of the peripheral regions. The results suggest that axisymmetric deformation controlled the early stages of the Williston Basin area, and was the causal factor of the oval shape of the basin. In the first, "pre-Williston" phase, the region was uplifted by an axisymmetric lithospheric intrusion creating radial extensional signatures in the central zone and compressional structures in the surroundings. Erosion and thermal cooling and/or phase change of the mantle material led to the initiation of the basin subsidence. Consequently, in the "intracratonic phase" (Sauk - Absaroka), the pre-existing radial and circumferentially arranged structures were periodically reactivated in the opposite sense. The active periods were unrelated to global orogenic events of the continent. The exception is the Kaskaskia I (Devonian) interval, when the territory was tilted to the northwest and the axisymmetric cause of the subsidence was overprinted. The subsequent "foreland phase" (Zuni - Tejas), was dominated by lateral forces of the Sevier and Laramide orogenies. This plate-margin-related major tectonic development was associated with the NNW-SSE elliptical elongation of the basin and the related highly prevalent NE-SW/NW-SE faulting and fracturing. Additional consequences of this process were offsetting and rotation of the pre-existing radial and circumferential structural features. These radial and circumferential structural features of the Williston Basin may be recognizable in comparable cratonic environments (e.g., Michigan Basin, Paris Basin). Comprehensive seismic/sequence stratigraphiy was developed throughout the basin. In the Sauk - Absaroka interval the sequence stratigraphic and the lithostratigraphic boundaries are generally identical. In the Zuni - Tejas interval, when the clastic sedimentation was dominant, the two subdivisions are not identical. In these younger strata 16 sequence stratigraphic units were identified. More detailed subdivision of the interval containing the Eagle Sandstone revealed that two major sources of the terrigenous sediments are directly recognizable on the seismic profiles, beyond 500 km east of the shorelines

    DNA Chemical Reaction Network Design Synthesis and Compilation

    Get PDF
    The advantages of biomolecular computing include 1) the ability to interface with, monitor, and intelligently protect and maintain the functionality of living systems, 2) the ability to create computational devices with minimal energy needs and hazardous waste production during manufacture and lifecycle, 3) the ability to store large amounts of information for extremely long time periods, and 4) the ability to create computation analogous to human brain function. To realize these advantages over electronics, biomolecular computing is at a watershed moment in its evolution. Computing with entire molecules presents different challenges and requirements than computing just with electric charge. These challenges have led to ad-hoc design and programming methods with high development costs and limited device performance. At the present time, device building entails complete low-level detail immersion. We address these shortcomings by creation of a systems engineering process for building and programming DNA-based computing devices. Contributions of this thesis include numeric abstractions for nucleic acid sequence and secondary structure, and a set of algorithms which employ these abstractions. The abstractions and algorithms have been implemented into three artifacts: DNADL, a design description language; Pyxis, a molecular compiler and design toolset; and KCA, a simulation of DNA kinetics using a cellular automaton discretization. Our methods are applicable to other DNA nanotechnology constructions and may serve in the development of a full DNA computing model
    corecore