733 research outputs found

    A Software Tool for the Exponential Power Distribution: The normalp Package

    Get PDF
    In this paper we present the normalp package, a package for the statistical environment R that has a set of tools for dealing with the exponential power distribution. In this package there are functions to compute the density function, the distribution function and the quantiles from an exponential power distribution and to generate pseudo-random numbers from the same distribution. Moreover, methods concerning the estimation of the distribution parameters are described and implemented. It is also possible to estimate linear regression models when we assume the random errors distributed according to an exponential power distribution. A set of functions is designed to perform simulation studies to see the suitability of the estimators used. Some examples of use of this package are provided.

    Efeito do tráfego nas pressões de preconsolidação do solo devido as operações de colheita do eucalyptus

    Get PDF
    Uma das limitações para alcançar o desenvolvimento florestal sustentável está relacionado ao tráfico de máquinas e veículos durante as operações de colheita e transporte de madeira que podem causar degradação da estrutura do solo. Buscando uma maneira para analisar este problema, o objetivo deste estudo foi determinar o efeito do tráfego devido a operações de colheita e transporte de madeira, nas pressões de preconsolidação (sigmap) de um Latosol Vermelho-Amarelo, cultivado com Eucalyptus. Este estudo foi realizado usando amostras de solo indeformadas coletadas a 0,10-0,125 m de profundidade. As amostras indeformadas foram usadas nos ensaios de compressão de uniaxial. A amostragem consistiu de duas fases, antes e depois das operações de colheita mecanizada. As alterações causadas pelo tráfego nas sigmap na estação seca indicaram que o processo de compactação não foi evidente e nem importante. Já na estação chuvosa as alterações causadas nas sigmap pelas operações realizadas com o Harvester e Forwarder foram as que causaram maior compactação, enquanto que as operações realizadas com a Motosserra e baldeio manual, foram as que causaram menor compactação do solo.One of the limitations for reaching sustainable forest development is related to the traffic of machines and vehicles during harvest operations and wood transport, which may cause soil structure degradation. Seeking a way to analyze this problem, the objective of this study was to determine the traffic effects due to harvest operations and wood transport, on the preconsolidation pressure (sigmap) in a Typic Acrustox cultivated with eucalyptus. This study was conducted using undisturbed soil samples collected at the 0.1-0.125 m depth. Undisturbed soil samples were used in the uniaxial compression tests. Soil sampling consisted of two stages, before and after the mechanized harvest operations. The traffic effects on the sigmap in the dry season indicated that the soil compaction process was neither evident nor important. However, in the rainy season the traffic effects on the sigmap indicated that the operations performed with Harvester and Forwarder caused greater soil compaction than those with Motorized Saw and Manual, which caused less soil compaction

    A Software Tool for the Exponential Power Distribution: The normalp Package

    Get PDF
    In this paper we present the normalp package, a package for the statistical environment R that has a set of tools for dealing with the exponential power distribution. In this package there are functions to compute the density function, the distribution function and the quantiles from an exponential power distribution and to generate pseudo\u2013random numbers from the same distribution. Moreover, methods concerning the estimation of the distribution parameters are described and implemented. It is also possible to estimate linear regression models when we assume the random errors distributed according to an exponential power distribution. A set of functions is designed to perform simulation studies to see the suitability of the estimators used. Some examples of use of this package are provided

    Development of Multigene Expression Signature Maps at the Protein Level from Digitized Immunohistochemistry Slides

    Get PDF
    Molecular classification of diseases based on multigene expression signatures is increasingly used for diagnosis, prognosis, and prediction of response to therapy. Immunohistochemistry (IHC) is an optimal method for validating expression signatures obtained using high-throughput genomics techniques since IHC allows a pathologist to examine gene expression at the protein level within the context of histologically interpretable tissue sections. Additionally, validated IHC assays may be readily implemented as clinical tests since IHC is performed on routinely processed clinical tissue samples. However, methods have not been available for automated n-gene expression profiling at the protein level using IHC data. We have developed methods to compute expression level maps (signature maps) of multiple genes from IHC data digitized on a commercial whole slide imaging system. Areas of cancer for these expression level maps are defined by a pathologist on adjacent, co-registered H&E slides, allowing assessment of IHC statistics and heterogeneity within the diseased tissue. This novel way of representing multiple IHC assays as signature maps will allow the development of n-gene expression profiling databases in three dimensions throughout virtual whole organ reconstructions

    RawHash: Enabling Fast and Accurate Real-Time Analysis of Raw Nanopore Signals for Large Genomes

    Full text link
    Nanopore sequencers generate electrical raw signals in real-time while sequencing long genomic strands. These raw signals can be analyzed as they are generated, providing an opportunity for real-time genome analysis. An important feature of nanopore sequencing, Read Until, can eject strands from sequencers without fully sequencing them, which provides opportunities to computationally reduce the sequencing time and cost. However, existing works utilizing Read Until either 1) require powerful computational resources that may not be available for portable sequencers or 2) lack scalability for large genomes, rendering them inaccurate or ineffective. We propose RawHash, the first mechanism that can accurately and efficiently perform real-time analysis of nanopore raw signals for large genomes using a hash-based similarity search. To enable this, RawHash ensures the signals corresponding to the same DNA content lead to the same hash value, regardless of the slight variations in these signals. RawHash achieves an accurate hash-based similarity search via an effective quantization of the raw signals such that signals corresponding to the same DNA content have the same quantized value and, subsequently, the same hash value. We evaluate RawHash on three applications: 1) read mapping, 2) relative abundance estimation, and 3) contamination analysis. Our evaluations show that RawHash is the only tool that can provide high accuracy and high throughput for analyzing large genomes in real-time. When compared to the state-of-the-art techniques, UNCALLED and Sigmap, RawHash provides 1) 25.8x and 3.4x better average throughput and 2) an average speedup of 32.1x and 2.1x in the mapping time, respectively. Source code is available at https://github.com/CMU-SAFARI/RawHash

    Properties of locally checkable vertex partitioning problems in digraphs

    Get PDF
    While, for undirected graphs, locally checkable vertex subset and partitioning problems have been studied extensively, the equivalent directed problems have not received nearly as much attention yet. We take a closer look at the relationship between undirected and directed problems considering hardness. We extend some properties that have already been shown for undirected graphs to directed graphs. Furthermore, we explore some of the trivialities in directed problem definitions that do not appear in undirected ones. And finally, we construct and visualize digraph coverings to achieve a deeper understanding of their structure.Masteroppgave i informatikkINF399KMAMN-IN

    RawHash2: Accurate and Fast Mapping of Raw Nanopore Signals using a Hash-based Seeding Mechanism

    Full text link
    Summary: Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including a more sensitive chaining implementation, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and various data formats such as POD5. Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 3.44% and up to 10.32%) and better throughput (on average by 2.3x and up to 5.4x) than RawHash. Availability and Implementation: RawHash2 is available at https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully reproduce our results on our GitHub page

    FLEXWAL: A computer program for predicting the wall modifications for two-dimensional, solid, adaptive-wall tunnels

    Get PDF
    A program called FLEXWAL for calculating wall modifications for solid, adaptive-wall wind tunnels is presented. The method used is the iterative technique of NASA TP-2081 and is applicable to subsonic and transonic test conditions. The program usage, program listing, and a sample case are given

    TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

    Full text link
    Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation. To overcome this issue, we propose TargetCall, the first fast and widely-applicable pre-basecalling filter to eliminate the wasted computation in basecalling. TargetCall's key idea is to discard reads that will not match the target reference (i.e., off-target reads) prior to basecalling. TargetCall consists of two main components: (1) LightCall, a lightweight neural network basecaller that produces noisy reads; and (2) Similarity Check, which labels each of these noisy reads as on-target or off-target by matching them to the target reference. TargetCall filters out all off-target reads before basecalling; and the highly-accurate but slow basecalling is performed only on the raw signals whose noisy reads are labeled as on-target. Our thorough experimental evaluations using both real and simulated data show that TargetCall 1) improves the end-to-end basecalling performance of the state-of-the-art basecaller by 3.31x while maintaining high (98.88%) sensitivity in keeping on-target reads, 2) maintains high accuracy in downstream analysis, 3) precisely filters out up to 94.71% of off-target reads, and 4) achieves better performance, sensitivity, and generality compared to prior works. We freely open-source TargetCall at https://github.com/CMU-SAFARI/TargetCall
    corecore