2,818 research outputs found
Benchmarking database systems for Genomic Selection implementation
Motivation: With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. Results: We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix
Investigation of Interactions between Rev and Microtubules: Purification of Wild-type and Mutant Rev Protein and Optimization of Microtubule Depolymerization Assays
As a logical pharmaceutical target for antiviral drugs, HIV-1 Rev is a regulatory protein essential for viral infection (Hope, 1999). The development of antiviral drugs that target Rev has been hindered by the lack of high-resolution structural information due to the protein\u27s tendency to aggregate in solution. While searching for solution conditions rendering Rev amenable to crystallographic analyses, Watts et al., (2000) discovered a novel in vitro interaction between Rev and microtubules (MTs) whereby addition of equimolar Rev and tubulin forms bilayered rings called Rev-tubulin toroidal complexes (RTTs). RTTs are similar to those seen when MTs are mixed with certain anti-MT drugs and KinI kinesins (Watts et al., 2000). Coupled with the sequence homology that exists between KinI kinesins and Rev, I hypothesize that Rev and KinI\u27s are depolymerizing MTs by a shared mechanism. This mechanism may include the binding of the Rev/KinI to the end of the MT inducing a curved confirmation in the MT thereby destabilizing it to promote depolymerization. I propose to test this hypothesis through measuring Rev-MT interactions by adapting biochemical and microscopy-based assays used to measure MT depolymerization by KinI proteins. These assays require microgram amounts of highly purified wild-type and mutant Rev proteins as well as purified tubulin from which MTs can be polymerized in vitro. To this end, I have purified wild type Rev with no visible contaminants on coomasie stained gels. Rev mutants R42A and E57A can also be purified with limited visible contaminants. However, the appropriate controls can be generated from non-expressing cells to address this issue. Rev mutants A37D and R39A can also be partially purified. Using purified Rev proteins, I then applied sedimentation assays to measure Rev-stimulated MT depolymerization. There was a statistically significant time dependence for wild type Rev to depolymerize MTs although there was no evidence for concentration dependence. Visual assays demonstrate no significant difference in the length of MTs treated with Rev although Rev decreased the number of MTs on the coverslip over time. This could contribute to the finding that MT depolymerization by Rev is time dependent. These results demonstrate that it is possible to measure Rev-MT interactions in vitro although it is clear that these assays are deficient in certain ways, including the ability of MTs to depolymerize on their own and RTTs potentially pelleting during the depolymerization assay. However, it is likely that cycling tubulin and/or using taxol to further stabilize the MTs can remedy the deficiency of MT depolymerization on their own. EM could also be used to determine if RTTs are pelleting during the depolymerization assay. After using mutant Rev proteins in the depolymerization and visual assays, it is the long-term goal that mechanistic information about Rev will lend valuable evidence to the study of KinI kinesins. Generating structural Rev information may also be helpful in the drug design of anti-mitotic peptides
QuickCut CNC Waterjet Cutter
This design project was dedicated to improving the existing design and making functional the QuickCut CNC Waterjet Cutter. The scope of work consisted of making the machine mobile and more structurally stable as well as programming the CNC controls. The project was started in the 2018/19 school year by a former mechanical engineering student. The subsystems consist of the frame, power, motor, water pump and associated plumbing, nozzle, abrasive feed system, x/y gantry, Arduino control system and the cutting bed. Unfortunately, due to the COVID-19 pandemic, the full scope was not able to be completed in its entirety
Sincerity, Hypocrisy, and Conspiracy Theory in the Occupied Palestinian Territory.
Concerns about lying and sincerity in politics are common in most societies, as are concerns about conspiracy theories. But in the occupied Palestinian territory, these concerns give rise to particular kinds of effects because of the conditions of Israeli occupation. Political theorists often interpret opacity claims and conspiracy theories as responses to social disorder. In occupied Palestine, disorder and instability are standard. Opacity claims and conspiracy theories therefore require a different kind of analysis. Through an examination of the semiotic ideology of sincerity, especially as it has emerged in the conflict between Fatah and Hamas, this article argues that opacity claims act as a form of nationalist pedagogy, at once reinforcing the basic principles of sincerity of action and word, and encouraging a wariness of political spin
The evolutionary constraints on angiosperm chloroplast adaptation
The chloroplast (plastid) arose via the endosymbiosis of a photosynthetic cyanobacterium by a nonphotosynthetic eukaryotic cell ∼1.5 billion years ago. Although the plastid underwent rapid evolution by genome reduction, its rate of molecular evolution is low and its genome organization is highly conserved. Here, we investigate the factors that have constrained the rate of molecular evolution of protein-coding genes in the plastid genome. Through phylogenomic analysis of 773 angiosperm plastid genomes, we show that there is substantial variation in the rate of molecular evolution between genes. We demonstrate that the distance of a plastid gene from the likely origin of replication influences the rate at which it has evolved, consistent with time and distance-dependent nucleotide mutation gradients. In addition, we show that the amino acid composition of a gene product constraints its substitution tolerance, limiting its mutation landscape and its corresponding rate of molecular evolution. Finally, we demonstrate that the mRNA abundance of a gene is a key factor in determining its rate of molecular evolution, suggesting an interaction between transcription and DNA repair in the plastid. Collectively, we show that the location, the composition, and the expression of a plastid gene can account for >50% of the variation in its rate of molecular evolution. Thus, these three factors have exerted a substantial limitation on the capacity for adaptive evolution in plastid-encoded genes and ultimately constrained the evolvability of the chloroplast
Optimising the data combination rule for seamless phase II/III clinical trials
We consider seamless Phase II/III clinical trials which compare K treatments with a common control in Phase II, then test the most promising treatment against control in Phase III. The final hypothesis test for the selected treatment can use data from both Phases, subject to controlling the familywise type I error rate. We show that the choice of method for conducting the final hypothesis test has a substantial impact on the power to demonstrate that an effective treatment is superior to control. To understand these differences in power we derive optimal decision rules, maximising power for particular configurations of treatment effects. Rules with optimal frequentist properties are found as solutions to multivariate Bayes decision problems. Although the optimal rule depends on the configuration of treatment means considered, we are able to identify two decision rules with robust efficiency: a rule using a weighted average of the Phase II and Phase III data on the selected treatment and control, and a closed testing procedure using an inverse normal combination rule and a Dunnett test for intersection hypotheses. For the first of these rules, we find the optimal division of a given total sample size between Phase II and Phase III.We also assess the value of using Phase II data in the final analysis and find that for many plausible scenarios, between 50% and 70% of the Phase II numbers on the selected treatment and control would need to be added to the Phase III sample size in order to achieve the same increase in power
Comparison of Computational Models for Assessing Conservation of Gene Expression across Species
Assessing conservation/divergence of gene expression across species is important for the understanding of gene regulation evolution. Although advances in microarray technology have provided massive high-dimensional gene expression data, the analysis of such data is still challenging. To date, assessing cross-species conservation of gene expression using microarray data has been mainly based on comparison of expression patterns across corresponding tissues, or comparison of co-expression of a gene with a reference set of genes. Because direct and reliable high-throughput experimental data on conservation of gene expression are often unavailable, the assessment of these two computational models is very challenging and has not been reported yet. In this study, we compared one corresponding tissue based method and three co-expression based methods for assessing conservation of gene expression, in terms of their pair-wise agreements, using a frequently used human-mouse tissue expression dataset. We find that 1) the co-expression based methods are only moderately correlated with the corresponding tissue based methods, 2) the reliability of co-expression based methods is affected by the size of the reference ortholog set, and 3) the corresponding tissue based methods may lose some information for assessing conservation of gene expression. We suggest that the use of either of these two computational models to study the evolution of a gene's expression may be subject to great uncertainty, and the investigation of changes in both gene expression patterns over corresponding tissues and co-expression of the gene with other genes is necessary
A jackknife-like method for classification and uncertainty assessment of multi-category tumor samples using gene expression information
<p>Abstract</p> <p>Background</p> <p>The use of gene expression profiling for the classification of human cancer tumors has been widely investigated. Previous studies were successful in distinguishing several tumor types in binary problems. As there are over a hundred types of cancers, and potentially even more subtypes, it is essential to develop multi-category methodologies for molecular classification for any meaningful practical application.</p> <p>Results</p> <p>A jackknife-based supervised learning method called paired-samples test algorithm (PST), coupled with a binary classification model based on linear regression, was proposed and applied to two well known and challenging datasets consisting of 14 (GCM dataset) and 9 (NC160 dataset) tumor types. The results showed that the proposed method improved the prediction accuracy of the test samples for the GCM dataset, especially when t-statistic was used in the primary feature selection. For the NCI60 dataset, the application of PST improved prediction accuracy when the numbers of used genes were relatively small (100 or 200). These improvements made the binary classification method more robust to the gene selection mechanism and the size of genes to be used. The overall prediction accuracies were competitive in comparison to the most accurate results obtained by several previous studies on the same datasets and with other methods. Furthermore, the relative confidence R(T) provided a unique insight into the sources of the uncertainty shown in the statistical classification and the potential variants within the same tumor type.</p> <p>Conclusion</p> <p>We proposed a novel bagging method for the classification and uncertainty assessment of multi-category tumor samples using gene expression information. The strengths were demonstrated in the application to two bench datasets.</p
Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier
Multi-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) outputcoding strategy is compared with the results obtained from the generalized One Versus All (OVA) method and their efficiencies of using them for multi-class tumor classification have been studied. This comparative study was done using two microarray gene expression data: Global Cancer Map (GCM) dataset and brain cancer (BC) dataset. Primary feature selection was based on fold change and penalized t-statistics. Evaluation was conducted with varying feature numbers. The OVO coding strategy worked quite well with the BC data, while both OVO and OVA results seemed to be similar for the GCM data. The selection of output coding methods for combining binary classifiers for multi-class tumor classification depends on the number of tumor types considered, the discrepancies between the tumor samples used for training as well as the heterogeneity of expression within the cancer subtypes used as training data
- …
