138 research outputs found

    Inferring meta-covariates in classification

    Get PDF
    This paper develops an alternative method for gene selection that combines model based clustering and binary classification. By averaging the covariates within the clusters obtained from model based clustering, we define “meta-covariates” and use them to build a probit regression model, thereby selecting clusters of similarly behaving genes, aiding interpretation. This simultaneous learning task is accomplished by an EM algorithm that optimises a single likelihood function which rewards good performance at both classification and clustering. We explore the performance of our methodology on a well known leukaemia dataset and use the Gene Ontology to interpret our results

    Solving Inventory Routing Problems Using Location Based Heuristics

    Get PDF
    Inventory routing problems (IRPs) occur where vendor managed inventory replenishment strategies are implemented in supply chains. These problems are characterized by the presence of both transportation and inventory considerations, either as parameters or constraints. The research presented in this paper aims at extending IRP formulation developed on the basis of location based heuristics proposed by Bramel and Simchi-Levi and continued by Hanczar. In the first phase of proposed algorithms, mixed integer programming is used to determine the partitioning of customers as well as dates and quantities of deliveries. Then, using 2-opt algorithm for solving the traveling sales-person problem the optimal routes for each partition are determined. In the main part of research the classical formulation is extended by additional constraints (visit spacing, vehicle filling rate, driver (vehicle) consistency, and heterogeneous fleet of vehicles) as well as the additional criteria are discussed. Then the impact of using each of proposed extensions for solution possibilities is evaluated. The results of computational tests are presented and discussed. Obtained results allow to conclude that the location based heuristics should be considered when solving real life instances of IRP. (original abstract

    The significance of biomechanics and scaffold structure for bladder tissue engineering

    Get PDF
    Current approaches for bladder reconstruction surgery are associated with many morbidities. Tissue engineering is considered an ideal approach to create constructs capable of restoring the function of the bladder wall. However, many constructs to date have failed to create a sufficient improvement in bladder capacity due to insufficient neobladder compliance. This review evaluates the biomechanical properties of the bladder wall and how the current reconstructive materials aim to meet this need. To date, limited data from mechanical testing and tissue anisotropy make it challenging to reach a consensus on the native properties of the bladder wall. Many of the materials whose mechanical properties have been quantified do not fall within the range of mechanical properties measured for native bladder wall tissue. Many promising new materials have yet to be mechanically quantified, which makes it difficult to ascertain their likely effectiveness. The impact of scaffold structures and the long-term effect of implanting these materials on their inherent mechanical properties are areas yet to be widely investigated that could provide important insight into the likely longevity of the neobladder construct. In conclusion, there are many opportunities for further investigation into novel materials for bladder reconstruction. Currently, the field would benefit from a consensus on the target values of key mechanical parameters for bladder wall scaffolds

    Predicting students' emotions using machine learning techniques

    Get PDF
    Detecting students' real-time emotions has numerous benefits, such as helping lecturers understand their students' learning behaviour and to address problems like confusion and boredom, which undermine students' engagement. One way to detect students' emotions is through their feedback about a lecture. Detecting students' emotions from their feedback, however, is both demanding and time-consuming. For this purpose, we looked at several models that could be used for detecting emotions from students' feedback by training seven different machine learning techniques using real students' feedback. The models with a single emotion performed better than those with multiple emotions. Overall, the best three models were obtained with the CNB classiffier for three emotions: amused, bored and excitement

    Predictive response-relevant clustering of expression data provides insights into disease processes

    Get PDF
    This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of ;response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the ;meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes-including three transcription factors (Arntl, Bhlhe41 and Npas2)-that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets

    Batch-adaptive rejection threshold estimation with application to OCR post-processing

    Full text link
    An OCR process is often followed by the application of a language model to find the best transformation of an OCR hypothesis into a string compatible with the constraints of the document, field or item under consideration. The cost of this transformation can be taken as a confidence value and compared to a threshold to decide if a string is accepted as correct or rejected in order to satisfy the need for bounding the error rate of the system. Widespread tools like ROC, precision-recall, or error-reject curves, are commonly used along with fixed thresholding in order to achieve that goal. However, those methodologies fail when a test sample has a confidence distribution that differs from the one of the sample used to train the system, which is a very frequent case in post-processed OCR strings (e.g., string batches showing particularly careful handwriting styles in contrast to free styles). In this paper, we propose an adaptive method for the automatic estimation of the rejection threshold that overcomes this drawback, allowing the operator to define an expected error rate within the set of accepted (non-rejected) strings of a complete batch of documents (as opposed to trying to establish or control the probability of error of a single string), regardless of its confidence distribution. The operator (expert) is assumed to know the error rate that can be acceptable to the user of the resulting data. The proposed system transforms that knowledge into a suitable rejection threshold. The approach is based on the estimation of an expected error vs. transformation cost distribution. First, a model predicting the probability of a cost to arise from an erroneously transcribed string is computed from a sample of supervised OCR hypotheses. Then, given a test sample, a cumulative error vs. cost curve is computed and used to automatically set the appropriate threshold that meets the user-defined error rate on the overall sample. The results of experiments on batches coming from different writing styles show very accurate error rate estimations where fixed thresholding clearly fails. An original procedure to generate distorted strings from a given language is also proposed and tested, which allows the use of the presented method in tasks where no real supervised OCR hypotheses are available to train the system.Navarro Cerdan, JR.; Arlandis Navarro, JF.; Llobet Azpitarte, R.; Perez-Cortes, J. (2015). Batch-adaptive rejection threshold estimation with application to OCR post-processing. Expert Systems with Applications. 42(21):8111-8122. doi:10.1016/j.eswa.2015.06.022S81118122422

    Evaluation Method, Dataset Size or Dataset Content: How to Evaluate Algorithms for Image Matching?

    Get PDF
    Most vision papers have to include some evaluation work in order to demonstrate that the algorithm proposed is an improvement on existing ones. Generally, these evaluation results are presented in tabular or graphical forms. Neither of these is ideal because there is no indication as to whether any performance differences are statistically significant. Moreover, the size and nature of the dataset used for evaluation will obviously have a bearing on the results, and neither of these factors are usually discussed. This paper evaluates the effectiveness of commonly used performance characterization metrics for image feature detection and description for matching problems and explores the use of statistical tests such as McNemar’s test and ANOVA as better alternatives

    Genomic Insights into Methanotrophy: The Complete Genome Sequence of Methylococcus capsulatus (Bath)

    Get PDF
    Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular, substantially reducing emissions of biologically generated methane to the atmosphere. Despite their importance, and in contrast to organisms that play roles in other major parts of the carbon cycle such as photosynthesis, no genome-level studies have been published on the biology of methanotrophs. We report the first complete genome sequence to our knowledge from an obligate methanotroph, Methylococcus capsulatus (Bath), obtained by the shotgun sequencing approach. Analysis revealed a 3.3-Mb genome highly specialized for a methanotrophic lifestyle, including redundant pathways predicted to be involved in methanotrophy and duplicated genes for essential enzymes such as the methane monooxygenases. We used phylogenomic analysis, gene order information, and comparative analysis with the partially sequenced methylotroph Methylobacterium extorquens to detect genes of unknown function likely to be involved in methanotrophy and methylotrophy. Genome analysis suggests the ability of M. capsulatus to scavenge copper (including a previously unreported nonribosomal peptide synthetase) and to use copper in regulation of methanotrophy, but the exact regulatory mechanisms remain unclear. One of the most surprising outcomes of the project is evidence suggesting the existence of previously unsuspected metabolic flexibility in M. capsulatus, including an ability to grow on sugars, oxidize chemolithotrophic hydrogen and sulfur, and live under reduced oxygen tension, all of which have implications for methanotroph ecology. The availability of the complete genome of M. capsulatus (Bath) deepens our understanding of methanotroph biology and its relationship to global carbon cycles. We have gained evidence for greater metabolic flexibility than was previously known, and for genetic components that may have biotechnological potential
    corecore