54 research outputs found

    Assessment of brain cancer atlas maps with multimodal imaging features.

    Get PDF
    BACKGROUND: Glioblastoma Multiforme (GBM) is a fast-growing and highly aggressive brain tumor that invades the nearby brain tissue and presents secondary nodular lesions across the whole brain but generally does not spread to distant organs. Without treatment, GBM can result in death in about 6 months. The challenges are known to depend on multiple factors: brain localization, resistance to conventional therapy, disrupted tumor blood supply inhibiting effective drug delivery, complications from peritumoral edema, intracranial hypertension, seizures, and neurotoxicity. MAIN TEXT: Imaging techniques are routinely used to obtain accurate detections of lesions that localize brain tumors. Especially magnetic resonance imaging (MRI) delivers multimodal images both before and after the administration of contrast, which results in displaying enhancement and describing physiological features as hemodynamic processes. This review considers one possible extension of the use of radiomics in GBM studies, one that recalibrates the analysis of targeted segmentations to the whole organ scale. After identifying critical areas of research, the focus is on illustrating the potential utility of an integrated approach with multimodal imaging, radiomic data processing and brain atlases as the main components. The templates associated with the outcome of straightforward analyses represent promising inference tools able to spatio-temporally inform on the GBM evolution while being generalizable also to other cancers. CONCLUSIONS: The focus on novel inference strategies applicable to complex cancer systems and based on building radiomic models from multimodal imaging data can be well supported by machine learning and other computational tools potentially able to translate suitably processed information into more accurate patient stratifications and evaluations of treatment efficacy

    Semantic Biclustering

    Get PDF
    Tato disertační práce se zaměřuje na problém hledání interpretovatelných a prediktivních vzorů, které jsou vyjádřeny formou dvojshluků, se specializací na biologická data. Prezentované metody jsou souhrnně označovány jako sémantické dvojshlukování, jedná se o podobor dolování dat. Termín sémantické dvojshlukování je použit z toho důvodu, že zohledňuje proces hledání koherentních podmnožin řádků a sloupců, tedy dvojshluků, v 2-dimensionální binární matici a zárove ň bere také v potaz sémantický význam prvků v těchto dvojshlucích. Ačkoliv byla práce motivována biologicky orientovanými daty, vyvinuté algoritmy jsou obecně aplikovatelné v jakémkoli jiném výzkumném oboru. Je nutné pouze dodržet požadavek na formát vstupních dat. Disertační práce představuje dva originální a v tomto ohledu i základní přístupy pro hledání sémantických dvojshluků, jako je Bicluster enrichment analysis a Rule a tree learning. Jelikož tyto metody nevyužívají vlastní hierarchické uspořádání termů v daných ontologiích, obecně je běh těchto algoritmů dlouhý čin může docházet k indukci hypotéz s redundantními termy. Z toho důvodu byl vytvořen nový operátor zjemnění. Tento operátor byl včleněn do dobře známého algoritmu CN2, kde zavádí dvě redukční procedury: Redundant Generalization a Redundant Non-potential. Obě procedury pomáhají dramaticky prořezat prohledávaný prostor pravidel a tím umožňují urychlit proces indukce pravidel v porovnání s tradičním operátorem zjemnění tak, jak je původně prezentován v CN2. Celý algoritmus spolu s redukčními metodami je publikován ve formě R balííčku, který jsme nazvali sem1R. Abychom ukázali i možnost praktického užití metody sémantického dvojshlukování na reálných biologických problémech, v disertační práci dále popisujeme a specificky upravujeme algoritmus sem1R pro dv+ úlohy. Zaprvé, studujeme praktickou aplikaci algoritmu sem1R v analýze E-3 ubikvitin ligázy v trávicí soustavě s ohledem na potenciál regenerace tkáně. Zadruhé, kromě objevování dvojshluků v dat ech genové exprese, adaptujeme algoritmus sem1R pro hledání potenciálne patogenních genetických variant v kohortě pacientů.This thesis focuses on the problem of finding interpretable and predic tive patterns, which are expressed in the form of biclusters, with an orientation to biological data. The presented methods are collectively called semantic biclustering, as a subfield of data mining. The term semantic biclustering is used here because it reflects both a process of finding coherent subsets of rows and columns in a 2-dimensional binary matrix and simultaneously takes into account a mutual semantic meaning of elements in such biclusters. In spite of focusing on applications of algorithms in biological data, the developed algorithms are generally applicable to any other research field, there are only limitations on the format of the input data. The thesis introduces two novel, and in that context basic, approaches for finding semantic biclusters, as Bicluster enrichment analysis and Rule and tree learning. Since these methods do not exploit the native hierarchical order of terms of input ontologies, the run-time of algorithms is relatively long in general or an induced hypothesis might have terms that are redundant. For this reason, a new refinement operator has been invented. The refinement operator was incorporated into the well-known CN2 algorithm and uses two reduction procedures: Redundant Generalization and Redundant Non-potential, both of which help to dramatically prune the rule space and consequently, speed-up the entire process of rule induction in comparison with the traditional refinement operator as is presented in CN2. The reduction procedures were published as an R package that we called sem1R. To show a possible practical usage of semantic biclustering in real biological problems, the thesis also describes and specifically adapts the algorithm for two real biological problems. Firstly, we studied a practical application of sem1R algorithm in an analysis of E-3 ubiquitin ligase in the gastrointestinal tract with respect to tissue regeneration potential. Secondly, besides discovering biclusters in gene expression data, we adapted the sem1R algorithm for a different task, concretely for finding potentially pathogenic genetic variants in a cohort of patients

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Local Geometry Processing for Deformations of Non-Rigid 3D Shapes

    Get PDF
    Geometry processing and in particular spectral geometry processing deal with many different deformations that complicate shape analysis problems for non-rigid 3D objects. Furthermore, pointwise description of surfaces has increased relevance for several applications such as shape correspondences and matching, shape representation, shape modelling and many others. In this thesis we propose four local approaches to face the problems generated by the deformations of real objects and improving the pointwise characterization of surfaces. Differently from global approaches that work simultaneously on the entire shape we focus on the properties of each point and its local neighborhood. Global analysis of shapes is not negative in itself. However, having to deal with local variations, distortions and deformations, it is often challenging to relate two real objects globally. For this reason, in the last decades, several instruments have been introduced for the local analysis of images, graphs, shapes and surfaces. Starting from this idea of localized analysis, we propose both theoretical insights and application tools within the local geometry processing domain. In more detail, we extend the windowed Fourier transform from the standard Euclidean signal processing to different versions specifically designed for spectral geometry processing. Moreover, from the spectral geometry processing perspective, we define a new family of localized basis for the functional space defined on surfaces that improve the spatial localization for standard applications in this field. Finally, we introduce the discrete time evolution process as a framework that characterizes a point through its pairwise relationship with the other points on the surface in an increasing scale of locality. The main contribute of this thesis is a set of tools for local geometry processing and local spectral geometry processing that could be used in standard useful applications. The overall observation of our analysis is that localization around points could factually improve the geometry processing in many different applications

    Fundamentals

    Get PDF
    Volume 1 establishes the foundations of this new field. It goes through all the steps from data collection, their summary and clustering, to different aspects of resource-aware learning, i.e., hardware, memory, energy, and communication awareness. Machine learning methods are inspected with respect to resource requirements and how to enhance scalability on diverse computing architectures ranging from embedded systems to large computing clusters

    Mathematical Methods and Operation Research in Logistics, Project Planning, and Scheduling

    Get PDF
    In the last decade, the Industrial Revolution 4.0 brought flexible supply chains and flexible design projects to the forefront. Nevertheless, the recent pandemic, the accompanying economic problems, and the resulting supply problems have further increased the role of logistics and supply chains. Therefore, planning and scheduling procedures that can respond flexibly to changed circumstances have become more valuable both in logistics and projects. There are already several competing criteria of project and logistic process planning and scheduling that need to be reconciled. At the same time, the COVID-19 pandemic has shown that even more emphasis needs to be placed on taking potential risks into account. Flexibility and resilience are emphasized in all decision-making processes, including the scheduling of logistic processes, activities, and projects

    Large-scale Machine Learning in High-dimensional Datasets

    Get PDF

    Frontiers in Nonparametric Statistics

    Get PDF
    The goal of this workshop was to discuss recent developments of nonparametric statistical inference. A particular focus was on high dimensional statistics, semiparametrics, adaptation, nonparametric bayesian statistics, shape constraint estimation and statistical inverse problems. The close interaction of these issues with optimization, machine learning and inverse problems has been addressed as well

    Evaluation of statistical correlation and validation methods for construction of gene co-expression networks

    Get PDF
    High-throughput technologies such as microarrays have led to the rapid accumulation of large scale genomic data providing opportunities to systematically infer gene function and co-expression networks. Typical steps of co-expression network analysis using microarray data consist of estimation of pair-wise gene co-expression using some similarity measure, construction of co-expression networks, identification of clusters of co-expressed genes and post-cluster analyses such as cluster validation. This dissertation is primarily concerned with development and evaluation of approaches for the first and the last steps – estimation of gene co-expression matrices and validation of network clusters. Since clustering methods are not a focus, only a paraclique clustering algorithm will be used in this evaluation. First, a novel Bayesian approach is presented for combining the Pearson correlation with prior biological information from Gene Ontology, yielding a biologically relevant estimate of gene co-expression. The addition of biological information by the Bayesian approach reduced noise in the paraclique gene clusters as indicated by high silhouette and increased homogeneity of clusters in terms of molecular function. Standard similarity measures including correlation coefficients from Pearson, Spearman, Kendall’s Tau, Shrinkage, Partial, and Mutual information, and Euclidean and Manhattan distance measures were evaluated. Based on quality metrics such as cluster homogeneity and stability with respect to ontological categories, clusters resulting from partial correlation and mutual information were more biologically relevant than those from any other correlation measures. Second, statistical quality of clusters was evaluated using approaches based on permutation tests and Mantel correlation to identify significant and informative clusters that capture most of the covariance in the dataset. Third, the utility of statistical contrasts was studied for classification of temporal patterns of gene expression. Specifically, polynomial and Helmert contrast analyses were shown to provide a means of labeling the co-expressed gene sets because they showed similar temporal profiles
    corecore