Search CORE

208 research outputs found

Reliably Capture Local Clusters in Noisy Domains From Parallel Universes

Author
Publication venue: Dagstuhl Seminar Proceedings. 07181 - Parallel Universes and Local Patterns
Publication date: 01/01/2007
Field of study

When seeking for small local patterns it is very intricate to distinguish between incidental agglomeration of noisy points and true local patterns. We propose a new approach that addresses this problem by exploiting temporal information which is contained in most business data sets. The algorithm enables the detection of local patterns in noisy data sets more reliable compared to the case when the temporal information is ignored. This is achieved by making use of the fact that noise does not reproduce its incidental structure but even small patterns do. In particular, we developed a method to track clusters over time based on an optimal match of data partitions between time periods

Dagstuhl Research Online Publication Server

Immunology as a metaphor for computational information processing : fact or fiction?

Author: Hart Emma
Publication venue: The University of Edinburgh
Publication date: 01/01/2002
Field of study

The biological immune system exhibits powerful information processing capabilities, and therefore is of great interest to the computer scientist. A rapidly expanding research area has attempted to model many of the features inherent in the natural immune system in order to solve complex computational problems. This thesis examines the metaphor in detail, in an effort to understand and capitalise on those features of the metaphor which distinguish it from other existing methodologies. Two problem domains are considered — those of scheduling and data-clustering. It is argued that these domains exhibit similar characteristics to the environment in which the biological immune system operates and therefore that they are suitable candidates for application of the metaphor. For each problem domain, two distinct models are developed, incor-porating a variety of immunological principles. The models are tested on a number of artifical benchmark datasets. The success of the models on the problems considered confirms the utility of the metaphor

CiteSeerX

Edinburgh Research Archive

Learning with Graphs using Kernels from Propagated Information

Author: Neumann Marion
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Traditional machine learning approaches are designed to learn from independent vector-valued data points. The assumption that instances are independent, however, is not always true. On the contrary, there are numerous domains where data points are cross-linked, for example social networks, where persons are linked by friendship relations. These relations among data points make traditional machine learning diffcult and often insuffcient. Furthermore, data points themselves can have complex structure, for example molecules or proteins constructed from various bindings of different atoms. Networked and structured data are naturally represented by graphs, and for learning we aimto exploit their structure to improve upon non-graph-based methods. However, graphs encountered in real-world applications often come with rich additional information. This naturally implies many challenges for representation and learning: node information is likely to be incomplete leading to partially labeled graphs, information can be aggregated from multiple sources and can therefore be uncertain, or additional information on nodes and edges can be derived from complex sensor measurements, thus being naturally continuous. Although learning with graphs is an active research area, learning with structured data, substantially modeling structural similarities of graphs, mostly assumes fully labeled graphs of reasonable sizes with discrete and certain node and edge information, and learning with networked data, naturally dealing with missing information and huge graphs, mostly assumes homophily and forgets about structural similarity. To close these gaps, we present a novel paradigm for learning with graphs, that exploits the intermediate results of iterative information propagation schemes on graphs. Originally developed for within-network relational and semi-supervised learning, these propagation schemes have two desirable properties: they capture structural information and they can naturally adapt to the aforementioned issues of real-world graph data. Additionally, information propagation can be efficiently realized by random walks leading to fast, flexible, and scalable feature and kernel computations. Further, by considering intermediate random walk distributions, we can model structural similarity for learning with structured and networked data. We develop several approaches based on this paradigm. In particular, we introduce propagation kernels for learning on the graph level and coinciding walk kernels and Markov logic sets for learning on the node level. Finally, we present two application domains where kernels from propagated information successfully tackle real-world problems

bonndoc – Der Publikationsserver der Universität Bonn

Perturbative quantum simulation

Author: Endo Suguru
Hayden Patrick
Lin Huiping
Sun Jinzhao
Vedral Vlatko
Yuan Xiao
Publication venue
Publication date: 10/06/2021
Field of study

Approximations based on perturbation theory are the basis for most of the quantitative predictions of quantum mechanics, whether in quantum field theory, many-body physics, chemistry or other domains. Quantum computing provides an alternative to the perturbation paradigm, but the tens of noisy qubits currently available in state-of-the-art quantum processors are of limited practical utility. In this article, we introduce perturbative quantum simulation, which combines the complementary strengths of the two approaches, enabling the solution of large practical quantum problems using noisy intermediate-scale quantum hardware. The use of a quantum processor eliminates the need to identify a solvable unperturbed Hamiltonian, while the introduction of perturbative coupling permits the quantum processor to simulate systems larger than the available number of physical qubits. After introducing the general perturbative simulation framework, we present an explicit example algorithm that mimics the Dyson series expansion. We then numerically benchmark the method for interacting bosons, fermions, and quantum spins in different topologies, and study different physical phenomena on systems of up to

48

qubits, such as information propagation, charge-spin separation and magnetism. In addition, we use 5 physical qubits on the IBMQ cloud to experimentally simulate the

8

-qubit Ising model using our algorithm. The result verifies the noise robustness of our method and illustrates its potential for benchmarking large quantum processors with smaller ones.Comment: 35 pages, 12 figure

arXiv.org e-Print Archive

Fuzzy-Granular Based Data Mining for Effective Decision Support in Biomedical Applications

Author: He Yuanchen
Publication venue: ScholarWorks @ Georgia State University
Publication date: 04/12/2006
Field of study

Due to complexity of biomedical problems, adaptive and intelligent knowledge discovery and data mining systems are highly needed to help humans to understand the inherent mechanism of diseases. For biomedical classification problems, typically it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). In this dissertation, a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, is proposed to build such a DSS for binary classification problems in the biomedical domain. Empirical studies show that FARM-DS is competitive to state-of-the-art classifiers in terms of prediction accuracy. More importantly, FARs can provide strong decision support on disease diagnoses due to their easy interpretability. This dissertation also proposes a fuzzy-granular method to select informative and discriminative genes from huge microarray gene expression data. With fuzzy granulation, information loss in the process of gene selection is decreased. As a result, more informative genes for cancer classification are selected and more accurate classifiers can be modeled. Empirical studies show that the proposed method is more accurate than traditional algorithms for cancer classification. And hence we expect that genes being selected can be more helpful for further biological studies

ScholarWorks @ Georgia State University

PERICLES Deliverable 4.3:Content Semantics and Use Context Analysis Techniques

Author: Chatzilari E
Corubolo F
Darányi Sandor
De Weerdt David
Gill Alastair
Kontopoulos Efstratios
Maronidis A
Mitzias P
Nikopoulos S
Riga M
Sauter Christine
Tonkin Emma L.
Waddington Simon
Wittek Peter
Publication venue
Publication date: 01/01/2016
Field of study

The current deliverable summarises the work conducted within task T4.3 of WP4, focusing on the extraction and the subsequent analysis of semantic information from digital content, which is imperative for its preservability. More specifically, the deliverable defines content semantic information from a visual and textual perspective, explains how this information can be exploited in long-term digital preservation and proposes novel approaches for extracting this information in a scalable manner. Additionally, the deliverable discusses novel techniques for retrieving and analysing the context of use of digital objects. Although this topic has not been extensively studied by existing literature, we believe use context is vital in augmenting the semantic information and maintaining the usability and preservability of the digital objects, as well as their ability to be accurately interpreted as initially intended.PERICLE

University of Borås

Digitala Vetenskapliga Arkivet - Academic Archive On-line

King's Research Portal

Explore Bristol Research

Probabilistic techniques in semantic mapping for mobile robotics

Author: Ruiz Sarmiento José Raúl
Publication venue: UMA Editorial
Publication date: 01/01/2016
Field of study

Los mapas semánticos son representaciones del mundo que permiten a un robot entender no sólo los aspectos espaciales de su lugar de trabajo, sino también el significado de sus elementos (objetos, habitaciones, etc.) y como los humanos interactúan con ellos (e.g. funcionalidades, eventos y relaciones). Para conseguirlo, un mapa semántico añade a las representaciones puramente espaciales, tales como mapas geométricos o topológicos, meta-información sobre los tipos de elementos y relaciones que pueden encontrarse en el entorno de trabajo. Esta meta-información, denominada conocimiento semántico o de sentido común, se codifica típicamente en Bases de Conocimiento. Un ejemplo de este tipo de información podría ser: "los frigoríficos son objetos grandes, con forma rectangular, colocados normalmente en las cocinas, y que pueden contener comida perecedera y medicación". Codificar y manejar este conocimiento semántico permite al robot razonar acerca de la información obtenida de un cierto lugar de trabajo, así como inferir nueva información con el fin de ejecutar eficientemente tareas de alto nivel como "¡hola robot! llévale la medicación a la abuela, por favor". La presente tesis propone la utilización de técnicas probabilísticas para construir y mantener mapas semánticos, lo cual presenta tres ventajas principales en comparación con los enfoques tradicionales: i) permite manejar incertidumbre (proveniente de los sensores imprecisos del robot y de los modelos empleados), ii) provee representaciones del entorno coherentes por medio del aprovechamiento de las relaciones contextuales entre los elementos observados (e.g. los frigoríficos usualmente se encuentran en las cocinas) desde un punto de vista holístico, y iii) produce valores de certidumbre que reflejan el grado de exactitud de la comprensión del robot acerca de su entorno. Específicamente, las contribuciones presentadas pueden agruparse en dos temas principales. El primer conjunto de contribuciones se basa en el problema del reconocimiento de objetos y/o habitaciones, ya que los sistemas de mapeo semántico deben contar con algoritmos de reconocimiento fiables para la construcción de representaciones válidas. Para ello se ha explorado la utilización de Modelos Gráficos Probabilísticos (Probabilistic Graphical Models o PGMs en inglés) con el fin de aprovechar las relaciones de contexto entre objetos y/o habitaciones a la vez que se maneja la incertidumbre inherente al problema de reconocimiento, y el empleo de Bases de Conocimiento para mejorar su desempeño de distintos modos, e.g., detectando resultados incoherentes, proveyendo información a priori, reduciendo la complejidad de los algoritmos de inferencia probabilística, generando ejemplos de entrenamiento sintéticos, habilitando el aprendizaje a partir de experiencias pasadas, etc. El segundo grupo de contribuciones acomoda los resultados probabilísticos provenientes de los algoritmos de reconocimiento desarrollados en una nueva representación semántica, denominada Multiversal Semantic Map (MvSmap). Este mapa gestiona múltiples interpretaciones del espacio de trabajo del robot, llamadas universos, los cuales son anotados con la probabilidad de ser los correctos de acuerdo con el conocimiento actual del robot. Así, este enfoque proporciona una creencia fundamentada sobre la exactitud de la comprensión del robot sobre su entorno, lo que le permite operar de una manera más eficiente y coherente. Los algoritmos probabilísticos propuestos han sido testeados concienzudamente y comparados con otros enfoques actuales e innovadores empleando conjuntos de datos del estado del arte. De manera adicional, esta tesis también contribuye con dos conjuntos de datos, UMA-Offices and Robot@Home, los cuales contienen información sensorial capturada en distintos entornos de oficinas y casas, así como dos herramientas software, la librería Undirected Probabilistic Graphical Models in C++ (UPGMpp), y el conjunto de herramientas Object Labeling Toolkit (OLT), para el trabajo con Modelos Gráficos Probabilísticos y el procesamiento de conjuntos de datos respectivamente

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga

Recommended from our members

Genomic Evolution of Glioblastoma

Author: Ladewig Erik
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Understanding how tumors evolve and drive uncontrolled cellular growth may lead to better prognosis and therapy for individuals suffering from cancer. A key to understanding the paths of progression are to develop computational and experimental methods to dissect clonal heterogeneity and statistically model evolutionary routes. This thesis contains results from analysis of genomic data using computational methods that integrate diverse next generation sequencing data and evolutionary concepts to model tumor evolution and delineate likely routes of genomic alterations. First, I introduce some background and present studies into how tumor genomic sequencing tells us about tumor evolution. This will encompass some of the principles and practices related to tumor heterogeneity within the field of computional biology. Second, I will present a study of longitudinal sampling in Glioblastoma (GBM) in cohort of 114 individuals pre- and post-treatment. We will see how genomic alterations were dissected to uncover a diverse and largely unexpected landscape of recurrence. This details major observations that the recurrent tumor is not likely seeded by the primary lesion. Second, to dissect heterogeneity from clonal evolution, multiple biopsies will be added to extend our longitudinal GBM cohort. This new data will introduce analyses to explicate inter and intra-tumor heterogeneity of GBM. Specifically, we identify a metric of intratumor heterogeneity able to identify multisector biopsies and propose a model of tumor growth in multiple GBM. These results will relate to clinical outcome and are in agreement with previously established hypotheses in truncal mutation targeting. Fourth, I will introduce new models of clonal growth applicable to 2 patient biopsies and then fit these to our GBM cohort. Simulations are used to verify models and a brief proof is presented

Columbia University Academic Commons

Application and Optimization of Contact-Guided Replica Exchange Molecular Dynamics

Author: Voronin Arthur
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 18/05/2022
Field of study

Proteine sind komplexe Makromoleküle, die in lebenden Organismen eine große Vielfalt an wichtigen Aufgaben erfüllen. Proteine können beispielsweise Gene regulieren, Struktur stabilisieren, Zellsignale übertragen, Substanzen transportieren und vieles mehr. Typischerweise sind umfassende Kenntnisse von Struktur und Dynamik eines Proteins erforderlich um dessen physiologische Funktion und Interaktionsmechanismen vollständig zu verstehen. Gewonnene Erkenntnisse sind für Biowissenschaften unerlässlich und können auf viele Bereiche angewendet werden, wie z.B. für Arzneimitteldesign oder zur Krankheitsbehandlung. Trotz des unfassbaren Fortschritts experimenteller Techniken bleibt die Bestimmung einer Proteinstruktur immer noch eine herausfordernde Aufgabe. Außerdem können Experimente nur Teilinformationen liefern und Messdaten können mehrdeutig und schwer zu interpretieren sein. Aus diesem Grund werden häufig Computersimulationen durchgeführt um weitere Erkenntnisse zu liefern und die Lücke zwischen Theorie und Experiment zu schließen. Heute sind viele in-silico Methoden in der Lage genaue Protein Strukturmodelle zu erzeugen, sei es mit einem de novo Ansatz oder durch Verbesserung eines anfänglichen Modells unter Berücksichtigung experimenteller Daten. In dieser Dissertation erforsche ich die Möglichkeiten von Replica Exchange Molekulardynamik (REX MD) als ein physikbasierter Ansatz zur Erzeugung von physikalisch sinnvollen Proteinstrukturen. Dabei lege ich den Fokus darauf möglichst nativähnliche Strukturen zu erhalten und untersuche die Stärken und Schwächen der angewendeten Methode. Ich erweitere die Standardanwendung, indem ich ein kontaktbasiertes Bias-Potential integriere um die Leistung und das Endergebnis von REX zu verbessern. Die Einbeziehung nativer Kontaktpaare, die sowohl aus theoretischen als auch aus experimentellen Quellen abgeleitet werden können, treibt die Simulation in Richtung gewünschter Konformationen und reduziert dementsprechend den notwendigen Rechenaufwand. Während meiner Arbeit führte ich mehrere Studien durch mit dem Ziel, die Anreicherung von nativ-ähnlichen Strukturen zu maximieren, wodurch der End-to-End Prozess von geleitetem REX MD optimiert wird. Jede Studie zielt darauf ab wichtige Aspekte der verwendeten Methode zu untersuchen und zu verbessern: 1) Ich studiere die Auswirkungen verschiedener Auswahlen von Bias-Kontakten, insbesondere die Reichweitenabhängigkeit und den negativen Einfluss von fehlerhaften Kontakten. Dadurch kann ich ermitteln, welche Art von Bias zu einer signifikanten Anreicherung von nativ-ähnlichen Konformationen führen im Vergleich zu regulärem REX. 2) Ich führe eine Parameteroptimierung am verwendeten Bias-Potential durch. Der Vergleich von Ergebnissen aus REX-Simulationen unter Verwendung unterschiedlicher sigmoidförmiger Potentiale weist mir sinnvolle Parameter Bereiche auf, wodurch ich ein ideales Bias-Potenzial für den allgemeinen Anwendungsfall ableiten kann. 3) Ich stelle eine de novo Faltungsmethode vor, die möglichst schnell viele einzigartige Startstrukturen für REX generieren kann. Dabei untersuche ich ausführlich die Leistung dieser Methode und vergleiche zwei verschiedene Ansätze zur Auswahl der Startstruktur. Das Ergebnis von REX wird stark verbessert, falls Strukturen bereits zu Beginn eine große Bandbreite des Konformationsraumes abdecken und gleichzeitig eine geringe Distanz zum angestrebten Zustand aufweisen. 4) Ich untersuche vier komplexe Algorithmusketten, die in der Lage sind repräsentative Strukturen aus großen biomolekularen Ensembles zu extrahieren, welche durch REX erzeugt wurden. Dabei studiere ich ihre Robustheit und Zuverlässigkeit, vergleiche sie miteinander und bewerte ihre erbrachte Leistung numerisch. 5) Basierend auf meiner Erfahrung mit geleitetem REX MD habe ich ein Python-Paket entwickelt um REX-Projekte zu automatisieren und zu vereinfachen. Es ermöglicht einem Benutzer das Entwerfen, Ausführen, Analysieren und Visualisieren eines REX-Projektes in einer interaktiven und benutzerfreundlichen Umgebung

KITopen

Informed Segmentation Approaches for Studying Time-Varying Functional Connectivity in Resting State fMRI

Author: Duda Marlena
Publication venue
Publication date: 01/01/2021
Field of study

The brain is a complex dynamical system that is never truly “at rest”. Even in the absence of explicit task demands, the brain still manifests a stream of conscious thought, varying levels of vigilance and arousal, as well as a number of postulated ongoing “under the hood” functions such as memory consolidation. Over the past decade, the field of time-varying functional connectivity (TVFC) has emerged as a means of detecting dynamic reconfigurations of the network structure in the resting brain, as well as uncovering the relevance of these changing connectivity patterns with respect to cognition, behavior, and psychopathology. Since the nature and timescales of the underlying resting dynamics are unknown, methodologies that can detect changing temporal patterns in connectivity without imposing arbitrary timescales are required. Moreover, as the study of TVFC is still in its infancy, rigorous evaluation of new and existing methodologies is critical to better understand their behavior when applied in resting data, which lacks ground truth temporal landmarks against which accuracy can be assessed. In this dissertation, I contribute to the methodological component of the TVFC discourse. I propose two distinct, yet related, approaches for identifying TVFC using an informed segmentation framework. This data-driven framework bridges instantaneous and windowed approaches for studying TVFC, in an attempt to mitigate the limitations of each while simultaneously leveraging the advantages of both. I also present a comprehensive, head-to-head comparative analysis of several of the most promising TVFC methodologies proposed to date, which does not exist in the current body of literature.PHDBioinformaticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/170046/1/marlenad_1.pd

Deep Blue Documents at the University of Michigan