Search CORE

34 research outputs found

Associative Pattern Recognition for Biological Regulation Data

Author: Xiao Yiou
Publication venue: SURFACE at Syracuse University
Publication date: 22/12/2017
Field of study

In the last decade, bioinformatics data has been accumulated at an unprecedented rate, thanks to the advancement in sequencing technologies. Such rapid development poses both challenges and promising research topics. In this dissertation, we propose a series of associative pattern recognition algorithms in biological regulation studies. In particular, we emphasize efficiently recognizing associative patterns between genes, transcription factors, histone modifications and functional labels using heterogeneous data sources (numeric, sequences, time series data and textual labels). In protein-DNA associative pattern recognition, we introduce an efficient algorithm for affinity test by searching for over-represented DNA sequences using a hash function and modulo addition calculation. This substantially improves the efficiency of \textit{next generation sequencing} data analysis. In gene regulatory network inference, we propose a framework for refining weak networks based on transcription factor binding sites, thus improved the precision of predicted edges by up to 52%. In histone modification code analysis, we propose an approach to genome-wide combinatorial pattern recognition for histone code to function associative pattern recognition, and achieved improvement by up to

38.1\%

. We also propose a novel shape based modification pattern analysis approach, using this to successfully predict sub-classes of genes in flowering-time category. We also propose a combination to combination associative pattern recognition, and achieved better performance compared against multi-label classification and bidirectional associative memory methods. Our proposed approaches recognize associative patterns from different types of data efficiently, and provides a useful toolbox for biological regulation analysis. This dissertation presents a road-map to associative patterns recognition at genome wide level

Syracuse University Research Facility and Collaborative Environment

29th International Symposium on Theoretical Aspects of Computer Science: STACS '12, February 29th to March 3rd, 2012, Paris, France

Author: Schloss Dagstuhl Leibniz-Zentrum für Informatik
Publication venue: Schloss Dagstuhl - Leibniz-Center for Informatics
Publication date: 01/02/2012
Field of study

Digitale Bibliothek Thüringen

Fundamental Approaches to Software Engineering

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2022
Field of study

This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications

Directory of Open Access Books (DOAB)

Paths and walks, forests and planes : arcadian algorithms and complexity

Author: Brand Cornelius
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

This dissertation is concerned with new results in the area of parameterized algorithms and complexity. We develop a new technique for hard graph problems that generalizes and unifies established methods such as Color-Coding, representative families, labelled walks and algebraic fingerprinting. At the heart of the approach lies an algebraic formulation of the problems, which is effected by means of a suitable exterior algebra. This allows us to estimate the number of simple paths of given length in directed graphs faster than before. Additionally, we give fast deterministic algorithms for finding paths of given length if the input graph contains only few of such paths. Moreover, we develop faster deterministic algorithms to find spanning trees with few leaves. We also consider the algebraic foundations of our new method. Additionally, we investigate the fine-grained complexity of determining the precise number of forests with a given number of edges in a given undirected graph. To wit, this happens in two ways. Firstly, we complete the complexity classification of the Tutte plane, assuming the exponential time hypothesis. Secondly, we prove that counting forests with a given number of edges is at least as hard as counting cliques of a given size.Diese Dissertation befasst sich mit neuen Ergebnissen auf dem Gebiet parametrisierter Algorithmen und Komplexitätstheorie. Wir entwickeln eine neue Technik für schwere Graphprobleme, die etablierte Methoden wie Color-Coding, representative families, labelled walks oder algebraic fingerprinting verallgemeinert und vereinheitlicht. Kern der Herangehensweise ist eine algebraische Formulierung der Probleme, die vermittels passender Graßmannalgebren geschieht. Das erlaubt uns, die Anzahl einfacher Pfade gegebener Länge in gerichteten Graphen schneller als bisher zu schätzen. Außerdem geben wir schnelle deterministische Verfahren an, Pfade gegebener Länge zu finden, falls der Eingabegraph nur wenige solche Pfade enthält. Übrigens entwickeln wir schnellere deterministische Algorithmen, um Spannbäume mit wenigen Blättern zu finden. Wir studieren außerdem die algebraischen Grundlagen unserer neuen Methode. Weiters untersuchen wir die fine-grained-Komplexität davon, die genaue Anzahl von Wäldern einer gegebenen Kantenzahl in einem gegebenen ungerichteten Graphen zu bestimmen. Und zwar erfolgt das auf zwei verschiedene Arten. Erstens vervollständigen wir die Komplexitätsklassifizierung der Tutte-Ebene unter Annahme der Expo- nentialzeithypothese. Zweitens beweisen wir, dass Wälder mit gegebener Kantenzahl zu zählen, wenigstens so schwer ist, wie Cliquen gegebener Größe zu zählen.Cluster of Excellence (Multimodal Computing and Interaction

Universaar

Acronym

Fundamental Approaches to Software Engineering

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server

Recommended from our members

Fast, Scalable, and Accurate Algorithms for Time-Series Analysis

Author: Paparrizos Ioannis
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Time is a critical element for the understanding of natural processes (e.g., earthquakes and weather) or human-made artifacts (e.g., stock market and speech signals). The analysis of time series, the result of sequentially collecting observations of such processes and artifacts, is becoming increasingly prevalent across scientific and industrial applications. The extraction of non-trivial features (e.g., patterns, correlations, and trends) in time series is a critical step for devising effective time-series mining methods for real-world problems and the subject of active research for decades. In this dissertation, we address this fundamental problem by studying and presenting computational methods for efficient unsupervised learning of robust feature representations from time series. Our objective is to (i) simplify and unify the design of scalable and accurate time-series mining algorithms; and (ii) provide a set of readily available tools for effective time-series analysis. We focus on applications operating solely over time-series collections and on applications where the analysis of time series complements the analysis of other types of data, such as text and graphs. For applications operating solely over time-series collections, we propose a generic computational framework, GRAIL, to learn low-dimensional representations that natively preserve the invariances offered by a given time-series comparison method. GRAIL represents a departure from classic approaches in the time-series literature where representation methods are agnostic to the similarity function used in subsequent learning processes. GRAIL relies on the attractive idea that once we construct the data-to-data similarity matrix most time-series mining tasks can be trivially solved. To overcome scalability issues associated with approaches relying on such matrices, GRAIL exploits time-series clustering to construct a small set of landmark time series and learns representations to reduce the data-to-data matrix to a data-to-landmark points matrix. To demonstrate the effectiveness of GRAIL, we first present domain-independent, highly accurate, and scalable time-series clustering methods to facilitate exploration and summarization of time-series collections. Then, we show that GRAIL representations, when combined with suitable methods, significantly outperform, in terms of efficiency and accuracy, state-of-the-art methods in major time-series mining tasks, such as querying, clustering, classification, sampling, and visualization. Overall, GRAIL rises as a new primitive for highly accurate, yet scalable, time-series analysis. For applications where the analysis of time series complements the analysis of other types of data, such as text and graphs, we propose generic, simple, and lightweight methodologies to learn features from time-varying measurements. Such applications often organize operations over different types of data in a pipeline such that one operation provides input---in the form of feature vectors---to subsequent operations. To reason about the temporal patterns and trends in the underlying features, we need to (i) track the evolution of features over different time periods; and (ii) transform these time-varying features into actionable knowledge (e.g., forecasting an outcome). To address this challenging problem, we propose principled approaches to model time-varying features and study two large-scale, real-world, applications. Specifically, we first study the problem of predicting the impact of scientific concepts through temporal analysis of characteristics extracted from the metadata and full text of scientific articles. Then, we explore the promise of harnessing temporal patterns in behavioral signals extracted from web search engine logs for early detection of devastating diseases. In both applications, combinations of features with time-series relevant features yielded the greatest impact than any other indicator considered in our analysis. We believe that our simple methodology, along with the interesting domain-specific findings that our work revealed, will motivate new studies across different scientific and industrial settings

Columbia University Academic Commons

LIPIcs, Volume 261, ICALP 2023, Complete Volume

Author: Etessami Kousha
Feige Uriel
Puppis Gabriele
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 261, ICALP 2023, Complete Volum

Dagstuhl Research Online Publication Server

30th International Symposium on Theoretical Aspects of Computer Science: STACS '13, February 27th to March 2nd, 2013, Kiel, Germany

Author: STACS <30 2013, Kiel>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/02/2013
Field of study

Digitale Bibliothek Thüringen

35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

Author: STACS
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/02/2018
Field of study

Digitale Bibliothek Thüringen