Search CORE

68 research outputs found

Equivalence of the filament and overlap graphs of subtrees of limited trees

Author: Enright Jessica
Stewart Lorna
Publication venue
Publication date: 14/06/2017
Field of study

The overlap graphs of subtrees of a tree are equivalent to subtree filament graphs, the overlap graphs of subtrees of a star are cocomparability graphs, and the overlap graphs of subtrees of a caterpillar are interval filament graphs. In this paper, we show the equivalence of many more classes of subtree overlap and subtree filament graphs, and equate them to classes of complements of cochordal-mixed graphs. Our results generalize the previously known results mentioned above

arXiv.org e-Print Archive

Stirling Online Research Repository (RIOXX)

Episciences.org

Enlighten

Stirling Online Research Repository

Recognising the overlap graphs of subtrees of restricted trees is hard

Author: Enright Jessica
Pergel Martin
Publication venue: Comenius University, Faculty of Mathematics, Physics and Informatics
Publication date: 29/07/2019
Field of study

The overlap graphs of subtrees in a tree (SOGs) generalise many other graphs classes with set representation characterisations. The complexity of recognising SOGs is open. The complexities of recognising many subclasses of SOGs are known. Weconsider several subclasses of SOGs by restricting the underlying tree. For a fixed integer

k \geq 3

, we consider:\begin{my_itemize} \item The overlap graphs of subtrees in a tree where that tree has

k

leaves \item The overlap graphs of subtrees in trees that can be derived from a given input tree by subdivision and have at least 3 leaves \item The overlap and intersection graphs of paths in a tree where that tree has maximum degree

k

\end{my_itemize} We show that the recognition problems of these classes are NP-complete. For all other parameters we get circle graphs, well known to be polynomially recognizable

Recognising the overlap graphs of subtrees of restricted trees is hard

Author: Enright Jessica
Pergel Martin
Publication venue: Comenius University, Faculty of Mathematics, Physics and Informatics
Publication date: 29/07/2019
Field of study

k \geq 3

, we consider:\begin{my_itemize} \item The overlap graphs of subtrees in a tree where that tree has

k

k

\end{my_itemize} We show that the recognition problems of these classes are NP-complete. For all other parameters we get circle graphs, well known to be polynomially recognizable

Enlighten

A fast approximate skeleton with guarantees for any cloud of points in a Euclidean space

Author: Elkin Yury
Kurlin Vitaliy
Liu Di
Publication venue
Publication date: 17/07/2020
Field of study

The tree reconstruction problem is to find an embedded straight-line tree that approximates a given cloud of unorganized points in

\mathbb{R}^m

up to a certain error. A practical solution to this problem will accelerate a discovery of new colloidal products with desired physical properties such as viscosity. We define the Approximate Skeleton of any finite point cloud

C

in a Euclidean space with theoretical guarantees. The Approximate Skeleton ASk

(C)

always belongs to a given offset of

C

, i.e. the maximum distance from

C

to ASk

(C)

can be a given maximum error. The number of vertices in the Approximate Skeleton is close to the minimum number in an optimal tree by factor 2. The new Approximate Skeleton of any unorganized point cloud

C

is computed in a near linear time in the number of points in

C

. Finally, the Approximate Skeleton outperforms past skeletonization algorithms on the size and accuracy of reconstruction for a large dataset of real micelles and random clouds

arXiv.org e-Print Archive

University of Liverpool Repository

Comparative Genomics in Distant Taxa: Generating Total Orders of Digraphs

Author: Gärtner Fabian
Publication venue
Publication date: 11/03/2020
Field of study

Qucosa - Publikationsserver der Universität Leipzig

Local properties of graphs with large chromatic number

Author: Davies James
Publication venue: 'University of Waterloo'
Publication date: 24/08/2022
Field of study

This thesis deals with problems concerning the local properties of graphs with large chromatic number in hereditary classes of graphs. We construct intersection graphs of axis-aligned boxes and of lines in

\mathbb{R}^3

that have arbitrarily large girth and chromatic number. We also prove that the maximum chromatic number of a circle graph with clique number at most

\omega

is equal to

\Theta(\omega \log \omega)

. Lastly, extending the

\chi

-boundedness of circle graphs, we prove a conjecture of Geelen that every proper vertex-minor-closed class of graphs is

\chi

-bounded

University of Waterloo's Institutional Repository

On the Chromatic Number of Disjointness Graphs of Curves

Author
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th International Symposium on Computational Geometry (SoCG 2019)
Publication date: 01/01/2019
Field of study

Let omega(G) and chi(G) denote the clique number and chromatic number of a graph G, respectively. The disjointness graph of a family of curves (continuous arcs in the plane) is the graph whose vertices correspond to the curves and in which two vertices are joined by an edge if and only if the corresponding curves are disjoint. A curve is called x-monotone if every vertical line intersects it in at most one point. An x-monotone curve is grounded if its left endpoint lies on the y-axis. We prove that if G is the disjointness graph of a family of grounded x-monotone curves such that omega(G)=k, then chi(G)<= binom{k+1}{2}. If we only require that every curve is x-monotone and intersects the y-axis, then we have chi(G)<= k+1/2 binom{k+2}{3}. Both of these bounds are best possible. The construction showing the tightness of the last result settles a 25 years old problem: it yields that there exist K_k-free disjointness graphs of x-monotone curves such that any proper coloring of them uses at least Omega(k^{4}) colors. This matches the upper bound up to a constant factor

Dagstuhl Research Online Publication Server

Contextual Analysis of Gene Expression Data

Author: Sohler Florian
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 20/07/2006
Field of study

As measurement of gene expression using microarrays has become a standard high throughput method in molecular biology, the analysis of gene expression data is still a very active area of research in bioinformatics and statistics. Despite some issues in quality and reproducibility of microarray and derived data, they are still considered as one of the most promising experimental techniques for the understanding of complex molecular mechanisms. This work approaches the problem of expression data analysis using contextual information. While all analyses must be based on sound statistical data processing, it is also important to include biological knowledge to arrive at biologically interpretable results. After giving an introduction and some biological background, in chapter 2 some standard methods for the analysis of microarray data including normalization, computation of differentially expressed genes, and clustering are reviewed. The first source of context information that is used to aid in the interpretation of the data, is functional annotation of genes. Such information is often represented using ontologies such as gene ontology. GO annotations are provided by many gene and protein databases and have been used to find functional groups that are significantly enriched in differentially expressed, or otherwise conspicuous genes. In gene clustering approaches, functional annotations have been used to find enriched functional classes within each cluster. In chapter 3, a clustering method for the samples of an expression data set is described that uses GO annotations during the clustering process in order to find functional classes that imply a particularly strong separation of the samples. The resulting clusters can be interpreted more easily in terms of GO classes. The clustering method was developed in joint work with Henning Redestig. More complex biological information that covers interactions between biological objects is contained in networks. Such networks can be obtained from public databases of metabolic pathways, signaling cascades, transcription factor binding sites, or high-throughput measurements for the detection of protein-protein interactions such as yeast two hybrid experiments. Furthermore, networks can be inferred using literature mining approaches or network inference from expression data. The information contained in such networks is very heterogenous with respect to the type, the quality and the completeness of the contained data. ToPNet, a software tool for the interactive analysis of networks and gene expression data has been developed in cooperation with Daniel Hanisch. The basic analysis and visualization methods as well as some important concepts of this tool are described in chapter 4. In order to access the heterogeneous data represented as networks with annotated experimental data and functions, it is important to provide advanced querying functionality. Pathway queries allow the formulation of network templates that can include functional annotations as well as expression data. The pathway search algorithm finds all instances of the template in a given network. In order to do so, a special case of the well known subgraph isomorphism problem has to be solved. Although the algorithm has exponential running time in the worst case, some implementation tricks make it run fast enough for practical purposes. Often, a pathway query has many matching instances, and it is important to assess the statistical significance of the individual instances with respect to expression data or other criteria. In chapter 5 the pathway query language and the pathway search algorithm are described in detail and some theoretical properties are derived. Furthermore, some scoring methods that have been implemented are described. The possibility of combining different scoring schemes for different parts of the query result in very flexible scoring capabilities. In chapter 6, some applications of the methods are described, using public data sets as well as data sets from research projects. On the basis of the well studied public data sets, it is demonstrated that the methods yield biologically meaningful results. The other analyses show how new hypotheses can be generated in more complex biological systems, but the validation of these hypotheses can only be provided by new experiments. Finally, an outlook is given on how the presented methods can contribute to ongoing research efforts in the area of expression data analysis, their applicability to other types of data (such as proteomics data) and their possible extensions.Während die Messung von RNA-Konzentrationen mittels Microarrays eine Standardtechnik zur genomweiten Bestimmung von Genexpressionswerten geworden ist, ist die Analyse der dabei gewonnenen Daten immer noch ein Gebiet äußerst aktiver Forschung. Trotz einiger Probleme bezüglich der Reproduzierbarkeit von Microarray- und davon abgeleiteten Daten werden diese als eine der vielversprechendsten Technologien zur Aufklärung komplexer molekularer Mechanismen angesehen. Diese Arbeit beschäftigt sich mit dem Problem der Expressionsdatenanalyse mit Hilfe von Kontextinformationen. Alle Analysen müssen auf solider Statistik beruhen, aber es ist außerdem wichtig, biologisches Wissen einzubeziehen, um biologisch interpretierbare Ergebnisse zu erhalten. Nach einer Einleitung und einigem biologischen Hintergrund werden in Kapitel 2 einige Standardmethoden zur Analyse von Expressionsdaten vorgestellt, wie z.B. Normalisierung, Berechnung differenziell exprimierter Gene sowie Clustering. Die erste Quelle von Kontextinformationen, die zur besseren Interpretation der Daten herangezogen wird, ist funktionale Annotation von Genen. Solche Informationen werden oft mit Hilfe von Ontologien wie z.B. der Gene Ontology dargestellt. GO Annotationen werden von vielen Gen- und Proteindatenbanken zur Verfügung gestellt und werden unter anderem benutzt, um Funktionen zu finden, die signifikant angereichert sind an differenziell exprimierten oder aus anderen Gründen auffälligen Genen. Bei Clusteringmethoden werden funktionale Annotationen benutzt, um in den gefundenen Clustern angereicherte Funktionen zu identifizieren. In Kapitel 3 wird ein neues Clusterverfahren für Proben in Expressionsdatensätzen vorgestellt, das GO Annotationen während des Clustering benutzt, um Funktionen zu finden, anhand derer die Expressionsdaten besonders deutlich getrennt werden können. Die resultierenden Cluster können mit Hilfe der GO Annotationen leichter interpretiert werden. Die Clusteringmethode wurde in Zusammenarbeit mit Henning Redestig entwickelt. Komplexere biologische Informationen, die auch die Interaktionen zwischen biologischen Objekten beinhaltet, sind in Netzwerken enthalten. Solche Netzwerke können aus öffentlichen Datenbanken von metabolischen Pfaden, Signalkaskaden, Bindestellen von Transkriptionsfaktoren, aber auch aus Hochdurchsatzexperimenten wie der Yeast Two Hybrid Methode gewonnen werden. Außerdem können Netzwerke durch die automatische Auswertung wissenschaftlicher Literatur oder Inferenz aus Expressionsdaten gewonnen werden. Die Information, die in solchen Netzwerken enthalten ist, ist sehr verschieden in Bezug auf die Art, die Qualität und die Vollständigkeit der Daten. ToPNet, ein Computerprogramm zur interaktiven Analyse von Netzwerken und Genexpressionsdaten, wurde gemeinsam mit Daniel Hanisch entwickelt. Die grundlegenden Analyse und Visualisierungsmethoden sowie einige wichtige Konzepte dieses Programms werden in Kapitel 4 beschrieben. Um auf die verschiedenartigen Daten zugreifen zu können, die durch Netzwerke mit funktionalen Annotationen sowie Expressionsdaten repräsentiert werden, ist es wichtig, flexible und mächtige Anfragefunktionalität zur Verfügung zu stellen. Pathway queries erlauben die Beschreibung von Netzwerkmustern, die funktionale Annotationen sowie Expressionsdaten enthalten. Der pathway search Algorithmus findet alle Instanzen des Musters in einem gegebenen Netzwerk. Dazu muss ein Spezialfall des bekannten Subgraph-Isomorphie-Problems gelöst werden. Obwohl der Algorithmus im schlechtesten Fall exponentielle Laufzeit in der Größe des Musters hat, läuft er durch einige Implementationstricks schnell genug für praktische Anwendungen. Oft hat eine pathway query viele Instanzen, so dass es wichtig ist, die statistische Signifikanz der einzelnen Instanzen in Hinblick auf Expressionsdaten oder andere Kriterien zu bestimmen. In Kapitel 5 werden die Anfragesprache pathway query language sowie der pathway search Algorithmus im Detail vorgestellt und einige theoretische Eigenschaften gezeigt. Außerdem werden einige implementierte Scoring-Methoden beschrieben. Die Möglichkeit, verschiedene Teile der Anfrage mit verschiedenen Scoring-Methoden zu bewerten und zu einem Gesamtscore zusammenzufassen, erlaubt äußerst flexible Bewertungen der Instanzen. In Kapitel 6 werden einige Anwendungen der vorgestellten Methoden beschrieben, die auf öffentlichen Datensätzen sowie Datensätzen aus Forschungsprojekten beruhen. Mit Hilfe der gut untersuchten öffentlichen Datensätze wird gezeigt, dass die Methoden biologisch sinnvolle Ergebnisse liefern. Die anderen Analysen zeigen, wie neue Hypothesen in komplexeren biologischen Systemen generiert werden können, die jedoch nur mit Hilfe von weiteren biologischen Experimenten validiert werden könnten. Schließlich wird ein Ausblick gegeben, was die vorgestellten Methoden zur laufenden Forschung im Bereich der Expressionsdatenanalyse beitragen können, wie sie auf andere Daten angewendet werden können und welche Erweiterungen denkbar und wünschenswert sind

Digitale Hochschulschriften der LMU

Immunoinformatics: towards an understanding of species-specific protein evolution using phylogenomics and network theory

Author: Webb Andrew Edward
Publication venue: Dublin City University. School of Biotechnology
Publication date: 01/03/2015
Field of study

In immunology, the mouse is unquestionably the predominant model organism. However, an increasing number of reports suggest that mouse models do not always mimic human innate immunology. To better understand this discordance at the molecular level, we are investigating two mechanisms of gene evolution: positive selection and gene remodeling by introgression/domain shuffling. We began by creating a bioinformatic pipeline for large-scale evolutionary analyses. We next investigated bowhead genomic data to test our pipeline and to determine if there is lineage specific positive selection in particular whale lineages. Positive selection is a molecular signature of adaptation, and therefore, potential protein functional divergence. Once we had the pipeline troubleshot using the low quality bowhead data we moved on to test our innate immune dataset for lineage specific selective pressures. When possible, we applied population genomics theory to identify potential false-positives and date putative positive selection events in human. The final phase of our analysis uses network (graph) theory to identify genes remodeled by domain shuffling/introgression and to identify species-specific introgressive events. Introgressive events potentially impart novel function and may also alter interactions within a protein network. By identifying genes displaying evidence of positive selection or introgression, we may begin to understand the molecular underpinnings of phenotypic discordance between human and mouse immune systems

DCU Online Research Access Service