Search CORE

8 research outputs found

Descoberta da topologia de rede

Author: Oliveira Olga Margarida Fajarda
Publication venue: Universidade de Aveiro
Publication date: 01/01/2017
Field of study

Doutoramento em MatemáticaA monitorização e avaliação do desempenho de uma rede são essenciais para detetar e resolver falhas no seu funcionamento. De modo a conseguir efetuar essa monitorização, e essencial conhecer a topologia da rede, que muitas vezes e desconhecida. Muitas das técnicas usadas para a descoberta da topologia requerem a cooperação de todos os dispositivos de rede, o que devido a questões e políticas de segurança e quase impossível de acontecer. Torna-se assim necessário utilizar técnicas que recolham, passivamente e sem a cooperação de dispositivos intermédios, informação que permita a inferência da topologia da rede. Isto pode ser feito recorrendo a técnicas de tomografia, que usam medições extremo-a-extremo, tais como o atraso sofrido pelos pacotes. Nesta tese usamos métodos de programação linear inteira para resolver o problema de inferir uma topologia de rede usando apenas medições extremo-a-extremo. Apresentamos duas formulações compactas de programação linear inteira mista (MILP) para resolver o problema. Resultados computacionais mostraram que a medida que o número de dispositivos terminais cresce, o tempo que as duas formulações MILP compactas necessitam para resolver o problema, também cresce rapidamente. Consequentemente, elaborámos duas heurísticas com base nos métodos Feasibility Pump e Local ranching. Uma vez que as medidas de atraso têm erros associados, desenvolvemos duas abordagens robustas, um para controlar o número máximo de desvios e outra para reduzir o risco de custo alto. Criámos ainda um sistema que mede os atrasos de pacotes entre computadores de uma rede e apresenta a topologia dessa rede.Monitoring and evaluating the performance of a network is essential to detect and resolve network failures. In order to achieve this monitoring level, it is essential to know the topology of the network which is often unknown. Many of the techniques used to discover the topology require the cooperation of all network devices, which is almost impossible due to security and policy issues. It is therefore, necessary to use techniques that collect, passively and without the cooperation of intermediate devices, the necessary information to allow the inference of the network topology. This can be done using tomography techniques, which use end-to-end measurements, such as the packet delays. In this thesis, we used some integer linear programming theory and methods to solve the problem of inferring a network topology using only end-to-end measurements. We present two compact mixed integer linear programming (MILP) formulations to solve the problem. Computational results showed that as the number of end-devices grows, the time need by the two compact MILP formulations to solve the problem also grows rapidly. Therefore, we elaborate two heuristics based on the Feasibility Pump and Local Branching method. Since the packet delay measurements have some errors associated, we developed two robust approaches, one to control the maximum number of deviations and the other to reduce the risk of high cost. We also created a system that measures the packet delays between computers on a network and displays the topology of that network

Repositório Institucional da Universidade de Aveiro

Árvores filogenéticas e o problema da evolução mínima

Author: Oliveira Olga Margarida Fajarda
Publication venue: Universidade de Aveiro
Publication date: 01/01/2009
Field of study

Mestrado em Matemática e AplicaçõesAs árvores filogenéticas permite compreender a história evolutiva das espécies e pode ajudar no desenvolvimento de vacinas e no estudo da biodiversidade. Existem vários critérios para seleccionar uma árvore filogenética de entre as muitas possíveis, sendo um deles o da evolução mínima. Nesta dissertação estudam-se vários métodos para a construção das árvores filogenéticas e várias formulações para a resolução do problema da evolução mínima. Ainda, se apresenta uma formulação alternativa que foi implementada em XPRESS.The phylogenetic trees permits to understand the evolutionary history of species and can assist in the development of vaccines and the study of biodiversity. There are several criteria to select a phylogenetic tree among the many possible, one being the evolution of the minimum. In this thesis we study various methods for the construction of phylogenetic trees and various formulations to solve the problem of minimum evolution. It, also, presents an alternative formulation that was implemented in XPRESS

Repositório Institucional da Universidade de Aveiro

Merging microarray studies to identify a common gene expression signature to several structural heart diseases

Author: Duarte-Pereira Sara
Fajarda Olga
Oliveira José Luís
Silva Raquel M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/07/2020
Field of study

Background: Heart disease is the leading cause of death worldwide. Knowing a gene expression signature in heart disease can lead to the development of more efficient diagnosis and treatments that may prevent premature deaths. A large amount of microarray data is available in public repositories and can be used to identify differentially expressed genes. However, most of the microarray datasets are composed of a reduced number of samples and to obtain more reliable results, several datasets have to be merged, which is a challenging task. The identification of differentially expressed genes is commonly done using statistical methods. Nonetheless, these methods are based on the definition of an arbitrary threshold to select the differentially expressed genes and there is no consensus on the values that should be used. Results: Nine publicly available microarray datasets from studies of different heart diseases were merged to form a dataset composed of 689 samples and 8354 features. Subsequently, the adjusted p-value and fold change were determined and by combining a set of adjusted p-values cutoffs with a list of different fold change thresholds, 12 sets of differentially expressed genes were obtained. To select the set of differentially expressed genes that has the best accuracy in classifying samples from patients with heart diseases and samples from patients with no heart condition, the random forest algorithm was used. A set of 62 differentially expressed genes having a classification accuracy of approximately 95% was identified. Conclusions: We identified a gene expression signature common to different cardiac diseases and supported our findings by showing their involvement in the pathophysiology of the heart. The approach used in this study is suitable for the identification of gene expression signatures, and can be extended to different diseases.info:eu-repo/semantics/publishedVersio

Repositório Institucional da Universidade Católica Portuguesa

Naprt expression regulation mechanisms: novel functions predicted by a bioinformatics approach

Author: Duarte-Pereira Sara
Fajarda Olga
Matos Sérgio
Oliveira José Luís
Silva Raquel Monteiro
Publication venue: 'MDPI AG'
Publication date: 01/12/2021
Field of study

The nicotinate phosphoribosyltransferase (NAPRT) gene has gained relevance in the research of cancer therapeutic strategies due to its main role as a NAD biosynthetic enzyme. NAD metabolism is an attractive target for the development of anti-cancer therapies, given the high energy requirements of proliferating cancer cells and NAD-dependent signaling. A few studies have shown that NAPRT expression varies in different cancer types, making it imperative to assess NAPRT expression and functionality status prior to the application of therapeutic strategies targeting NAD. In addition, the recent finding of NAPRT extracellular form (eNAPRT) suggested the involvement of NAPRT in inflammation and signaling. However, the mechanisms regulating NAPRT gene expression have never been thoroughly addressed. In this study, we searched for NAPRT gene expression regulatory mechanisms in transcription factors (TFs), RNA binding proteins (RBPs) and microRNA (miRNAs) databases. We identified several potential regulators of NAPRT transcription activation, downregulation and alternative splicing and performed GO and expression analyses. The results of the functional analysis of TFs, RBPs and miRNAs suggest new, unexpected functions for the NAPRT gene in cell differentiation, development and neuronal biology.info:eu-repo/semantics/publishedVersio

Directory of Open Access Journals

PubMed Central

Repositório Institucional da Universidade Católica Portuguesa

Methodology to identify a gene expression signature by merging microarray datasets

Author: Almeida João Rafael
Duarte-Pereira Sara
Fajarda Olga
Oliveira José Luís
Silva Raquel M.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2023
Field of study

A vast number of microarray datasets have been produced as a way to identify differentially expressed genes and gene expression signatures. A better understanding of these biological processes can help in the diagnosis and prognosis of diseases, as well as in the therapeutic response to drugs. However, most of the available datasets are composed of a reduced number of samples, leading to low statistical, predictive and generalization power. One way to overcome this problem is by merging several microarray datasets into a single dataset, which is typically a challenging task. Statistical methods or supervised machine learning algorithms are usually used to determine gene expression signatures. Nevertheless, statistical methods require an arbitrary threshold to be defined, and supervised machine learning methods can be ineffective when applied to high-dimensional datasets like microarrays. We propose a methodology to identify gene expression signatures by merging microarray datasets. This methodology uses statistical methods to obtain several sets of differentially expressed genes and uses supervised machine learning algorithms to select the gene expression signature. This methodology was validated using two distinct research applications: one using heart failure and the other using autism spectrum disorder microarray datasets. For the first, we obtained a gene expression signature composed of 117 genes, with a classification accuracy of approximately 98%. For the second use case, we obtained a gene expression signature composed of 79 genes, with a classification accuracy of approximately 82%. This methodology was implemented in R language and is available, under the MIT licence, at https://github.com/bioinformatics-ua/MicroGES.info:eu-repo/semantics/publishedVersio

Repositório Institucional da Universidade Católica Portuguesa

GTO : A toolkit to unify pipelines in genomic and proteomic research

Author: Almeida Joao R.
Fajarda Olga
Oliveira Jose L.
Pinho Armando J.
Pratas Diogo
Publication venue
Publication date: 01/01/2020
Field of study

Next-generation sequencing triggered the production of a massive volume of publicly available data and the development of new specialised tools. These tools are dispersed over different frameworks, making the management and analyses of the data a challenging task. Additionally, new targeted tools are needed, given the dynamics and specificities of the field. We present GTO, a comprehensive toolkit designed to unify pipelines in genomic and proteomic research, which combines specialised tools for analysis, simulation, compression, development, visualisation, and transformation of the data. This toolkit combines novel tools with a modular architecture, being an excellent platform for experimental scientists, as well as a useful resource for teaching bioinformatics enquiry to students in life sciences. GTO is implemented in C language and is available, under the MIT license, at https://bioinformatics.ua.pt/gto. (C) 2020 The Authors. Published by Elsevier B.V.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Portuguese twitter dataset on COVID-19

Author: Fajarda Olga
Jonker Richard A.A.
Lopes Rui Pedro
Matos Sérgio
Oliveira José Luís
Poudel Roshan
Publication venue: IEEE
Publication date: 01/01/2022
Field of study

Over the last two years, the COVID-19 pandemic has affected hundreds of millions of people around the world. As in many crises, people turn to social media platforms, like Twitter, to communicate and share information. Twitter datasets have been used over the years in many research studies to extract valuable information. Therefore, several large COVID- 19 Twitter datasets have been released over the last two years. However, none of these datasets contains only Portuguese Tweets, despite the Portuguese Language being reported as one of the top five languages used on Twitter. In this paper, we present the first large-scale Portuguese COVID-19 Twitter dataset. The dataset contains over 19 million Tweets spanning 2020 and 2021, allowing the entire pandemic to be analyzed. We also conducted a sentiment analysis on the dataset and correlated the various spikes in Tweet count and sentiment scores to various news articles and government announcements in Portugal and Brazil. The dataset is available at: https://github.com/bioinformaticsua/ Portuguese-Covid19-DatasetThis work was supported by FCT – Fundaçãoo para a Ciência e Tecnologia within project DSAIPA/AI/0088/2020.info:eu-repo/semantics/publishedVersio

Biblioteca Digital do IPB

MIP model-based heuristics for the minimum weighted tree reconstruction problem

Author: Fajarda Olga
Requejo Cristina
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/07/2023
Field of study

We consider the Minimum Weighted Tree Reconstruction (MWTR) problem and two matheuristic methods to obtain optimal or near-optimal solutions: the Feasibility Pump heuristic and the Local Branching heuristic. These matheuristics are based on a Mixed Integer Programming (MIP) model used to find feasible solutions. We discuss the applicability and effectiveness of the matheuristics to obtain solutions to the MWTR problem. The purpose of the MWTR problem is to find a minimum weighted tree connecting a set of leaves in such a way that the length of the path between each pair of leaves is greater than or equal to a given distance between the considered pair of leaves. The Feasibility Pump matheuristic starts with the Linear Programming solution, iteratively fixes the values of some variables and solves the corresponding problem until a feasible solution is achieved. The Local Branching matheuristic, in its turn, improves a feasible solution by using a local search. Computational results using two different sets of instances, one from the phylogenetic area and another from the telecommunications area, show that these matheuristics are quite effective in finding feasible solutions and present small gap values. Each matheuristic can be used independently; however, the best results are obtained when used together. For instances of the problem having up to 17 leaves, the feasible solution obtained by the Feasibility Pump heuristic is improved by the Local Branching heuristic. Noticeably, when comparing with existing based models processes that solve instances having up to 15 leaves, this achievement of the matheuristic increases the size of solved instances.publishe

Repositório Institucional da Universidade de Aveiro