Search CORE

13 research outputs found

A new approach for the construction of historical databases—NoSQL Document-oriented databases: the example of AtlantoCracies

Author: Díaz-Ordóñez Manuel
Rodríguez Baena Domingo S.
Yun-Casalilla Bartolomé
Publication venue: Oxford
Publication date: 01/01/2023
Field of study

This article proposes, and justifies, the use of the Document-oriented databases as a flexible, easy to use, and powerful digital tool in the field of historical research. First, the reasons that have made relational databases the predominant instrument among historians are studied, while detailing the problems involved in their use. Next, the way in which historians have tried to face these problems by using other digital tools is explained, as well as the limitations that such use entails. Through a case study—that of European aristocratic networks in early modern times—it is shown, however, that Document-oriented databases, present notable advantages and have greater explanatory power for the historian’s work. Thanks to their flexibility, they are better adapted to the often-unpredictable nature of historical sources without diminishing their ease of use or their analytical potential.Junta de Andalucía UPO-1264973Junta de Andalucía HUM 100

idUS. Depósito de Investigación Universidad de Sevilla

Clustering Main Concepts from e-Mails

Author: Aguilar Ruiz Jesús Salvador
Cohen Paul R.
Riquelme Santos José Cristóbal
Rodríguez Baena Domingo S.
Publication venue
Publication date: 01/01/2003
Field of study

E–mail is one of the most common ways to communicate, assuming, in some cases, up to 75% of a company’s communication, in which every employee spends about 90 minutes a day in e–mail tasks such as filing and deleting. This paper deals with the generation of clusters of relevant words from E–mail texts. Our approach consists of the application of text mining techniques and, later, data mining techniques, to obtain related concepts extracted from sent and received messages. We have developed a new clustering algorithm based on neighborhood, which takes into account similarity values among words obtained in the text mining phase. The potential of these applications is enormous and only a few companies, mainly large organizations, have invested in this project so far, taking advantage of employees’s knowledge in future decisions

idUS. Depósito de Investigación Universidad de Sevilla

Ensemble and Greedy Approach for the Reconstruction of Large Gene Co-Expression Networks

Author: Delgado-Chaves Fernando M
Federico Davinia
García-Torres Miguel
Gómez-Vela Francisco
Rodríguez-Baena Domingo S
Publication venue: Biosaia: Revista de los másteres de Biotecnología Sanitaria y Biotecnología Ambiental, Industrial y Alimentaria
Publication date: 19/03/2020
Field of study

In the recent years, the vast amount of genetic information generated by high-throughput approaches, have led to the need of new methods for data handling. The integrative analysis of diverse-nature gene information could provide a much-sought overview to study complex biological systems and processes. In this sense, Co-expression Gene Networks (CGN) have become a powerful tool in the comprehensive analysis of gene expression. Such networks represent relationships between genes (or gene products) by means of a graph composed of nodes and edges, where nodes represent genes and edges the relationships among them. Amongst the main features of CGN, sparseness and scale-free topology may notably affect the latter network analysis. Within this framework, structure optimization techniques are also important in the reduction of the size of the networks, not only improving their topology but also keeping a positive prediction ratio. On the other hand, ensemble strategies have significantly improved the precision of results by combining different measures or methods. In this work, we present Ensemble and Greedy networks (EnGNet), a novel two-step method for CGN inference. First, EnGNet uses an ensemble strategy for co-expression networks generation. Final score is estimated by major voting among three different methdos, i.e. Spearman and Kendall coefficients and Normalized Mutual Information. Second, a greedy algorithm optimizes both the size and the topological features of the network. Not only do achieved results show that this method is able to obtain reliable networks, but also that it significantly improves the topology of the networks. Moreover, the usefulness of the method is proven by an application to a human dataset on post-traumatic stress disorder (PTSD), revealing an innate immunity-mediated response to this pathology in accordance with previous studies. These results are indicative of the potential of CGN, and EnGNet in particular, in the unveiling of the genetic causes for complex diseases. Finally, the implications of CGN in biomarkers discovery, could lead research towards earlier detection and effective treatment of these diseases

Revistas UPO (Universidad Pablo de Olivade)

CarGene: Characterisation of sets of genes based on metabolic pathways analysis

Author: Aguilar Ruiz Jesús Salvador
Díaz Díaz Norberto
Nepomuceno Chamorro Isabel de los Ángeles
Rodríguez Baena Domingo S.
Publication venue: 'Inderscience Publishers'
Publication date: 01/01/2011
Field of study

The great amount of biological information provides scientists with an incomparable framework for testing the results of new algorithms. Several tools have been developed for analysing gene-enrichment and most of them are Gene Ontology-based tools. We developed a Kyoto Encyclopedia of Genes and Genomes (Kegg)-based tool that provides a friendly graphical environment for analysing gene-enrichment. The tool integrates two statistical corrections and simultaneously analysing the information about many groups of genes in both visual and textual manner. We tested the usefulness of our approach on a previous analysis (Huttenshower et al.). Furthermore, our tool is freely available (http://www.upo.es/eps/bigs/cargene.html).Ministerio de Ciencia y Tecnología TIN2007-68084-C02-00Ministerio de Ciencia e Innovación PCI2006-A7-0575Junta de Andalucía P07-TIC-02611Junta de Andalucía TIC-20

idUS. Depósito de Investigación Universidad de Sevilla

Discovering α–patterns from gene expression data

Author: Aguilar Ruiz Jesús Salvador
Díaz Díaz Norberto
Nepomuceno Chamorro Isabel de los Ángeles
Rodríguez Baena Domingo S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

The biclustering techniques have the purpose of finding subsets of genes that show similar activity patterns under a subset of conditions. In this paper we characterize a specific type of pattern, that we have called α–pattern, and present an approach that consists in a new biclustering algorithm specifically designed to find α–patterns, in which the gene expression values evolve across the experimental conditions showing a similar behavior inside a band that ranges from 0 up to a pre–defined threshold called α. The α value guarantees the co– expression among genes. We have tested our method on the Yeast dataset and compared the results to the biclustering algorithms of Cheng & Church (2000) and Aguilar & Divina (2005). Results show that the algorithm finds interesting biclusters, grouping genes with similar behaviors and maintaining a very low mean squared residue

idUS. Depósito de Investigación Universidad de Sevilla

Neighborhood-Based Clustering of Gene-Gene Interactions

Author: Aguilar Ruiz Jesús Salvador
Díaz Díaz Norberto
Nepomuceno Chamorro Isabel de los Ángeles
Rodríguez Baena Domingo S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

n this work, we propose a new greedy clustering algorithm to identify groups of related genes. Clustering algorithms analyze genes in order to group those with similar behavior. Instead, our approach groups pairs of genes that present similar positive and/or negative interactions. Our approach presents some interesting properties. For instance, the user can specify how the range of each gene is going to be segmented (labels). Some of these will mean expressed or inhibited (depending on the gradation). From all the label combinations a function transforms each pair of labels into another one, that identifies the type of interaction. From these pairs of genes and their interactions we build clusters in a greedy, iterative fashion, as two pairs of genes will be similar if they have the same amount of relevant interactions. Initial two–genes clusters grow iteratively based on their neighborhood until the set of clusters does not change. The algorithm allows the researcher to modify all the criteria: discretization mapping function, gene–gene mapping function and filtering function, and provides much flexibility to obtain clusters based on the level of precision needed. The performance of our approach is experimentally tested on the yeast dataset. The final number of clusters is low and genes within show a significant level of cohesion, as it is shown graphically in the experiments

idUS. Depósito de Investigación Universidad de Sevilla

Análisis de datos de expresión genética

Author: Díaz Díaz Norberto
Giráldez Raúl
Pontes Balanza Beatriz
Rodríguez Baena Domingo S.
Publication venue: 'Editorial Universidad de Almeria'
Publication date: 01/01/2006
Field of study

El análisis de datos de expresión genética es una de las tareas fundamentales dentro de la Bioinformática. Para llevar a cabo este estudio se hace necesaria la aplicación de técnicas de Minería de Datos. Las técnicas de Clustering han probado ser de gran utilidad a la hora de descubrir grupos de genes que intervienen en una misma función celular o que están regulados de la misma manera. Recientemente, el Biclustering ha sido propuesto como método para descubrir patrones de comportamiento especí co en los que el valor de expresión de un subgrupo de genes evoluciona de la misma forma a lo largo de un subgrupo de condiciones de laboratorio. En este artículo se revisan las distintas técnicas usadas en el análisis de datos de expresión genética, estudiándose en profundidad los métodos basados en Biclustering, además de discutir los diferentes métodos de validación para evaluar el modelo generado por las distintas propuestas

idUS. Depósito de Investigación Universidad de Sevilla

El trabajo autónomo como herramienta didáctica

Author: Aguilar-Ruiz Jesús S.
Barranco Carlos D.
Divina Federico
Rodríguez Baena Domingo Savio
Publication venue
Publication date: 01/01/2012
Field of study

El objetivo de este artículo es el de presentar tres casos prácticos, en el ámbito de tres asignaturas de la Titulación en Ingeniería Técnica en Informática de Gestión de la Universidad Pablo de Olavide, en los que el trabajo autónomo del alumno ha sido la herramienta utilizada para solventar la problemática provocada por la reducción de horas de clases que deriva de la implantación del EEES que se agravaba más en la modalidad semipresencial de la titulación, modalidad en la que los alumnos, normalmente trabajadores en activo, ven reducidas las horas de presencialidad requerida un 50% para facilitar la compaginación de estudios y actividad laboral. Los resultados obtenidos en términos de tasas de éxito y porcentajes de abandono muestran unamejora de los resultados obtenidos por las asignaturas, corroborando la utilidad de un trabajo autónomo bien planteado.Artículo revisado por pare

Repositorio Institucional Olavide

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A Deterministic Model to Infer Gene Networks from Microarray Data

Author: Aguilar Ruiz Jesús Salvador
Díaz Díaz Norberto
García Gutiérrez Jorge
Nepomuceno Chamorro Isabel de los Ángeles
Rodríguez Baena Domingo S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Microarray experiments help researches to construct the str ucture of gene regulatory networks, i.e., networks representing relation ships among different genes. Filter and knowledge extraction processes are necessary in order to handle the huge amount of data produced by microarray technologies. We propose regression trees techniques as a method to identify gene networks. Regression trees are a very use ful technique to estimate the numerical values for the target outputs. They are very often more precise than linear regression models because they can adjust different linear regressions to separate areas of the search space. In our approach, we generate a single regression tree for each genes from a set of genes, taking as input the remaining genes, to finally build a graph from all the relationships among output and input genes. In this paper, we will simplify the approach by setting an only seed, the gene ARN1, and building the graph around it. The final model might gives some clues to understand the dynamics, the regulation or the topology of the gene network from one (or several) seeds, since it gathers rele vant genes with accurate connections. The performance of our approach is experimentally tested on the yeast Saccharomyces cerevisiae dataset (Rosetta compendium)

idUS. Depósito de Investigación Universidad de Sevilla