25 research outputs found
A Machine Learning Approach to Reduce Dimensional Space in Large Datasets
Large datasets computing is a research problem as well as a huge challenge due to massive amounts of data that are mined and crunched in order to successfully analyze these massive datasets because they constitute a valuable source of information over different and cross-folded domains, and therefore it represents an irreplaceable opportunity. Hence, the increasing number of environments that use data-intensive computations need more complex calculations than the ones applied to grid-based infrastructures. In this way, this paper analyzes the most commonly used algorithms regarding to this complex problem of handling large datasets whose part of research efforts are focused on reducing dimensional space. Consequently, we present a novel machine learning method that reduces dimensional space in large datasets. This approach is carried out by developing different phases: merging all datasets as a huge one, performing the Extract, Transform and Load (ETL) process, applying the Principal Component Analysis (PCA) algorithm to machine learning techniques, and finally displaying the data results by means of dashboards. The major contribution in this paper is the development of a novel architecture divided into five phases that presents an hybrid method of machine learning for reducing dimensional space in large datasets. In order to verify the correctness of our proposal, we have presented a case study with a complex dataset, specifically an epileptic seizure recognition database. The experiments carried out are very promising since they present very encouraging results to be applied to a great number of different domains.This work was partially funded by Grant RTI2018-094283-B-C32, ECLIPSE-UA (Spanish Ministry of Education and Science), and in part by the Lucentia AGI Grant. This work was partially funded by GENDER-NET Plus Joint Call on Gender an UN Sustainable Development Goals (European Commission - Grant Agreement 741874), funded in Spain by “La Caixa” Foundation (ID 100010434) with code LCF/PR/DE18/52010001 to MTH
An Ontology-Oriented Architecture for Dealing With Heterogeneous Data Applied to Telemedicine Systems
Current trends in medicine regarding issues of accessibility to and the quantity and quality of information and quality of service are very different compared to former decades. The current state requires new methods for addressing the challenge of dealing with enormous amounts of data present and growing on the Web and other heterogeneous data sources such as sensors and social networks and unstructured data, normally referred to as big data. Traditional approaches are not enough, at least on their own, although they were frequently used in hybrid architectures in the past. In this paper, we propose an architecture to process big data, including heterogeneous sources of information. We have defined an ontology-oriented architecture, where a core ontology has been used as a knowledge base and allows data integration of different heterogeneous sources. We have used natural language processing and artificial intelligence methods to process and mine data in the health sector to uncover the knowledge hidden in diverse data sources. Our approach has been applied to the field of personalized medicine (study, diagnosis, and treatment of diseases customized for each patient) and it has been used in a telemedicine system. A case study focused on diabetes is presented to prove the validity of the proposed model.This work was supported in part by the Spanish Ministry of Economy and Competitiveness (MINECO) under Project SEQUOIA-UA (TIN2015-63502-C3-3-R) and Project RESCATA (TIN2015-65100-R) and in part by the Spanish Research Agency (AEI) and the European Regional Development Fund (FEDER) under Project CloudDriver4Industry (TIN2017-89266-R)
An IoT-Based Computational Framework for Healthcare Monitoring in Mobile Environments
The new Internet of Things paradigm allows for small devices with sensing, processing and communication capabilities to be designed, which enable the development of sensors, embedded devices and other ‘things’ ready to understand the environment. In this paper, a distributed framework based on the internet of things paradigm is proposed for monitoring human biomedical signals in activities involving physical exertion. The main advantages and novelties of the proposed system is the flexibility in computing the health application by using resources from available devices inside the body area network of the user. This proposed framework can be applied to other mobile environments, especially those where intensive data acquisition and high processing needs take place. Finally, we present a case study in order to validate our proposal that consists in monitoring footballers’ heart rates during a football match. The real-time data acquired by these devices presents a clear social objective of being able to predict not only situations of sudden death but also possible injuries.This work has been partially funded by the Spanish Ministry of Economy and Competitiveness (MINECO/FEDER) under the granted Project SEQUOIA-UA (Management requirements and methodology for Big Data analytics) TIN2015-63502-C3-3-R, by the University of Alicante, within the program of support for research, under project GRE14-10, and by the Conselleria de Educación, Investigación, Cultura y Deporte, Comunidad Valenciana, Spain, within the program of support for research, under project GV/2016/087. This work has also been partially funded by Vicerrectorado de Innovación, University of Alicante, Spain (Vigrob)
Coordinación y seguimiento de la docencia semipresencial en el Máster Universitario en Ingeniería Informática
El Máster Universitario en Ingeniería Informática de la Universidad de Alicante está regulado según las recomendaciones establecidas para la ordenación de las enseñanzas de Máster en el ámbito de la Ingeniería Informática, ofreciendo una formación avanzada en las tecnologías de la informática que capacita para la elaboración, planificación, dirección y coordinación de proyectos, así como su gestión técnica y económica en todos los ámbitos de la ingeniería informática, siguiendo criterios de calidad y medioambientales. El propósito principal de este trabajo de investigación docente es el seguimiento y coordinación de la docencia semipresencial en las asignaturas del Máster Universitario en Ingeniería Informática, tanto en la metodología docente como en los materiales y la carga de trabajo para el alumnado. Puesto que la implantación de la semipresencialidad es novedosa en este curso, es especialmente importante la coordinación entre todas las asignaturas y el seguimiento del desarrollo académico para detectar y solventar los posibles problemas que puedan aparecer y establecer un plan de mejoras que permita la mejora continua de la titulación. Para ello, se han realizado reuniones de coordinación de todos los responsables de asignaturas del Máster y reuniones con el alumnado para comprobar el progreso académico a lo largo del curso
Docencia semipresencial en el Máster en Ingeniería Informática
En este artículo se describe el trabajo realizado por la red de investigación en docencia universitaria denominada “Docencia semipresencial en el Máster en Ingeniería Informática” y que ha pretendido trabajar en las diferentes asignaturas del Máster en Ingeniería Informática de la Universidad de Alicante con el fin de dotarlas de un carácter semipresencial de una forma coordinada e integrada. Se ha creado un grupo de trabajo dentro de la comisión académica del máster y se ha impulsado una colaboración estrecha entre los responsables de todas las asignaturas del Máster en Ingeniería Informática a la hora de usar todos los mecanismos necesarios para dotar a las respectivas asignaturas del carácter semipresencial. Ha sido muy importante el apoyo que se ha tenido del ICE en este sentido, por ejemplo mediante la solicitud y realización de un curso específico sobre bLearning
Time to Switch to Second-line Antiretroviral Therapy in Children With Human Immunodeficiency Virus in Europe and Thailand.
Background: Data on durability of first-line antiretroviral therapy (ART) in children with human immunodeficiency virus (HIV) are limited. We assessed time to switch to second-line therapy in 16 European countries and Thailand. Methods: Children aged <18 years initiating combination ART (≥2 nucleoside reverse transcriptase inhibitors [NRTIs] plus nonnucleoside reverse transcriptase inhibitor [NNRTI] or boosted protease inhibitor [PI]) were included. Switch to second-line was defined as (i) change across drug class (PI to NNRTI or vice versa) or within PI class plus change of ≥1 NRTI; (ii) change from single to dual PI; or (iii) addition of a new drug class. Cumulative incidence of switch was calculated with death and loss to follow-up as competing risks. Results: Of 3668 children included, median age at ART initiation was 6.1 (interquartile range (IQR), 1.7-10.5) years. Initial regimens were 32% PI based, 34% nevirapine (NVP) based, and 33% efavirenz based. Median duration of follow-up was 5.4 (IQR, 2.9-8.3) years. Cumulative incidence of switch at 5 years was 21% (95% confidence interval, 20%-23%), with significant regional variations. Median time to switch was 30 (IQR, 16-58) months; two-thirds of switches were related to treatment failure. In multivariable analysis, older age, severe immunosuppression and higher viral load (VL) at ART start, and NVP-based initial regimens were associated with increased risk of switch. Conclusions: One in 5 children switched to a second-line regimen by 5 years of ART, with two-thirds failure related. Advanced HIV, older age, and NVP-based regimens were associated with increased risk of switch
A knowledge-based textual entailment approach applied to the QA answer validation at CLEF 2006
The Answer Validation Exercise (AVE) is a pilot track within the Cross-Language Evaluation Forum (CLEF) 2006. The AVE competition provides an evaluation frame-
work for answer validations in Question Answering (QA). In our participation in AVE, we propose a system that has been initially used for other task as Recognising Textual Entailment (RTE). The aim of our participation is to evaluate the improvement our system brings to QA. Moreover, due to the fact that these two task (AVE and RTE) have the same main idea, which is to find semantic implications between two fragments of text, our system has been able to be directly applied to the AVE competition. Our system is based on the representation of the texts by means of logic forms and the computation of semantic comparison between them. This comparison is carried out using two different approaches. The first one managed by a deeper study of the Word-
Net relations, and the second uses the measure defined by Lin in order to compute the semantic similarity between the logic form predicates. Moreover, we have also designed a voting strategy between our system and the MLEnt system, also presented by the University of Alicante, with the aim of obtaining a joint execution of the two systems developed at the University of Alicante. Although the results obtained have not been very high, we consider that they are quite promising and this supports the fact that there is still a lot of work on researching in any kind of textual entailment.This research has been partially funded by the Spanish Government under project CICyT number TIC2003-07158-C04-01
Applying logic forms and statistical methods to CL-SR performance
This paper describes a CL-SR system that employs two different techniques: the first one is based on NLP rules that consist on applying logic forms to the topic processing while the second one basically consists on applying the IR-n statistical search engine to the spoken document collection. The application of logic forms to the topics allows to increase the weight of topic terms according to a set of syntactic rules. Thus, the weights of the topic terms are used by IR-n system in the information retrieval process.This work has been partially supported by the Spanish Government (CICYT) with grant TIC2003-07158-C04-01
Aplicación de técnicas basadas en PLN al tratamiento de preguntas médicas en Búsqueda de Respuestas
Actualmente existe una cierta tendencia investigadora hacia el área de
la búsqueda de respuestas en dominios restringidos. En este papel se detalla la parte
de nuestro sistema de búsqueda de respuestas en el dominio médico que se encarga
del análisis de la pregunta. Este módulo está basado en técnicas sofisticadas de
PLN y utiliza el Metatesauro de UMLS como fuente de conocimiento. La principal
técnica de PLN utilizada por este módulo de análisis de la pregunta consiste en el
tratamiento computacional de la forma lógica de la pregunta y en el emparejado de
patrones.Nowadays, there is an increasing interest in research on QA over restricted domains. Concretely, in this paper we will show the process of question analysis
in our medical QA system. In this system we combine the use of NLP techniques
and the UMLS Metathesaurus as knowledge source. The main NLP technique is
the use of logic forms and the pattern matching technique in this question analysis
performance
Architecture of a multi-modal dialogue system oriented to multilingual question-answering
In this paper, a proposal of a multi-modal dialogue system oriented to multilingual question-answering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the architecture is presented for the first time on the base of the experiences in question-answering and dialogues previously developed. The main objective of this research work is the development of a solid platform that will permit the modular integration of the proposed architecture.This poster has been partially supported by the TABIMED Project (TABIMED 03)