Search CORE

3 research outputs found

Statistical Algorithms for Ontology-based Annotation of Scientific Literature

Author: Chakrabarti Chayan
Jones Thomas B.
Liard Angela R.
Luger George F.
Turner Jessica
Turner Matthew
Xu Jiawei F.
Publication venue: ScholarWorks @ Georgia State University
Publication date: 01/01/2014
Field of study

Background: Ontologies encode relationships within a domain in robust data structures that can be used to annotate data objects, including scientific papers, in ways that ease tasks such as search and meta-analysis. However, the annotation process requires significant time and effort when performed by humans. Text mining algorithms can facilitate this process, but they render an analysis mainly based upon keyword, synonym and semantic matching. They do not leverage information embedded in an ontology’s structure. Methods: We present a probabilistic framework that facilitates the automatic annotation of literature by indirectly modeling the restrictions among the different classes in the ontology. Our research focuses on annotating human functional neuroimaging literature within the Cognitive Paradigm Ontology (CogPO). We use an approach that combines the stochastic simplicity of naïve Bayes with the formal transparency of decision trees. Our data structure is easily modifiable to reflect changing domain knowledge. Results: We compare our results across naïve Bayes, Bayesian Decision Trees, and Constrained Decision Tree classifiers that keep a human expert in the loop, in terms of the quality measure of the F1-mirco score. Conclusions: Unlike traditional text mining algorithms, our framework can model the knowledge encoded by the dependencies in an ontology, albeit indirectly. We successfully exploit the fact that CogPO has explicitly stated restrictions, and implicit dependencies in the form of patterns in the expert curated annotations

Crossref

ScholarWorks @ Georgia State University

PubMed Central

DigitalCommons@Florida International University

Reaaliaikaiset ennustukset verkkopalveluissa

Author: Sailaranta Liisa
Publication venue
Publication date: 13/02/2017
Field of study

In this Master's Theses a real-time analytics pipeline is built to serve predictions to users based on the usage and the operational data of a Web service. The data of the service is analyzed and a predictive model is built using statistical learning methods. The pipeline is set up to serve the predictions real-time using components from Amazon Cloud Services. The aim is to show the user a prediction of how long will it take until she/he gets a verdict on her/his application from the service. As additional goals, the aim is to study the dataset and its possibilities and research the suitability of the Amazon Machine Learning service in real-time predictions in Web context. The features for the predictive model are selected by exploring the dataset and using the Amazon Machine Learning service to evaluate the features. The Amazon Machine Learning service is also used to build a predictive machine learning model. The real-time analytics pipeline is built using Amazon components and following the Lambda Architecture guidelines. The best model performed better than the baseline model, though only moderately. The data lacked some vital information for the prediction target such as information about the personnel. Implementing the pipeline with Amazon components was considered straightforward. The Lambda Architecture worked well for the problem. It was found out that the Amazon Machine Learning service is easy to use but its machine learning capabilities and user interface are limited. It was highlighted that it is essential to explore and learn the dataset before building or designing the pipeline, as the pipeline design depends heavily from the data and from the use case.Tässä diplomityössä on rakennettu reaaliaikainen analytiikkajärjestelmä, jolla näytetään ennustuksia käyttäjille eräässä verkkopalvelussa, perustuen verkkopalvelun käyttödataan ja operatiiviseen dataan. Verkkopalvelun dataa analysoidaan ja sen perusteella rakennetaan tilastollisiin menetelmiin pohjaava ennustava koneoppimismalli. Analytiikkajärjestelmä rakennetaan käyttäen komponentteja Amazonin pilvipalvelusta. Tarkoituksena on näyttää käyttäjälle ennustus siitä kauanko kestää, että hän saa vastauksen verkkopalveluun jättämäänsä hakemukseen. Tämän lisäksi tavoitteena on muodostaa ymmärrys verkkopalvelun datasta ja sen mahdollisuuksista, sekä tutkia soveltuuko Amazonin koneoppimispalvelu reaaliaikaisten ennustuksien näyttämiseen verkkoympäristössä. Ennustavan mallin ominaisuudet valittiin tarkastelemalla dataa ja evaluoimalla ominaisuudet Amazonin koneoppimispalvelun avulla. Amazonin koneoppimispalvelua käytettiin myös ennustavan koneoppimismallin rakentamiseen. Reaaliaikainen analytiikkajärjestelmä rakennettiin käyttäen komponentteja Amazonin pilvipalveluista ja seuraten Lambda-arkkitehtuurin suunnitteluperiaatteita. Paras rakennetuista koneoppimismalleista oli parempi kuin pohjamalli, joskaan ei mitenkään merkittävästi. Datasta puuttui joitain ennustettavan arvon kannalta tärkeitä tekijöitä kuten tietoa hakemuksia käsittelevästä henkilökunnasta. Analytiikkajärjestelmän rakentaminen Amazoniin osoittautui kuitenkin helpoksi. Amazonin koneoppimispalvelu todettiin helppokäyttöiseksi, vaikkakin se todettiin koneoppimisominaisuuksiltaan melko yksinkertaiseksi, sekä käyttöliittymän osalta rajoittuneeksi. Työssä korostetaan, että on tärkeää tutkia dataa ennen kuin rakentaa analytiikkajärjestelmän, sillä järjestelmän rakenne riippuu suuresti siitä, minkälaista data on ja mikä on sen sekä datan käyttötarkoitus

Aaltodoc Publication Archive

Fuzzy model predictive control. Complexity reduction by functional principal component analysis

Author: Escaño González Juan Manuel
Publication venue
Publication date: 10/12/2015
Field of study

En el Control Predictivo basado en Modelo, el controlador ejecuta una optimización en tiempo real para obtener la mejor solución para la acción de control. Un problema de optimización se resuelve para identificar la mejor acción de control que minimiza una función de coste relacionada con las predicciones de proceso. Debido a la carga computacional de los algoritmos, el control predictivo sujeto a restricciones, no es adecuado para funcionar en cualquier plataforma de hardware. Las técnicas de control predictivo son bien conocidos en la industria de proceso durante décadas. Es cada vez más atractiva la aplicación de técnicas de control avanzadas basadas en modelos a otros muchos campos tales como la automatización de edificios, los teléfonos inteligentes, redes de sensores inalámbricos, etc., donde las plataformas de hardware nunca se han conocido por tener una elevada potencia de cálculo. El objetivo principal de esta tesis es establecer una metodología para reducir la complejidad de cálculo al aplicar control predictivo basado en modelos no lineales sujetos a restricciones, utilizando como plataforma, sistemas de hardware de baja potencia de cálculo, permitiendo una implementación basado en estándares de la industria. La metodología se basa en la aplicación del análisis de componentes principales funcionales, proporcionando un enfoque matemáticamente elegante para reducir la complejidad de los sistemas basados en reglas, como los sistemas borrosos y los sistemas lineales a trozos. Lo que permite reducir la carga computacional en el control predictivo basado en modelos, sujetos o no a restricciones. La idea de utilizar sistemas de inferencia borrosos, además de permitir el modelado de sistemas no lineales o complejos, dota de una estructura formal que permite la implementación de la técnica de reducción de la complejidad mencionada anteriormente. En esta tesis, además de las contribuciones teóricas, se describe el trabajo realizado con plantas reales en los que se han llevado a cabo tareas de modelado y control borroso. Uno de los objetivos a cubrir en el período de la investigación y el desarrollo de la tesis ha sido la experimentación con sistemas borrosos, su simplificación y aplicación a sistemas industriales. La tesis proporciona un marco de conocimiento práctico, basado en la experiencia.In Model-based Predictive Control, the controller runs a real-time optimisation to obtain the best solution for the control action. An optimisation problem is solved to identify the best control action that minimises a cost function related to the process predictions. Due to the computational load of the algorithms, predictive control subject to restric- tions is not suitable to run on any hardware platform. Predictive control techniques have been well known in the process industry for decades. The application of advanced control techniques based on models is becoming increasingly attractive in other fields such as building automation, smart phones, wireless sensor networks, etc., as the hardware platforms have never been known to have high computing power. The main purpose of this thesis is to establish a methodology to reduce the computational complexity of applying nonlinear model based predictive control systems subject to constraints, using as a platform hardware systems with low computational power, allowing a realistic implementation based on industry standards. The methodology is based on applying the functional principal component analysis, providing a mathematically elegant approach to reduce the complexity of rule-based systems, like fuzzy and piece wise affine systems, allowing the reduction of the computational load on modelbased predictive control systems, subject or not subject to constraints. The idea of using fuzzy inference systems, in addition to allowing nonlinear or complex systems modelling, endows a formal structure which enables implementation of the aforementioned complexity reduction technique. This thesis, in addition to theoretical contributions, describes the work done with real plants on which tasks of modeling and fuzzy control have been carried out. One of the objectives to be covered for the period of research and development of the thesis has been training with fuzzy systems and their simplification and application to industrial systems. The thesis provides a practical knowledge framework, based on experience

idUS. Depósito de Investigación Universidad de Sevilla