10 research outputs found

    Developing a Generic Predictive Computational Model using Semantic data Pre-Processing with Machine Learning Techniques and its application for Stock Market Prediction Purposes

    Get PDF
    In this paper, we present a Generic Predictive Computational Model (GPCM) and apply it by building a Use Case for the FTSE 100 index forecasting. This involves the mining of heterogeneous data based on semantic methods (ontology), graph-based methods (knowledge graphs, graph databases) and advanced Machine Learning methods. The main focus of our research is data pre-processing aimed at a more efficient selection of input features. The GPCM model pipelineā€™s cycles involve the propagation of the (initially raw) data to the Graph Database structured by an ontology and regular updates of the featuresā€™ weights in the Graph Database by the feedback loop from the Machine Learning Engine. The Graph Database queries output the most valuable features that, in turn, serve as the input for the Machine Learning-based prediction. The end-product of this process is fed back to the Graph Database to update the weights. We report on practical experiments evaluating the effectiveness of the GPCM application in forecasting the FTSE 100 index. The underlying dataset contains multiple parameters related to predicting time-series data, where Long Short-Term Memory (LSTM) is known to be one of the most efficient machine learning methods. The most challenging task here has been to overcome the known restrictions of LSTM, which is capable of analysing one input parameter only. We solved this problem by combining several parallel LSTMs, a Concatenation unit, which merges the LSTMsā€™ outputs (into a time-series matrix), and a Linear Regression Unit, which produces the final resul

    The Digital Health Evidence Generator

    Get PDF

    Live Demonstration of the PITHIA e-Science Centre

    Get PDF
    PITHIA-NRF (Plasmasphere Ionosphere Thermosphere Integrated Research Environment and Access services: a Network of Research Facilities) is a four-year project funded by the European Commissionā€™s H2020 programme to integrate data, models and physical observing facilities for further advancing European research capacity in this area. A central point of PITHIA-NRF is the PITHIA e-Science Centre (PeSC), a science gateway that provides access to distributed data sources and prediction models to support scientific discovery. As the project reached its half-way point in March 2023, the first official prototype of the e-Science Centre was released. This live demonstration will provide an overview of the current status and capabilities of the PeSC, highlighting the underlying ontology and metadata structure, the registration process for models and datasets, the ontology-based search functionalities and the interaction methods for executing models and processing data. One of the main objectives of the PeSC is to enable scientists to register their Data Collections, that can be both raw or higher-level datasets and prediction models, using a standard metadata format and a domain ontology. For these purposes, PITHIA builds on the results of the ESPAS FP7 project by adopting and modifying its ontology and metadata specification. The project utilises the ISO 19156 standard on Observations and Measurements (O&M) to describe Data Collections in an XML format that is widely used within the research community. Following the standard, Data Collections are referring to other XML documents, such as Computations that a model used to derive the results, Acquisitions describing how the data was collected, Instruments that were used during the data collection process, or Projects that were responsible for the data/model. Within the XML documents, specific keywords of the Space Physics ontology can be used to describe the various elements. For example, Observed Property can be Field, Particle, Wave, or Mixed, at the top level. When preparing the XML metadata file, only these values are accepted for validation. Once described in XML format, Data Collections can be published in the PeSC and searched using the ontology-based search engine. Besides large and typically changing/growing Data Collections, PeSC also supports the registration of Catalogues. These are smaller sets of data, originating from a Data Collection and related to specific events, e.g. volcano eruptions. Catalogue Data Subsets can be assigned DOIs to be referenced in publications and provide a permanent set of data for reproducibility. Additionally, to publication and search, the PeSC also provides several mechanisms for interacting with Data Collections, e.g. executing a model or downloading subsets of the data. In the current version two of the four planned interaction methods are implemented: accessing the Data Collection by a direct link and interacting with it via an API and an automatically generated GUI. Data Collections can either be hosted by the local provider or can be deployed on EGI cloud computing resources. The development of the PeSC is still work in progress. Authentication and authorisation are currently being implemented using EGI Checkin and the PERUN Attribute Management System. Further interaction mechanisms enabling local execution and dynamic deployment in the cloud will also be added in the near future. The main screen of the PeSC is illustrated on Figure 1. The source code is open and available in GitHub

    Semantic Data Pre-Processing for Machine Learning Based Bankruptcy Prediction Computational Model

    Get PDF
    This paper studies a Bankruptcy Prediction Computational Model (BPCM model) ā€“ a comprehensive methodology of evaluating companiesā€™ bankruptcy level, which combines storing, structuring and pre-processing of raw financial data using semantic methods with machine learning analysis techniques. Raw financial data are interconnected, diverse, often potentially inconsistent, and open to duplication. The main goal of our research is to develop data pre-processing techniques, where ontologies play a central role. We show how ontologies are used to extract and integrate information from different sources, prepare data for further processing, and enable communication in natural language. Using ontology, we give meaning to the disparate and raw business data, build logical relationships between data in various formats and sources and establish relevant context. Our Ontology of Bankruptcy Prediction (OBP Ontology) which provides a conceptual framework for companiesā€™ financial analysis, is built in the widely established Prote Ģge Ģ environment. An OBP Ontology can be effectively described with a graph database. Graph database expands the capabilities of traditional databases tackling the interconnected nature of economic data and providing graph-based structures to store information allowing the effective selection of the most relevant input features for the machine learning algorithm. To create and manage the BPCM Graph Database (Graph DB), we use the Neo4j environment and Neo4j query language, Cypher, to perform feature selection of the structured data. Selected key features are used for the Machine Learning Engine ā€“ supervised MLP Neural Network with Sigmoid activation function. The programming of this component is performed in Python. We illustrate the approach and advantages of semantic data pre-processing applying it to a representative use case

    EnAbled: A Psychology Profile based Academic Compass to Build and Navigate Students' Learning Paths

    Get PDF
    Inthe moderneducational environmentstudents are faced with a plethora of different options in their learning journey during the University years. To help them to make optimal choices among all these options,that best correspond to their individual-ity, we have conducted a research project ā€œEnabled: Educational Network Amplifying Learning Experienceā€ (EnAbled). The project aims at ā€œmappingā€ these choices to per-sonal preferences and individual learning styles. We allow students to either self-assess their profiles or usethe Lumina Psychological Traits of Behavioral Preferencestests.We argue that this approach will be beneficial not only to the students but also to the academics assisting them in the preparation and delivery of modules and providing them with more insight into what and how teaching is delivered

    Science Gateways with Embedded Ontology-based E-learning Support

    Get PDF
    Science gateways are widely utilised in a range of scientific disciplines to provide user-friendly access to complex distributed computing infrastructures. The traditional approach in science gateway development is to concentrate on this simplified resource access and provide scientists with a graphical user interface to conduct their experiments and visualise the results. However, as user communities behind these gateways are growing and opening their doors to less experienced scientists or even to the general public as ā€œcitizen scientistsā€, there is an emerging need to extend these gateways with training and learning support capabilities. This paper describes a novel approach showing how science gateways can be extended with embedded e-learning support using an ontology-based learning environment called Knowledge Repository Exchange and Learning (KREL). The paper also presents a prototype implementation of a science gateway for analysing earthquake data and demonstrates how the KREL can extend this gateway with ontology-based embedded e-learning support

    Enabled: Educational Network Amplifying Learning Experience (EnAbled)

    Get PDF
    In modern days, students are faced with a plethora of different options in their learning journey during the University years. From the classical approach of at-tending lectures and tutorials to learning from online videos or participating in the many fora and tutorials that are freely available online; there are so many possible paths and approaches that it is becoming difficult for students to make an in-formed and optimal choice among available resources. To help students to make optimal choices among all these options, we have conducted a research project ā€œEnabled: Educational Network Amplifying Learning Experienceā€ (EnAbled). In this paper, we present and discuss the results of this project. The project aims at ā€œmappingā€ these choices to personal preferences and individual learning styles. We define studentsā€™ individual learning profiles based on the Lumina Psychological Traits of Behavioral Preferences. We argue that this approach will be beneficial not only to the students but also to the academics assisting them in the preparation and delivery of modules and providing them with more insight into what and how teaching is delivered. Alongside the theoretical investigation, the project aims at creating an ā€œAcademic Compassā€, a digital platform that brings together students and lecturers. Students use the platform to navigate across the various learning op-tions guided by both their preferences and learning styles and previous learning experience whilst lecturers use the platform to model their modules encompassing the various options available at each learning stage. During the project, we collect data required to build students individual learning profiles and utilize them in the definition of a Use Case that models the revision sessions for a module: ā€œMathe-matics for Computingā€

    Sharing Data Collections and Models for Ionosphere, Thermosphere and Plasmasphere Research

    No full text
    PITHIA-NRF (Plasmasphere Ionosphere Thermosphere Integrated Research Environment and Access services: a Network of Research Facilities) is a project funded by the European Commissionā€™s H2020 programme to build a distributed network of observing facilities, data processing tools and prediction models dedicated to ionosphere, thermosphere and plasmasphere research. One of the core components of PITHIA-NRF is the PITHIA e-Science Centre that supports access to distributed data resources and facilitates the execution of various models on local infrastructures and remote cloud computing resources. There are two major types of resources to be registered with the e-Science Centre: Data Collections and Models. Data Collections are either generated as direct outcome of an observation facility (e.g. radars, radio telescopes, meteor cameras, etc.) or can also be generated by various scientific Models. Models are scientific applications that take either raw or cleaned data from observation facilities and produce higher level datasets with predicted characteristics to facilitate further scientific research. Both Data Collections and Models are registered with the PITHIA e-Science Centre using a rich set of metadata that is based on the ISO 19156 standard on Observations and Measurements (O&M), and specifically augmented and tailored for the requirements of space physics. The metadata structure and the related ontology were originally developed in the FP7 ESPAS project [1] and are currently being modified for the specific requirements of PITHIA. PITHIA-NRF decided to describe and register data collections only, instead of the central registration of every individual data granule, as in previous projects such as ESPAS. Such simplification enables easier management of the e-Science Centre and can lead to longer term sustainability with feasible amount of maintenance effort required. On the other hand, local searchability of individual data pieces still remains, not restricting the scientists to access the required details at the necessary granularity. When it comes to the execution of models, the PITHIA e-Science Centre supports three types of model execution and access scenarios, all provided from a single entry-point. Models can be executed on local resources of the various PITHIA nodes (institutions sharing Data Collections and Models). Additionally, some Models can be deployed and executed on cloud computing resources on-demand. Finally, nodes can also offer Models to be downloaded and executed on the usersā€™ own resources. Model providers can select the most suitable execution mechanism, based on the specific characteristics of the models and the resources (both human and computational resources) they have. The implementation of the PITHIA e-Science Centre is work in progress. This presentation will report on the current state of this development work. The ESPAS metadata structure and ontology, tailored for the specific requirements of the project, have already been demonstrated to the research community on the example of some Data Collections and Models. Based on this metadata structure, work is currently ongoing to enable the registration and the ontology-based search facility of both Models and Data Collections. Proof of concept implementations [2] of the various Model access and execution mechanisms have also been implemented and demonstrated to the research community. Acknowledgement This work was funded by the PITHIA-NRF - Plasmasphere Ionosphere Thermosphere Integrated Research Environment and Access services: a Network of Research Facilities (No. 101007599) EU H2020 project. Keywords ā€“ e-Science Centre, ontology, metadata, Data Collection, Model execution. REFERENCES [1] Anna Belehaki, Sarah James, Mike Hapgood, Spiros Ventouras, Ivan Galkin, Antonis Lembesis, Ioanna Tsagouri, Anna Charisi, Luca Spogli, Jens Berdermann, Ingemar HƤggstrƶm, The ESPAS e-infrastructure: Access to data from near-Earth space, Advances in Space Research, Volume 58, Issue 7, 2016, Pages 1177-1200, ISSN 0273-1177, https://doi.org/10.1016/j.asr.2016.06.014. [2] Gabriele Pierantoni, Tamas Kiss, Alexander Bolotov, Dimitrios Kagialis, James DesLauriers, Amjad Ullah, Huankai Chen, David Chan You Fee, Hai-Van Dang, Jozsef Kovacs, Anna Belehaki, Themistocles Herekakis, Ioanna Tsagouri, Sandra Gesing: Towards a Reference Architecture based Science Gateway Framework with Embedded E-Learning Support, Concurrency and Computation, Practice and Experience, Wiley, 2022, https://doi.org/10.1002/cpe.687

    SMARTEST - knowledge and learning repository

    No full text
    SMARTEST is a knowledge repository that assists and facilitates learning. It represents knowledge and learning activities as graphs, which present information in a clear, visual format that is easy to follow and understand. Nodes (coloured circles) contain content such as instructions or concepts relevant to the subject it is being used for; the lines connecting together two nodes (edges), show the relationship between them. Students can follow instructions using these graphs and visually see the links between the concepts or entities represented by the nodes. The nodes can then be colour-coded by students depending on their understanding of what that node represents. If a student has had difficulty understanding a particular concept, they can simply choose the colour that best represents their situation and level of understanding. This allows teachers to get clear feedback from students on specific parts of a subject and enable them to then help that student better understand the topic at hand. Furthermore, if several students are having a problem with a certain topic, this is communicated to the teacher and as a group the teacher can tackle the problem. There are two types of graphs: learning paths and ontologies. A learning path sets out steps for students to go through to acquire knowledge for their subject and build on the already acquired knowledge to complete the next steps. It allows students to see what steps they will need to take to achieve the final goal and creates a visual sense of accomplishment as students get closer and closer to the end of the learning path. An ontology is a set of concepts and categories in a subject area or domain that shows their properties and the relations between them. SMARTEST has been developed within a project undertaken at the University of Westminster and is sponsored by Quintin Hogg Trust

    Toward a reference architecture based science gateway framework with embedded eā€learning support

    No full text
    Science gateways have been widely utilised by a large number of user communities to simplify access to complex distributed computing infrastructures. While science gateways are still becoming increasingly popular and the number of user communities is growing, the fast and efficient creation of new science gateways and the flexibility to deploy these gateways on-demand on heterogeneous computational resources, remain a challenge. Additionally, the increase in the number of users, especially with very different backgrounds, requires intuitive embedded e-learning tools that support all stakeholders to find related learning material and to guide the learning process. This paper introduces a novel science gateway framework that addresses these challenges. The framework supports the creation, publication, selection and deployment of cloud-based Reference Architectures that can be automatically instantiated and executed even by non-technical users. The framework also incorporates a Knowledge Repository Exchange and Learning module that provides embedded e-learning support. To demonstrate the feasibility of the proposed solution, two scientific case studies are presented based on the requirements of the plasmasphere, ionosphere and thermosphere research communities
    corecore