9 research outputs found

    Semantic Systems. In the Era of Knowledge Graphs

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Conference on Semantic Systems, SEMANTiCS 2020, held in Amsterdam, The Netherlands, in September 2020. The conference was held virtually due to the COVID-19 pandemic

    Source associations for the virtual observatory

    Get PDF
    This thesis presents investigations into different methods of associating astronomical sources detected at different wavelengths, and describes the development of a tool for AstroGrid to enable users to associate sources in a fully automated manner.At present when associating sources at different wavelengths it is common for astronomers to select IDs by eye or at least verify probabilistically -determined counterparts by eye. With the new trend for large surveys this is no longer practical as datasets may contain millions of objects. Previous work on association algorithms has focussed on case -specific techniques which typically only match a restricted number of objects with counterparts, and often only those with small positional errors. This thesis addresses the issue that these methods are not adequate in the general case where datasets may be enormous and source error ellipses large. In such situations matching based purely on spatial proximity is deficient since there may be hundreds of candidate counterparts within a source error ellipse. We therefore investigate the likelihood ratio as an association technique, as this allows incorporation of data such as object magnitudes as well as positions, and prove its applicability in the (difficult association) case of the FIRBACK survey. We also develop the application of a machine learning technique, the EM algorithm, and test it against the likelihood ratio method. We determine that it may be effectively applied to find IDs in surveys with a magnitude distribution with unrestricted shape. These different association methods are successfully developed into a tool for AstroGrid to enable users to associate sources in a fully automated manner.We describe detailed analysis of the likelihood ratio method through the association of a population of far -infrared sources from the FIRBACK survey with optical counterparts from the INT Wide Field Survey. This is a challenging association problem since the far -infrared sources have a large positional error due to the poor resolution of the instrument and the relatively long wavelength. We compare two different variants of the likelihood ratio method in detail, and use the better one to derive optical counterparts for the far -infrared sources. This proves the applicability of the likelihood ratio method in the case of large source error ellipses where there are numerous candidates to choose between.The scientific benefits of associating multiwavelength data are illustrated via deducing, for the first time, the nature of the FIRBACK sources. These are identified with not only an optical counterpart but also with data at up to nine further wavelengths. Their properties are examined through the comparison of their observed spectral energy distributions with predictions from radiative transfer models which simulate the emission from both cirrus and starburst components. The far -infrared sources are found to be 80 per cent star -bursting galaxies with their starburst component at a high optical depthIt is a common situation in astronomy to wish to investigate a source population for which we have no prior knowledge about the properties of the source counterparts expected at another wavelength, for example through observations with a new instrument. In such a case it is necessary to estimate the counterpart magnitude distribution to use the likelihood ratio association method. Since little was known about the FIRBACK sources, prior to our research, their optical magnitude distribution had to be estimated in order to assign them optical IDs. To alleviate this problem we develop a new astronomical application of a machine learning technique known as the EM algorithm which is used in the field of informatics. This is able to `learn' the source magnitude distribution iteratively. The algorithm is tested on the FIRBACK sources and also radio sources from the HI Parkes All -Sky Survey (HIPASS) catalogue and is found to be a very effective association method in the HIPASS case where the background magnitude distribution is of unrestricted shape.We use the FIRBACK survey far -infrared sources as a test -bed for several different association methods. The value of bringing together multiwavelength observations is illustrated through the insights that are gained into the nature of the sources. This work culminates in the development of an association tool for AstroGrid, the UK Virtual Observatory project, offering three different association methods: the Poisson method, the likelihood ratio method and the EM algorithm. This tool is able to return a user specified number of possible counterparts along with a figure of merit for their match with a source. We also implement the AstroDAS system to store resulting object pairs in a database for future use. This prevents the same cross association tasks being carried out numerous times by different users. The Virtual Observatory aims to link diverse datasets from across the globe. The extra knowledge available from these may only be extracted after establishing links between detections in these datasets. Our AstroGrid association tool is therefore vital to the success of the Virtual Observatory

    Using interpretable machine learning for indoor COâ‚‚ level prediction and occupancy estimation

    Get PDF
    Management and monitoring of rooms’ environmental conditions is a good step towards achieving energy efficiency and a healthy indoor environment. However, studies indicate that some of the current methods used in environmental room monitoring are faced with some challenges such as high cost and lack of privacy. As a result, there is need to use a method that is simpler, reliable, affordable and without any privacy issues. Therefore, the aims of this thesis were: (i) to predict future CO₂ levels using environmental sensor data, (ii) to determine room occupancy using environmental sensor data and (iii) to create a prototype dashboard for possible future room management based on the models developed for room occupancy and CO₂ prediction. Machine learning methods were used and these included: Gradient Boosting ensemble model (GB), Long Short-Term Memory recurrent neural network model (LSTM) and Facebook Prophet model for time series (Prophet). The sensor data were recorded from three different office locations (two test sites at a university and a real-world commercial office in Glasgow, Scotland, UK). The results of the analysis show that with LSTM method, a Root Mean Square Error (RMSE) (absolute fit of the model results to the observed data) of 0.0682 could be achieved for two-hour time interval CO₂ prediction and with GB, of 82% accuracy could be achieved for proposed room occupancy estimation. Furthermore, as the model understanding was raised as a key issue, interpretable machine learning methods (SHapley Additive exPlanation. (SHAP) and Local Model-agnostic explanations. (LIME)) were used to interpret room occupancy results obtained by GB model. In addition a dashboard was designed and prototyped to show room environmental data, predicted CO₂ levels and estimated room occupancy based on what the sensor data and models might provide for people managing rooms in different settings. The proposed dashboard that was designed in this research was evaluated by interested participants and their responses show that the proposed dashboard could potentially offer inputs to building management towards the control of heating, ventilation and air-conditioning systems. This in turn could lead to improved energy efficiency, better planning of shared spaces in buildings, potentially reducing energy and operational costs, improved environmental conditions for room occupants; potentially leading to improved health, reduced risks, enhanced comfort and improved productivity. It is advised that further studies should be conducted at multiple locations to demonstrate generalisation of the results of the proposed model. In addition, the end benefits of the model could be assessed through applying its outputs to enhance the control of HVAC systems, room management systems and safety systems. The health and productivity of the occupants could be monitored in detail to identify whether resulting environmental improvements deliver improvements in health and productivity. The findings of this research contribute new knowledge that could be used to achieve reliable results in room occupancy estimation using machine learning approach.Management and monitoring of rooms’ environmental conditions is a good step towards achieving energy efficiency and a healthy indoor environment. However, studies indicate that some of the current methods used in environmental room monitoring are faced with some challenges such as high cost and lack of privacy. As a result, there is need to use a method that is simpler, reliable, affordable and without any privacy issues. Therefore, the aims of this thesis were: (i) to predict future CO₂ levels using environmental sensor data, (ii) to determine room occupancy using environmental sensor data and (iii) to create a prototype dashboard for possible future room management based on the models developed for room occupancy and CO₂ prediction. Machine learning methods were used and these included: Gradient Boosting ensemble model (GB), Long Short-Term Memory recurrent neural network model (LSTM) and Facebook Prophet model for time series (Prophet). The sensor data were recorded from three different office locations (two test sites at a university and a real-world commercial office in Glasgow, Scotland, UK). The results of the analysis show that with LSTM method, a Root Mean Square Error (RMSE) (absolute fit of the model results to the observed data) of 0.0682 could be achieved for two-hour time interval CO₂ prediction and with GB, of 82% accuracy could be achieved for proposed room occupancy estimation. Furthermore, as the model understanding was raised as a key issue, interpretable machine learning methods (SHapley Additive exPlanation. (SHAP) and Local Model-agnostic explanations. (LIME)) were used to interpret room occupancy results obtained by GB model. In addition a dashboard was designed and prototyped to show room environmental data, predicted CO₂ levels and estimated room occupancy based on what the sensor data and models might provide for people managing rooms in different settings. The proposed dashboard that was designed in this research was evaluated by interested participants and their responses show that the proposed dashboard could potentially offer inputs to building management towards the control of heating, ventilation and air-conditioning systems. This in turn could lead to improved energy efficiency, better planning of shared spaces in buildings, potentially reducing energy and operational costs, improved environmental conditions for room occupants; potentially leading to improved health, reduced risks, enhanced comfort and improved productivity. It is advised that further studies should be conducted at multiple locations to demonstrate generalisation of the results of the proposed model. In addition, the end benefits of the model could be assessed through applying its outputs to enhance the control of HVAC systems, room management systems and safety systems. The health and productivity of the occupants could be monitored in detail to identify whether resulting environmental improvements deliver improvements in health and productivity. The findings of this research contribute new knowledge that could be used to achieve reliable results in room occupancy estimation using machine learning approach
    corecore