2,071 research outputs found

    Detection of Chocolate Properties Using Near-Infrared Spectrophotometry †

    Get PDF
    Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.[Abstract] Knowing the chemical composition of a substance provides valuable information about it. That is why numerous techniques have been developed to try to obtain it. One of them is the Near Infrared Spectrometry technique, a non-destructive technique that analyzes the electromagnetic spectrum in search of waves of a certain length. The aim of this project is to combine this technology with machine learning techniques to try to detect the presence of milk, as well as the level of cocoa present in an ounce of chocolate. This has given satisfactory results in both cases, so it is considered that the combination of these techniques offers great possibilities.The authors would like to thank the support from RNASA-IMEDIR group

    DoME: A Deterministic Technique for Equation Development and Symbolic Regression

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Based on a solid mathematical background, this paper proposes a method for Symbolic Regression that enables the extraction of mathematical expressions from a dataset. Contrary to other approaches, such as Genetic Programming, the proposed method is deterministic and, consequently, does not require the creation of a population of initial solutions. Instead, a simple expression is grown until it fits the data. This method has been compared with four well-known Symbolic Regression techniques with a large number of datasets. As a result, on average, the proposed method returns better performance than the other techniques, with the advantage of returning mathematical expressions that can be easily used by different systems. Additionally, this method makes it possible to establish a threshold at the complexity of the expressions generated, i.e., the system can return mathematical expressions that are easily analyzed by the user, as opposed to other techniques that return very large expressions.This study is partially supported by Instituto de Salud Carlos III, grant number PI17/01826 (Collaborative Project in Genomic Data Integration (CICLOGEN) funded by the Instituto de Salud Carlos III from the Spanish National Plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER)—“A way to build Europe”. It was also partially supported by different grants and projects from the Xunta de Galicia [ED431D 2017/23; ED431D 2017/16; ED431G/01; ED431C 2018/49; IN845D-2020/03]. The authors thank the CyTED, Spain and each National Organism for Science and Technology for funding the IBEROBDIA project (P918PTE0409). In this regard, Spain specifically thanks the Ministry of Economy and Competitiveness for the financial support for this project through the State Program of I+D+I Oriented to the Challenges of Society 2017–2020 (International Joint Programming 2018), project (PCI2018-093284). Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431G/01Xunta de Galicia; ED431C 2018/49Xunta de Galicia; IN845D-2020/0

    Convolutional Neural Networks for Sleep Stage Scoring on a Two-Channel EEG Signal

    Get PDF
    This is a pre-print of an article published in Soft Computing. The final authenticated version is available online at: https://doi.org/10.1007/s00500-019-04174-1[Abstract] Sleeping problems have become one of the major diseases all over the world. To tackle this issue, the basic tool used by specialists is the Polysomnogram, which is a collection of different signals recorded during sleep. After its recording, the specialists have to score the different signals according to one of the standard guidelines. This process is carried out manually, which can be highly time consuming and very prone to annotation errors. Therefore, over the years, many approaches have been explored in an attempt to support the specialists in this task. In this paper, an approach based on convolutional neural networks is presented, where an in-depth comparison is performed in order to determine the convenience of using more than one signal simultaneously as input. Additionally, the models were also used as parts of an ensemble model to check whether any useful information can be extracted from signal processing a single signal at a time which the dual-signal model cannot identify. Tests have been performed by using a well-known dataset called expanded sleep-EDF, which is the most commonly used dataset as benchmark for this problem. The tests were carried out with a leave-one-out cross-validation over the patients, which ensures that there is no possible contamination between training and testing. The resulting proposal is a network smaller than previously published ones, but which overcomes the results of any previous models on the same dataset. The best result shows an accuracy of 92.67% and a Cohen’s Kappa value over 0.84 compared to human experts.Instituto de Salud Carlos III; PI17/01826Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431G/0

    Estimation of the Alcoholic Degree in Beers through Near Infrared Spectrometry Using Machine Learning

    Get PDF
    [Abstract] It is a fact that, non-destructive measurement technologies have gain a lot of attention over the years. Among those technologies, NIR technology is the one which allows the analysis of electromagnetic spectrum looking for carbon-link interactions. This technology analyzes the electromagnetic spectrum in the band between 700 nm and 2500 nm, a band very close to the visible spectrum. Traditionally, the devices used to measure are utterly expensive and enormously bulky. That is why this project was focused on a portable spectrophotometer to make measures. This device is smaller and cheaper than the common spectrophotometer, although at the cost of a lower resolution. In this work, that device in combination with the use of machine learning was used to detect if a beer contains alcohol or it can be labeled as non-alcoholic drink.Xunta de Galicia; ED431G/0

    Population Subset Selection for the Use of a Validation Dataset for Overfitting Control in Genetic Programming

    Get PDF
    [Abstract] Genetic Programming (GP) is a technique which is able to solve different problems through the evolution of mathematical expressions. However, in order to be applied, its tendency to overfit the data is one of its main issues. The use of a validation dataset is a common alternative to prevent overfitting in many Machine Learning (ML) techniques, including GP. But, there is one key point which differentiates GP and other ML techniques: instead of training a single model, GP evolves a population of models. Therefore, the use of the validation dataset has several possibilities because any of those evolved models could be evaluated. This work explores the possibility of using the validation dataset not only on the training-best individual but also in a subset with the training-best individuals of the population. The study has been conducted with 5 well-known databases performing regression or classification tasks. In most of the cases, the results of the study point out to an improvement when the validation dataset is used on a subset of the population instead of only on the training-best individual, which also induces a reduction on the number of nodes and, consequently, a lower complexity on the expressions.Xunta de Galicia; ED431G/01Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431D 2017/23Instituto de Salud Carlos III; PI17/0182

    Application of Artificial Neural Networks for the Monitoring of Episodes of High Toxicity by DSP in Mussel Production Areas in Galicia

    Get PDF
    [Abstract] This study seeks to support, through the use of Artificial Neural Networks (ANN), the decision to perform closings after days without sampling in the Vigo estuary. The opening and closing of the mussel production areas are based on the toxicity analysis of this bivalve’s meat. Sometimes it is not possible to obtain the necessary data for effective closing. If there is evidence of an increase in toxicity levels, “Precautionary Closings” on mussel extraction is done. A small error in the forecast of the state of the areas could mean serious losses for the mussel industry and a huge risk for public health. Unlike in previous studies, this study aims to manage the state of the mussel production areas, whilst the others focused on predicting the harmful algae blooms. Having achieved test sensitivity values of 67.40% and test accuracy of 83.00%, these results may lead to new research that involves obtaining more accurate models that can be integrated into a support system.Xunta de Galicia; ED431D 2017/23Xunta de Galicia; ED431D 2017/16Xunta de Galicia; ED431G/01Xunta de Galicia; ED431C 2018/4

    A Public Domain Dataset for Real-Life Human Activity Recognition Using Smartphone Sensors

    Get PDF
    [Abstract] In recent years, human activity recognition has become a hot topic inside the scientific community. The reason to be under the spotlight is its direct application in multiple domains, like healthcare or fitness. Additionally, the current worldwide use of smartphones makes it particularly easy to get this kind of data from people in a non-intrusive and cheaper way, without the need for other wearables. In this paper, we introduce our orientation-independent, placement-independent and subject-independent human activity recognition dataset. The information in this dataset is the measurements from the accelerometer, gyroscope, magnetometer, and GPS of the smartphone. Additionally, each measure is associated with one of the four possible registered activities: inactive, active, walking and driving. This work also proposes asupport vector machine (SVM) model to perform some preliminary experiments on the dataset. Considering that this dataset was taken from smartphones in their actual use, unlike other datasets, the development of a good model on such data is an open problem and a challenge for researchers. By doing so, we would be able to close the gap between the model and a real-life application.This research was partially funded by Xunta de Galicia/FEDER-UE (ConectaPeme, GEMA: IN852A 2018/14), MINECO-AEI/FEDER-UE (Flatcity: TIN2016-77158-C4-3-R) and Xunta de Galicia/FEDER-UE (AXUDAS PARA A CONSOLIDACION E ESTRUTURACION DE UNIDADES DE INVESTIGACION COMPETITIVAS.GRC: ED431C 2017/58 and ED431C 2018/49)Xunta de Galicia; IN852A 2018/14Xunta de Galicia; ED431C 2017/58Xunta de Galicia; ED431C 2018/4

    New machine learning approaches for real-life human activity recognition using smartphone sensor-based data

    Get PDF
    Financiado para publicar en acceso aberto. Universidade da Coruña/CISUG[Abstract]: In recent years, mainly due to the application of smartphones in this area, research in human activity recognition (HAR) has shown a continuous and steady growth. Thanks to its wide range of sensors, its size, its ease of use, its low price and its applicability in many other fields, it is a highly attractive option for researchers. However, the vast majority of studies carried out so far focus on laboratory settings, outside of a real-life environment. In this work, unlike in other papers, progress was sought on the latter point. To do so, a dataset already published for this purpose was used. This dataset was collected using the sensors of the smartphones of different individuals in their daily life, with almost total freedom. To exploit these data, numerous experiments were carried out with various machine learning techniques and each of them with different hyperparameters. These experiments proved that, in this case, tree-based models, such as Random Forest, outperform the rest. The final result shows an enormous improvement in the accuracy of the best model found to date for this purpose, from 74.39% to 92.97%.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED481A 2020/003Xunta de Galicia; ED431C 2022/46Xunta de Galicia; ED431C 2018/49Xunta de Galicia; ED431C 2021/53This research was partially funded by MCIN/AEI/10.13039/ 501100011033, NextGenerationEU/PRTR, FLATCITY-POC , Spain [grant number P DC2021-121239-C31]; MCIN/AEI/10.13039/ 501100011033 MAGIST, Spain [grant number P ID2019-105221RB-C41]; Xunta de Galicia/FEDER-UE, Spain [grant numbers ED431G 2019/01 , ED481A 2020/003 , ED431C 2022/46 , ED431C 2018/49 and ED431C 2021/53 ]. Funding for open access charge: Universidade da Coruña/CISUG

    Classification of Signals by Means of Genetic Programming

    Get PDF
    [Abstract] This paper describes a new technique for signal classification by means of Genetic Programming (GP). The novelty of this technique is that no prior knowledge of the signals is needed to extract the features. Instead of it, GP is able to extract the most relevant features needed for classification. This technique has been applied for the solution of a well-known problem: the classification of EEG signals in epileptic and healthy patients. In this problem, signals obtained from EEG recordings must be correctly classified into their corresponding class. The aim is to show that the technique described here, with the automatic extraction of features, can return better results than the classical techniques based on manual extraction of features. For this purpose, a final comparison between the results obtained with this technique and other results found in the literature with the same database can be found. This comparison shows how this technique can improve the ones found.Instituto de Salud Carlos III; RD07/0067/0005Xunta de Galicia; 10SIN105004P

    Automated Early Detection of Drops in Commercial Egg Production Using Neural Networks

    Get PDF
    [Abstract] 1. The purpose of this work was to support decision-making in poultry farms by performing automatic early detection of anomalies in egg production. 2. Unprocessed data were collected from a commercial egg farm on a daily basis over 7 years. Records from a total of 24 flocks, each with approximately 20 000 laying hens, were studied. 3. Other similar works have required a prior feature extraction by a poultry expert, and this method is dependent on time and expert knowledge. 4. The present approach reduces the dependency on time and expert knowledge because of the automatic selection of relevant features and the use of artificial neural networks capable of cost-sensitive learning. 5. The optimum configuration of features and parameters in the proposed model was evaluated on unseen test data obtained by a repeated cross-validation technique. 6. The accuracy, sensitivity, specificity and positive predictive value are presented and discussed at 5 forecasting intervals. The accuracy of the proposed model was 0.9896 for the day before a problem occurs.Galicia. ConsellerĂ­a de Cultura, EducaciĂłn e OrdenaciĂłn Universitaria; GRC2014/04
    • …
    corecore