10 research outputs found

    Predictive geospatial analytics using principal component regression

    Get PDF
    Nowadays, exponential growth in geospatial or spatial data all over the globe, geospatial data analytics is absolutely deserved to pay attention in manipulating voluminous amount of geodata in various forms increasing with high velocity. In addition, dimensionality reduction has been playing a key role in high-dimensional big data sets including spatial data sets which are continuously growing not only in observations but also in features or dimensions. In this paper, predictive analytics on geospatial big data using Principal Component Regression (PCR), traditional Multiple Linear Regression (MLR) model improved with Principal Component Analysis (PCA), is implemented on distributed, parallel big data processing platform. The main objective of the system is to improve the predictive power of MLR model combined with PCA which reduces insignificant and irrelevant variables or dimensions of that model. Moreover, it is contributed to present how data mining and machine learning approaches can be efficiently utilized in predictive geospatial data analytics. For experimentation, OpenStreetMap (OSM) data is applied to develop a one-way road prediction for city Yangon, Myanmar. Experimental results show that hybrid approach of PCA and MLR can be efficiently utilized not only in road prediction using OSM data but also in improvement of traditional MLR model

    Regulation of Microclimatic Conditions inside Native Beehives and Its Relationship with Climate in Southern Spain

    Get PDF
    In this study, the Wbee Sensor System was used to record data from 10 Iberian beehives for two years in southern Spain. These data were used to identify potential conditioning climatic factors of the internal regulatory behavior of the hive and its weight. Categorical principal components analysis (CATPCA) was used to determine the minimum number of those factors able to capture the maximum percentage of variability in the data recorded. Then, categorical regression (CATREG) was used to select the factors that were linearly related to hive internal humidity, temperature and weight to issue predictive regression equations in Iberian bees. Average relative humidity values of 51.7% ± 10.4 and 54.2% ± 11.7 were reported for humidity in the brood nest and in the food area, while average temperatures were 34.3 °C ± 1.5 in the brood nest and 29.9 °C ± 5.8 in the food area. Average beehive weight was 38.2 kg ± 13.6. Some of our data, especially those related to humidity, contrast with previously published results for other studies about bees from Central and northern Europe. Conclusively, certain combinations of climatic factors may condition within hive humidity, temperature and hive weight. Southern Iberian honeybees’ brood nest humidity regulatory capacity could be lower than brood nest thermoregulatory capacity, maintaining values close to 34 °C, even in dry conditions

    Design of experiment and analysis techniques for fuel consumption data using heavy-duty diesel vehicles and on-road testing

    Get PDF
    Chassis dynamometer and on-road testing are usually employed to test vehicle operation. Testing on a chassis dynamometer reduces data variability compared to on-road testing due to the controlled environment but it does not account for other important variables that affects real-world vehicle operation. This study used on-road testing to investigate the differences between two test fuels under real-world conditions. Three heavy-duty diesel vehicles were driven on different routes for a period of three months. Each vehicle was instrumented with flow meters to gather fuel consumption data, which was then compared to the fuel rate broadcasted by the engine control unit (ECU). Additionally, the driveshaft torque was measured using a strain gage and a torque transmitter, which was used to confirm that the output torque was correlated to the vehicle’s fuel consumption. Data from both the ECU and the sensors were stored on a portable activity measurement system (PAMS), which also collected global positioning system (GPS) data and ambient conditions. The experimental procedure was based on SAE J1321. Due to the proprietary nature of the data, specific results of the study were not shown. However, the thesis details the design of experiments, including the selection, installation, benefits, and limitations of using additional sensors to improve data analysis. It also discusses the data storage and methods used for data analysis with the considerably large data sets obtained in the study. For example, while ~4.5 million data points were collected for each vehicle and each month of testing, more than 55% of the data points were discarded due to idling, engine cutoff during downhill operation, and adverse weather conditions. With respect to data analysis, the principal component analysis (PCA) identified the variables that caused the most variability in the datasets. PCA and data binning were used to compare datasets and determine the differences between them. The results show that the route with the most interstate data supplied the highest number of usable data points. Moreover, the ECU fuel consumption was consistent with the flow meter data with an average percent error of 2.5%. Measuring the engine torque using a torque meter can be difficult for on-road testing due to the excessive vibration experienced by the sensor

    Framework de apoio à tomada de decisão no mercado de ações baseado em aprendizado por reforço profundo

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2021.No mercado de ações, investidores adotam diferentes estratégias para identificar uma sequência de decisões de investimento a fim de maximizar seus lucros. Para apoiar a decisão dos investidores, uma framework de aprendizado de máquina (machine learning) foi proposta. Em particular, as abordagens de aprendizado profundo (deep learning) são muito atraentes, uma vez que o mercado de ações apresenta um comportamento altamente não linear e as técnicas de aprendizado profundo podem rastrear variações de curto e longo prazo. Em contraste com as técnicas de aprendizado supervisionadas, o aprendizado por reforço profundo reúne benefícios de aprendizado profundo e adiciona adaptação e melhoria em tempo real do modelo de aprendizado de máquina. Neste trabalho, propomos uma framework de suporte à decisão para o mercado de ações baseado no aprendizado por reforço profundo. Ao aprender as regras de negociação, a framework reconhece padrões, maximiza o lucro obtido e fornece recomendações aos investidores. A framework proposta supera o estado da arte com 86 % da métrica F1-Score para operações de compra e 88 % da pontuação F1-Score para operações de venda em termos de avaliação da estratégia de posicionamento.In stock markets, investors adopt different strategies to identify a sequence of profitable investment decisions to maximize their profits. To support the decision of investors, machine learning (ML) softwares are being applied. In particular, deep learning (DL) approaches are attractive since the stock market parameter presents a highly non- linear behavior, and since DL techniques can track short time and long time variations. In contrast to supervised ML techniques, deep reinforcement learning (DRL) gathers DL’s benefits and adds the real-time adaptation and improvement of the machine learning model. In this paper, we propose a decision support framework for the stock market based on DRL. By learning the trading rules, our framework recognizes patterns, maximizes the profit obtained and provides recommendations to the investors. The proposed DRL framework outperforms the state-of-the-art framework with 86 % of F1 score for buy operations and 88 % of F1 score for sale operations in terms of evaluating the positioning strategy

    Klasifikasi Dna Tuberkulosis Berdasarkan K-Mer Menggunakan Support Vector Machine (Svm) Dan Variable Neighborhood Search (Vns)

    Get PDF
    Tuberkulosis adalah penyakit yang disebabkan oleh mycobacterium tuberculosis dan termasuk kedalam salah satu dari 10 penyebab kematian di dunia. Oleh karena itu diperlukan pendeteksian secara lebih akurat supaya dapat diberikan penanganan yang tepat. Dalam pendeteksiannya, terkadang terjadi kesalahan karena menyerupai dengan penyakit paru-paru lainnya. Penelitian ini menerapkan algoritme machine learning dalam melakukan deteksi penyakit Tuberkulosis dengan menggunakan data DNA karena semua organisme memiliki struktur DNA. Metode yang digunakan adalah support vector machine (SVM) yang dioptimasi dengan variable neighborhood search (VNS). SVM digunakan untuk klasifikasi dan VNS digunakan untuk optimasi dari parameter SVM. SVM dipilih karena bagus dalam generalisasi data. Data DNA sebelum digunakan sebagai masukan kedalam SVM perlu dilakukan preprocessing terlebih dahulu dengan menggunakan k-Mer untuk mengambil substring DNA kemudian mengkonversinya menjadi data berupa numerik dan dilakukan reduksi dimensi karena fitur data yang banyak. Performa dari SVM tergantung dari pemilihan parameter yang tepat, oleh karena itu dioptimasi dengan VNS dan VNS yang digunakan adalah VNS yang telah dimodifikasi, yaitu nested RVNS. k-Mer terbaik pada penelitian ini bernilai k = 5. Hasil akhir setelah dilakukan optimasi adalah akurasi = 0.995708, presisi = 0.995765, recall = 0.995708, F measure = 0.995557, dan MCC = 0.992659. Akurasi ini lebih baik daripada sebelum dilakukan optimasi, yang bernilai 0.927039. Dengan menggunakan nested RVNS, berjalan 2.5 kali lebih cepat daripada VNS dasat dalam mencari parameter SVM yang optima

    Symmetry-Adapted Machine Learning for Information Security

    Get PDF
    Symmetry-adapted machine learning has shown encouraging ability to mitigate the security risks in information and communication technology (ICT) systems. It is a subset of artificial intelligence (AI) that relies on the principles of processing future events by learning past events or historical data. The autonomous nature of symmetry-adapted machine learning supports effective data processing and analysis for security detection in ICT systems without the interference of human authorities. Many industries are developing machine-learning-adapted solutions to support security for smart hardware, distributed computing, and the cloud. In our Special Issue book, we focus on the deployment of symmetry-adapted machine learning for information security in various application areas. This security approach can support effective methods to handle the dynamic nature of security attacks by extraction and analysis of data to identify hidden patterns of data. The main topics of this Issue include malware classification, an intrusion detection system, image watermarking, color image watermarking, battlefield target aggregation behavior recognition model, IP camera, Internet of Things (IoT) security, service function chain, indoor positioning system, and crypto-analysis

    Quality analysis of available information for patients with different pathologies on social media video platforms

    Get PDF
    Internet ha sufrido una expansión incomparable hasta convertirse en el medio más importante de difusión de información en el mundo (Kocyigit et al., 2019). Buscar información sobre salud en Internet se ha convertido en algo progresivamente común hasta el punto de que los usuarios suelen utilizar Internet como fuente primaria de información sobre temas relacionados con la salud(Baker et al., 2021; Kocyigit et al., 2019; Sui et al., 2022). Según Amante et al. (2015), casi el cincuenta por ciento de los adultos en los Estados Unidos (EE.UU.) obtienen información relacionada con la salud a través de Internet(Amante et al., 2015). Como portal web popular para ver y compartir vídeos, YouTube es utilizada ampliamente a nivel mundial para ver y compartir vídeos (Amante et al., 2015; Baran & Yilmaz Baran, 2021; Lewis et al., 2012; Oydanich et al., 2022; Shun Zhang et al., 2020). Debido al contenido gratuito de sus vídeos y a su facilidad para llegar a la población, YouTube puede considerarse como un recurso eficaz para obtener y difundir información relacionada con la salud y utilizarse como una herramienta útil para la educación de los pacientes (Chang & Park, 2021; Jessen et al., 2022; Katz & Nandi, 2021; Warren, Wisener, et al., 2021). Sin embargo, existen dudas razonables sobre la calidad, la fiabilidad y el contenido de los vídeos (Piskin et al., 2021; Shun Zhang et al., 2020). El sistema de carga de YouTube, permite incluir vídeos sin control ni escrutinio previos. Es por ello necesario verificar la calidad, el contenido y la exactitud de la información compartida (McMahon et al., 2022; Yildiz & Toros, 2021). Los vídeos de YouTube pueden compartir información de alta calidad relacionada con la salud pero también suscitan preocupación por los riesgos de proporcionar información de baja evidencia científica (Baran & Yilmaz Baran, 2021; Chang & Park, 2021; Culha et al., 2021; Dubey et al., 2014; Madathil et al., 2015; Oydanich et al., 2022; Patel et al., 2022; Yildiz & Toros, 2021). Por lo tanto, el objetivo de esta tesis doctoral fue evaluar la calidad de los vídeos de YouTube en cuanto a los ejercicios recomendados que están relacionados con temas de gran importancia para la población general, como son los tipos de cáncer más comunes en las mujeres y en los hombres (cáncer de mama y de próstata, respectivamente), así como los recomendados para los períodos de confinamiento en los domicilios de la población debido a la pandemia mundial causada por el CoVid-19.The Internet has been expanded worldwide to become the most important mean to spread information in the world (Kocyigit et al., 2019). Searching and finding health information using the Internet has become progressively common while people often use the Internet as a source of health information (Baker et al., 2021; Kocyigit et al., 2019; Sui et al., 2022). According to Amante et al (2015), nearly fifty per cent of adults in the United States (US) get health-related information on the Internet (Amante et al., 2015). As a popular video sharing site, YouTube is extensively used all around the world for users to watch and share videos (Amante et al., 2015; Baran & Yilmaz Baran, 2021; Lewis et al., 2012; Oydanich et al., 2022; Shun Zhang et al., 2020). Due to the free content of its videos and its ease of reach to the population, YouTube can be considered an effective resource for obtaining and disseminating healthrelated information. Consequently, it can also be used as a useful tool for patient education (Chang & Park, 2021; Jessen et al., 2022; Katz & Nandi, 2021; Warren, Wisener, et al., 2021). However, there are doubts about the quality, reliability and content of the videos (Piskin et al., 2021; Shun Zhang et al., 2020). Especially in view of YouTube’s philosophy, where anyone can upload videos without prior check or scrutiny, and which could be used for promotional ends, it is necessary to verify the quality, content and accuracy of the information shared in the uploaded videos (McMahon et al., 2022; Yildiz & Toros, 2021). This means that YouTube videos may raise concerns about the risks of providing misleading health-related information in videos available on this platform (Baran & Yilmaz Baran, 2021; Culha et al., 2021; Dubey et al., 2014; Madathil et al., 2015; Patel et al., 2022). A previous systematic review investigating eighteen pieces of research found that YouTube can display misleading and conflicting health-related information, but, on the other hand, it can also share high quality health-related information (Chang & Park, 2021; Oydanich et al., 2022; Yildiz & Toros, 2021). So, digital platforms are a promising way to support physical activity levels and may have provided an alternative for people to maintain their activity while at home (Güloğlu et al., 2022; Kadakia et al., 2022; McDonough et al., 2022; Parker et al., 2021). Therefore, the aim of this study was to assess the quality of YouTube videos, that any internet user could access, regarding to the recommended exercises that are related to topics of major importance for general population. In this regard, exercises related to the most common types of cancer in women and men (breast cancer and prostate cancer, respectively) have been considered, as well as those recommended for periods of confinement due to the global pandemic caused by CoVid-19. Article 1 The prolonged immobilization suggested after breast cancer (BC) surgery causes morbidity. Patients search the Internet, especially social networks, for recommended exercises. The aim of this observational study was to assess the quality of YouTube videos, accessible for any patient, about exercises after BC surgery (Rodriguez-Rodriguez, Blanco-Diaz, Lopez-Diaz, de la Fuente-Costa, Duenas, et al., 2021). Article 2 Prostate cancer (PC) is a major cause of disease and mortality among men. Surgical treatment involving the removal of the prostate may result in temporary or permanent erectile dysfunction (ED) and urinary incontinence (UI), with considerable impact on quality of life. (QoL) Pelvic floor muscle training (PFMT) is one of the recommended techniques for the prevention, treatment, and rehabilitation of postoperative complications. The aim of this observational study was to assess the quality of YouTube videos related to exercises after prostatectomy surgery (Rodriguez-Rodriguez, Blanco-Diaz, Lopez-Diaz, de la Fuente-Costa, Sousa-Fraguas, et al., 2021). Article 3 The world has been experiencing a pandemic caused by COVID-19. Insufficient physical activity can increase the risk of illness. The aim of this study was to evaluate the quality of YouTube videos related to home exercises during lockdown and their adherence to WHO recommendations after replicating a simple search process that could be performed by any individual internet user (Rodriguez-Rodriguez et al., 2022)

    Study of the colony-environment relationship in domestic bee populations (Apis mellifera L.) by implementing electronic remote monitoring systems

    Get PDF
    La polinización es la aportación principal de la abeja doméstica (Apis mellifera L.) a los ecosistemas terrestres, y además resulta fundamental para el éxito de muchos cultivos. Sin las abejas podría estar seriamente comprometida la viabilidad de muchas especies vegetales. Sin embargo, las poblaciones de abejas están sufriendo importantes pérdidas, decreciendo debido a diferentes factores no bien identificados, aunque el cambio climático ha sido propuesto como uno de ellos. Por tanto, entender cómo responden las abejas a los nuevos escenarios climáticos es esencial para hacerle frente, especialmente en las zonas bioclimáticas más sensibles, como es el área mediterránea. En este sentido, es necesario conseguir toda la información posible sobre cómo interactúan las abejas con las condiciones ambientales, y cómo son capaces de regular estas condiciones en el interior de la colmena, empleando además métodos lo menos intrusivos posibles, evitando así modificar las condiciones naturales y obtener datos más realistas. Con ese objetivo, hemos diseñado un sistema de monitorización remota, al que hemos denominado WBee, basado en la tecnología Waspmote, y diseñado como un modelo jerárquico a tres niveles: nodo inalámbrico, un servidor local, y un servidor para almacenar los datos en la nube. WBee es un sistema fácilmente adaptable en relación al número y tipo de sensores, al número de colmenas y a su distribución geográfica. WBee además almacena los datos en cada uno de los niveles por si se produjeran errores en la comunicación, disponiendo los nodos también con baterías de apoyo, lo que permite continuar recabando información aunque se produzca una caída del sistema eléctrico. Actualmente el sistema está dotado con sensores que le permiten monitorizar la temperatura y la humedad relativa de la colonia en tres puntos diferentes, así como el peso de la colmena. Todos los datos recogidos se pueden consultar a tiempo real con acceso a través de internet. Una vez implementado el sistema, apoyándonos en los datos obtenidos, hemos estudiado la relación de las abejas con el medio en tres situaciones: en la primera, monitorizamos las tres variables (peso, temperatura y humedad relativa) a lo largo de un mes en 20 colmenas, coincidiendo con una floración comercial de girasol. Esto nos ha permitido entender la evolución de las colonias durante una floración, registrar la producción de miel en las colmenas y estimar el momento óptimo para su extracción, además de verificar el correcto funcionamiento del sistema Wbee. En la segunda, se estudió la influencia de episodios de temperaturas extremas en las colmenas durante el periodo de floración en las campañas apícolas de 2016 y 2017. En este ensayo usamos los cambios en el peso de las colmenas como variable indicadora de la evolución de las colonias, y lo completamos con evaluaciones exhaustivas en tres momentos críticos (principio, mitad y final) de la floración en su conjunto, determinando la población de abejas adultas, cría, y reservas de polen y miel. Los resultados mostraron que la floración se redujo en tres semanas en 2017 en comparación con 2016, ya que las condiciones adversas afectaron significativamente a la evolución normal de las poblaciones de abejas y las reservas de polen y miel, incrementando el estrés alimenticio de las abejas. Esto también afectó al espectro polínico y a las características comerciales de la miel. En la tercera, se registraron los datos de peso, humedad y temperatura de 10 colmenas de abejas ibéricas durante los mismos dos años completos. Estos datos fueron usados para identificar los factores climáticos que potencialmente afectan al comportamiento regulatorio interno en las colmenas y el peso de las mismas. Sobre estos datos se realizó un análisis categórico de los componentes principales (CATPCA) que fue usado para determinar el número mínimo de los factores capaces de explicar el máximo porcentaje de la variabilidad registrada en los datos. A continuación, se usó una regresión categórica (CATREG) para seleccionar los factores que estaban relacionados linealmente con el peso, temperatura y humedad interna de las colmenas, con los que proponer ecuaciones de regresión específicas para abejas ibéricas. Los resultados obtenidos, especialmente aquellos relacionados con la humedad relativa, contrastan con los previamente publicados en otros estudios con abejas en el centro y norte de Europa, y pueden ayudar a planificar una apicultura más eficiente, así como a conocer el efecto del cambio climático en las abejas. Finalmente, los resultados no solo atañen a las abejas, pues el sistema puede ser una herramienta muy útil para estudiar lo que sucede en el medio, usando las colonias de abejas como bioindicadores.Pollination is the main contribution of the domestic bee (Apis mellifera L.) to terrestrial ecosystems, and it is also essential for the success of many crops. Without bees, the viability of many plant species could be seriously compromised. However, bee populations are suffering significant losses, and are decreasing due to different factors not well identified, although climate change has been proposed as one of them. Therefore, understanding how bees respond to new climate scenarios is essential to face it, especially in sensitive bioclimatic zones, such as the Mediterranean area. In this sense, it is necessary to obtain a large amount of information on how bees interact with environmental conditions, and how they are able to regulate these conditions inside the hive, also using the least intrusive methods possible, and avoiding modifying natural conditions and obtaining more realistic data. With this objective, we have designed a remote monitoring system, which we have called WBee, based on Waspmote technology, and designed as a hierarchical model at three levels: wireless node, a local data server, and a cloud data server. WBee is an easily adaptable system in relation to the number and type of sensors, the number of hives and their geographical distribution. WBee saves the data in each of the levels if there are failures in communication, also include a backup battery, which makes it possible to continue collecting data in the event of a power outage. Currently the system is equipped with sensors that allow it to monitor the temperature and relative humidity of the colony at three different points, as well as the weight of the hive. All the data collected can be consulted in real time with Access through the internet. Once the system was implemented, we have studied, based on the data obtained, the relationship of bees with the environment in three situations: in the first, we evaluated the three variables (weight, temperature and relative humidity) over a month in 20 hives, coinciding with a commercial sunflower flowering. This has allowed us to understand the evolution of the colonies during a flowering period, to record the production of honey in the hives and to estimate the optimal moment for its extraction, in addition to verifying the correct functioning of the Wbee system. In the second, the influence of episodes of extreme temperatures in the hives during the flowering period, in the 2016 and 2017 beekeeping sessions, was evaluated. In this study we use the changes in the weight of the hives as a reflection of the evolution of the colonies, and we complete it with exhaustive assessments at three critical moments (beginning, middle and end) of the flowering, determining the population of adult bees, brood, and pollen and honey reserves. The results showed that flowering was reduced by three weeks in 2017 compared to 2016, since the normal evolution of bee populations and pollen and honey reserves were significantly affected by adverse conditions, increasing the nutritional stress of the bees. This also affected the pollen spectrum and the commercial characteristics of honey. In the third, the weight, humidity and temperature data of 10 hives of Iberian bees were recorded during the same two full years. These data were used to identify climatic factors that potentially affect internal regulatory behavior and their weight in hives. On these data, a Categorical principal components analysis (CATPCA) was carried out, which was used to determine the minimum number of factors capable of explaining the maximum percentage of the variability recorded in the data. Next, a categorical regression (CATREG) was used to select the factors that were linearly related to hive internal humidity, temperature and weight to issue predictive regression equations in Iberian bees. The results obtained, especially those related to relative humidity, contrast with those previously published in other studies with bees in central and northern Europe, and can help to plan more efficient beekeeping, as well as to know the effect of climate change on the bees. Finally, the results do not only concern bees, since the system can be a useful tool to study what happens in the environment, using bee colonies as bioindicators
    corecore