10 research outputs found
Predictive geospatial analytics using principal component regression
Nowadays, exponential growth in geospatial or spatial data all over the globe, geospatial data analytics is absolutely deserved to pay attention in manipulating voluminous amount of geodata in various forms increasing with high velocity. In addition, dimensionality reduction has been playing a key role in high-dimensional big data sets including spatial data sets which are continuously growing not only in observations but also in features or dimensions. In this paper, predictive analytics on geospatial big data using Principal Component Regression (PCR), traditional Multiple Linear Regression (MLR) model improved with Principal Component Analysis (PCA), is implemented on distributed, parallel big data processing platform. The main objective of the system is to improve the predictive power of MLR model combined with PCA which reduces insignificant and irrelevant variables or dimensions of that model. Moreover, it is contributed to present how data mining and machine learning approaches can be efficiently utilized in predictive geospatial data analytics. For experimentation, OpenStreetMap (OSM) data is applied to develop a one-way road prediction for city Yangon, Myanmar. Experimental results show that hybrid approach of PCA and MLR can be efficiently utilized not only in road prediction using OSM data but also in improvement of traditional MLR model
Regulation of Microclimatic Conditions inside Native Beehives and Its Relationship with Climate in Southern Spain
In this study, the Wbee Sensor System was used to record data from 10 Iberian beehives for two years in southern Spain. These data were used to identify potential conditioning climatic factors of the internal regulatory behavior of the hive and its weight. Categorical principal components analysis (CATPCA) was used to determine the minimum number of those factors able to capture the maximum percentage of variability in the data recorded. Then, categorical regression (CATREG) was used to select the factors that were linearly related to hive internal humidity, temperature and weight to issue predictive regression equations in Iberian bees. Average relative humidity values of 51.7% ± 10.4 and 54.2% ± 11.7 were reported for humidity in the brood nest and in the food area, while average temperatures were 34.3 °C ± 1.5 in the brood nest and 29.9 °C ± 5.8 in the food area. Average beehive weight was 38.2 kg ± 13.6. Some of our data, especially those related to humidity, contrast with previously published results for other studies about bees from Central and northern Europe. Conclusively, certain combinations of climatic factors may condition within hive humidity, temperature and hive weight. Southern Iberian honeybees’ brood nest humidity regulatory capacity could be lower than brood nest thermoregulatory capacity, maintaining values close to 34 °C, even in dry conditions
Design of experiment and analysis techniques for fuel consumption data using heavy-duty diesel vehicles and on-road testing
Chassis dynamometer and on-road testing are usually employed to test vehicle operation. Testing on a chassis dynamometer reduces data variability compared to on-road testing due to the controlled environment but it does not account for other important variables that affects real-world vehicle operation. This study used on-road testing to investigate the differences between two test fuels under real-world conditions. Three heavy-duty diesel vehicles were driven on different routes for a period of three months. Each vehicle was instrumented with flow meters to gather fuel consumption data, which was then compared to the fuel rate broadcasted by the engine control unit (ECU). Additionally, the driveshaft torque was measured using a strain gage and a torque transmitter, which was used to confirm that the output torque was correlated to the vehicle’s fuel consumption. Data from both the ECU and the sensors were stored on a portable activity measurement system (PAMS), which also collected global positioning system (GPS) data and ambient conditions. The experimental procedure was based on SAE J1321. Due to the proprietary nature of the data, specific results of the study were not shown. However, the thesis details the design of experiments, including the selection, installation, benefits, and limitations of using additional sensors to improve data analysis. It also discusses the data storage and methods used for data analysis with the considerably large data sets obtained in the study. For example, while ~4.5 million data points were collected for each vehicle and each month of testing, more than 55% of the data points were discarded due to idling, engine cutoff during downhill operation, and adverse weather conditions. With respect to data analysis, the principal component analysis (PCA) identified the variables that caused the most variability in the datasets. PCA and data binning were used to compare datasets and determine the differences between them. The results show that the route with the most interstate data supplied the highest number of usable data points. Moreover, the ECU fuel consumption was consistent with the flow meter data with an average percent error of 2.5%. Measuring the engine torque using a torque meter can be difficult for on-road testing due to the excessive vibration experienced by the sensor
Framework de apoio à tomada de decisão no mercado de ações baseado em aprendizado por reforço profundo
Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2021.No mercado de ações, investidores adotam diferentes estratégias para identificar
uma sequência de decisões de investimento a fim de maximizar seus lucros. Para apoiar a
decisão dos investidores, uma framework de aprendizado de máquina (machine learning) foi
proposta. Em particular, as abordagens de aprendizado profundo (deep learning) são muito
atraentes, uma vez que o mercado de ações apresenta um comportamento altamente não
linear e as técnicas de aprendizado profundo podem rastrear variações de curto e longo prazo.
Em contraste com as técnicas de aprendizado supervisionadas, o aprendizado por reforço
profundo reúne benefícios de aprendizado profundo e adiciona adaptação e melhoria em
tempo real do modelo de aprendizado de máquina. Neste trabalho, propomos uma framework
de suporte à decisão para o mercado de ações baseado no aprendizado por reforço profundo.
Ao aprender as regras de negociação, a framework reconhece padrões, maximiza o lucro
obtido e fornece recomendações aos investidores. A framework proposta supera o estado da
arte com 86 % da métrica F1-Score para operações de compra e 88 % da pontuação F1-Score
para operações de venda em termos de avaliação da estratégia de posicionamento.In stock markets, investors adopt different strategies to identify a sequence of
profitable investment decisions to maximize their profits. To support the decision of investors,
machine learning (ML) softwares are being applied. In particular, deep learning (DL)
approaches are attractive since the stock market parameter presents a highly non-
linear behavior, and since DL techniques can track short time and long time variations.
In contrast to supervised ML techniques, deep reinforcement learning (DRL) gathers DL’s
benefits and adds the real-time adaptation and improvement of the machine learning
model. In this paper, we propose a decision support framework for the stock market
based on DRL. By learning the trading rules, our framework recognizes patterns,
maximizes the profit obtained and provides recommendations to the investors. The
proposed DRL framework outperforms the state-of-the-art framework with 86 % of F1 score
for buy operations and 88 % of F1 score for sale operations in terms of evaluating the
positioning strategy
Klasifikasi Dna Tuberkulosis Berdasarkan K-Mer Menggunakan Support Vector Machine (Svm) Dan Variable Neighborhood Search (Vns)
Tuberkulosis adalah penyakit yang disebabkan oleh mycobacterium
tuberculosis dan termasuk kedalam salah satu dari 10 penyebab kematian di
dunia. Oleh karena itu diperlukan pendeteksian secara lebih akurat supaya dapat
diberikan penanganan yang tepat. Dalam pendeteksiannya, terkadang terjadi
kesalahan karena menyerupai dengan penyakit paru-paru lainnya. Penelitian ini
menerapkan algoritme machine learning dalam melakukan deteksi penyakit
Tuberkulosis dengan menggunakan data DNA karena semua organisme memiliki
struktur DNA. Metode yang digunakan adalah support vector machine (SVM) yang
dioptimasi dengan variable neighborhood search (VNS). SVM digunakan untuk
klasifikasi dan VNS digunakan untuk optimasi dari parameter SVM. SVM dipilih
karena bagus dalam generalisasi data. Data DNA sebelum digunakan sebagai
masukan kedalam SVM perlu dilakukan preprocessing terlebih dahulu dengan
menggunakan k-Mer untuk mengambil substring DNA kemudian
mengkonversinya menjadi data berupa numerik dan dilakukan reduksi dimensi
karena fitur data yang banyak. Performa dari SVM tergantung dari pemilihan
parameter yang tepat, oleh karena itu dioptimasi dengan VNS dan VNS yang
digunakan adalah VNS yang telah dimodifikasi, yaitu nested RVNS. k-Mer terbaik
pada penelitian ini bernilai k = 5. Hasil akhir setelah dilakukan optimasi adalah
akurasi = 0.995708, presisi = 0.995765, recall = 0.995708, F measure = 0.995557,
dan MCC = 0.992659. Akurasi ini lebih baik daripada sebelum dilakukan optimasi,
yang bernilai 0.927039. Dengan menggunakan nested RVNS, berjalan 2.5 kali lebih
cepat daripada VNS dasat dalam mencari parameter SVM yang optima
Symmetry-Adapted Machine Learning for Information Security
Symmetry-adapted machine learning has shown encouraging ability to mitigate the security risks in information and communication technology (ICT) systems. It is a subset of artificial intelligence (AI) that relies on the principles of processing future events by learning past events or historical data. The autonomous nature of symmetry-adapted machine learning supports effective data processing and analysis for security detection in ICT systems without the interference of human authorities. Many industries are developing machine-learning-adapted solutions to support security for smart hardware, distributed computing, and the cloud. In our Special Issue book, we focus on the deployment of symmetry-adapted machine learning for information security in various application areas. This security approach can support effective methods to handle the dynamic nature of security attacks by extraction and analysis of data to identify hidden patterns of data. The main topics of this Issue include malware classification, an intrusion detection system, image watermarking, color image watermarking, battlefield target aggregation behavior recognition model, IP camera, Internet of Things (IoT) security, service function chain, indoor positioning system, and crypto-analysis
Quality analysis of available information for patients with different pathologies on social media video platforms
Internet ha sufrido una expansión incomparable hasta convertirse en el medio más
importante de difusión de información en el mundo (Kocyigit et al., 2019). Buscar información
sobre salud en Internet se ha convertido en algo progresivamente común hasta el punto de
que los usuarios suelen utilizar Internet como fuente primaria de información sobre temas
relacionados con la salud(Baker et al., 2021; Kocyigit et al., 2019; Sui et al., 2022). Según
Amante et al. (2015), casi el cincuenta por ciento de los adultos en los Estados Unidos (EE.UU.)
obtienen información relacionada con la salud a través de Internet(Amante et al., 2015).
Como portal web popular para ver y compartir vídeos, YouTube es utilizada ampliamente a
nivel mundial para ver y compartir vídeos (Amante et al., 2015; Baran & Yilmaz Baran, 2021;
Lewis et al., 2012; Oydanich et al., 2022; Shun Zhang et al., 2020). Debido al contenido gratuito
de sus vídeos y a su facilidad para llegar a la población, YouTube puede considerarse como
un recurso eficaz para obtener y difundir información relacionada con la salud y utilizarse
como una herramienta útil para la educación de los pacientes (Chang & Park, 2021; Jessen et
al., 2022; Katz & Nandi, 2021; Warren, Wisener, et al., 2021).
Sin embargo, existen dudas razonables sobre la calidad, la fiabilidad y el contenido de los
vídeos (Piskin et al., 2021; Shun Zhang et al., 2020). El sistema de carga de YouTube, permite
incluir vídeos sin control ni escrutinio previos. Es por ello necesario verificar la calidad, el
contenido y la exactitud de la información compartida (McMahon et al., 2022; Yildiz & Toros,
2021). Los vídeos de YouTube pueden compartir información de alta calidad relacionada con
la salud pero también suscitan preocupación por los riesgos de proporcionar información de
baja evidencia científica (Baran & Yilmaz Baran, 2021; Chang & Park, 2021; Culha et al., 2021;
Dubey et al., 2014; Madathil et al., 2015; Oydanich et al., 2022; Patel et al., 2022; Yildiz & Toros,
2021).
Por lo tanto, el objetivo de esta tesis doctoral fue evaluar la calidad de los vídeos de YouTube
en cuanto a los ejercicios recomendados que están relacionados con temas de gran
importancia para la población general, como son los tipos de cáncer más comunes en las
mujeres y en los hombres (cáncer de mama y de próstata, respectivamente), así como los
recomendados para los períodos de confinamiento en los domicilios de la población debido
a la pandemia mundial causada por el CoVid-19.The Internet has been expanded worldwide to become the most important mean to spread
information in the world (Kocyigit et al., 2019). Searching and finding health information
using the Internet has become progressively common while people often use the Internet as
a source of health information (Baker et al., 2021; Kocyigit et al., 2019; Sui et al., 2022).
According to Amante et al (2015), nearly fifty per cent of adults in the United States (US) get
health-related information on the Internet (Amante et al., 2015). As a popular video sharing
site, YouTube is extensively used all around the world for users to watch and share videos
(Amante et al., 2015; Baran & Yilmaz Baran, 2021; Lewis et al., 2012; Oydanich et al., 2022; Shun
Zhang et al., 2020). Due to the free content of its videos and its ease of reach to the population,
YouTube can be considered an effective resource for obtaining and disseminating healthrelated
information. Consequently, it can also be used as a useful tool for patient education
(Chang & Park, 2021; Jessen et al., 2022; Katz & Nandi, 2021; Warren, Wisener, et al., 2021).
However, there are doubts about the quality, reliability and content of the videos (Piskin et
al., 2021; Shun Zhang et al., 2020). Especially in view of YouTube’s philosophy, where anyone
can upload videos without prior check or scrutiny, and which could be used for promotional
ends, it is necessary to verify the quality, content and accuracy of the information shared in
the uploaded videos (McMahon et al., 2022; Yildiz & Toros, 2021). This means that YouTube
videos may raise concerns about the risks of providing misleading health-related
information in videos available on this platform (Baran & Yilmaz Baran, 2021; Culha et al.,
2021; Dubey et al., 2014; Madathil et al., 2015; Patel et al., 2022).
A previous systematic review investigating eighteen pieces of research found that YouTube
can display misleading and conflicting health-related information, but, on the other hand, it
can also share high quality health-related information (Chang & Park, 2021; Oydanich et al.,
2022; Yildiz & Toros, 2021). So, digital platforms are a promising way to support physical
activity levels and may have provided an alternative for people to maintain their activity while
at home (Güloğlu et al., 2022; Kadakia et al., 2022; McDonough et al., 2022; Parker et al., 2021).
Therefore, the aim of this study was to assess the quality of YouTube videos, that any internet
user could access, regarding to the recommended exercises that are related to topics of major
importance for general population. In this regard, exercises related to the most common
types of cancer in women and men (breast cancer and prostate cancer, respectively) have
been considered, as well as those recommended for periods of confinement due to the global
pandemic caused by CoVid-19.
Article 1
The prolonged immobilization suggested after breast cancer (BC) surgery causes morbidity.
Patients search the Internet, especially social networks, for recommended exercises. The aim
of this observational study was to assess the quality of YouTube videos, accessible for any
patient, about exercises after BC surgery (Rodriguez-Rodriguez, Blanco-Diaz, Lopez-Diaz,
de la Fuente-Costa, Duenas, et al., 2021).
Article 2
Prostate cancer (PC) is a major cause of disease and mortality among men. Surgical treatment
involving the removal of the prostate may result in temporary or permanent erectile
dysfunction (ED) and urinary incontinence (UI), with considerable impact on quality of life.
(QoL) Pelvic floor muscle training (PFMT) is one of the recommended techniques for the
prevention, treatment, and rehabilitation of postoperative complications. The aim of this
observational study was to assess the quality of YouTube videos related to exercises after
prostatectomy surgery (Rodriguez-Rodriguez, Blanco-Diaz, Lopez-Diaz, de la Fuente-Costa,
Sousa-Fraguas, et al., 2021).
Article 3
The world has been experiencing a pandemic caused by COVID-19. Insufficient physical
activity can increase the risk of illness. The aim of this study was to evaluate the quality of
YouTube videos related to home exercises during lockdown and their adherence to WHO
recommendations after replicating a simple search process that could be performed by any
individual internet user (Rodriguez-Rodriguez et al., 2022)
Study of the colony-environment relationship in domestic bee populations (Apis mellifera L.) by implementing electronic remote monitoring systems
La polinización es la aportación principal de la abeja doméstica (Apis mellifera L.) a los ecosistemas terrestres, y además resulta fundamental para el éxito de muchos cultivos. Sin las abejas podría estar seriamente comprometida la viabilidad de muchas especies vegetales. Sin embargo, las poblaciones de abejas están sufriendo importantes pérdidas, decreciendo debido a diferentes factores no bien identificados, aunque el cambio climático ha sido propuesto como uno de ellos. Por tanto, entender cómo responden las abejas a los nuevos escenarios climáticos es esencial para hacerle frente, especialmente en las zonas bioclimáticas más sensibles, como es el área mediterránea. En este sentido, es necesario conseguir toda la información posible sobre cómo interactúan las abejas con las condiciones ambientales, y cómo son capaces de regular estas condiciones en el interior de la colmena, empleando además métodos lo menos intrusivos posibles, evitando así modificar las condiciones naturales y obtener datos más realistas. Con ese objetivo, hemos diseñado un sistema de monitorización remota, al que hemos denominado WBee, basado en la tecnología Waspmote, y diseñado como un modelo jerárquico a tres niveles: nodo inalámbrico, un servidor local, y un servidor para almacenar los datos en la nube. WBee es un sistema fácilmente adaptable en relación al número y tipo de sensores, al número de colmenas y a su distribución geográfica. WBee además almacena los datos en cada uno de los niveles por si se produjeran errores en la comunicación, disponiendo los nodos también con baterías de apoyo, lo que permite continuar recabando información aunque se produzca una caída del sistema eléctrico. Actualmente el sistema está dotado con sensores que le permiten monitorizar la temperatura y la humedad relativa de la colonia en tres puntos diferentes, así como el peso de la colmena. Todos los datos recogidos se pueden consultar a tiempo real con acceso a través de internet. Una vez implementado el sistema, apoyándonos en los datos obtenidos, hemos estudiado la relación de las abejas con el medio en tres situaciones: en la primera, monitorizamos las tres variables (peso, temperatura y humedad relativa) a lo largo de un mes en 20 colmenas, coincidiendo con una floración comercial de girasol. Esto nos ha permitido entender la evolución de las colonias durante una floración, registrar la producción de miel en las colmenas y estimar el momento óptimo para su extracción, además de verificar el correcto funcionamiento del sistema Wbee. En la segunda, se estudió la influencia de episodios de temperaturas extremas en las colmenas durante el periodo de floración en las campañas apícolas de 2016 y 2017. En este ensayo usamos los cambios en el peso de las colmenas como variable indicadora de la evolución de las colonias, y lo completamos con evaluaciones exhaustivas en tres momentos críticos (principio, mitad y final) de la floración en su conjunto, determinando la población de abejas adultas, cría, y reservas de polen y miel. Los resultados mostraron que la floración se redujo en tres semanas en 2017 en comparación con 2016, ya que las condiciones adversas afectaron significativamente a la evolución normal de las poblaciones de abejas y las reservas de polen y miel, incrementando el estrés alimenticio de las abejas. Esto también afectó al espectro polínico y a las características comerciales de la miel. En la tercera, se registraron los datos de peso, humedad y temperatura de 10 colmenas de abejas ibéricas durante los mismos dos años completos. Estos datos fueron usados para identificar los factores climáticos que potencialmente afectan al comportamiento regulatorio interno en las colmenas y el peso de las mismas. Sobre estos datos se realizó un análisis categórico de los componentes principales (CATPCA) que fue usado para determinar el número mínimo de los factores capaces de explicar el máximo porcentaje de la variabilidad registrada en los datos. A continuación, se usó una regresión categórica (CATREG) para seleccionar los factores que estaban relacionados linealmente con el peso, temperatura y humedad interna de las colmenas, con los que proponer ecuaciones de regresión específicas para abejas ibéricas. Los resultados obtenidos, especialmente aquellos relacionados con la humedad relativa, contrastan con los previamente publicados en otros estudios con abejas en el centro y norte de Europa, y pueden ayudar a planificar una apicultura más eficiente, así como a conocer el efecto del cambio climático en las abejas. Finalmente, los resultados no solo atañen a las abejas, pues el sistema puede ser una herramienta muy útil para estudiar lo que sucede en el medio, usando las colonias de abejas como bioindicadores.Pollination is the main contribution of the domestic bee (Apis mellifera L.) to terrestrial ecosystems, and it is also essential for the success of many crops. Without bees, the viability of many plant species could be seriously compromised. However, bee populations are suffering significant losses, and are decreasing due to different factors not well identified, although climate change has been proposed as one of them. Therefore, understanding how bees respond to new climate scenarios is essential to face it, especially in sensitive bioclimatic zones, such as the Mediterranean area. In this sense, it is necessary to obtain a large amount of information on how bees interact with environmental conditions, and how they are able to regulate these conditions inside the hive, also using the least intrusive methods possible, and avoiding modifying natural conditions and obtaining more realistic data. With this objective, we have designed a remote monitoring system, which we have called WBee, based on Waspmote technology, and designed as a hierarchical model at three levels: wireless node, a local data server, and a cloud data server. WBee is an easily adaptable system in relation to the number and type of sensors, the number of hives and their geographical distribution. WBee saves the data in each of the levels if there are failures in communication, also include a backup battery, which makes it possible to continue collecting data in the event of a power outage. Currently the system is equipped with sensors that allow it to monitor the temperature and relative humidity of the colony at three different points, as well as the weight of the hive. All the data collected can be consulted in real time with Access through the internet. Once the system was implemented, we have studied, based on the data obtained, the relationship of bees with the environment in three situations: in the first, we evaluated the three variables (weight, temperature and relative humidity) over a month in 20 hives, coinciding with a commercial sunflower flowering. This has allowed us to understand the evolution of the colonies during a flowering period, to record the production of honey in the hives and to estimate the optimal moment for its extraction, in addition to verifying the correct functioning of the Wbee system. In the second, the influence of episodes of extreme temperatures in the hives during the flowering period, in the 2016 and 2017 beekeeping sessions, was evaluated. In this study we use the changes in the weight of the hives as a reflection of the evolution of the colonies, and we complete it with exhaustive assessments at three critical moments (beginning, middle and end) of the flowering, determining the population of adult bees, brood, and pollen and honey reserves. The results showed that flowering was reduced by three weeks in 2017 compared to 2016, since the normal evolution of bee populations and pollen and honey reserves were significantly affected by adverse conditions, increasing the nutritional stress of the bees. This also affected the pollen spectrum and the commercial characteristics of honey. In the third, the weight, humidity and temperature data of 10 hives of Iberian bees were recorded during the same two full years. These data were used to identify climatic factors that potentially affect internal regulatory behavior and their weight in hives. On these data, a Categorical principal components analysis (CATPCA) was carried out, which was used to determine the minimum number of factors capable of explaining the maximum percentage of the variability recorded in the data. Next, a categorical regression (CATREG) was used to select the factors that were linearly related to hive internal humidity, temperature and weight to issue predictive regression equations in Iberian bees. The results obtained, especially those related to relative humidity, contrast with those previously published in other studies with bees in central and northern Europe, and can help to plan more efficient beekeeping, as well as to know the effect of climate change on the bees. Finally, the results do not only concern bees, since the system can be a useful tool to study what happens in the environment, using bee colonies as bioindicators