11,074 research outputs found

    Qluster: An easy-to-implement generic workflow for robust clustering of health data

    Get PDF
    The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors

    What do new performance metrics, VeDBA and Dynamic yaw, tell us about energy-intensive activities in whale sharks?

    Get PDF
    During oscillatory dives, whale sharks (Rhincodon typus) expend varying levels of energy in active ascent and passive descent. They are expected to minimise movement costs by travelling at optimum speed unless having reason to move faster, for example during feeding or evasion of danger. A proxy for power, dynamic body acceleration (DBA) has previously been used to identify whale shark movement patterns but has yet been used to identify occasions where power is elevated above minimum requirements. 59 hours of biologging data from 13 juvenile whale sharks (Ningaloo Reef, Western Australia) including depth, body pitch angle, magnetometry and DBA, was analysed to investigate minimum power requirements for dives and identify events of elevated power. Dynamic yaw (the rate of change of heading), a new proxy for power, was introduced to determine its effectiveness compared to the already-established DBA. The relationship between pitch angle and these two proxies was investigated to determine which had the stronger relationship. Dynamic yaw produced a poor relationship with pitch angle compared to DBA, and thus DBA was selected as the focus proxy for the remainder of the study. DBA was utilised to produce a minimum power trend versus body pitch angle using a convex hull analysis which allowed for the identification of proxy for power utilisation above the minimum (PAM). 16 instances of PAM were identified in 59 hours of data, which could all be considered instances where energy minimisation is not prioritised, such as feeding or avoidance. The PAM method was capable of identifying instances where energy minimisation is not prioritised, and therefore has future implications in investigations of location-specific behaviours in relation to feeding and anthropogenic disturbance

    Countermeasures for the majority attack in blockchain distributed systems

    Get PDF
    La tecnología Blockchain es considerada como uno de los paradigmas informáticos más importantes posterior al Internet; en función a sus características únicas que la hacen ideal para registrar, verificar y administrar información de diferentes transacciones. A pesar de esto, Blockchain se enfrenta a diferentes problemas de seguridad, siendo el ataque del 51% o ataque mayoritario uno de los más importantes. Este consiste en que uno o más mineros tomen el control de al menos el 51% del Hash extraído o del cómputo en una red; de modo que un minero puede manipular y modificar arbitrariamente la información registrada en esta tecnología. Este trabajo se enfocó en diseñar e implementar estrategias de detección y mitigación de ataques mayoritarios (51% de ataque) en un sistema distribuido Blockchain, a partir de la caracterización del comportamiento de los mineros. Para lograr esto, se analizó y evaluó el Hash Rate / Share de los mineros de Bitcoin y Crypto Ethereum, seguido del diseño e implementación de un protocolo de consenso para controlar el poder de cómputo de los mineros. Posteriormente, se realizó la exploración y evaluación de modelos de Machine Learning para detectar software malicioso de tipo Cryptojacking.DoctoradoDoctor en Ingeniería de Sistemas y Computació

    Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process

    Full text link
    We develop a novel credit assignment algorithm for information processing with spiking neurons without requiring feedback synapses. Specifically, we propose an event-driven generalization of the forward-forward and the predictive forward-forward learning processes for a spiking neural system that iteratively processes sensory input over a stimulus window. As a result, the recurrent circuit computes the membrane potential of each neuron in each layer as a function of local bottom-up, top-down, and lateral signals, facilitating a dynamic, layer-wise parallel form of neural computation. Unlike spiking neural coding, which relies on feedback synapses to adjust neural electrical activity, our model operates purely online and forward in time, offering a promising way to learn distributed representations of sensory data patterns with temporal spike signals. Notably, our experimental results on several pattern datasets demonstrate that the even-driven forward-forward (ED-FF) framework works well for training a dynamic recurrent spiking system capable of both classification and reconstruction

    DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards

    Full text link
    Exploration is a fundamental aspect of reinforcement learning (RL), and its effectiveness crucially decides the performance of RL algorithms, especially when facing sparse extrinsic rewards. Recent studies showed the effectiveness of encouraging exploration with intrinsic rewards estimated from novelty in observations. However, there is a gap between the novelty of an observation and an exploration in general, because the stochasticity in the environment as well as the behavior of an agent may affect the observation. To estimate exploratory behaviors accurately, we propose DEIR, a novel method where we theoretically derive an intrinsic reward from a conditional mutual information term that principally scales with the novelty contributed by agent explorations, and materialize the reward with a discriminative forward model. We conduct extensive experiments in both standard and hardened exploration games in MiniGrid to show that DEIR quickly learns a better policy than baselines. Our evaluations in ProcGen demonstrate both generalization capabilities and the general applicability of our intrinsic reward.Comment: Accepted as a conference paper to the 32nd International Joint Conference on Artificial Intelligence (IJCAI-23

    A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms

    Get PDF
    Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data. A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability. To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity. A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case. The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change. The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence

    Learning disentangled speech representations

    Get PDF
    A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody. The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions. In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks. This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically

    Multiscale structural optimisation with concurrent coupling between scales

    Get PDF
    A robust three-dimensional multiscale topology optimisation framework with concurrent coupling between scales is presented. Concurrent coupling ensures that only the microscale data required to evaluate the macroscale model during each iteration of optimisation is collected and results in considerable computational savings. This represents the principal novelty of the framework and permits a previously intractable number of design variables to be used in the parametrisation of the microscale geometry, which in turn enables accessibility to a greater range of mechanical point properties during optimisation. Additionally, the microscale data collected during optimisation is stored in a re-usable database, further reducing the computational expense of subsequent iterations or entirely new optimisation problems. Application of this methodology enables structures with precise functionally-graded mechanical properties over two-scales to be derived, which satisfy one or multiple functional objectives. For all applications of the framework presented within this thesis, only a small fraction of the microstructure database is required to derive the optimised multiscale solutions, which demonstrates a significant reduction in the computational expense of optimisation in comparison to contemporary sequential frameworks. The derivation and integration of novel additive manufacturing constraints for open-walled microstructures within the concurrently coupled multiscale topology optimisation framework is also presented. Problematic fabrication features are discouraged through the application of an augmented projection filter and two relaxed binary integral constraints, which prohibit the formation of unsupported members, isolated assemblies of overhanging members and slender members during optimisation. Through the application of these constraints, it is possible to derive self-supporting, hierarchical structures with varying topology, suitable for fabrication through additive manufacturing processes.Open Acces

    A Framework to Support Continuous Range Queries over Multi-Attribute Trajectories

    Get PDF

    Desarrollo de una herramienta integral de gestión de gases de efecto invernadero para la toma de decisión contra el cambio climático a nivel regional y local en la Comunitat Valenciana

    Full text link
    Tesis por compendio[ES] Actualmente, los responsables de tomar decisiones contra el cambio climático carecen de herramientas para desarrollar inventarios de emisiones de gases de efecto invernadero (GEI) con suficiente rigor científico-técnico y precisión para priorizar e invertir los recursos disponibles de manera eficiente en las medidas necesarias para luchar contra el cambio climático. Por ello, en esta tesis se expone el desarrollo de un sistema de información territorial y sectorial (SITE) para monitorear las emisiones de GEI que sirva como herramienta de gobernanza climática local y regional. SITE combina las ventajas de los enfoques metodológicos descendente o top-down (de arriba hacia abajo) y ascendente o bottom-up (de abajo hacia arriba), para lograr un enfoque híbrido innovador para contabilizar y gestionar de manera eficiente las emisiones de GEI. Por tanto, en esta tesis se definen los diferentes desarrollos metodológicos, tanto generales como específicos de sectores clave del Panel Intergubernamental de Cambio Climático (IPPC) (edificación, transporte, sector forestal, etc.), un desarrollo informático para la parte de SITE que se ejecuta del lado del servidor, que de ahora en adelante denominaremos back-end del sistema, y siete implementaciones como casos de estudio representativos, a diferentes escalas y aplicados sobre diferentes sectores. Estas implementaciones a diferentes escalas y sectores demuestran el potencial del sistema como herramienta de apoyo en la toma de decisión contra el cambio climático a nivel regional y local. Las diferentes implementaciones en casos piloto representativos, tanto a nivel regional en la Comunitat Valenciana como a nivel local en municipios grandes (València) y medianos (Quart de Poblet y Llíria) muestran el potencial de adaptación territorial y sectorial que tiene la herramienta. Las metodologías desarrolladas para los sectores específicos de tráfico rodado, edificación o sector forestal, ofrecen cuantificaciones con una resolución espacial con gran capacidad de optimizar las políticas locales y regionales. Por tanto, la herramienta cuenta con un gran potencial de escalabilidad y gran capacidad de mejora continua mediante la inclusión de nuevos enfoques metodológicos, adaptación de las metodologías a la disponibilidad de datos, metodologías concretas para sectores clave y actualización a las mejores metodologías disponibles derivadas de actividades de investigación de la comunidad científica.[CA] Actualment, els responsables de prendre decisions contra el canvi climàtic no tenen eines per aconseguir inventaris d'emissions de gasos d'efecte hivernacle (GEH) amb prou cientificotècnic rigor, precisió i integritat per invertir els recursos disponibles de manera eficient en les mesures necessàries contra el canvi climàtic. Per això, en aquesta tesis se exposa el desenvolupa un sistema d'informació territorial i sectorial (SITE) per monitoritzar les emissions de GEH com a eina de governança climàtica local i regional. Aquest sistema combina els avantatges dels enfocaments metodològics descendent o top-down (de dalt a baix) i ascendent o bottom-up (de baix a dalt), per aconseguir un enfocament híbrid innovador per comptabilitzar i gestionar de manera eficient les emissions de GEH. Per tant, en aquesta tesi doctoral es descriuen els diferents desenvolupaments metodològics, tant generals com específics de sectors clau del Panel Intergovernamental contra el Canvi Climàtic (edificació, transport, forestal, etc.), un desenvolupament informàtic per al back-end del sistema i set implementacions com a casos d'estudi representatius, a diferents escales, amb els diferents enfocaments metodològics i aplicats sobre diferents sectors. Això queda descrit en sis capítols. Aquestes implementacions a diferents escales i sectors demostren el potencial del sistema com a eina de suport en la presa de decisió contra el canvi climàtic a nivell regional i local. Les diferents implementacions en casos pilot representatius, tant a nivell regional a la Comunitat Valenciana com a nivell local en municipis grans (València) i mitjans (Quart de Poblet i Llíria,) mostren el potencial d'adaptació territorial i sectorial que té l'eina. Les metodologies desenvolupades per als sectors específics de trànsit rodat, edificació i forestal, ofereixen quantificacions amb una resolució espacial amb gran capacitat d'optimitzar les polítiques locals i regionals. Per tant, l'eina compta amb un gran potencial d'escalabilitat i gran capacitat de millora contínua mitjançant la inclusió de nous enfocaments metodològics, adaptació de les metodologies a la disponibilitat de dades, metodologies concretes per a sectors clau, i actualització a les millors metodologies disponibles derivades de activitats de investigació de la comunitat científica.[EN] Currently, regional and local decision-makers lack of tools to achieve greenhouse gases (GHG) emissions inventories with enough rigor, accuracy and completeness in order to prioritize available resources efficiently against climate change. Thus, in this thesis the development of a territorial and sectoral information system (SITE) to monitor GHG emissions as a local and regional climate governance tool is exposed. This system combines the advantages of both, top-down and bottom-up approaches, to achieve an innovative hybrid approach to account and manage efficiently GHG emissions. Furthermore, this thesis defines the methodologies developed, a computer proposal for the back-end of the system and seven implementations as representative case studies at different scales (local and regional level), with the different methodological approaches and applied to different sectors. Thus, these implementations demonstrate the potential of the system as decision-making tool against climate change at the regional and local level as climate governance tool. The different implementations in representative pilot cases, both at the regional level in the Valencian Community and at the local level in large (Valencia) and medium-sized municipalities (Quart de Poblet and Llíria) demonstrate the potential for territorial and sectoral adaptation of the system developed. The methodologies developed for the specific sectors of road transport, building and forestry, offer quantifications with a spatial resolution with a great capacity to optimize local and regional policies. Therefore, the tool has a great potential for scalability and a great capacity for continuous improvement through the inclusion of new methodological approaches, adapting the methodologies to the availability of data, specific methodologies for key sectors, and updating to the best methodologies available in the scientific community.Lorenzo Sáez, E. (2022). Desarrollo de una herramienta integral de gestión de gases de efecto invernadero para la toma de decisión contra el cambio climático a nivel regional y local en la Comunitat Valenciana [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/181662TESISCompendi
    corecore