19 research outputs found

    Machine learning and data-driven techniques for verification and synthesis of cyber-physical systems

    Get PDF
    Safety and performance are the most important requirements for designing and manufacturing complex life-critical systems. Consider a self-driving car which is not equipped with certain safety functionalities. It can cause fatal accidents, severe injuries, or serious damages to the environment. Hence, rigorous analysis required to ensure the correctness of functionalities in many safety-critical applications. Model-based approaches for satisfying such requirements have been studied extensively in the literature. Unfortunately, a precise model of the system is not always available in many practical scenarios. Hence, in this thesis we focus on data-driven methods and machine learning techniques to tackle this challenge. First, we assume that only an incomplete parameterized model of the system is available. The main goal is to study formal verification of linear time-invariant systems with respect to a fragment of temporal logic specifications when only a partial knowledge of the model is available, i.e., a parameterized model of the system is known but the exact values of the parameters are unknown. We provide a probabilistic measure for the satisfaction of the specification by trajectories of the system under the influence of uncertainty. We assume that these specifications are expressed as signal temporal logic formulae and provide an approach that relies on gathering input-output data from the system. We employ Bayesian inference on the collected data to associate a notion of confidence with the satisfaction of the specification. Second, we assume that we do not have any knowledge about the model of the system and just have access to input-output data from the system. We study verification and synthesis problems for safety specifications over unknown discrete-time stochastic systems. When a model of the system is available, notion of barrier certificates have been successfully applied for ensuring the satisfaction of safety specifications. Here, we formulate the computation of barrier certificates as a robust convex program (RCP). Solving the acquired RCP is difficult in general because the model of the system that appears in one of the constraints of the RCP is unknown. We propose a data-driven approach that replaces the uncountable number of constraints in the RCP with a finite number of constraints by taking finitely many random samples from the trajectories of the system. We thus replace the original RCP with a scenario convex program (SCP) and show how to relate their optimizers. We guarantee that the solution of the SCP is a solution of the RCP with a priori guaranteed confidence when the number of samples is larger than a specific value. This provides a lower bound on the safety probability of the original unknown system together with a controller in the case of synthesis. Lastly, to address the high demand for data in our data-driven barrier-based approach, we propose three remedies. First, the wait-and-judge approach that checks a condition over the optimal value of the SCP using a fixed number of samples, ensuring a lower bound probability and the desired confidence for satisfying safety specifications. Second, the repetition-based scenario framework that iteratively solves the SCP with samples, checking feasibility and achieving the desired violation error. A safety condition is verified, enabling the computation of a lower bound for safety satisfaction. Third, the wait, judge, and repeat framework that solves the SCP iteratively until a feasibility condition, based on computed support constraints, is met. If the safety condition is satisfied, the system is considered safe with a lower bound probability determined using the optimizer of the successful iteration

    Convolutional Neural Network in Pattern Recognition

    Get PDF
    Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier. As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification. Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN. The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics

    Enabling Scalable Neurocartography: Images to Graphs for Discovery

    Get PDF
    In recent years, advances in technology have enabled researchers to ask new questions predicated on the collection and analysis of big datasets that were previously too large to study. More specifically, many fundamental questions in neuroscience require studying brain tissue at a large scale to discover emergent properties of neural computation, consciousness, and etiologies of brain disorders. A major challenge is to construct larger, more detailed maps (e.g., structural wiring diagrams) of the brain, known as connectomes. Although raw data exist, obstacles remain in both algorithm development and scalable image analysis to enable access to the knowledge within these data volumes. This dissertation develops, combines and tests state-of-the-art algorithms to estimate graphs and glean other knowledge across six orders of magnitude, from millimeter-scale magnetic resonance imaging to nanometer-scale electron microscopy. This work enables scientific discovery across the community and contributes to the tools and services offered by NeuroData and the Open Connectome Project. Contributions include creating, optimizing and evaluating the first known fully-automated brain graphs in electron microscopy data and magnetic resonance imaging data; pioneering approaches to generate knowledge from X-Ray tomography imaging; and identifying and solving a variety of image analysis challenges associated with building graphs suitable for discovery. These methods were applied across diverse datasets to answer questions at scales not previously explored

    Data-Intensive Computing in Smart Microgrids

    Get PDF
    Microgrids have recently emerged as the building block of a smart grid, combining distributed renewable energy sources, energy storage devices, and load management in order to improve power system reliability, enhance sustainable development, and reduce carbon emissions. At the same time, rapid advancements in sensor and metering technologies, wireless and network communication, as well as cloud and fog computing are leading to the collection and accumulation of large amounts of data (e.g., device status data, energy generation data, consumption data). The application of big data analysis techniques (e.g., forecasting, classification, clustering) on such data can optimize the power generation and operation in real time by accurately predicting electricity demands, discovering electricity consumption patterns, and developing dynamic pricing mechanisms. An efficient and intelligent analysis of the data will enable smart microgrids to detect and recover from failures quickly, respond to electricity demand swiftly, supply more reliable and economical energy, and enable customers to have more control over their energy use. Overall, data-intensive analytics can provide effective and efficient decision support for all of the producers, operators, customers, and regulators in smart microgrids, in order to achieve holistic smart energy management, including energy generation, transmission, distribution, and demand-side management. This book contains an assortment of relevant novel research contributions that provide real-world applications of data-intensive analytics in smart grids and contribute to the dissemination of new ideas in this area

    A Transparency Index Framework for Machine Learning powered AI in Education

    Get PDF
    The increase in the use of AI systems in our daily lives, brings calls for more ethical AI development from different sectors including, finance, the judiciary and to an increasing extent education. A number of AI ethics checklists and frameworks have been proposed focusing on different dimensions of ethical AI, such as fairness, explainability and safety. However, the abstract nature of these existing ethical AI guidelines often makes them difficult to operationalise in real-world contexts. The inadequacy of the existing situation with respect to ethical guidance is further complicated by the paucity of work to develop transparent machine learning powered AI systems for real-world. This is particularly true for AI applied in education and training. In this thesis, a Transparency Index Framework is presented as a tool to forefront the importance of transparency and aid the contextualisation of ethical guidance for the education and training sector. The transparency index framework presented here has been developed in three iterative phases. In phase one, an extensive literature review of the real-world AI development pipelines was conducted. In phase two, an AI-powered tool for use in an educational and training setting was developed. The initial version of the Transparency Index Framework was prepared after phase two. And in phase three, a revised version of the Transparency Index Framework was co- designed that integrates learning from phases one and two. The co-design process engaged a range of different AI in education stakeholders, including educators, ed-tech experts and AI practitioners. The Transparency Index Framework presented in this thesis maps the requirements of transparency for different categories of AI in education stakeholders, and shows how transparency considerations can be ingrained throughout the AI development process, from initial data collection to deployment in the world, including continuing iterative improvements. Transparency is shown to enable the implementation of other ethical AI dimensions, such as interpretability, accountability and safety. The 3 optimisation of transparency from the perspective of end-users and ed-tech companies who are developing AI systems is discussed and the importance of conceptualising transparency in developing AI powered ed-tech products is highlighted. In particular, the potential for transparency to bridge the gap between the machine learning and learning science communities is noted. For example, through the use of datasheets, model cards and factsheets adapted and contextualised for education through a range of stakeholder perspectives, including educators, ed-tech experts and AI practitioners

    Laser-Cooled Ion Beams and Strongly Coupled Plasmas for Precision Experiments

    Get PDF
    The first part of this thesis summarizes the results of laser-cooling of relativistic C3+ ion beams at the ESR/GSI. It is shown that laser cooling at high beam energies is feasible and that momentum spreads much smaller than those observed for electron cooling can be achieved. Resulty indicate that space-charge dominated beams have been observed, reaching the regime of strong coupling which is an essential prerequisite for beam crystallization. Moderate electron cooling was employed to create three-dimensionally cold beams. With the laser cooled beams it was possible to perform precision VUV spectroscopy of the cooling transition. In the second part results on large-scale realistic simulations on the stopping of highly charged ions in a laser-cooled one-component plasma of 24Mg+ ions confined in a harmonic potential are presented. It is shown that cooling times short enough for cooling unstable nuclei can be achieved and fast recooling of the plasma is possible. With this cooling scheme highly charged ions for precision experiments such as mass spectrometry in Penning traps at millikelvin temperatures can be delivered

    Modelado de las propiedades dieléctricas del suelo. Aplicación en el diseño de sensores para sistemas de control en agricultura de precisión

    Get PDF
    [SPA] Esta tesis doctoral se presenta bajo la modalidad de compendio de publicaciones. El agua es una sustancia clave para el desarrollo de la vida en La Tierra. Es por ello que la búsqueda de oportunidad de vida en otros planetas y satélites se basa en la presencia de agua en los mismos. La gestión ecológica del agua es necesaria para la sostenibilidad de los ecosistemas. Uno de los ecosistemas más amplios y donde el agua juega un papel más importante es el suelo, que alberga multitud de variedades de microorganismos cuya actividad, en parte resultante en la generación de nutrientes para el desarrollo de las especies vegetales, es totalmente dependiente del contenido de agua en el suelo. En zonas áridas y semiáridas, como es el caso de la cuenca Mediterránea, la escasez de agua supone un grave problema a la hora de gestionar los pocos recursos hídricos disponibles. En este caso, donde las condiciones geográficas son idóneas para el desarrollo de la agricultura, las soluciones pasan por una optimización de las técnicas de riego y un mayor control sobre los recursos hídricos. En este sentido, las técnicas de riego deficitario controlado se han mostrado exitosas en la reducción de la dotación hídrica a los cultivos en fases no críticas. Sin embargo, para realizar una aplicación prudente y eficiente de las mismas, resulta necesario monitorizar el estado hídrico de los cultivos, con el objetivo de que éstos no alcancen situaciones de estrés irreversible en términos de producción o estado vegetativo. Los indicadores que mayor información aportan sobre el estado hídrico de la planta suelen estar relacionados con variables medibles a partir de la propia planta, pero que son difícilmente automatizables debido a las operaciones de manejo asociadas. Este es el caso del potencial hídrico de tallo a mediodía medido con cámara de presión, considerado hasta la fecha como el indicador más fiable del estado hídrico de los cultivos en general. Es por ello que, para lograr una monitorización continua de esta variable, se busquen otras variables del continuo suelo-planta-atmósfera que puedan estar relacionadas y a partir de las cuales obtener una estimación indirecta. El suelo es la matriz de donde la planta adquiere la mayor parte del agua y los nutrientes que necesita para realizar la fotosíntesis. La relación entre el estado hídrico del suelo y el estado hídrico de los cultivos está más que demostrada. Sin embargo, la precisión alcanzada en los modelos de correlación entre ambos estados requiere de una mejora considerable para hacer un uso realmente fiable de los mismos, y esta mejora no solo pasa por encontrar mejores métodos de correlación, sino también por mejorar la precisión de las medidas obtenidas del suelo. Para monitorizar el estado hídrico del suelo, existen diversas metodologías que ofrecen parámetros medibles como el contenido de agua. El método de medida más extendido para monitorizar el contenido de agua en el suelo es a través del uso de sensores dieléctricos. Sin embargo, la precisión de los mismos está sujeta a diversos factores, entre ellos las características propias del suelo donde se instalan y su coste, relativamente alto para el pequeño y mediano agricultor, condicionando una implantación extensiva de la Agricultura de Precisión y limitando a veces la aplicación de algunos desarrollos únicamente a trabajos de investigación. Esta tesis, elaborada bajo la modalidad de compendio de publicaciones, aborda a través de cuatro artículos científicos la propuesta de soluciones accesibles para la medida del estado hídrico del suelo, con especial enfoque en el contenido de agua; explora las limitaciones y retos asociados con la calibración de los sensores dieléctricos de suelo; participa en la generación de nuevos conocimientos y propuestas para un mejor entendimiento del comportamiento del agua en el suelo y de su interacción con las ondas electromagnéticas; y establece nuevos enfoques y modelos que mejoran la predicción del estado hídrico de los cultivos a partir de medidas indirectas y automatizables en suelo y atmósfera. [ENG] This doctoral dissertation has been presented in the form of thesis by publication. Water is a fundamental substance for the development of life on Earth. That is why the search for life on other planets and satellites is based on the presence of water on them. Ecological water management is necessary for the sustainability of ecosystems. One of the most extensive ecosystems where water plays a major role is soil, which hosts a large variety of micro-organisms whose activity, partly resulting in the generation of nutrients for the development of plant species, is totally dependent on the water content of the soil. In arid and semi-arid regions, as it is the case in the Mediterranean basin, water scarcity is a serious problem when it comes to managing the few water resources available. In this case, where the geographical conditions are ideal for the development of agriculture, the solutions involve optimization of irrigation techniques and greater control over water resources. In this sense, regulated deficit irrigation strategies have proven to be successful in reducing the water supply to crops in non-critical periods. However, in order to apply them prudently and efficiently, it is necessary to monitor the water status of the crops, so that they do not reach irreversible stress situations in terms of yield or vegetative state. The indicators that provide the highest amount of information on the water status of the plant are usually related to variables that can be measured from the plant itself, but which are difficult to automate due to the labor and time-consuming associated operations. This is the case of the midday stem water potential measured with a pressure chamber, considered to date to be the most reliable indicator of the crop's water status in general. In order to achieve a continuous monitoring of this variable, it is necessary to look for other variables of the soil-plant-atmosphere continuum that may be related and from which to obtain an indirect estimate. Soil is the matrix from which the plant acquires most of the water and nutrients it needs for photosynthesis. The relationship between soil water status and crop water status is well established. However, the accuracy achieved in the correlation models between the two requires considerable improvement to make a truly reliable use of them, and this improvement is not only to find better correlation methods, but also to improve the accuracy of the measurements obtained from the soil. To monitor soil water status, there are several methodologies that provide measurable parameters such as water content. The most widespread measurement method for monitoring soil water content is through the use of dielectric sensors. However, the accuracy of these sensors is subject to various factors, including the characteristics of the soil where they are installed, and their relatively high cost for small and medium-sized farmers, conditioning the extensive implementation of precision agriculture and sometimes limiting the application of some developments only to research work. This thesis, elaborated under the modality of a compendium of publications, addresses through four scientific articles the proposal of affordable solutions for the measurement of soil water status, with special focus on water content; it explores the limitations and challenges associated with the calibration of soil dielectric sensors; participates in the generation of new insights and proposals for a better understanding of the behavior of water in soil and its interaction with electromagnetic waves; and establishes new approaches and models that improve the prediction of crop water status from indirect and automatable measurements in soil and atmosphere.Esta tesis doctoral se presenta bajo la modalidad de compendio de publicaciones. Está formada por un total de cuatro artículos: Article I. González-Teruel, J.D., Torres-Sánchez, R., Blaya-Ros, P.J., Toledo-Moreo, A.B., Jiménez-Buendía, M., Soto-Valles, F., 2019. Design and Calibration of a Low-Cost SDI-12 Soil Moisture Sensor. Sensors, 19, 491. DOI: 10.3390/s19030491 - Article II. González-Teruel, J.D., Jones, S.B., Soto-Valles, F., Torres-Sánchez, R., Lebron, I., Friedman, S.P., Robinson, D.A., 2020. Dielectric Spectroscopy and Application of Mixing Models Describing Dielectric Dispersion in Clay Minerals and Clayey Soils. Sensors, 20, 6678. DOI: 10.3390/s20226678 Article III. González-Teruel, J.D., Jones, S.B., Robinson, D.A., Giménez-Gallego, J., Zornoza, R., Torres-Sánchez, R., 2022. Measurement of the broadband complex permittivity of soils in the frequency domain with a low-cost Vector Network Analyzer and an Open-Ended coaxial probe. Computers and Electronics in Agriculture, 195, 106847. DOI: 10.1016/J.COMPAG.2022.106847 Article IV. González-Teruel, J.D., Ruiz-Abellon, M.C., Blanco, V., Blaya-Ros, P.J., Domingo, R., Torres-Sánchez, R., 2022. Prediction of Water Stress Episodes in Fruit Trees Based on Soil and Weather Time Series Data. Agronomy, 12, 1422. DOI: 10.3390/agronomy12061422Escuela Internacional de Doctorado de la Universidad Politécnica de CartagenaUniversidad Politécnica de CartagenaPrograma de Doctorado en Tecnologías Industriale

    Structure determination of a yeast ribosomal protein L30 and pre-mRNA binding site complex by NMR spectroscopy

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Chemistry, 1998.Includes bibliographical references (p. 342-353).The yeast (Saccharomyces cerevisiae) ribosomal protein L30 and its auto-regulatory pre-mRNA binding site provide one of the best examples the critical role of protein-RNA interactions in regulation of RNA processing and control of gene translation. A model system for this interaction, which includes the ribosomal L30 protein and the phylogenetically conserved RNA segment for auto-regulation, was studied using nuclear magnetic resonance (NMR) spectroscopy. The L30 protein recognizes and binds tightly to the stem-internal loop-stem RNA, the recognition elements of which lie mostly on the conserved two-plus-five asymmetric purine-rich internal loop. NMR characterizations were carried out on both the free and bound forms of the protein and the RNA. Detailed analyses of the protein revealed that the main architecture, a fourstranded n-sheet sandwiched between four a-helices, is present both in the free and in the bound form. There are however, substantial local perturbations that accompany RNA binding, the largest of which have been mapped onto the loops connecting Strand A and Helix 2, Strand B and Helix 3, Helix 4 and Strand D. In contrast to the protein, the internal loop of the RNA undergoes significant changes upon complex formation, and the most distinct observation was the formation of the G 11G56 reverse Hoogsteen mismatch pair. Structure modeling using simulated annealing in restrained molecular dynamics was carried out in X-PLOR. Detailed analyses of the complex structure reveal that the protein recognizes the RNA mostly along one side of the internal loop with five purines. The interactions are divided further into two sections. One region consists of mostly aromatic stacking and hydrophobic contacts from Leu25, Phe85 and Val87 of the protein to G56 of the RNA. The other region consists of mostly specific contacts, which include recognition of A57 by Asn 48, and G58 by Arg 52. The L30 protein- RNA complex structure thus determined using NMR spectroscopy not only provides a detailed insight for understanding the structure-function relationship regarding the yeast auto-regulation, it also further demonstrates the important role of the protein-RNA interaction in controlling RNA processing and gene translation.by Hongyuan Mao.Ph.D
    corecore