37 research outputs found

    Reinforcement learning-based cell selection in sparse mobile crowdsensing

    Get PDF
    International audienceSparse Mobile Crowdsensing (MCS) is a novel MCS paradigm which allows us to use the mobile devices to collect sensing data from only a small subset of cells (sub-areas) in the target sensing area while intelligently inferring the data of other cells with quality guarantee. Since selecting sensed data from different cell sets will probably lead to diverse levels of inference data quality, cell selection (i.e., choosing which cells in the target area to collect sensed data from participants) is a critical issue that will impact the total amount of data that requires to be collected (i.e., data collection costs) for ensuring a certain level of data quality. To address this issue, this paper proposes the reinforcement learning-based cell selection algorithm for Sparse MCS. First, we model the key concepts in reinforcement learning including state, action, and reward, and then propose a Q-learning based cell selection algorithm. To deal with the large state space, we employ the deep Q-network to learn the Q-function that can help decide which cell is a better choice under a certain state during cell selection. Then, we modify the Q-network to a deep recurrent Q-network with LSTM to catch the temporal patterns and handle partial observability. Furthermore, we leverage the transfer learning techniques to relieve the dependency on a large amount of training data. Experiments on various real-life sensing datasets verify the effectiveness of our proposed algorithms over the state-of-the-art mechanisms in Sparse MCS by reducing up to 20% of sensed cells with the same data inference quality guarantee

    Assessment of Freeway Traffic Parameters Leading to Lane-Change Related Collisions

    Get PDF
    This study aims at ‘predicting’ the occurrence of lane-change related freeway crashes using the traffic surveillance data collected from a pair of dual loop detectors. The approach adopted here involves developing classification models using the historical crash data and corresponding information on real-time traffic parameters obtained from loop detectors. The historical crash and loop detector data to calibrate the neural network models (corresponding to crash and non-crash cases to set up a binary classification problem) were collected from the Interstate-4 corridor in Orlando (FL) metropolitan area. Through a careful examination of crash data, it was concluded that all sideswipe collisions and the angle crashes that occur on the inner lanes (left most and center lanes) of the freeway may be attributed to lane-changing maneuvers. These crashes are referred to as lane-change related crashes in this study. The factors explored as independent variables include the parameters formulated to capture the overall measure of lane-changing and between-lane variations of speed, volume and occupancy at the station located upstream of crash locations. Classification tree based variable selection procedure showed that average speeds upstream and downstream of crash location, difference in occupancy on adjacent lanes and standard deviation of volume and speed downstream of the crash location were found to be significantly associated with the binary variable (crash versus non-crash). The classification models based on data mining approach achieved satisfactory classification accuracy over the validation dataset. The results indicate that these models may be applied for identifying real-time traffic conditions prone to lane-change related crashes

    Machine learning in the analysis of biomolecular simulations

    Get PDF
    Machine learning has rapidly become a key method for the analysis and organization of large-scale data in all scientific disciplines. In life sciences, the use of machine learning techniques is a particularly appealing idea since the enormous capacity of computational infrastructures generates terabytes of data through millisecond simulations of atomistic and molecular-scale biomolecular systems. Due to this explosion of data, the automation, reproducibility, and objectivity provided by machine learning methods are highly desirable features in the analysis of complex systems. In this review, we focus on the use of machine learning in biomolecular simulations. We discuss the main categories of machine learning tasks, such as dimensionality reduction, clustering, regression, and classification used in the analysis of simulation data. We then introduce the most popular classes of techniques involved in these tasks for the purpose of enhanced sampling, coordinate discovery, and structure prediction. Whenever possible, we explain the scope and limitations of machine learning approaches, and we discuss examples of applications of these techniques.Peer reviewe

    Machine learning methods for sign language recognition: a critical review and analysis.

    Get PDF
    Sign language is an essential tool to bridge the communication gap between normal and hearing-impaired people. However, the diversity of over 7000 present-day sign languages with variability in motion position, hand shape, and position of body parts making automatic sign language recognition (ASLR) a complex system. In order to overcome such complexity, researchers are investigating better ways of developing ASLR systems to seek intelligent solutions and have demonstrated remarkable success. This paper aims to analyse the research published on intelligent systems in sign language recognition over the past two decades. A total of 649 publications related to decision support and intelligent systems on sign language recognition (SLR) are extracted from the Scopus database and analysed. The extracted publications are analysed using bibliometric VOSViewer software to (1) obtain the publications temporal and regional distributions, (2) create the cooperation networks between affiliations and authors and identify productive institutions in this context. Moreover, reviews of techniques for vision-based sign language recognition are presented. Various features extraction and classification techniques used in SLR to achieve good results are discussed. The literature review presented in this paper shows the importance of incorporating intelligent solutions into the sign language recognition systems and reveals that perfect intelligent systems for sign language recognition are still an open problem. Overall, it is expected that this study will facilitate knowledge accumulation and creation of intelligent-based SLR and provide readers, researchers, and practitioners a roadmap to guide future direction

    Renewable Energy Integration in Distribution System with Artificial Intelligence

    Get PDF
    With the increasing attention of renewable energy development in distribution power system, artificial intelligence (AI) can play an indispensiable role. In this thesis, a series of artificial intelligence based methods are studied and implemented to further enhance the performance of power system operation and control. Due to the large volume of heterogeneous data provided by both the customer and the grid side, a big data visualization platform is built to feature out the hidden useful knowledge for smart grid (SG) operation, control and situation awareness. An open source cluster calculation framework with Apache Spark is used to discover big data hidden information. The data is transmitted with an Open System Interconnection (OSI) model to the data visualization platform with a high-speed communication architecture. Google Earth and Global Geographic Information System (GIS) are used to design the visualization platform and realize the results. Based on the data visualization platform above, the external manifestation of the data is studied. In the following work, I try to understand the internal hidden information of the data. A short-term load forecasting approach is designed based on support vector regression (SVR) to provide a higher accuracy load forecasting for the network reconfiguration. The nonconvexity of three-phase balanced optimal power flow is relaxed to an optimal power flow (OPF) problem with the second-order cone program (SOCP). The alternating direction method of multipliers (ADMM) is used to compute the optimal power flow in distributed manner. Considering the reality of distribution systems, a three-phase unbalanced distribtion system is built, which consists of the hourly operation scheduling at substation level and the minutes power flow operation at feeder level. The operaion cost of system with renewable generation is minimized at substation level. The stochastoc distribution model of renewable generation is simulated with a chance constraint, and the derived deterministic form is modeled with Gaussian Mixture Model (GMM) with genetic algorithm-based expectationmaximization (GAEM). The system cost is further reduced with OPF in real-time (RT) scheduling. The semidefinite programming (SDP) is used to relax the nonconvexity of the three-phase unbalanced distribution system into a convex problem, which helps to achieve the global optimal result. In the parallel manner, the ADMM is realizing getting the results in a short time. Clouds have a big impact on solar energy forecasting. Firstly, a convolutional neural network based mathod is used to estimate the solar irradiance, Secondly, the regression results are collected to predict the renewable generation. After that, a novel approach is proposed to capture the Global horizontal irradiance (GHI) conveniently and accurately. Considering the nonstationary property of the GHI on cloudy days, the GHI capturing is cast as an image regression problem. In traditional approaches, the image regression problem is treated as two parts, feature extraction and regression, which are optimized separately and no interconnections. Considering the nonlinear regression capability, a convolutional neural network (CNN) based image regression approach is proposed to provide an End-to- End solution for the cloudy day GHI capturing problem in this paper. For data cleaning, the Gaussian mixture model with Bayesian inference is employed to detect and eliminate the anomaly data in a nonparametric manner. The purified data are used as input data for the proposed image regression approach. The numerical results demonstrate the feasibility and effectiveness of the proposed approach

    THE APPLICATION OF PRINCIPAL COMPONENT ANALYSIS IN PRODUCTION FORECASTING

    Get PDF
    Current methods of production forecasting, such as Decline Curve Analysis and Rate Transient Analysis, require years of production data, and their accuracy is affected by the artificial choice of model parameters. Unconventional resources, which usually lack long-term production history and have hard-to-determine model parameters, challenge traditional methods. This paper proposes a new method using principal components Analysis to estimate production with reasonable certainty. PCA is a statistical tool which unveils the hidden patterns of production by reducing high-dimension rate-time data into a linear combination of only a few principal components. This paper establishes a PCA-based predictive model which makes predictions by using information from the first few months’ production data from a well. Its efficacy has been examined with both simulation data and field data. Also, this study shows that the K-means clustering technique can enhance predictive model performance and give a reasonably certain future production range estimate based on historical data

    Information technologies for pain management

    Get PDF
    Millions of people around the world suffer from pain, acute or chronic and this raises the importance of its screening, assessment and treatment. The importance of pain is attested by the fact that it is considered the fifth vital sign for indicating basic bodily functions, health and quality of life, together with the four other vital signs: blood pressure, body temperature, pulse rate and respiratory rate. However, while these four signals represent an objective physical parameter, the occurrence of pain expresses an emotional status that happens inside the mind of each individual and therefore, is highly subjective that makes difficult its management and evaluation. For this reason, the self-report of pain is considered the most accurate pain assessment method wherein patients should be asked to periodically rate their pain severity and related symptoms. Thus, in the last years computerised systems based on mobile and web technologies are becoming increasingly used to enable patients to report their pain which lead to the development of electronic pain diaries (ED). This approach may provide to health care professionals (HCP) and patients the ability to interact with the system anywhere and at anytime thoroughly changes the coordinates of time and place and offers invaluable opportunities to the healthcare delivery. However, most of these systems were designed to interact directly to patients without presence of a healthcare professional or without evidence of reliability and accuracy. In fact, the observation of the existing systems revealed lack of integration with mobile devices, limited use of web-based interfaces and reduced interaction with patients in terms of obtaining and viewing information. In addition, the reliability and accuracy of computerised systems for pain management are rarely proved or their effects on HCP and patients outcomes remain understudied. This thesis is focused on technology for pain management and aims to propose a monitoring system which includes ubiquitous interfaces specifically oriented to either patients or HCP using mobile devices and Internet so as to allow decisions based on the knowledge obtained from the analysis of the collected data. With the interoperability and cloud computing technologies in mind this system uses web services (WS) to manage data which are stored in a Personal Health Record (PHR). A Randomised Controlled Trial (RCT) was implemented so as to determine the effectiveness of the proposed computerised monitoring system. The six weeks RCT evidenced the advantages provided by the ubiquitous access to HCP and patients so as to they were able to interact with the system anywhere and at anytime using WS to send and receive data. In addition, the collected data were stored in a PHR which offers integrity and security as well as permanent on line accessibility to both patients and HCP. The study evidenced not only that the majority of participants recommend the system, but also that they recognize it suitability for pain management without the requirement of advanced skills or experienced users. Furthermore, the system enabled the definition and management of patient-oriented treatments with reduced therapist time. The study also revealed that the guidance of HCP at the beginning of the monitoring is crucial to patients' satisfaction and experience stemming from the usage of the system as evidenced by the high correlation between the recommendation of the application, and it suitability to improve pain management and to provide medical information. There were no significant differences regarding to improvements in the quality of pain treatment between intervention group and control group. Based on the data collected during the RCT a clinical decision support system (CDSS) was developed so as to offer capabilities of tailored alarms, reports, and clinical guidance. This CDSS, called Patient Oriented Method of Pain Evaluation System (POMPES), is based on the combination of several statistical models (one-way ANOVA, Kruskal-Wallis and Tukey-Kramer) with an imputation model based on linear regression. This system resulted in fully accuracy related to decisions suggested by the system compared with the medical diagnosis, and therefore, revealed it suitability to manage the pain. At last, based on the aerospace systems capability to deal with different complex data sources with varied complexities and accuracies, an innovative model was proposed. This model is characterized by a qualitative analysis stemming from the data fusion method combined with a quantitative model based on the comparison of the standard deviation together with the values of mathematical expectations. This model aimed to compare the effects of technological and pen-and-paper systems when applied to different dimension of pain, such as: pain intensity, anxiety, catastrophizing, depression, disability and interference. It was observed that pen-and-paper and technology produced equivalent effects in anxiety, depression, interference and pain intensity. On the contrary, technology evidenced favourable effects in terms of catastrophizing and disability. The proposed method revealed to be suitable, intelligible, easy to implement and low time and resources consuming. Further work is needed to evaluate the proposed system to follow up participants for longer periods of time which includes a complementary RCT encompassing patients with chronic pain symptoms. Finally, additional studies should be addressed to determine the economic effects not only to patients but also to the healthcare system

    An evolutionary approach to optimising neural network predictors for passive sonar target tracking

    Get PDF
    Object tracking is important in autonomous robotics, military applications, financial time-series forecasting, and mobile systems. In order to correctly track through clutter, algorithms which predict the next value in a time series are essential. The competence of standard machine learning techniques to create bearing prediction estimates was examined. The results show that the classification based algorithms produce more accurate estimates than the state-of-the-art statistical models. Artificial Neural Networks (ANNs) and K-Nearest Neighbour were used, demonstrating that this technique is not specific to a single classifier. [Continues.

    Tensor-variate machine learning on graphs

    Get PDF
    Traditional machine learning algorithms are facing significant challenges as the world enters the era of big data, with a dramatic expansion in volume and range of applications and an increase in the variety of data sources. The large- and multi-dimensional nature of data often increases the computational costs associated with their processing and raises the risks of model over-fitting - a phenomenon known as the curse of dimensionality. To this end, tensors have become a subject of great interest in the data analytics community, owing to their remarkable ability to super-compress high-dimensional data into a low-rank format, while retaining the original data structure and interpretability. This leads to a significant reduction in computational costs, from an exponential complexity to a linear one in the data dimensions. An additional challenge when processing modern big data is that they often reside on irregular domains and exhibit relational structures, which violates the regular grid assumptions of traditional machine learning models. To this end, there has been an increasing amount of research in generalizing traditional learning algorithms to graph data. This allows for the processing of graph signals while accounting for the underlying relational structure, such as user interactions in social networks, vehicle flows in traffic networks, transactions in supply chains, chemical bonds in proteins, and trading data in financial networks, to name a few. Although promising results have been achieved in these fields, there is a void in literature when it comes to the conjoint treatment of tensors and graphs for data analytics. Solutions in this area are increasingly urgent, as modern big data is both large-dimensional and irregular in structure. To this end, the goal of this thesis is to explore machine learning methods that can fully exploit the advantages of both tensors and graphs. In particular, the following approaches are introduced: (i) Graph-regularized tensor regression framework for modelling high-dimensional data while accounting for the underlying graph structure; (ii) Tensor-algebraic approach for computing efficient convolution on graphs; (iii) Graph tensor network framework for designing neural learning systems which is both general enough to describe most existing neural network architectures and flexible enough to model large-dimensional data on any and many irregular domains. The considered frameworks were employed in several real-world applications, including air quality forecasting, protein classification, and financial modelling. Experimental results validate the advantages of the proposed methods, which achieved better or comparable performance against state-of-the-art models. Additionally, these methods benefit from increased interpretability and reduced computational costs, which are crucial for tackling the challenges posed by the era of big data.Open Acces
    corecore