161 research outputs found

    Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams

    Full text link
    The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction (DA3DA^3) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only 25%25\% label proportions. It shows highly competitive performance even if compared with fully supervised learners with 100%100\% label proportions.Comment: This paper has been accepted for publication in Information Science

    Supporting System for Detecting Pathologies

    Get PDF
    Arrays CGH make possible the realization of tests on patients for the detection of mutations in chromosomal regions. Detecting these mutations allows to carry out diagnoses and to complete studies of sequencing in relevant regions of the DNA. The analysis process of arrays CGH requires the use of mechanisms that facilitate the data processing by specialized personnel since traditionally, a segmentation process is needed and starting from the segmented data, a visual analysis of the information is carried out for the selection of relevant segments. In this study a CBR system is presented as a supporting system for the extraction of relevant information in arrays CGH that facilitates the process of analysis and its interpretation

    Modelling the Longevity of Dental Restorations by means of a CBR System

    Get PDF

    Modelling the Longevity of Dental Restorations by means of a CBR system

    Get PDF
    The lifespan of dental restorations is limited. Longevity depends on the material used and the different characteristics of the dental piece. However, it is not always the case that the best and longest lasting material is used since patients may prefer different treatments according to how noticeable the material is. Over the last 100 years, the most commonly used material has been silver amalgam, which, while very durable, is somewhat aesthetically displeasing. Our study is based on the collection of data from the charts, notes, and radiographic information of restorative treatments performed by Dr. Vera in 1993, the analysis of the information by computer artificial intelligence to determine the most appropriate restoration, and the monitoring of the evolution of the dental restoration. The data will be treated confidentially according to the Organic Law 15/1999 on 13 December on the Protection of Personal Data. This paper also presents a clustering technique capable of identifying the most significant cases with which to instantiate the case-base. In order to classify the cases, a mixture of experts is used which incorporates a Bayesian network and a multilayer perceptron; the combination of both classifiers is performed with a neural network

    Pattern recognition beyond classification: An abductive framework for time series interpretation

    Get PDF
    Time series interpretation aims to provide an explanation of what is observed in terms of its underlying processes. The present work is based on the assumption that the common classification-based approaches to time series interpretation suffer from a set of inherent weaknesses, whose ultimate cause lies in the monotonic nature of the deductive reasoning paradigm. In this thesis we propose a new approach to this problem, based on the initial hypothesis that abductive reasoning properly accounts for the human ability to identify and characterize the patterns appearing in a time series. The result of this interpretation is a set of conjectures in the form of observations, organized into an abstraction hierarchy and explaining what has been observed. A knowledge-based framework and a set of algorithms for the interpretation task are provided, implementing a hypothesize-and-test cycle guided by an attentional mechanism. As a representative application domain, interpretation of the electrocardiogram allows us to highlight the strengths of the present approach in comparison with traditional classification-based approaches

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Fuzzy Logic

    Get PDF
    The capability of Fuzzy Logic in the development of emerging technologies is introduced in this book. The book consists of sixteen chapters showing various applications in the field of Bioinformatics, Health, Security, Communications, Transportations, Financial Management, Energy and Environment Systems. This book is a major reference source for all those concerned with applied intelligent systems. The intended readers are researchers, engineers, medical practitioners, and graduate students interested in fuzzy logic systems

    Development of Machine Learning Based Analytical Tools for Pavement Performance Assessment and Crack Detection

    Get PDF
    Pavement Management System (PMS) analytical tools mainly consist of pavement condition investigation and evaluation tools, pavement condition rating and assessment tools, pavement performance prediction tools, treatment prioritizations and implementation tools. The effectiveness of a PMS highly depends on the efficiency and reliability of its pavement condition evaluation tools. Traditionally, pavement condition investigation and evaluation practices are based on manual distress surveys and performance level assessments, which have been blamed for low efficiency low reliability. Those kinds of manually surveys are labor intensive and unsafe due to proximity to live traffic conditions. Meanwhile, the accuracy can be lower due to the subjective nature of the evaluators. Considering these factors, semiautomated and automated pavement condition evaluation tools had been developed for several years. In current years, it is undoubtable that highly advanced computerized technologies have resulted successful applications in diverse engineering fields. Therefore, these techniques can be successfully incorporated into pavement condition evaluation distress detection, the analytical tools can improve the performance of existing PMSs. Hence, this research aims to bridge the gaps between highly advanced Machine Learning Techniques (MLTs) and the existing analytical tools of current PMSs. The research outputs intend to provide pavement condition evaluation tools that meet the requirement of high efficiency, accuracy, and reliability. To achieve the objectives of this research, six pavement damage condition and performance evaluation methodologies are developed. The roughness condition of pavement surface directly influences the riding quality of the users. International Roughness Index (IRI) is used worldwide by research institutions, pavement condition evaluation and management agencies to evaluate the roughness condition of the pavement. IRI is a time-dependent variable which generally tends to increase with the increase of the pavement service life. In this consideration, a multi-granularity fuzzy time series analysis based IRI prediction model is developed. Meanwhile, Particle Swarm Optimization (PSO) method is used for model optimization to obtain satisfactory IRI prediction results. Historical IRI data extracted from the InfoPave website have been used for training and testing the model. Experiment results proved the effectiveness of this method. Automated pavement condition evaluation tools can provide overall performance indices, which can then be used for treatment planning. The calculations of those performance indices are required for surface distress level and roughness condition evaluations. However, pavement surface roughness conditions are hard to obtain from surface image indicators. With this consideration, an image indicators-based pavement roughness and the overall performance prediction tools are developed. The state-of-the-art machine learning technique, XGBoost, is utilized as the main method in model training, validating and testing. In order to find the dominant image indicators that influence the pavement roughness condition and the overall performance conditions, the comprehensive pavement performance evaluation data collected by ARAN 900 are analyzed. Back Propagation Neural Network (BPNN) is used to develop the performance prediction models. On this basis, the mean important values (MIVs) for each input factor are calculated to evaluate the contributions of the input indicators. It has been observed that indicators of the wheel path cracking have the highest MIVs, which emphasizes the importance of cracking-focused maintenance treatments. The same issue is also found that current automated pavement condition evaluation systems only include the analysis of pavement surface distresses, without considering the structural capacity of the actual pavement. Hence, the structural performance analysis-based pavement performance prediction tools are developed using the Support Vector Machines (SVMs). To guarantee the overall performance of the proposed methodologies, heuristic methods including Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are selected to optimize the model. The experiments results show a promising future of machine learning based pavement structural performance prediction. Automated pavement condition analyzers usually detect pavement surface distress through the collected pavement surface images. Then, distress types, severities, quantities, and other parameters are calculated for the overall performance index calculation. Cracks are one of the most important pavement surface distresses that should be quantified. Traditional approaches are less accurate and efficient in locating, counting and quantifying various types of cracks initialed on the pavement surface. An integrated Crack Deep Net (CrackDN) is developed based on deep learning technologies. Through model training, validation and testing, it has proved that CrackDN can detect pavement surface cracks on complex background with high accuracy. Moreover, the combination of box-level pavement crack locating, and pixel-level crack calculation can achieve comprehensive crack analysis. Thereby, more effective maintenance treatments can be assigned. Hence, a methodology regarding pixel-level crack detection which is called CrackU-net, is proposed. CrackU-net is composed of several convolutional, maxpooling, and up-convolutional layers. The model is developed based on the innovations of deep learning-based segmentation. Pavement crack data are collected by multiple devices, including automated pavement condition survey vehicles, smartphones, and action cameras. The proposed CrackU-net is tested on a separate crack image set which has not been used for training the model. The results demonstrate a promising future of use in the PMSs. Finally, the proposed toolboxes are validated through comparative experiments in terms of accuracy (precision, recall, and F-measure) and error levels. The accuracies of all those models are higher than 0.9 and the errors are lower than 0.05. Meanwhile, the findings of this research suggest that the wheel path cracking should be a priority when conducting maintenance activity planning. Benefiting from the highly advanced machine learning technologies, pavement roughness condition and the overall performance levels have a promising future of being predicted by extraction of the image indicators. Moreover, deep learning methods can be utilized to achieve both box-level and pixel-level pavement crack detection with satisfactory performance. Therefore, it is suggested that those state-of-the-art toolboxes be integrated into current PMSs to upgrade their service levels

    Development of Machine Learning Based Analytical Tools for Pavement Performance Assessment and Crack Detection

    Get PDF
    Pavement Management System (PMS) analytical tools mainly consist of pavement condition investigation and evaluation tools, pavement condition rating and assessment tools, pavement performance prediction tools, treatment prioritizations and implementation tools. The effectiveness of a PMS highly depends on the efficiency and reliability of its pavement condition evaluation tools. Traditionally, pavement condition investigation and evaluation practices are based on manual distress surveys and performance level assessments, which have been blamed for low efficiency low reliability. Those kinds of manually surveys are labor intensive and unsafe due to proximity to live traffic conditions. Meanwhile, the accuracy can be lower due to the subjective nature of the evaluators. Considering these factors, semiautomated and automated pavement condition evaluation tools had been developed for several years. In current years, it is undoubtable that highly advanced computerized technologies have resulted successful applications in diverse engineering fields. Therefore, these techniques can be successfully incorporated into pavement condition evaluation distress detection, the analytical tools can improve the performance of existing PMSs. Hence, this research aims to bridge the gaps between highly advanced Machine Learning Techniques (MLTs) and the existing analytical tools of current PMSs. The research outputs intend to provide pavement condition evaluation tools that meet the requirement of high efficiency, accuracy, and reliability. To achieve the objectives of this research, six pavement damage condition and performance evaluation methodologies are developed. The roughness condition of pavement surface directly influences the riding quality of the users. International Roughness Index (IRI) is used worldwide by research institutions, pavement condition evaluation and management agencies to evaluate the roughness condition of the pavement. IRI is a time-dependent variable which generally tends to increase with the increase of the pavement service life. In this consideration, a multi-granularity fuzzy time series analysis based IRI prediction model is developed. Meanwhile, Particle Swarm Optimization (PSO) method is used for model optimization to obtain satisfactory IRI prediction results. Historical IRI data extracted from the InfoPave website have been used for training and testing the model. Experiment results proved the effectiveness of this method. Automated pavement condition evaluation tools can provide overall performance indices, which can then be used for treatment planning. The calculations of those performance indices are required for surface distress level and roughness condition evaluations. However, pavement surface roughness conditions are hard to obtain from surface image indicators. With this consideration, an image indicators-based pavement roughness and the overall performance prediction tools are developed. The state-of-the-art machine learning technique, XGBoost, is utilized as the main method in model training, validating and testing. In order to find the dominant image indicators that influence the pavement roughness condition and the overall performance conditions, the comprehensive pavement performance evaluation data collected by ARAN 900 are analyzed. Back Propagation Neural Network (BPNN) is used to develop the performance prediction models. On this basis, the mean important values (MIVs) for each input factor are calculated to evaluate the contributions of the input indicators. It has been observed that indicators of the wheel path cracking have the highest MIVs, which emphasizes the importance of cracking-focused maintenance treatments. The same issue is also found that current automated pavement condition evaluation systems only include the analysis of pavement surface distresses, without considering the structural capacity of the actual pavement. Hence, the structural performance analysis-based pavement performance prediction tools are developed using the Support Vector Machines (SVMs). To guarantee the overall performance of the proposed methodologies, heuristic methods including Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are selected to optimize the model. The experiments results show a promising future of machine learning based pavement structural performance prediction. Automated pavement condition analyzers usually detect pavement surface distress through the collected pavement surface images. Then, distress types, severities, quantities, and other parameters are calculated for the overall performance index calculation. Cracks are one of the most important pavement surface distresses that should be quantified. Traditional approaches are less accurate and efficient in locating, counting and quantifying various types of cracks initialed on the pavement surface. An integrated Crack Deep Net (CrackDN) is developed based on deep learning technologies. Through model training, validation and testing, it has proved that CrackDN can detect pavement surface cracks on complex background with high accuracy. Moreover, the combination of box-level pavement crack locating, and pixel-level crack calculation can achieve comprehensive crack analysis. Thereby, more effective maintenance treatments can be assigned. Hence, a methodology regarding pixel-level crack detection which is called CrackU-net, is proposed. CrackU-net is composed of several convolutional, maxpooling, and up-convolutional layers. The model is developed based on the innovations of deep learning-based segmentation. Pavement crack data are collected by multiple devices, including automated pavement condition survey vehicles, smartphones, and action cameras. The proposed CrackU-net is tested on a separate crack image set which has not been used for training the model. The results demonstrate a promising future of use in the PMSs. Finally, the proposed toolboxes are validated through comparative experiments in terms of accuracy (precision, recall, and F-measure) and error levels. The accuracies of all those models are higher than 0.9 and the errors are lower than 0.05. Meanwhile, the findings of this research suggest that the wheel path cracking should be a priority when conducting maintenance activity planning. Benefiting from the highly advanced machine learning technologies, pavement roughness condition and the overall performance levels have a promising future of being predicted by extraction of the image indicators. Moreover, deep learning methods can be utilized to achieve both box-level and pixel-level pavement crack detection with satisfactory performance. Therefore, it is suggested that those state-of-the-art toolboxes be integrated into current PMSs to upgrade their service levels

    Data-Driven Simulation Modeling of Construction and Infrastructure Operations Using Process Knowledge Discovery

    Get PDF
    Within the architecture, engineering, and construction (AEC) domain, simulation modeling is mainly used to facilitate decision-making by enabling the assessment of different operational plans and resource arrangements, that are otherwise difficult (if not impossible), expensive, or time consuming to be evaluated in real world settings. The accuracy of such models directly affects their reliability to serve as a basis for important decisions such as project completion time estimation and resource allocation. Compared to other industries, this is particularly important in construction and infrastructure projects due to the high resource costs and the societal impacts of these projects. Discrete event simulation (DES) is a decision making tool that can benefit the process of design, control, and management of construction operations. Despite recent advancements, most DES models used in construction are created during the early planning and design stage when the lack of factual information from the project prohibits the use of realistic data in simulation modeling. The resulting models, therefore, are often built using rigid (subjective) assumptions and design parameters (e.g. precedence logic, activity durations). In all such cases and in the absence of an inclusive methodology to incorporate real field data as the project evolves, modelers rely on information from previous projects (a.k.a. secondary data), expert judgments, and subjective assumptions to generate simulations to predict future performance. These and similar shortcomings have to a large extent limited the use of traditional DES tools to preliminary studies and long-term planning of construction projects. In the realm of the business process management, process mining as a relatively new research domain seeks to automatically discover a process model by observing activity records and extracting information about processes. The research presented in this Ph.D. Dissertation was in part inspired by the prospect of construction process mining using sensory data collected from field agents. This enabled the extraction of operational knowledge necessary to generate and maintain the fidelity of simulation models. A preliminary study was conducted to demonstrate the feasibility and applicability of data-driven knowledge-based simulation modeling with focus on data collection using wireless sensor network (WSN) and rule-based taxonomy of activities. The resulting knowledge-based simulation models performed very well in properly predicting key performance measures of real construction systems. Next, a pervasive mobile data collection and mining technique was adopted and an activity recognition framework for construction equipment and worker tasks was developed. Data was collected using smartphone accelerometers and gyroscopes from construction entities to generate significant statistical time- and frequency-domain features. The extracted features served as the input of different types of machine learning algorithms that were applied to various construction activities. The trained predictive algorithms were then used to extract activity durations and calculate probability distributions to be fused into corresponding DES models. Results indicated that the generated data-driven knowledge-based simulation models outperform static models created based upon engineering assumptions and estimations with regard to compatibility of performance measure outputs to reality
    corecore