17,871 research outputs found
Machine Learning for Fluid Mechanics
The field of fluid mechanics is rapidly advancing, driven by unprecedented
volumes of data from field measurements, experiments and large-scale
simulations at multiple spatiotemporal scales. Machine learning offers a wealth
of techniques to extract information from data that could be translated into
knowledge about the underlying fluid mechanics. Moreover, machine learning
algorithms can augment domain knowledge and automate tasks related to flow
control and optimization. This article presents an overview of past history,
current developments, and emerging opportunities of machine learning for fluid
mechanics. It outlines fundamental machine learning methodologies and discusses
their uses for understanding, modeling, optimizing, and controlling fluid
flows. The strengths and limitations of these methods are addressed from the
perspective of scientific inquiry that considers data as an inherent part of
modeling, experimentation, and simulation. Machine learning provides a powerful
information processing framework that can enrich, and possibly even transform,
current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
A Survey on Feature Selection Algorithms
One major component of machine learning is feature analysis which comprises of mainly two processes: feature selection and feature extraction. Due to its applications in several areas including data mining, soft computing and big data analysis, feature selection has got a reasonable importance. This paper presents an introductory concept of feature selection with various inherent approaches. The paper surveys historic developments reported in feature selection with supervised and unsupervised methods. The recent developments with the state of the art in the on-going feature selection algorithms have also been summarized in the paper including their hybridizations.
DOI: 10.17762/ijritcc2321-8169.16043
A novel Big Data analytics and intelligent technique to predict driver's intent
Modern age offers a great potential for automatically predicting the driver's intent through the increasing miniaturization of computing technologies, rapid advancements in communication technologies and continuous connectivity of heterogeneous smart objects. Inside the cabin and engine of modern cars, dedicated computer systems need to possess the ability to exploit the wealth of information generated by heterogeneous data sources with different contextual and conceptual representations. Processing and utilizing this diverse and voluminous data, involves many challenges concerning the design of the computational technique used to perform this task. In this paper, we investigate the various data sources available in the car and the surrounding environment, which can be utilized as inputs in order to predict driver's intent and behavior. As part of investigating these potential data sources, we conducted experiments on e-calendars for a large number of employees, and have reviewed a number of available geo referencing systems. Through the results of a statistical analysis and by computing location recognition accuracy results, we explored in detail the potential utilization of calendar location data to detect the driver's intentions. In order to exploit the numerous diverse data inputs available in modern vehicles, we investigate the suitability of different Computational Intelligence (CI) techniques, and propose a novel fuzzy computational modelling methodology. Finally, we outline the impact of applying advanced CI and Big Data analytics techniques in modern vehicles on the driver and society in general, and discuss ethical and legal issues arising from the deployment of intelligent self-learning cars
Recommended from our members
Combinatorial optimization and metaheuristics
Today, combinatorial optimization is one of the youngest and most active areas of discrete mathematics. It is a branch of optimization in applied mathematics and computer science, related to operational research, algorithm theory and computational complexity theory. It sits at the intersection of several fields, including artificial intelligence, mathematics and software engineering. Its increasing interest arises for the fact that a large number of scientific and industrial problems can be formulated as abstract combinatorial optimization problems, through graphs and/or (integer) linear programs. Some of these problems have polynomial-time (“efficient”) algorithms, while most of them are NP-hard, i.e. it is not proved that they can be solved in polynomial-time. Mainly, it means that it is not possible to guarantee that an exact solution to the problem can be found and one has to settle for an approximate solution with known performance guarantees. Indeed, the goal of approximate methods is to find “quickly” (reasonable run-times), with “high” probability, provable “good” solutions (low error from the real optimal solution). In the last 20 years, a new kind of algorithm commonly called metaheuristics have emerged in this class, which basically try to combine heuristics in high level frameworks aimed at efficiently and effectively exploring the search space. This report briefly outlines the components, concepts, advantages and disadvantages of different metaheuristic approaches from a conceptual point of view, in order to analyze their similarities and differences. The two very significant forces of intensification and diversification, that mainly determine the behavior of a metaheuristic, will be pointed out. The report concludes by exploring the importance of hybridization and integration methods
- …