22,351 research outputs found
Roadmap for Reliable Ensemble Forecasting of the Sun-Earth System
The authors of this report met on 28-30 March 2018 at the New Jersey
Institute of Technology, Newark, New Jersey, for a 3-day workshop that brought
together a group of data providers, expert modelers, and computer and data
scientists, in the solar discipline. Their objective was to identify challenges
in the path towards building an effective framework to achieve transformative
advances in the understanding and forecasting of the Sun-Earth system from the
upper convection zone of the Sun to the Earth's magnetosphere. The workshop
aimed to develop a research roadmap that targets the scientific challenge of
coupling observations and modeling with emerging data-science research to
extract knowledge from the large volumes of data (observed and simulated) while
stimulating computer science with new research applications. The desire among
the attendees was to promote future trans-disciplinary collaborations and
identify areas of convergence across disciplines. The workshop combined a set
of plenary sessions featuring invited introductory talks and workshop progress
reports, interleaved with a set of breakout sessions focused on specific topics
of interest. Each breakout group generated short documents, listing the
challenges identified during their discussions in addition to possible ways of
attacking them collectively. These documents were combined into this
report-wherein a list of prioritized activities have been collated, shared and
endorsed.Comment: Workshop Repor
Recommended from our members
History and trends in solar irradiance and PV power forecasting: A preliminary assessment and review using text mining
Text mining is an emerging topic that advances the review of academic literature. This paper presents a preliminary study on how to review solar irradiance and photovoltaic (PV) power forecasting (both topics combined as “solar forecasting” for short) using text mining, which serves as the first part of a forthcoming series of text mining applications in solar forecasting. This study contains three main contributions: (1) establishing the technological infrastructure (authors, journals & conferences, publications, and organizations) of solar forecasting via the top 1000 papers returned by a Google Scholar search; (2) consolidating the frequently-used abbreviations in solar forecasting by mining the full texts of 249 ScienceDirect publications; and (3) identifying key innovations in recent advances in solar forecasting (e.g., shadow camera, forecast reconciliation). As most of the steps involved in the above analysis are automated via an application programming interface, the presented method can be transferred to other solar engineering topics, or any other scientific domain, by means of changing the search word. The authors acknowledge that text mining, at its present stage, serves as a complement to, but not a replacement of, conventional review papers
Machine Learning for the Geosciences: Challenges and Opportunities
Geosciences is a field of great societal relevance that requires solutions to
several urgent problems facing our humanity and the planet. As geosciences
enters the era of big data, machine learning (ML) -- that has been widely
successful in commercial domains -- offers immense potential to contribute to
problems in geosciences. However, problems in geosciences have several unique
challenges that are seldom found in traditional applications, requiring novel
problem formulations and methodologies in machine learning. This article
introduces researchers in the machine learning (ML) community to these
challenges offered by geoscience problems and the opportunities that exist for
advancing both machine learning and geosciences. We first highlight typical
sources of geoscience data and describe their properties that make it
challenging to use traditional machine learning techniques. We then describe
some of the common categories of geoscience problems where machine learning can
play a role, and discuss some of the existing efforts and promising directions
for methodological development in machine learning. We conclude by discussing
some of the emerging research themes in machine learning that are applicable
across all problems in the geosciences, and the importance of a deep
collaboration between machine learning and geosciences for synergistic
advancements in both disciplines.Comment: Under review at IEEE Transactions on Knowledge and Data Engineerin
Big Data Analytics for Dynamic Energy Management in Smart Grids
The smart electricity grid enables a two-way flow of power and data between
suppliers and consumers in order to facilitate the power flow optimization in
terms of economic efficiency, reliability and sustainability. This
infrastructure permits the consumers and the micro-energy producers to take a
more active role in the electricity market and the dynamic energy management
(DEM). The most important challenge in a smart grid (SG) is how to take
advantage of the users' participation in order to reduce the cost of power.
However, effective DEM depends critically on load and renewable production
forecasting. This calls for intelligent methods and solutions for the real-time
exploitation of the large volumes of data generated by a vast amount of smart
meters. Hence, robust data analytics, high performance computing, efficient
data network management, and cloud computing techniques are critical towards
the optimized operation of SGs. This research aims to highlight the big data
issues and challenges faced by the DEM employed in SG networks. It also
provides a brief description of the most commonly used data processing methods
in the literature, and proposes a promising direction for future research in
the field.Comment: Published in ELSEVIER Big Data Researc
From Digitalization to Data-Driven Decision Making in Container Terminals
With the new opportunities emerging from the current wave of digitalization,
terminal planning and management need to be revisited by taking a data-driven
perspective. Business analytics, as a practice of extracting insights from
operational data, assists in reducing uncertainties using predictions and helps
to identify and understand causes of inefficiencies, disruptions, and anomalies
in intra- and inter-organizational terminal operations. Despite the growing
complexity of data within and around container terminals, a lack of data-driven
approaches in the context of container terminals can be identified. In this
chapter, the concept of business analytics for supporting terminal planning and
management is introduced. The chapter specifically focuses on data mining
approaches and provides a comprehensive overview on applications in container
terminals and related research. As such, we aim to establish a data-driven
perspective on terminal planning and management, complementing the traditional
optimization perspective.Comment: 20 pages, 5 figures, book chapte
Deep Learning on Traffic Prediction: Methods, Analysis and Future Directions
Traffic prediction plays an essential role in intelligent transportation
system. Accurate traffic prediction can assist route planing, guide vehicle
dispatching, and mitigate traffic congestion. This problem is challenging due
to the complicated and dynamic spatio-temporal dependencies between different
regions in the road network. Recently, a significant amount of research efforts
have been devoted to this area, especially deep learning method, greatly
advancing traffic prediction abilities. The purpose of this paper is to provide
a comprehensive survey on deep learning-based approaches in traffic prediction
from multiple perspectives. Specifically, we first summarize the existing
traffic prediction methods, and give a taxonomy. Second, we list the
state-of-the-art approaches in different traffic prediction applications.
Third, we comprehensively collect and organize widely used public datasets in
the existing literature to facilitate other researchers. Furthermore, we give
an evaluation and analysis by conducting extensive experiments to compare the
performance of different methods on a real-world public dataset. Finally, we
discuss open challenges in this field.Comment: to be published in IEEE Transactions on Intelligent Transportation
System
Reliability and Sharpness in Border Crossing Traffic Interval Prediction
Short-term traffic volume prediction models have been extensively studied in
the past few decades. However, most of the previous studies only focus on
single-value prediction. Considering the uncertain and chaotic nature of the
transportation system, an accurate and reliable prediction interval with upper
and lower bounds may be better than a single point value for transportation
management. In this paper, we introduce a neural network model called Extreme
Learning Machine (ELM) for interval prediction of short-term traffic volume and
improve it with the heuristic particle swarm optimization algorithm (PSO). The
hybrid PSO-ELM model can generate the prediction intervals under different
confidence levels and guarantee the quality by minimizing a multi-objective
function which considers two criteria reliability and interval sharpness. The
PSO-ELM models are built based on an hourly traffic dataset and compared with
ARMA and Kalman Filter models. The results show that ARMA models are the worst
for all confidence levels, and the PSO-ELM models are comparable with Kalman
Filter from the aspects of reliability and narrowness of the intervals,
although the parameters of PSO-ELM are fixed once the training is done while
Kalman Filter is updated in an online approach. Additionally, only the PSO-ELMs
are able to produce intervals with coverage probabilities higher than or equal
to the confidence levels. For the points outside of the prediction levels given
by PSO-ELMs, they lie very close to the bounds.Comment: Presented at 2017 TRB Annual Meetin
Using Social Media to Predict the Future: A Systematic Literature Review
Social media (SM) data provides a vast record of humanity's everyday
thoughts, feelings, and actions at a resolution previously unimaginable.
Because user behavior on SM is a reflection of events in the real world,
researchers have realized they can use SM in order to forecast, making
predictions about the future. The advantage of SM data is its relative ease of
acquisition, large quantity, and ability to capture socially relevant
information, which may be difficult to gather from other data sources.
Promising results exist across a wide variety of domains, but one will find
little consensus regarding best practices in either methodology or evaluation.
In this systematic review, we examine relevant literature over the past decade,
tabulate mixed results across a number of scientific disciplines, and identify
common pitfalls and best practices. We find that SM forecasting is limited by
data biases, noisy data, lack of generalizable results, a lack of
domain-specific theory, and underlying complexity in many prediction tasks. But
despite these shortcomings, recurring findings and promising results continue
to galvanize researchers and demand continued investigation. Based on the
existing literature, we identify research practices which lead to success,
citing specific examples in each case and making recommendations for best
practices. These recommendations will help researchers take advantage of the
exciting possibilities offered by SM platforms
PowerNet: Neural Power Demand Forecasting in Smart Grid
Power demand forecasting is a critical task for achieving efficiency and
reliability in power grid operation. Accurate forecasting allows grid operators
to better maintain the balance of supply and demand as well as to optimize
operational cost for generation and transmission. This article proposes a novel
neural network architecture PowerNet, which can incorporate multiple
heterogeneous features, such as historical energy consumption data, weather
data, and calendar information, for the power demand forecasting task. Compared
to two recent works based on Gradient Boosting Tree (GBT) and Support Vector
Regression (SVR), PowerNet demonstrates a decrease of 33.3% and 14.3% in
forecasting error, respectively. We further provide empirical results the two
operational considerations that are crucial when using PowerNet in practice,
i.e., how far in the future the model can forecast with a decent accuracy and
how often we should re-train the forecasting model to retain its modeling
capability. Finally, we briefly discuss a multilayer anomaly detection approach
based on PowerNet
A Data Mining Approach Combining K-Means Clustering with Bagging Neural Network for Short-term Wind Power Forecasting
Wind power forecasting (WPF) is significant to guide the dispatching of grid
and the production planning of wind farm effectively. The intermittency and
volatility of wind leading to the diversity of the training samples have a
major impact on the forecasting accuracy. In this paper, to deal with the
training samples dynamics and improve the forecasting accuracy, a data mining
approach consisting of K-means clustering and bagging neural network is
proposed for short-term WPF. Based on the similarity among historical days,
K-means clustering is used to classify the samples into several categories,
which contain the information of meteorological conditions and historical power
data. In order to overcome the over fitting and instability problems of
conventional networks, a bagging-based ensemble approach is integrated into the
back propagation neural network. To confirm the effectiveness, the proposed
data mining approach is examined on real wind generation data traces. The
simulation results show that it can obtain better forecasting accuracy than
other baseline and existed short-term WPF approaches
- …