52,268 research outputs found
Efficient Forecasting for Hierarchical Time Series
Forecasting is used as the basis for business planning in many application areas such as energy, sales and traffic management. Time series data used in these areas is often hierarchically organized and thus, aggregated along the hierarchy levels based on their dimensional features. Calculating forecasts in these environments is very time consuming, due to ensuring forecasting consistency between hierarchy levels. To increase the forecasting efficiency for hierarchically organized time series, we introduce a novel forecasting approach that takes advantage of the hierarchical organization. There, we reuse the forecast models maintained on the lowest level of the hierarchy to almost instantly create already estimated forecast models on higher hierarchical levels. In addition, we define a hierarchical communication framework, increasing the communication flexibility and efficiency. Our experiments show significant runtime improvements for creating a forecast model at higher hierarchical levels, while still providing a very high accuracy
Tiresias: Online Anomaly Detection for Hierarchical Operational Network Data
Operational network data, management data such as customer care call logs and
equipment system logs, is a very important source of information for network
operators to detect problems in their networks. Unfortunately, there is lack of
efficient tools to automatically track and detect anomalous events on
operational data, causing ISP operators to rely on manual inspection of this
data. While anomaly detection has been widely studied in the context of network
data, operational data presents several new challenges, including the
volatility and sparseness of data, and the need to perform fast detection
(complicating application of schemes that require offline processing or
large/stable data sets to converge).
To address these challenges, we propose Tiresias, an automated approach to
locating anomalous events on hierarchical operational data. Tiresias leverages
the hierarchical structure of operational data to identify high-impact
aggregates (e.g., locations in the network, failure modes) likely to be
associated with anomalous events. To accommodate different kinds of operational
network data, Tiresias consists of an online detection algorithm with low time
and space complexity, while preserving high detection accuracy. We present
results from two case studies using operational data collected at a large
commercial IP network operated by a Tier-1 ISP: customer care call logs and
set-top box crash logs. By comparing with a reference set verified by the ISP's
operational group, we validate that Tiresias can achieve >94% accuracy in
locating anomalies. Tiresias also discovered several previously unknown
anomalies in the ISP's customer care cases, demonstrating its effectiveness
Enhanced news sentiment analysis using deep learning methods
We explore the predictive power of historical news sentiments based on financial market performance to forecast financial news sentiments. We define news sentiments based on stock price returns averaged over one minute right after a news article has been released. If the stock price exhibits positive (negative) return, we classify the news article released just prior to the observed stock return as positive (negative). We use Wikipedia and Gigaword five corpus articles from 2014 and we apply the global vectors for word representation method to this corpus to create word vectors to use as inputs into the deep learning TensorFlow network. We analyze high-frequency (intraday) Thompson Reuters News Archive as well as the high-frequency price tick history of the Dow Jones Industrial Average (DJIA 30) Index individual stocks for the period between 1/1/2003 and 12/30/2013. We apply a combination of deep learning methodologies of recurrent neural network with long short-term memory units to train the Thompson Reuters News Archive Data from 2003 to 2012, and we test the forecasting power of our method on 2013 News Archive data. We find that the forecasting accuracy of our methodology improves when we switch from random selection of positive and negative news to selecting the news with highest positive scores as positive news and news with highest negative scores as negative news to create our training data set.Published versio
- …