3 research outputs found

    Enhancing portfolio return based on sentiment-of-topic

    Full text link
    © 2017 While time-series analysis is commonly used in financial forecasting, a key source of market-sentiments is often omitted. Financial news is known to be making persuasive impact on the markets. Without considering this additional source of signals, only sub-optimal predictions can be made. This paper proposes a notion of sentiment-of-topic (SoT) to address the problem. It is achieved by considering sentiment-linked topics, which are retrieved from time-series with heterogeneous dimensions (i.e., numbers and texts). Using this approach, we successfully improve the prediction accuracy of a proprietary trade recommendation platform. Different from traditional sentiment analysis and unsupervised topic modeling methods, topics associated with different sentiment levels are used to quantify market conditions. In particular, sentiment levels are learned from historical market performances and commentaries instead of using subjective interpretations of human expressions. By capturing the domain knowledge of respective industries and markets, an impressive double-digit improvement in portfolio return is obtained as shown in our experiments

    Hybrid intelligence for data mining

    Full text link
    Today, enormous amount of data are being recorded in all kinds of activities. This sheer size provides an excellent opportunity for data scientists to retrieve valuable information using data mining techniques. Due to the complexity of data in many neoteric problems, one-size-fits-all solutions are seldom able to provide satisfactory answers. Although the studies of data mining have been active, hybrid techniques are rarely scrutinized in detail. Currently, not many techniques can handle time-varying properties while performing their core functions, neither do they retrieve and combine information from heterogeneous dimensions, e.g., textual and numerical horizons. This thesis summarizes our investigations on hybrid methods to provide data mining solutions to problems involving non-trivial datasets, such as trajectories, microblogs, and financial data. First, time-varying dynamic Bayesian networks are extended to consider both causal and dynamic regularization requirements. Combining with density-based clustering, the enhancements overcome the difficulties in modeling spatial-temporal data where heterogeneous patterns, data sparseness and distribution skewness are common. Secondly, topic-based methods are proposed for emerging outbreak and virality predictions on microblogs. Complicated models that consider structural details are popular while others might have taken overly simplified assumptions to sacrifice accuracy for efficiency. Our proposed virality prediction solution delivers the benefits of both worlds. It considers the important characteristics of a structure yet without the burden of fine details to reduce complexity. Thirdly, the proposed topic-based approach for microblog mining is extended for sentiment prediction problems in finance. Sentiment-of-topic models are learned from both commentaries and prices for better risk management. Moreover, previously proposed, supervised topic model provides an avenue to associate market volatility with financial news yet it displays poor resolutions at extreme regions. To overcome this problem, extreme topic model is proposed to predict volatility in financial markets by using supervised learning. By mapping extreme events into Poisson point processes, volatile regions are magnified to reveal their hidden volatility-topic relationships. Lastly, some of the proposed hybrid methods are applied to service computing to verify that they are sufficiently generic for wider applications
    corecore