339 research outputs found

    Practical Bayesian support vector regression for financial time series prediction and market condition change detection

    Get PDF
    Support vector regression (SVR) has long been proven to be a successful tool to predict financial time series. The core idea of this study is to outline an automated framework for achieving a faster and easier parameter selection process, and at the same time, generating useful prediction uncertainty estimates in order to effectively tackle flexible real-world financial time series prediction problems. A Bayesian approach to SVR is discussed, and implemented. It is found that the direct implementation of the probabilistic framework of Gao et al. returns unsatisfactory results in our experiments. A novel enhancement is proposed by adding a new kernel scaling parameter (Formula presented.) to overcome the difficulties encountered. In addition, the multi-armed bandit Bayesian optimization technique is applied to automate the parameter selection process. Our framework is then tested on financial time series of various asset classes (i.e. equity index, credit default swaps spread, bond yields, and commodity futures) to ensure its flexibility. It is shown that the generalization performance of this parameter selection process can reach or sometimes surpass the computationally expensive cross-validation procedure. An adaptive calibration process is also described to allow practical use of the prediction uncertainty estimates to assess the quality of predictions. It is shown that the machine-learning approach discussed in this study can be developed as a very useful pricing tool, and potentially a market condition change detector. A further extension is possible by taking the prediction uncertainties into consideration when building a financial portfolio

    1-factorisation of the Composition of Regular Graphs

    No full text
    1-factorability of the composition of graphs is studied. The followings sufficient conditions are proved: G[H]G[H] is 1-factorable if GG and HH are regular and at least one of the following holds: (i) Graphs GG and HH both contain a 1-factor, (ii) GG is 1-factorable (iii) HH is 1-factorable. It is also shown that the tensor product G⊗HG\otimes H is 1-factorable, if at least one of two graphs is 1-factorable. This result in turn implies that the strong tensor product G⊗′HG\otimes' H is 1-factorable, if GG is 1-factorable

    A Balanced Route Design for Min-Max Multiple-Depot Rural Postman Problem (MMMDRPP): a police patrolling case

    Get PDF
    Providing distributed services on road networks is an essential concern for many applications, such as mail delivery, logistics and police patrolling. Designing effective and balanced routes for these applications is challenging, especially when involving multiple postmen from distinct depots. In this research, we formulate this routing problem as a Min-Max Multiple-Depot Rural Postman Problem (MMMDRPP). To solve this routing problem, we develop an efficient tabu-search-based algorithm and propose three novel lower bounds to evaluate the routes. To demonstrate its practical usefulness, we show how to formulate the route design for police patrolling in London as an MMMDRPP and generate balanced routes using the proposed algorithm. Furthermore, the algorithm is tested on multiple adapted benchmark problems. The results demonstrate the efficiency of the algorithm in generating balanced routes

    Computational analysis of stochastic heterogeneity in PCR amplification efficiency revealed by single molecule barcoding

    Get PDF
    The polymerase chain reaction (PCR) is one of the most widely used techniques in molecular biology. In combination with High Throughput Sequencing (HTS), PCR is widely used to quantify transcript abundance for RNA-seq, and in the context of analysis of T and B cell receptor repertoires. In this study, we combine DNA barcoding with HTS to quantify PCR output from individual target molecules. We develop computational tools that simulate both the PCR branching process itself, and the subsequent subsampling which typically occurs during HTS sequencing. We explore the influence of different types of heterogeneity on sequencing output, and compare them to experimental results where the efficiency of amplification is measured by barcodes uniquely identifying each molecule of starting template. Our results demonstrate that the PCR process introduces substantial amplification heterogeneity, independent of primer sequence and bulk experimental conditions. This heterogeneity can be attributed both to inherited differences between different template DNA molecules, and the inherent stochasticity of the PCR process. The results demonstrate that PCR heterogeneity arises even when reaction and substrate conditions are kept as constant as possible, and therefore single molecule barcoding is essential in order to derive reproducible quantitative results from any protocol combining PCR with HTS

    Localized lasso for high-dimensional regression

    Get PDF
    We introduce the localized Lasso, which learns models that both are interpretable and have a high predictive power in problems with high dimensionality d and small sample size n. More specifically, we consider a function defined by local sparse models, one at each data point. We introduce sample-wise network regularization to borrow strength across the models, and sample-wise exclusive group sparsity (a.k.a., `1,2 norm) to introduce diversity into the choice of feature sets in the local models. The local models are interpretable in terms of similarity of their sparsity patterns. The cost function is convex, and thus has a globally optimal solution. Moreover, we propose a simple yet efficient iterative least-squares based optimization procedure for the localized Lasso, which does not need a tuning parameter, and is guaranteed to converge to a globally optimal solution. The solution is empirically shown to outperform alternatives for both simulated and genomic personalized/precision medicine data

    Disease classification from capillary electrophoresis: mass spectrometry

    Get PDF
    We investigate the possibility of using pattern recognition techniques to classify various disease types using data produced by a new form of rapid Mass Spectrometry. The data format has several advantages over other high-throughput technologies and as such could become a useful diagnostic tool. We investigate the binary and multi-class performances obtained using standard classifiers as the number of features is varied and conclude that there is potential in this technique and suggest research directions that would improve performance

    Probabilistic map-matching for low-frequency GPS trajectories

    Get PDF
    The ability to infer routes taken by vehicles from sparse and noisy GPS data is of crucial importance in many traffic applications. The task, known as map-matching, can be accurately approached by a popular technique known as ST-Matching. The algorithm is computationally efficient and has been shown to outperform more traditional map-matching approaches, especially on low-frequency GPS data. The major drawback of the algorithm is a lack of confidence scores associated with its outputs, which are particularly useful when GPS data quality is low. In this paper, we propose a probabilistic adaptation of ST-Matching that equips it with the ability to express map-matching certainty using probabilities. The adaptation, called probabilistic ST-Matching (PST-Matching) is inspired by similarities between ST-Matching and probabilistic approaches to map-matching based on a Hidden Markov Model. We validate the proposed algorithm on GPS trajectories of varied quality and show that it is similar to ST-Matching in terms of accuracy and computational efficiency, yet with the added benefit of having a measure of confidence associated with its outputs

    Probabilistic Map-matching using Particle Filters

    Get PDF
    Increasing availability of vehicle GPS data has created potentially transformative opportunities for trac management, route planning and other location-based services. Critical to the utility of the data is their accuracy. Map-matching is the process of improving the accuracy by aligning GPS data with the road network. In this paper, we propose a purely probabilistic approach to map-matching based on a sequential Monte Carlo algorithm known as particle filters. The approach performs map-matching by producing a range of candidate solutions, each with an associated probability score. We outline implementation details and thoroughly validate the technique on GPS data of varied quality
    • …
    corecore