58,410 research outputs found

    Integration of feature subset selection methods for sentiment analysis

    Get PDF
    Feature selection is one of the main challenges in sentiment analysis to find an optimal feature subset from a real-world domain. The complexity of an optimal feature subset selection grows exponentially based on the number of features for analysing and organizing data in high-dimensional spaces that lead to the high-dimensional problems. To overcome the problem, this study attempted to enhance the feature subset selection in high-dimensional data by removing irrelevant and redundant features using filter and wrapper approaches. Initially, a filter method based on dispersion of samples on feature space known as mutual standard deviation method was developed to minimize intra-class and maximize inter-class distances. The filter-based methods have some advantages such as they are easily scaled to high-dimensional datasets and are computationally simple and fast. Besides, they only depend on feature selection space and ignore the hypothesis model space. Hence, the next step of this study developed a new feature ranking approach by integrating various filter methods. The ordinal-based and frequency-based integration of different filter methods were developed. Finally, a hybrid harmony search based on search strategy was developed and used to enhance the feature subset selection to overcome the problem of ignoring the dependency of feature selection on the classifier. Therefore, a search strategy on feature space using integration of filter and wrapper approaches was introduced to find a semantic relationship among the model selections and subsets of the search features. Comparative experiments were performed on five sentiment datasets, namely movie, music, book, electronics, and kitchen review dataset. A sizeable performance improvement was noted whereby the proposed integration-based feature subset selection method yielded a result of 98.32% accuracy in sentiment classification using POS-based features on movie reviews. Finally, a statistical test conducted based on the accuracy showed significant differences between the proposed methods and the baseline methods in almost all the comparisons in k-fold cross-validation. The findings of the study have shown the effectiveness of the mutual standard deviation and integration-based feature subset selection methods have outperformed the other baseline methods in terms of accuracy

    Optimizing gravitational-wave searches for a population of coalescing binaries: Intrinsic parameters

    Full text link
    We revisit the problem of searching for gravitational waves from inspiralling compact binaries in Gaussian coloured noise. For binaries with quasicircular orbits and non-precessing component spins, considering dominant mode emission only, if the intrinsic parameters of the binary are known then the optimal statistic for a single detector is the well-known two-phase matched filter. However, the matched filter signal-to-noise ratio is /not/ in general an optimal statistic for an astrophysical population of signals, since their distribution over the intrinsic parameters will almost certainly not mirror that of noise events, which is determined by the (Fisher) information metric. Instead, the optimal statistic for a given astrophysical distribution will be the Bayes factor, which we approximate using the output of a standard template matched filter search. We then quantify the possible improvement in number of signals detected for various populations of non-spinning binaries: for a distribution of signals uniformly distributed in volume and with component masses distributed uniformly over the range 1m1,2/M241\leq m_{1,2}/M_\odot\leq 24, (m1+m2)/M25(m_1+m_2) /M_\odot\leq 25 at fixed expected SNR, we find 20%\gtrsim 20\% more signals at a false alarm threshold of 10610^{-6}\,Hz in a single detector. The method may easily be generalized to binaries with non-precessing spins.Comment: Version accepted by Phys. Rev.

    Discriminative Scale Space Tracking

    Full text link
    Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation. The explicit scale filter is learned online using the target appearance sampled at a set of different scales. Contrary to standard approaches, our method directly learns the appearance change induced by variations in the target scale. Additionally, we investigate strategies to reduce the computational cost of our approach. Extensive experiments are performed on the OTB and the VOT2014 datasets. Compared to the standard exhaustive scale search, our approach achieves a gain of 2.5% in average overlap precision on the OTB dataset. Additionally, our method is computationally efficient, operating at a 50% higher frame rate compared to the exhaustive scale search. Our method obtains the top rank in performance by outperforming 19 state-of-the-art trackers on OTB and 37 state-of-the-art trackers on VOT2014.Comment: To appear in TPAMI. This is the journal extension of the VOT2014-winning DSST tracking metho

    A 2D based Partition Strategy for Solving Ranking under Team Context (RTP)

    Full text link
    In this paper, we propose a 2D based partition method for solving the problem of Ranking under Team Context(RTC) on datasets without a priori. We first map the data into 2D space using its minimum and maximum value among all dimensions. Then we construct window queries with consideration of current team context. Besides, during the query mapping procedure, we can pre-prune some tuples which are not top ranked ones. This pre-classified step will defer processing those tuples and can save cost while providing solutions for the problem. Experiments show that our algorithm performs well especially on large datasets with correctness
    corecore