41 research outputs found

    Online Deception Detection Refueled by Real World Data Collection

    Full text link
    The lack of large realistic datasets presents a bottleneck in online deception detection studies. In this paper, we apply a data collection method based on social network analysis to quickly identify high-quality deceptive and truthful online reviews from Amazon. The dataset contains more than 10,000 deceptive reviews and is diverse in product domains and reviewers. Using this dataset, we explore effective general features for online deception detection that perform well across domains. We demonstrate that with generalized features - advertising speak and writing complexity scores - deception detection performance can be further improved by adding additional deceptive reviews from assorted domains in training. Finally, reviewer level evaluation gives an interesting insight into different deceptive reviewers' writing styles.Comment: 10 pages, Accepted to Recent Advances in Natural Language Processing (RANLP) 201

    Optimal-kk difference sequence in nonparametric regression

    Full text link
    Difference-based methods have been attracting increasing attention in nonparametric regression, in particular for estimating the residual variance.To implement the estimation, one needs to choose an appropriate difference sequence, mainly between {\em the optimal difference sequence} and {\em the ordinary difference sequence}. The difference sequence selection is a fundamental problem in nonparametric regression, and it remains a controversial issue for over three decades. In this paper, we propose to tackle this challenging issue from a very unique perspective, namely by introducing a new difference sequence called {\em the optimal-kk difference sequence}. The new difference sequence not only provides a better balance between the bias-variance trade-off, but also dramatically enlarges the existing family of difference sequences that includes the optimal and ordinary difference sequences as two important special cases. We further demonstrate, by both theoretical and numerical studies, that the optimal-kk difference sequence has been pushing the boundaries of our knowledge in difference-based methods in nonparametric regression, and it always performs the best in practical situations

    Global Depths for Irregularly Observed Multivariate Functional Data

    Full text link
    Two frameworks for multivariate functional depth based on multivariate depths are introduced in this paper. The first framework is multivariate functional integrated depth, and the second framework involves multivariate functional extremal depth, which is an extension of the extremal depth for univariate functional data. In each framework, global and local multivariate functional depths are proposed. The properties of population multivariate functional depths and consistency of finite sample depths to their population versions are established. In addition, finite sample depths under irregularly observed time grids are estimated. As a by-product, the simplified sparse functional boxplot and simplified intensity sparse functional boxplot are proposed for visualization without data reconstruction. A simulation study demonstrates the advantages of global multivariate functional depths over local multivariate functional depths in outlier detection and running time for big functional data. An application of our frameworks to cyclone tracks data demonstrates the excellent performance of our global multivariate functional depths.Comment: 29 pages, 6 figure

    A New Functional Clustering Method with Combined Dissimilarity Sources and Graphical Interpretation

    Get PDF
    Clustering is an essential task in functional data analysis. In this study, we propose a framework for a clustering procedure based on functional rankings or depth. Our methods naturally combine various types of between-cluster variation equally, which caters to various discriminative sources of functional data; for example, they combine raw data with transformed data or various components of multivariate functional data with their covariance. Our methods also enhance the clustering results with a visualization tool that allows intrinsic graphical interpretation. Finally, our methods are model-free and nonparametric and hence are robust to heavy-tailed distribution or potential outliers. The implementation and performance of the proposed methods are illustrated with a simulation study and applied to three real-world applications

    Time Reversal Enabled Fiber-Optic Time Synchronization

    Full text link
    Over the past few decades, fiber-optic time synchronization (FOTS) has provided fundamental support for the efficient operation of modern society. Looking toward the future beyond fifth-generation/sixth-generation (B5G/6G) scenarios and very large radio telescope arrays, developing high-precision, low-complexity and scalable FOTS technology is crucial for building a large-scale time synchronization network. However, the traditional two-way FOTS method needs a data layer to exchange time delay information. This increases the complexity of system and makes it impossible to realize multiple-access time synchronization. In this paper, a time reversal enabled FOTS method is proposed. It measures the clock difference between two locations without involving a data layer, which can reduce the complexity of the system. Moreover, it can also achieve multiple-access time synchronization along the fiber link. Tests over a 230 km fiber link have been carried out to demonstrate the high performance of the proposed method

    BPTF promotes tumor growth and predicts poor prognosis in lung adenocarcinomas.

    Get PDF
    BPTF, a subunit of NURF, is well known to be involved in the development of eukaryotic cell, but little is known about its roles in cancers, especially in non-small-cell lung cancer (NSCLC). Here we showed that BPTF was specifically overexpressed in NSCLC cell lines and lung adenocarcinoma tissues. Knockdown of BPTF by siRNA significantly inhibited cell proliferation, induced cell apoptosis and arrested cell cycle progress from G1 to S phase. We also found that BPTF knockdown downregulated the expression of the phosphorylated Erk1/2, PI3K and Akt proteins and induced the cleavage of caspase-8, caspase-7 and PARP proteins, thereby inhibiting the MAPK and PI3K/AKT signaling and activating apoptotic pathway. BPTF knockdown by siRNA also upregulated the cell cycle inhibitors such as p21 and p18 but inhibited the expression of cyclin D, phospho-Rb and phospho-cdc2 in lung cancer cells. Moreover, BPTF knockdown by its specific shRNA inhibited lung cancer growth in vivo in the xenografts of A549 cells accompanied by the suppression of VEGF, p-Erk and p-Akt expression. Immunohistochemical assay for tumor tissue microarrays of lung tumor tissues showed that BPTF overexpression predicted a poor prognosis in the patients with lung adenocarcinomas. Therefore, our data indicate that BPTF plays an essential role in cell growth and survival by targeting multiply signaling pathways in human lung cancers

    Optimal Estimation of Derivatives in Nonparametric Regression

    Get PDF
    Abstract We propose a simple framework for estimating derivatives without fitting the regression function in nonparametric regression. Unlike most existing methods that use the symmetric difference quotients, our method is constructed as a linear combination of observations. It is hence very flexible and applicable to both interior and boundary points, including most existing methods as special cases of ours. Within this framework, we define the variance-minimizing estimators for any order derivative of the regression function with a fixed bias-reduction level. For the equidistant design, we derive the asymptotic variance and bias of these estimators. We also show that our new method will, for the first time, achieve the asymptotically optimal convergence rate for difference-based estimators. Finally, we provide an effective criterion for selection of tuning parameters and demonstrate the usefulness of the proposed method through extensive simulation studies of the firstand second-order derivative estimators
    corecore