3 research outputs found

    SURVEY OF E-MAIL CLASSIFICATION: REVIEW AND OPEN ISSUES

    Get PDF
    Email is an economical facet of communication, the importance of which is increasing in spite of access to other approaches, such as electronic messaging, social networks, and phone applications. The business arena depends largely on the use of email, which urges the proper management of emails due to disruptive factors such as spams, phishing emails, and multi-folder categorization. The present study aimed to review the studies regarding emails, which were published during 2016-2020, based on the problem description analysis in terms of datasets, applications areas, classification techniques, and feature sets. In addition, other areas involving email classifications were identified and comprehensively reviewed. The results indicated four email application areas, while the open issues and research directions of email classifications were implicated for further investigation

    Understanding comparative questions and retrieving argumentative answers

    Get PDF
    Making decisions is an integral part of everyday life, yet it can be a difficult and complex process. While peoples’ wants and needs are unlimited, resources are often scarce, making it necessary to research the possible alternatives and weigh the pros and cons before making a decision. Nowadays, the Internet has become the main source of information when it comes to comparing alternatives, making search engines the primary means for collecting new information. However, relying only on term matching is not sufficient to adequately address requests for comparisons. Therefore, search systems should go beyond this approach to effectively address comparative information needs. In this dissertation, I explore from different perspectives how search systems can respond to comparative questions. First, I examine approaches to identifying comparative questions and study their underlying information needs. Second, I investigate a methodology to identify important constituents of comparative questions like the to-be-compared options and to detect the stance of answers towards these comparison options. Then, I address ambiguous comparative search queries by studying an interactive clarification search interface. And finally, addressing answering comparative questions, I investigate retrieval approaches that consider not only the topical relevance of potential answers but also account for the presence of arguments towards the comparison options mentioned in the questions. By addressing these facets, I aim to provide a comprehensive understanding of how to effectively satisfy the information needs of searchers seeking to compare different alternatives

    Analyzing intentions from big data traces of human activities

    Get PDF
    The rapid growth of big data formed by human activities makes research on intention analysis both challenging and rewarding. We study multifaceted problems in analyzing intentions from big data traces of human activities, and such problems span a range of machine learning, optimization, and security and privacy. We show that analyzing intentions from industry-scale human activity big data can effectively improve the accuracy of computational models. Specifically, we take query auto-completion as a case study. We identify two hitherto-undiscovered problems: adaptive query auto-completion and mobile query auto-completion. We develop two computational models by analyzing intentions from big data traces of human activities on search interface interactions and on mobile application usage respectively. Solving the large-scale optimization problems in the proposed query auto-completion models drives deeper studies of the solvers. Hence, we consider the generalized machine learning problem settings and focus on developing lightweight stochastic algorithms as solvers to the large-scale convex optimization problems with theoretical guarantees. For optimizing strongly convex objectives, we design an accelerated stochastic block coordinate descent method with optimal sampling; for optimizing non-strongly convex objectives, we design a stochastic variance reduced alternating direction method of multipliers with the doubling-trick. Inevitably, human activities are human-centric, thus its research can inform security and privacy. On one hand, intention analysis research from human activities can be motivated from the security perspective. For instance, to reduce false alarms of medical service providers' suspicious accesses to electronic health records, we discover potential de facto diagnosis specialties that reflect such providers' genuine and permissible intentions of accessing records with certain diagnoses. On the other hand, we examine the privacy risk in anonymized heterogeneous information networks representing large-scale human activities, such as in social networking. Such data are released for external researchers to improve the prediction accuracy for users' online social networking intentions on the publishers' microblogging site. We show a negative result that makes a compelling argument: privacy must be a central goal for sensitive human activity data publishers
    corecore