97,814 research outputs found

    Relation Discovery from Web Data for Competency Management

    Get PDF
    This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006

    Social Search with Missing Data: Which Ranking Algorithm?

    Get PDF
    Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods

    Neurocognitive Informatics Manifesto.

    Get PDF
    Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given

    Discovering conversational topics and emotions associated with Demonetization tweets in India

    Full text link
    Social media platforms contain great wealth of information which provides us opportunities explore hidden patterns or unknown correlations, and understand people's satisfaction with what they are discussing. As one showcase, in this paper, we summarize the data set of Twitter messages related to recent demonetization of all Rs. 500 and Rs. 1000 notes in India and explore insights from Twitter's data. Our proposed system automatically extracts the popular latent topics in conversations regarding demonetization discussed in Twitter via the Latent Dirichlet Allocation (LDA) based topic model and also identifies the correlated topics across different categories. Additionally, it also discovers people's opinions expressed through their tweets related to the event under consideration via the emotion analyzer. The system also employs an intuitive and informative visualization to show the uncovered insight. Furthermore, we use an evaluation measure, Normalized Mutual Information (NMI), to select the best LDA models. The obtained LDA results show that the tool can be effectively used to extract discussion topics and summarize them for further manual analysis.Comment: 6 pages, 11 figures. arXiv admin note: substantial text overlap with arXiv:1608.02519 by other authors; text overlap with arXiv:1705.08094 by other author
    corecore