5,991 research outputs found

    An Efficient Method of Summarizing Documents Using Impression Measurements

    Get PDF
    Automatic generic document summarization based on unsupervised schemes is a very useful approach because it does not require training data. Although techniques using latent semantic analysis (LSA) and non-negative matrix factorization (NMF) have been applied to determine topics of documents, there are no researches on reduction of matrix and speeding up of computation of the NMF method. In order to achieve this scheme, this paper utilizes the generic impressive expressions from newspapers to extract important sentences as summary. Therefore, it has no stemming processes and no filtering of stop words. Generally, novels are typical documents providing sentimental impression for readers. However, newspapers deliver different impressions for new knowledge because they inform readers about current events, informative articles and diverse features. The proposed method introduces impressive expressions for newspapers and their measurements are applied to the NMF method. From 100 KB text data of experimental results by the proposed method, it turns out that the matrix size reduces by 80 % and the computation of the NMF method becomes 7 times faster than with the original method, without degrading the relevancy of extracted sentences

    Exploiting E-mail structure to improve summarization

    Get PDF
    Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.Includes bibliographical references (p. 77-81).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.For this thesis, I designed and implemented a system to summarize e-mail messages. The system exploits two aspects of e-mail, thread reply chains and commonly-found features, to generate summaries. The system uses existing software designed to summarize single text documents. Such software typically performs best on well-authored, formal documents. E-mail messages, however, are typically neither well-authored, nor formal. As a result, existing summarization software typically gives a poor summary of e-mail messages. To remedy this poor performance, the system's approach preprocesses e-mail messages to synthesize new input to this software, so that it will output more useful summaries of e-mail. This pre-processing involves a lightweight, heuristics-based approach to filtering e-mail to remove e-mail signatures, header fields, and quoted parent messages. I also present a heuristics-based approach to identifying and reporting names, dates, and companies found in e-mail messages. Lastly, I discuss conclusions from a pilot user study of my summarization system, and conclude with areas for further investigation.by Derek Scott Lam.M.Eng

    Journalistic image access : description, categorization and searching

    Get PDF
    The quantity of digital imagery continues to grow, creating a pressing need to develop efficient methods for organizing and retrieving images. Knowledge on user behavior in image description and search is required for creating effective and satisfying searching experiences. The nature of visual information and journalistic images creates challenges in representing and matching images with user needs. The goal of this dissertation was to understand the processes in journalistic image access (description, categorization, and searching), and the effects of contextual factors on preferred access points. These were studied using multiple data collection and analysis methods across several studies. Image attributes used to describe journalistic imagery were analyzed based on description tasks and compared to a typology developed through a meta-analysis of literature on image attributes. Journalistic image search processes and query types were analyzed through a field study and multimodal image retrieval experiment. Image categorization was studied via sorting experiments leading to a categorization model. Advances to research methods concerning search tasks and categorization procedures were implemented. Contextual effects on image access were found related to organizational contexts, work, and search tasks, as well as publication context. Image retrieval in a journalistic work context was contextual at the level of image needs and search process. While text queries, together with browsing, remained the key access mode to journalistic imagery, participants also used visual access modes in the experiment, constructing multimodal queries. Assigned search task type and searcher expertise had an effect on query modes utilized. Journalistic images were mostly described and queried for on the semantic level but also syntactic attributes were used. Constraining the description led to more abstract descriptions. Image similarity was evaluated mainly based on generic semantics. However, functionally oriented categories were also constructed, especially by domain experts. Availability of page context promoted thematic rather than object-based categorization. The findings increase our understanding of user behavior in image description, categorization, and searching, as well as have implications for future solutions in journalistic image access. The contexts of image production, use, and search merit more interest in research as these could be leveraged for supporting annotation and retrieval. Multiple access points should be created for journalistic images based on image content and function. Support for multimodal query formulation should also be offered. The contributions of this dissertation may be used to create evaluation criteria for journalistic image access systems

    Enhanced web-based summary generation for search.

    Get PDF
    After a user types in a search query on a major search engine, they are presented with a number of search results. Each search result is made up of a title, brief text summary and a URL. It is then the user\u27s job to select documents for further review. Our research aims to improve the accuracy of users selecting relevant documents by improving the way these web pages are summarized. Improvements in accuracy will lead to time improvements and user experience improvements. We propose ReClose, a system for generating web document summaries. ReClose generates summary content through combining summarization techniques from query-biased and query-independent summary generation. Query-biased summaries generally provide query terms in context. Query-independent summaries focus on summarizing documents as a whole. Combining these summary techniques led to a 10% improvement in user decision making over Google generated summaries. Color-coded ReClose summaries provide keyword usage depth at a glance and also alert users to topic departures. Color-coding further enhanced ReClose results and led to a 20% improvement in user decision making over Google generated summaries. Many online documents include structure and multimedia of various forms such as tables, lists, forms and images. We propose to include this structure in web page summaries. We found that the expert user was insignificantly slowed in decision making while the majority of average users made decisions more quickly using summaries including structure without any decrease in decision accuracy. We additionally extended ReClose for use in summarizing large numbers of tweets in tracking flu outbreaks in social media. The resulting summaries have variable length and are effective at summarizing flu related trends. Users of the system obtained an accuracy of 0.86 labeling multi-tweet summaries. This showed that the basis of ReClose is effective outside of web documents and that variable length summaries can be more effective than fixed length. Overall the ReClose system provides unique summaries that contain more informative content than current search engines produce, highlight the results in a more meaningful way, and add structure when meaningful. The applications of ReClose extend far beyond search and have been demonstrated in summarizing pools of tweets

    Natural language in multimedia / multimodal systems

    Get PDF

    Framework analysis: a worked example of a study exploring young people’s experiences of depression

    Get PDF
    Framework analysis is an approach to qualitative research which is being increasingly used across multiple disciplines, including psychology, social policy and nursing research. The stages of framework analysis have been described in published work, but the literature is lacking in articles describing how to conduct it in practice, particularly in the field of psychology, where researchers may be working as part of a team. Having used framework analysis on a study exploring adolescents' experiences of depression, we faced various challenges along the way and learned from experience how to use this approach to qualitative analysis. In this reflective article, we describe a worked example of using framework, which we hope will assist other researchers in deciding if this approach is suitable for their own research, and will provide guidance on how one might go about conducting framework analysis when working as part of a research team. We conclude that framework is a valuable contribution to qualitative methods in psychology, offering a pragmatic, flexible and rigorous approach to data analysis

    Confessing in the Human Voice: A Defense of the Privilege Against Self-Incrimination

    Get PDF
    ABSTRACT OF CONFESSING IN THE HUMAN VOICE: A DEFENSE OF THE PRIVILEGE AGAINST SELF-INCRIMINATION By Andrew E. Taslitz The privilege against self-incrimination has fallen on hard times. Miranda rights shrink, as do those more traditional “core” aspects of the privilege. Partly this is due to an implicit skepticism by the courts about the value of the privilege, despite their occasional explicit words of praise for its role in our constitutional scheme. Scholars largely, though not uniformly, agree that the privilege cannot be justified as a philosophical matter, viewing it as an unfortunate burden we are stuck with because of its presence in the Constitution. This article bucks the dominant trend by articulating a new defense of the privilege against self-incrimination, one rooted in cognitive psychology and linguistic theory. In doing so, the article tries to resurrect in a completely new form the supposedly discredited “mental privacy” justification for the privilege. Specifically, this article maintains that one of the two primary purposes of the privilege is to protect individuals against the compelled expression of their “literal voice,” by which I mean both the content of their words and the paralinguistic cues that accompany them. (The second purpose of the privilege, which is to protect the individual’s “metaphorical voice” – the voice of his counsel – I address in a companion piece). The privilege thus protects not so much the privacy of our thoughts as of our words. Control over our words matters for two inter-related reasons: first, compelled speech alters our thoughts, feelings, and character, making us into persons other than what we might choose; second, once those words leave our mouths, they expose us to social mis-definition and mis-judgment in ways that harm our sense of individual uniqueness and violate the boundaries that define us as a person. Part I summarizes the sorry state of the privilege today. Part II draws on recent work in cognitive psychology to explain why each human becomes unique and deeply wants to be judged as unique by others, to be known for the fullness of who we truly are. Part III first explores psychological and linguistic research on those features of language that lead listeners to make judgments about speakers’ essential character, leading to praise or condemnation, including the reasons for the frequent inaccuracy of those judgments. Next, Part III explains why similar principles hold for written and internet communications, even though they are different in important ways from the paradigm case of spoken speech. Finally, Part III explores the special dangers of mis-judgment by the criminal justice system, society’s ultimate vehicle for expressing condemnation of the person. Part IV explores empirical and philosophical work on how compelled expression actually changes our fundamental nature, while Part V, the conclusion, sums up the preceding argument. This article does not pretend to resolve all the puzzles created by the current version of the privilege. But it does lay the foundation for doing so by defending a neglected justification for the central importance of the privilege to human flourishing, suggesting that cramped interpretations of the privilege work a grave injustice that calls for correction
    • …
    corecore