5 research outputs found

    Pattern-Based Approach to Table Extraction

    Get PDF
    International audienceIn this paper, we address a client-driven approach to automatically extract information content within the table in document images. We start with a graph-based representation of a set of key-fields selected by clients and perform graph mining in a document in order to learn them to produce a model. Such models are aimed to use to extract information content in the absence of clients. To avoid NP-hard general problem, our graph matching is based on relation assignment to see whether pairs of nodes are semantically identical. We have validated the concept by using a real-world industrial problem

    Document Information Extraction and its Evaluation based on Client's Relevance

    Get PDF
    International audienceIn this paper, we present a model-based document information content extraction approach and perform in-depth evaluation based on clients' relevance. Real-world users i.e., clients first provide a set of key fields from the document image which they think are important. These are used to represent a graph where nodes (i.e., fields) are labelled with dynamic semantics including other features and edges are attributed with spatial relations. Such an attributed relational graph (ARG) is then used to mine similar graphs from a document image that are used to reinforce or update the initial graph iteratively each time we extract them, in order to produce a model. Models therefore, can be employed in the absence of clients. We have validated the concept and evaluated its scientific impact on real-world industrial problem, where table extraction is found to be the best suited application

    Will they buy?

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 127-137).The proliferation of inexpensive video recording hardware and enormous storage capacity has enabled the collection of retail customer behavior at an unprecedented scale. The vast majority of this data is used for theft prevention and never used to better understand the customer. In what ways can this huge corpus be leveraged to improve the experience of customer and the performance of the store? This thesis presents MIMIC, a system that processes video captured in a retail store into predictions about customer proclivity to purchase. MIMIC relies on the observation that aggregate patterns of all of a store's patrons-the gestalt-captures behavior indicative of an imminent transaction. Video is distilled into a homogenous feature vector that captures the activity distribution by first tracking the locations of customers, then discretizing their movements into a feature vector using a collection of functional locations-areas of the store relevant to the tasks of patrons and employees. A time series of these feature vectors can then be classified as predictive-of-transaction using a Hidden Markov Model. MIMIc is evaluated on a small operational retail store located in the Mall of America near Minneapolis, Minnesota. Its performance is characterized across a wide cross-section of the model's parameters. Through manipulation of the training data supplied to MiMic, the behavior of customers in the store can be examined at fine levels of detail without foregoing the potential afforded by big data. MIMIC enables a suite of valuable tools. For ethnographic researchers, it offers a technique for identifying key moments in hundreds or thousands of hours of raw video. Retail managers gain a fine-grained metric to evaluate the performance of their stores, and interior designers acquire a critical component in a store layout optimization framework.by Rony Daniel Kubat.Ph.D

    Embedded online: Iraq war documentaries in the online public sphere

    Get PDF
    This study assesses the democratic and pedagogical roles of Iraq War documentaries in the online public sphere by synthesizing critical perspectives on war media and documentary film. The 2003 invasion and occupation of Iraq gave rise to an unprecedented profusion of war documentaries, many of which are now freely available on dedicated documentary-viewing websites. These websites function as knowledge resources archiving content produced over the course of the occupation and as transnational reception spheres allowing the claims of individual films to be contested or endorsed from multiple perspectives. Consequently, the traditional functions of the war documentary - as advocacy, reportage, and critique - are challenged and reframed in a transnational context. Within the “new war media ecology” (Hoskins & O’Loughlin 2010), documentary-viewing websites also call into question certain foundational assumptions of war media research such as the critical opposition between mainstream and alternative content and associated claims about the impact of mainstream media framing on public-opinion. To examine these issues, three levels of analysis are employed: a content analysis of eleven documentary-viewing websites establishes which Iraq War documentaries are circulating online; a textual analysis of six prominent films critiques their public sphere roles in reference to their thematic, ideological, and aesthetic constructions; and, finally, a reception analysis of user-comments on the largest documentary-viewing website, Top Documentary Films, evaluates how users contest or endorse the credibility of individual films and filmmakers. Although most of the Iraq War documentaries found online are highly critical of the war, this opposition is manifest in complex ways and relies on varying textual strategies for remediating war representations. With an emphasis on electoral politics, activist films articulate a narrow form of war opposition by appealing to the victimisation of the American subject under the Bush administration. In conjunction with transnational user-comments, however, these films also support a foundational reflection on patriotism during wartime. Documentary war reports call on the evidential power of on-the ground footage to frame fragments of the unfolding conflict for Western viewers. The online archiving of these piecemeal perspectives then undermines institutional efforts to commemorate the war in a particular way. Documentaries about war information and media utilise leaked and suppressed information to set different modes of war mediation against each other. While this strategy allows filmmakers to challenge official accounts of the war, it also reflects practices found in amateur conspiracy films. The study finds that viewers’ prior convictions, along with their pre-established trust in particular filmmakers and institutions, play a significant role in their willingness to accept the credibility of individual films. In this way, the transnational reception sphere frequently challenges the assumptions of film representations and brings together diverging perspectives on war. However, in the absence of editorial oversight, users are left to make their own distinctions between competing documentary claims. Consequently, documentary-viewing websites have an ambiguous relationship with documentary’s status as a “discourse of sobriety” (Nichols 1991). In an accelerated and highly partisan war media environment, the inherent tension between the free flow of content in the public sphere and the quality and veracity of this content calls for continued reflection on the dynamic relationship between traditional media content and emergent media practices
    corecore