474 research outputs found

    Adaptive Algorithms for Automated Processing of Document Images

    Get PDF
    Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

    Optical Character Recognition of Printed Persian/Arabic Documents

    Get PDF
    Texts are an important representation of language. Due to the volume of texts generated and the historical value of some documents, it is imperative to use computers to read generated texts, and make them editable and searchable. This task, however, is not trivial. Recreating human perception capabilities in artificial systems like documents is one of the major goals of pattern recognition research. After decades of research and improvements in computing capabilities, humans\u27 ability to read typed or handwritten text is hardly matched by machine intelligence. Although, classical applications of Optical Character Recognition (OCR) like reading machine-printed addresses in a mail sorting machine is considered solved, more complex scripts or handwritten texts push the limits of the existing technology. Moreover, many of the existing OCR systems are language dependent. Therefore, improvements in OCR technologies have been uneven across different languages. Especially, for Persian, there has been limited research. Despite the need to process many Persian historical documents or use of OCR in variety of applications, few Persian OCR systems work with good recognition rate. Consequently, the task of automatically reading Persian typed documents with close-to-human performance is still an open problem and the main focus of this dissertation. In this dissertation, after a literature survey of the existing technology, we propose new techniques in the two important preprocessing steps in any OCR system: Skew detection and Page segmentation. Then, rather than the usual practice of character segmentation, we propose segmentation of Persian documents into sub-words. The choice of sub-word segmentation is to avoid the challenges of segmenting highly cursive Persian texts to isolated characters. For feature extraction, we will propose a hybrid scheme between three commonly used methods and finally use a nonparametric classification method. A large number of papers and patents advertise recognition rates near 100%. Such claims give the impression that automation problems seem to have been solved. Although OCR is widely used, its accuracy today is still far from a child\u27s reading skills. Failure of some real applications show that performance problems still exist on composite and degraded documents and that there is still room for progress

    Commemoration, Memory and the Process of Display: Negotiating the Imperial War Museum's First World War Exhibitions, 1964 - 2014

    Get PDF
    This thesis explores the key permanent and temporary First World War exhibitions held at the Imperial War Museum in London over a fifty year period. In so doing, it examines the theoretical, political and intellectual considerations that inform exhibition-making. It thus illuminates the possibilities, challenges and difficulties, of displaying the 'War to End All Wars'. Furthermore, by situating these displays within their respective social, economic and cultural contexts, this produces a critical analysis of past and present practices of display. A study of these public presentations of the First World War enables discussion of the Museum’s primary agendas, and its role as a national public institution. In considering this with the broader effect of generational shifts and the ever-changing impact of the War’s cultural memory on this institution, the thesis investigates how the Imperial War Museum has consistently reinvented itself to produce engaging portrayals of the conflict for changing audiences.Arts and Humanities Research Council. Collaborative Doctoral Award in partnership with Imperial War Museums

    Derry, Londonderry, Legenderry - a city in transition

    Get PDF
    Diese Arbeit beschäftigt sich mit dem Wandlungsprozess der geteilten Stadt Derry/Londonderry in Nordirland aus einer räumlichen Perspektive. Der jahrhundertealte Konflikt zwischen der protestantischen/unionistischen/loyalistischen Bevölkerung und der katholischen/nationalistischen/republikanischen Bevölkerung hat einen stark segregierten öffentlichen Raum geschaffen, der den Friedensbemühungen der nordirischen Gesellschaft im Wege steht. Ausgehend vom Lefebvre’schen Raumkonzept, das Raum als eine triadische Dialektik begreift zwischen der physisch, durch Sinne wahrnehmbaren Dimension von Raum, der konzipierten, diskursiven Dimension des Raumes sowie der Dimension der Symbole und Bedeutungen, versucht diese Arbeit die Verschränkung von Raum und gesellschaftlichen Prozessen abzubilden. Sie untersucht wie visuelle Veränderungen des Raumes dazu beitragen die Beziehungen zwischen antagonistischen Gruppen zu verbessern, um in Folge der Frage nachzugehen ob und wie die Produktion eines gemeinsamen Raumes, ein „shared space“, möglich ist. Der Fokus gilt dabei der Transformation von territorialisierten Wohngegenden in Derry/Londonderry durch die Entfernung beziehungsweise Ersetzung von konflikthaften und abgrenzenden visuellen Botschaften in Form von „murals“ (Wandmalereinen), Markierungen von Gehsteigen, Graffiti und Flaggen. Am Beispiel von vier Fallstudien wird gezeigt welche unterschiedlichen Verhandlungsprozesse stattfinden, wenn dominante Symbole des kollektiven Gedächtnisses und der Identität/Alterität in Frage gestellt werden. Da in Nordirland Identitätszuschreibungen stark über räumliche Kategorien erfolgen, kann die erfolgreiche Veränderung der dominanten Symbole in territorialisierten Wohngegenden dazu beitragen, neue Identitäten außerhalb des Protestanten/Katholiken Antagonismus zu entwickeln. Im Prozess der visuell-repräsentativen Neuerfindung einer Nachbarschaft werden soziale Beziehungen neu verhandelt und räumliche Praktiken des öffentlichen, kollektiven Gedenkens verändern sich. Weiters zeigen die Fallstudien, welche neuen Raumkonzepte eingeführt werden und wie diese von verschiedenen Akteuren und Akteurinnen aufgenommen werden.This thesis looks at the transition process of the segregated city Derry/Londonderry in Northern Ireland from a spatial perspective. The centuries-old conflict between the Protestant/unionist/loyalist population and the Catholic/nationalist/republican population has created deep societal division of public space which is seen as a major inhibitor in Northern Irish society’s peacebuilding aspirations. Applying Henri Lefebvre’s concept of space, which understands space as a triad dialectic between the physical dimension of space, which can be perceived by the senses, the conceptual, discursive dimension of space and the dimension of symbols and meanings, this thesis tries to portray the interconnection of space and societal processes. Moreover, it investigates how visual transformations of space contribute to the improvement of the tensed community relations between the antagonists, in order to follow up the question if the production of a “shared space” is possible. The thesis focuses on the transformation of the territorialised residential areas in Derry/Londonderry through the removal respectively replacement of contentious visual displays in form of murals, kerb paintings, graffiti and flags. Four case studies give an insight into the negotiation processes that take place when dominant symbols of collective memory and identity/alterity are questioned. The case studies show that a successful re-imaging process of territorialised residential estates can free people from the dominant identity ascriptions which are often based on spatial categories. This can contribute to the development of new identifications outside the Protestant/Catholic antagonism. In this re-imaging process social relations are negotiated and spatial practices of public collective commemorations are transformed. Moreover, the case studies reveal what kind of new concepts of space are introduced into society and how they are perceived by various actors

    Hustadt, Inshallah : Learning from a participatory art project in a trans-local neighbourhood

    Get PDF
    My PhD dissertation investigates relationships between contemporary art and spatial practices It emphasising the creation of platforms for public participation as interventions into urban regeneration processes. The project has two essential objectives: a. To identify the potential within contemporary art for a critical analysis of an urban de velopment process from the location, in dialogue with people, and through direct par ticipatory spatial action; b. To propose a scenario for future operation that can instigate the inclusive change within our everyday environment and the wider domain of spatial practice. The PhD research results from my own practice and gives a detailed analysis of a three-year case study: Hustadt Project. The majority of Hustadt Project took place on location (Bochum, Germany) within a suburban setting. In cooperation with local inhabitants we established the context for and ultimately constructed a new Community Pavilion – a structural platform for participatory exchange. As my art practice is situated in-between architecture and design, sociology and urban studies, the process of the PhD research has been to emphasise my personal involvement and subjective observations utilising and transforming methods from these fields along with inventing new tools and strategies. I develop these methods from the context and the situation as a reaction to the process and therefore they are non-prescriptive, improvised, and reactive. In order to construct an argument in negotiation with local politicians, I introduce the form of spatial action by constructing performative events and inviting people to participate in them. More than accumulating knowledge, I’m interested in analysing it, using it, transforming it into a project where the result produces new relations with people onsite. It is important to emphasise that by using video and photo camera as the main tools within the research process, the research strategies I use transform my position from an observer of the situation to that of being observed. Thus the research questions I focus upon: 1. The relationship between the contemporary art production and urban regeneration process: What are the contemporary art and architecture references that have shaped my own practice? What is the process of activating public participation though an art practice? 2. The position that the artist occupies when becoming involved in the process of urban regeneration: What is the role of the artist working in the process of urban regeneration? How can an artist work within the urban regeneration process and keep her/his critical position? 3. The knowledge contribution to the critical spatial practice produced within the contemporary art discourse: What are the methods and strategies of my artistic research that differ from other disciplines? The final doctoral submission comprises of (1) a textual part with some graphic and photo material, (2) Hustadt Episodaire – narrative visual documentation, (3) Hustadt blog, which was produced while following the case study project, (4) an exhibition presenting the Hustadt Project Archive

    Historiography in modern poetry: text, imagination and authority in the work of David Jones, Geoffrey Hill and Ian Duhig; and King Harold: a long poem in three parts

    Get PDF
    This thesis explores how modern poetry is shaped by its relationships with academic and historical texts. Occasioned by creative writers’ increasing involvement in the academy, it considers the consequences of this relationship for contemporary poetry praxis. Through close readings of David Jones’ Anathemata, Geoffrey Hill’s Mercian Hymns and Ian Duhig’s The Speed of Dark, it explores how imaginative conceptions of poems relate to and are affected by their material presentations as texts. In so doing, there is a particular focus on how paratexts translate academic models of authoritative writing into their poems. This thesis addresses a number of key questions: how do modern poets express ideas about the past? How do their borrowings from academic and scholarly texts shape this expression? Do readers’ past experiences have an impact? Taking the work of critics Jerome J. McGann and Linda Hutcheon as its starting point, it develops new approaches to these questions through a synthesis of their ideas and applying these issues to the particulars of poetry composition. It opens new avenues of relevance to modern poets, connecting contemporary poetry criticism with textual studies. The creative component of this thesis makes a parallel treatment of these critical issues in King Harold, a long poem on the multiple literary lives of Harold Godwinson, the last Anglo-Saxon king. My poem dramatises the tensions explored in the critical component, creating an exciting and original bricolage of academic and historical paratexts. Both the critical component and the creative writing element of this thesis illustrate the impact of academic textual production on modern poetry

    Social work with airports passengers

    Get PDF
    Social work at the airport is in to offer to passengers social services. The main methodological position is that people are under stress, which characterized by a particular set of characteristics in appearance and behavior. In such circumstances passenger attracts in his actions some attention. Only person whom he trusts can help him with the documents or psychologically
    corecore