11 research outputs found

    A Study On The Effects Of Noise Level, Cleaning Method, And Vectorization Software On The Quality Of Vector Data.

    Get PDF
    In this paper we study different factors that affect vector quality. Noise level, cleaning method, and vectorization software are three factors that may influence the resulting vector data. Real scanned images from GREC'03 contest are used in the experiment. Three different levels of salt-and-pepper noise (5olo, l0%o, and l5o/o) are used. Noisy images are cleaned by six cleaning algorithms and then three different commercial raster to vector software are used to vectorize the cleaned images. vector Recovery Index (VRI) is the performance evaluation criteria used in this study to judge the quality of the resulting vectors compared to their ground truth data. Statistical analysis on the VRI values shows that vectorization software have the biggest influence on the quality of the resulting vectors

    Electronic Publishing : the evolution and economics of a hybrid journal.

    Get PDF
    The technical, social and economic issues of electronic publishing are examined by using as a case study the evolution of the journal Electronic Publishing Origination, Dissemination and Design (EP-odd) which is published by John Wiley Ltd. The journal is a `hybrid' one, in the sense that it appears in both electronic and paper form, and is now in its ninth year of publication. The author of this paper is the journal's Editor-in- Chief. The first eight volumes of EP-odd have been distributed via the conventional subscription method but a new method, from volume 9 onwards, is now under discussion whereby accepted papers will first be published on the EP-odd web site, with the printed version appearing later as a once-per-volume operation. Later sections of the paper lead on from the particular experiences with EP-odd into a more general discussion of peer review and the acceptability of e-journals in universities, the changing role of libraries, the sustainability of traditional subscription pricing and the prospects for `per paper' sales as micro-payment technologies become available

    Adobe's Acrobat -- the Electronic Journal Catalyst?

    Get PDF
    Adobe's Acrobat software, released in June 1993, is based around a new Portable Document Format (PDF) which offers the possibility of being able to view and exchange electronic documents, independent of the originating software, across a wide variety of supported hardware platforms (PC, Macintosh, Sun UNIX etc.). The principal features of Acrobat are reviewed and its importance for libraries discussed in the context of experience already gained from the CAJUN project (CD-ROM Acrobat Journals Using Networks). This two-year project, funded by two well-known journal publishers, is investigating the use of Acrobat software for the electronic dissemination of journals, on CD-ROM and over networks

    Retrieval from an image knowledge base

    Get PDF
    With advances in computer technology, images and image databases are becoming increasingly important. Retrievals of images in current image database systems have been designed using keyword searches. These carefully designed and handcrafted systems are very efficient given the application domain they are built for. Unfortunately, they are not adaptable to other domains, not expandable for other uses of the existing information and are not very forgiving to their users. The appearance of full-text search provides for a more general search given textual documents. However, pictorial images contain a vast amount of information that is difficult to catalog in a general way. Further this classification needs to be dynamic providing for flexible searching capability. The searching should allow for more than a pre-programmed set of search parameters, as exact searches make the image database quite useless for a search that was not designed into the original database. Further the incorporation of knowledge along with the images is difficult. Development of an image knowledge base along with content-based retrieval techniques is the focus of this thesis. Using an artificial intelligence technique called case-based reasoning, images can be retrieved with a degree of flexibility. Each image would be classified by user entered attributes about the image called descriptors. These descriptors would also have a degree-of-importance parameter. This parameter would indicate the relative importance or certainty of that descriptor. These descriptors are collected as the case for the image and stored in frames Each image can vary as to the amount of attribute information they contain. Retrieval of an image from the knowledge base begins with the entry of new descriptors for the desired image. Along with the descriptors are the degree-of-importance parameter. The degree-of-importance would indicate the requirement for the desired image to match that descriptor. Again, a variable number of descriptors can be entered. After all criteria are entered, the system will search for cases that have any level of matching. The system will use the degree-of-importance both in the knowledge base about the candidate image(s) and the degree-of-importance on the search criteria to order the images. The ordering process will use weighted summations to present a relatively small list of candidate images. To demonstrate and validate the concepts outlined, a prototype of the system has been developed. This prototype includes the primary architectural components of a potentially real product. Architectural areas addressed are: the storage of the knowledge, storage and access to a large number of high-resolution images, means of searching or interrogating the knowledge base, and the actual display of images. The prototype is called the Smart Photo Album It is an electronic filing system for 35mm pictures taken by the average photographer on up to the photo-journalist. It allows for multiple ways of indexing the pictures of any subject matter. Retrieval from the knowledge base provides relative matches to the given search criteria. Although this application is relatively simple, the basis of the system can be easily extended to include a more sophisticated knowledge base and reasoning process as, for example, would be used for a medical diagnostic application in the field of dermatology

    CD-ROM Acrobat Journals Using Networks

    Get PDF
    The available technologies for publishing journals electronically are surveyed. They range from abstract representations, such as SGML, concerned largely with the structure of the document, to formats such as PostScript which faithfully model the layout and the appearance. The issues are discussed in the context of choosing a format for electronically publishing the journal: Electronic Publishing -- Origination, Dissemination and Design. PostScript is neither widely enough available nor standardised enough to be suitable; a bitmapped pages approach suffers from being resolution-dependent in terms of the visual quality achievable. Reasons are put forward for the final choice of Adobe s new PDF document standard for creating electronic versions of the journal

    Text skimming as a part in paper document understanding

    Get PDF
    In our document understanding project ALV we analyse incoming paper mail in the domain of single-sided German business letters. These letters are scanned and after several analysis steps the text is recognized. The result may contain gaps, word alternatives, and even illegal words. The subject of this paper is the subsequent phase which concerns the extraction of important information predefined in our "message type model". An expectation driven partial text skimming analysis is proposed focussing on the kernel module, the so-called "predictor". In contrast to traditional text skimming the following aspects are important in our approach. Basically, the input data are fragmentary texts. Rather than having one text analysis module ("substantiator") only, our predictor controls a set of different and partially alternative substantiators. With respect to the usually proposed three working phases of a predictor - start, discrimination, and instantiation - the following differences are remarkable. The starting problem of text skimming is solved by applying specialized substantiators for classifying a business letter into message types. In order to select appropriate expectations within the message type hypotheses a twofold discrimination is performed. A coarse discrimination reduces the number of message type alternatives, and a fine discrimination chooses one expectation within one or a few previously selected message types. According to the expectation selected substantiators are activated. Several rules are applied both for the verification of the substantiator results and for error recovery if the results are insufficient
    corecore