16,355 research outputs found

    The NASA Astrophysics Data System: Data Holdings

    Get PDF
    Since its inception in 1993, the ADS Abstract Service has become an indispensable research tool for astronomers and astrophysicists worldwide. In those seven years, much effort has been directed toward improving both the quantity and the quality of references in the database. From the original database of approximately 160,000 astronomy abstracts, our dataset has grown almost tenfold to approximately 1.5 million references covering astronomy, astrophysics, planetary sciences, physics, optics, and engineering. We collect and standardize data from approximately 200 journals and present the resulting information in a uniform, coherent manner. With the cooperation of journal publishers worldwide, we have been able to place scans of full journal articles on-line back to the first volumes of many astronomical journals, and we are able to link to current version of articles, abstracts, and datasets for essentially all of the current astronomy literature. The trend toward electronic publishing in the field, the use of electronic submission of abstracts for journal articles and conference proceedings, and the increasingly prominent use of the World Wide Web to disseminate information have enabled the ADS to build a database unparalleled in other disciplines. The ADS can be accessed at http://adswww.harvard.eduComment: 24 pages, 1 figure, 6 tables, 3 appendice

    Special Libraries, December 1966

    Get PDF
    Volume 57, Issue 10https://scholarworks.sjsu.edu/sla_sl_1966/1009/thumbnail.jp

    Special Libraries, February 1966

    Get PDF
    Volume 57, Issue 2https://scholarworks.sjsu.edu/sla_sl_1966/1001/thumbnail.jp

    The demands of users and the publishing world: printed or online, free or paid for?

    Get PDF

    An Evaluation of the Need and Cost of Selected Trade Facilitation Measures in Bangladesh: Implications for the WTO Negotiations on Trade Facilitation

    Get PDF
    With the ongoing customs reforms in Bangladesh, the possible future negotiations on trade facilitation in the WTO will have a profound impact on Bangladesh, as well as on other LDC and developing countries. These countries will benefit greatly from new trade facilitation initiatives. Simultaneously, these countries may face enormous challenges in implementing their commitments in the area of trade facilitation. It is, thus, imperative for these countries to closely monitor the Doha negotiations in this area and be prepared to formulate their negotiating strategies. They should also continue with customs administration reform and trade facilitation capacity building programs in order to develop their own capacity.WTO, Trade Facilitation, Bangladesh

    Airport code/spaces

    Get PDF
    Nearly all aspects of passenger air travel from booking a ticket to checking-in, passing through security screening, buying goods in duty free, baggage-handling, flying, air traffic control, customs and immigration checks are now mediated by software and multiple information systems. Airports, as we have previously argued (Dodge and Kitchin 2004), presently consist of complex, over-lapping assemblages to varying degrees dependent on a myriad of software systems to function, designed to smooth and increase passenger flows through various ‘contact’ points in the airport (as illustrated in Figure 1) and to enable pervasive surveillance to monitor potential security threats. Airport spaces – the check-in areas, security check-points, shopping areas, departure lounges, baggage reclaim, the immigration hall, air traffic control room, even the plane itself - constitute coded space or code/space. Coded space is a space that uses software in its production, but where code is not essential to its production (code simply makes the production more efficient or productive). Code/space, in contrast, is a space dependent on software for its production – without code that space will not function as intended, with processes failing as there are no manual alternatives (or the legacy ‘fall-back’ procedures are unable to handle material flows which means the process then fails due to congestion). Air travel increasingly consists of transit through code/spaces, wherein if the code ‘fails’ passage is halted. For example, if the check-in computers crash there is no other way of checking passengers in; manual check-in has been discontinued, in part, due to new security procedures. Check-in areas then are dependent on code to operate and without it they are simply waiting rooms with no hope of onward passage until the problem is resolved. In these cases, a dyadic relationship exists between software and space (hence the slash conjoining code/space) so that spatiality is the product of code, and code exists in order to produce spatiality

    Advanced document data extraction techniques to improve supply chain performance

    Get PDF
    In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information
    • …
    corecore