167 research outputs found

    Retrieval System for Patent Images

    Get PDF
    AbstractPatent information and images play important roles to describe the novelty of an invention. However, current patent collections do not support image retrieval and patent images are become almost unsearchable. This paper presents a short review of the existing research work and challenges in patent image retrieval domain. From the review, the image feature extraction step is found to be an important step to match the query and database images successfully. In order to improve the current feature extraction step in image patent retrieval, we propose a patent image retrieval approach based on Affine-SIFT technique. Comparison discussions between the existing feature extraction techniques are presented to assess the potential of this proposed approach

    A Convolutional Neural Network-based Patent Image Retrieval Method for Design Ideation

    Full text link
    The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. The core of this approach is a novel neural network architecture named Dual-VGG that is aimed to accomplish two tasks: visual material type prediction and international patent classification (IPC) class label prediction. In turn, the trained neural network provides the deep features in the image embedding vectors that can be utilized for patent image retrieval and visual mapping. The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model. This approach is also illustrated in a case study of robot arm design retrieval. Compared to traditional keyword-based searching and Google image searching, the proposed method discovers more useful visual information for engineering design.Comment: 11 pages, 11 figure

    Invention through Form and Function Analogy

    Get PDF
    Invention through Form and Function Analogy is an invention book for teachers and other leaders working with youth who are involving students in the invention process. The book consists of an introduction and set of nine learning cycle formatted lessons for teaching the principles of invention through the science and engineering design principles of form and function. An appendix contains sets of color, illustrated cards to be printed onto cardstock paper and used for sorting and sequencing activities during the lessons. This set of lessons has been field-tested with elementary and middle school students by teachers and was improved through the peer review process. The introduction section of the book addresses metaphors, analogies, and the use of form and function analogies in problem-solving and innovations. Human need related to invention and cultural universals are also discussed. Lessons address the following topics: 1) Identifying forms and functions of objects; 2) Forms and functions of the human hand; 3) Forms and functions of the body extended with tools; 4) Extending the body to serve basic human needs; 5) Tools related to forms and functions of the mouth; 6) Historical perspective of inventions; 7) Animal form and function analogies; 8) Inventors inspired by form and function; and 9) Combining SCAMPER with form and function to spur invention. Many charts are included to give teachers ideas for supporting student discussions. Teachers who field-tested the book with elementary and middle school students were uniformly positive about the lessons [25 tables, 20 figures, 7 card sets in the Appendix, 35 references.https://scholarworks.uni.edu/facbook/1106/thumbnail.jp

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    An Analysis of Computer Systems for the Secure Creation and Verification of User Instructions

    Get PDF
    The ongoing digitisation of previously analogue systems through the Fourth Industrial Revolution transforms modern societies. Almost every citizen and businesses operating in most parts of the economy are increasingly dependent on the ability of computer systems to accurately execute people's command. This requires efficient data processing capabilities and effective data input methods that can accurately capture and process instructions given by a user. This thesis is concerned with the analysis of state-of-the-art technologies for reliable data input through three case studies. In the first case study, we analyse the UI of Windows 10 and macOS 10.14 for their ability to capture accurate input from users intending to erase data. We find several shortcomings in how both OS support users in identifying and selecting operations that match their intentions and propose several improvements. The second study investigates the use of transaction authentication technology in online banking to preserve the integrity of transaction data in the presence of financial malware. We find a complex interplay of personal and sociotechnical factors that affect whether people successfully secure their transactions, derive representative personas, and propose a novel transaction authentication mechanism that ameliorates some of these factors. In the third study, we analyse the Security Code AutoFill feature in iOS and macOS and its interactions with security processes of remote servers that require users to handle security codes delivered via SMS. We find novel security risks arising from this feature's design and propose amendments, some of which were implemented by Apple. From these case studies, we derive general insights on latent failure as causes for human error that extend the Swiss Cheese model of human error to non-work environments. These findings consequently extend the Human Factors Analysis and Classification System and can be applied to human error incident investigations

    Advancing automation and robotics technology for the Space Station and for the US economy, volume 2

    Get PDF
    In response to Public Law 98-371, dated July 18, 1984, the NASA Advanced Technology Advisory Committee has studied automation and robotics for use in the Space Station. The Technical Report, Volume 2, provides background information on automation and robotics technologies and their potential and documents: the relevant aspects of Space Station design; representative examples of automation and robotics; applications; the state of the technology and advances needed; and considerations for technology transfer to U.S. industry and for space commercialization

    Data bases and data base systems related to NASA's aerospace program. A bibliography with indexes

    Get PDF
    This bibliography lists 1778 reports, articles, and other documents introduced into the NASA scientific and technical information system, 1975 through 1980

    Action Recognition in Still Images: Confluence of Multilinear Methods and Deep Learning

    Get PDF
    Motion is a missing information in an image, however, it is a valuable cue for action recognition. Thus, lack of motion information in a single image makes action recognition for still images inherently a very challenging problem in computer vision. In this dissertation, we show that both spatial and temporal patterns provide crucial information for recognizing human actions. Therefore, action recognition depends not only on the spatially-salient pixels, but also on the temporal patterns of those pixels. To address the challenge caused by the absence of temporal information in a single image, we introduce five effective action classification methodologies along with a new still image action recognition dataset. These include (1) proposing a new Spatial-Temporal Convolutional Neural Network, STCNN, trained by fine-tuning a CNN model, pre-trained on appearance-based classification only, over a novel latent space-time domain, named Ranked Saliency Map and Predicted Optical Flow, or RankSM-POF for short, (2) introducing a novel unsupervised Zero-shot approach based on low-rank Tensor Decomposition, named ZTD, (3) proposing the concept of temporal image, a compact representation of hypothetical sequence of images and then using it to design a new hierarchical deep learning network, TICNN, for still image action recognition, (4) introducing a dataset for STill image Action Recognition (STAR), containing over 1M images across 50 different human body-motion action categories. UCF-STAR is the largest dataset in the literature for action recognition in still images, exposing the intrinsic difficulty of action recognition through its realistic scene and action complexity. Moreover, TSSTN, a two-stream spatiotemporal network, is introduced to model the latent temporal information in a single image, and using it as prior knowledge in a two-stream deep network, (5) proposing a parallel heterogeneous meta- learning method to combine STCNN and ZTD through a stacking approach into an ensemble classifier of the proposed heterogeneous base classifiers. Altogether, this work demonstrates benefits of UCF-STAR as a large-scale still images dataset, and show the role of latent motion information in recognizing human actions in still images by presenting approaches relying on predicting temporal information, yielding higher accuracy on widely-used datasets

    Application of Machine Learning within Visual Content Production

    Get PDF
    We are living in an era where digital content is being produced at a dazzling pace. The heterogeneity of contents and contexts is so varied that a numerous amount of applications have been created to respond to people and market demands. The visual content production pipeline is the generalisation of the process that allows a content editor to create and evaluate their product, such as a video, an image, a 3D model, etc. Such data is then displayed on one or more devices such as TVs, PC monitors, virtual reality head-mounted displays, tablets, mobiles, or even smartwatches. Content creation can be simple as clicking a button to film a video and then share it into a social network, or complex as managing a dense user interface full of parameters by using keyboard and mouse to generate a realistic 3D model for a VR game. In this second example, such sophistication results in a steep learning curve for beginner-level users. In contrast, expert users regularly need to refine their skills via expensive lessons, time-consuming tutorials, or experience. Thus, user interaction plays an essential role in the diffusion of content creation software, primarily when it is targeted to untrained people. In particular, with the fast spread of virtual reality devices into the consumer market, new opportunities for designing reliable and intuitive interfaces have been created. Such new interactions need to take a step beyond the point and click interaction typical of the 2D desktop environment. The interactions need to be smart, intuitive and reliable, to interpret 3D gestures and therefore, more accurate algorithms are needed to recognise patterns. In recent years, machine learning and in particular deep learning have achieved outstanding results in many branches of computer science, such as computer graphics and human-computer interface, outperforming algorithms that were considered state of the art, however, there are only fleeting efforts to translate this into virtual reality. In this thesis, we seek to apply and take advantage of deep learning models to two different content production pipeline areas embracing the following subjects of interest: advanced methods for user interaction and visual quality assessment. First, we focus on 3D sketching to retrieve models from an extensive database of complex geometries and textures, while the user is immersed in a virtual environment. We explore both 2D and 3D strokes as tools for model retrieval in VR. Therefore, we implement a novel system for improving accuracy in searching for a 3D model. We contribute an efficient method to describe models through 3D sketch via an iterative descriptor generation, focusing both on accuracy and user experience. To evaluate it, we design a user study to compare different interactions for sketch generation. Second, we explore the combination of sketch input and vocal description to correct and fine-tune the search for 3D models in a database containing fine-grained variation. We analyse sketch and speech queries, identifying a way to incorporate both of them into our system's interaction loop. Third, in the context of the visual content production pipeline, we present a detailed study of visual metrics. We propose a novel method for detecting rendering-based artefacts in images. It exploits analogous deep learning algorithms used when extracting features from sketches
    • …
    corecore