167 research outputs found

    A Convolutional Neural Network-based Patent Image Retrieval Method for Design Ideation

    Full text link
    The patent database is often used in searches of inspirational stimuli for innovative design opportunities because of its large size, extensive variety and rich design information in patent documents. However, most patent mining research only focuses on textual information and ignores visual information. Herein, we propose a convolutional neural network (CNN)-based patent image retrieval method. The core of this approach is a novel neural network architecture named Dual-VGG that is aimed to accomplish two tasks: visual material type prediction and international patent classification (IPC) class label prediction. In turn, the trained neural network provides the deep features in the image embedding vectors that can be utilized for patent image retrieval and visual mapping. The accuracy of both training tasks and patent image embedding space are evaluated to show the performance of our model. This approach is also illustrated in a case study of robot arm design retrieval. Compared to traditional keyword-based searching and Google image searching, the proposed method discovers more useful visual information for engineering design.Comment: 11 pages, 11 figure

    Data-Driven Design-by-Analogy: State of the Art and Future Directions

    Full text link
    Design-by-Analogy (DbA) is a design methodology wherein new solutions, opportunities or designs are generated in a target domain based on inspiration drawn from a source domain; it can benefit designers in mitigating design fixation and improving design ideation outcomes. Recently, the increasingly available design databases and rapidly advancing data science and artificial intelligence technologies have presented new opportunities for developing data-driven methods and tools for DbA support. In this study, we survey existing data-driven DbA studies and categorize individual studies according to the data, methods, and applications in four categories, namely, analogy encoding, retrieval, mapping, and evaluation. Based on both nuanced organic review and structured analysis, this paper elucidates the state of the art of data-driven DbA research to date and benchmarks it with the frontier of data science and AI research to identify promising research opportunities and directions for the field. Finally, we propose a future conceptual data-driven DbA system that integrates all propositions.Comment: A Preprint Versio

    Classification of Visualization Types and Perspectives in Patents

    Full text link
    Due to the swift growth of patent applications each year, information and multimedia retrieval approaches that facilitate patent exploration and retrieval are of utmost importance. Different types of visualizations (e.g., graphs, technical drawings) and perspectives (e.g., side view, perspective) are used to visualize details of innovations in patents. The classification of these images enables a more efficient search and allows for further analysis. So far, datasets for image type classification miss some important visualization types for patents. Furthermore, related work does not make use of recent deep learning approaches including transformers. In this paper, we adopt state-of-the-art deep learning methods for the classification of visualization types and perspectives in patent images. We extend the CLEF-IP dataset for image type classification in patents to ten classes and provide manual ground truth annotations. In addition, we derive a set of hierarchical classes from a dataset that provides weakly-labeled data for image perspectives. Experimental results have demonstrated the feasibility of the proposed approaches. Source code, models, and dataset will be made publicly available.Comment: Accepted in International Conference on Theory and Practice of Digital Libraries (TPDL) 2023 (They have the copyright to publish camera-ready version of this work

    Deep Learning for Technical Document Classification

    Full text link
    In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have only focused on processing text for classification, whereas technical documents often contain multimodal information. To leverage multimodal information for document classification to improve the model performance, this paper presents a novel multimodal deep learning architecture, TechDoc, which utilizes three types of information, including natural language texts and descriptive images within documents and the associations among the documents. The architecture synthesizes the convolutional neural network, recurrent neural network, and graph neural network through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that TechDoc presents a greater classification accuracy than the unimodal methods and other state-of-the-art benchmarks. The trained model can potentially be scaled to millions of real-world multimodal technical documents, which is useful for data and knowledge management in large technology companies and organizations.Comment: 16 pages, 8 figures, 9 table

    Patent Data for Engineering Design: A Critical Review and Future Directions

    Full text link
    Patent data have long been used for engineering design research because of its large and expanding size, and widely varying massive amount of design information contained in patents. Recent advances in artificial intelligence and data science present unprecedented opportunities to develop data-driven design methods and tools, as well as advance design science, using the patent database. Herein, we survey and categorize the patent-for-design literature based on its contributions to design theories, methods, tools, and strategies, as well as the types of patent data and data-driven methods used in respective studies. Our review highlights promising future research directions in patent data-driven design research and practice.Comment: Accepted by JCIS

    Generative Transformers for Design Concept Generation

    Full text link
    Generating novel and useful concepts is essential during the early design stage to explore a large variety of design opportunities, which usually requires advanced design thinking ability and a wide range of knowledge from designers. Growing works on computer-aided tools have explored the retrieval of knowledge and heuristics from design data. However, they only provide stimuli to inspire designers from limited aspects. This study explores the recent advance of the natural language generation (NLG) technique in the artificial intelligence (AI) field to automate the early-stage design concept generation. Specifically, a novel approach utilizing the generative pre-trained transformer (GPT) is proposed to leverage the knowledge and reasoning from textual data and transform them into new concepts in understandable language. Three concept generation tasks are defined to leverage different knowledge and reasoning: domain knowledge synthesis, problem-driven synthesis, and analogy-driven synthesis. The experiments with both human and data-driven evaluation show good performance in generating novel and useful concepts.Comment: Accepted by J. Comput. Inf. Sci. En

    Multi-modal Machine Learning in Engineering Design: A Review and Future Directions

    Full text link
    In the rapidly advancing field of multi-modal machine learning (MMML), the convergence of multiple data modalities has the potential to reshape various applications. This paper presents a comprehensive overview of the current state, advancements, and challenges of MMML within the sphere of engineering design. The review begins with a deep dive into five fundamental concepts of MMML:multi-modal information representation, fusion, alignment, translation, and co-learning. Following this, we explore the cutting-edge applications of MMML, placing a particular emphasis on tasks pertinent to engineering design, such as cross-modal synthesis, multi-modal prediction, and cross-modal information retrieval. Through this comprehensive overview, we highlight the inherent challenges in adopting MMML in engineering design, and proffer potential directions for future research. To spur on the continued evolution of MMML in engineering design, we advocate for concentrated efforts to construct extensive multi-modal design datasets, develop effective data-driven MMML techniques tailored to design applications, and enhance the scalability and interpretability of MMML models. MMML models, as the next generation of intelligent design tools, hold a promising future to impact how products are designed

    Natural Language Processing in-and-for Design Research

    Full text link
    We review the scholarly contributions that utilise Natural Language Processing (NLP) methods to support the design process. Using a heuristic approach, we collected 223 articles published in 32 journals and within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research

    A Concept for Deployment and Evaluation of Unsupervised Domain Adaptation in Cognitive Perception Systems

    Get PDF
    Jüngste Entwicklungen im Bereich des tiefen Lernens ermöglichen Perzeptionssystemen datengetrieben Wissen über einen vordefinierten Betriebsbereich, eine sogenannte Domäne, zu gewinnen. Diese Verfahren des überwachten Lernens werden durch das Aufkommen groß angelegter annotierter Datensätze und immer leistungsfähigerer Prozessoren vorangetrieben und zeigen unübertroffene Performanz bei Perzeptionsaufgaben in einer Vielzahl von Anwendungsbereichen.Jedoch sind überwacht-trainierte neuronale Netze durch die Menge an verfügbaren annotierten Daten limitiert und dies wiederum findet in einem begrenzten Betriebsbereich Ausdruck. Dabei beruht überwachtes Lernen stark auf manuell durchzuführender Datenannotation. Insbesondere durch die ständig steigende Verfügbarkeit von nicht annotierten großen Datenmengen ist der Gebrauch von unüberwachter Domänenanpassung entscheidend. Verfahren zur unüberwachten Domänenanpassung sind meist nicht geeignet, um eine notwendige Inbetriebnahme des neuronalen Netzes in einer zusätzlichen Domäne zu gewährleisten. Darüber hinaus sind vorhandene Metriken häufig unzureichend für eine auf die Anwendung der domänenangepassten neuronalen Netzen ausgerichtete Validierung. Der Hauptbeitrag der vorliegenden Dissertation besteht aus neuen Konzepten zur unüberwachten Domänenanpassung. Basierend auf einer Kategorisierung von Domänenübergängen und a priori verfügbaren Wissensrepräsentationen durch ein überwacht-trainiertes neuronales Netz wird eine unüberwachte Domänenanpassung auf nicht annotierten Daten ermöglicht. Um die kontinuierliche Bereitstellung von neuronalen Netzen für die Anwendung in der Perzeption zu adressieren, wurden neuartige Verfahren speziell für die unüberwachte Erweiterung des Betriebsbereichs eines neuronalen Netzes entwickelt. Beispielhafte Anwendungsfälle des Fahrzeugsehens zeigen, wie die neuartigen Verfahren kombiniert mit neu entwickelten Metriken zur kontinuierlichen Inbetriebnahme von neuronalen Netzen auf nicht annotierten Daten beitragen. Außerdem werden die Implementierungen aller entwickelten Verfahren und Algorithmen dargestellt und öffentlich zugänglich gemacht. Insbesondere wurden die neuartigen Verfahren erfolgreich auf die unüberwachte Domänenanpassung, ausgehend von der Tag- auf die Nachtobjekterkennung im Bereich des Fahrzeugsehens angewendet

    First steps in the study of cyber-psycho-cognitive operations

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Relações Internacionais, Programa de Pós-Graduação em Relações Internacionais, 2019.O presente trabalho é uma análise dos mecanismos informáticos e tecno-comunicacionais envolvidos na articulação de mundos da vida orientados estrategicamente para estimular, prever ou minar o desenvolvimento das condições psico-cognitivas adequadas para a construção e sustento da legitimidade racional de uma autoridade ou ação política. A aplicação de instrumentos “arqueológicos” Foucauldianos ao estudo das narrativas políticas que engendraram e surgiram de “Russiagate” permitiu situar a teoria num contexto histórico e validar a premissa da convergência e incorporação de tendências de agendamento comuns e de práticas típicas de operações psicológicas tradicionais. Contudo, os efeitos tanto da disponibilidade comercial das TICs com capacidade de “deep learning”, quanto da estruturação baseada em conhecimento permitida pela ubiquidade e centralidade econômica dessas tecnologias, tornam o conjunto de mecanismos analisados num fenômeno que merece uma conceptualização e marco investigativo únicos. A obra é uma contribuição a esse empreendimento.This is an analysis of the ICT-based mechanisms involved in the articulation of lifeworlds that are strategically oriented to foster, prevent or undermine the development of psycho-cognitive conditions adequate for the construction or sustainability of an authority’s or a political action’s rational legitimacy. While grounding theory to a historical context, the application of Foucauldian “archeological” instruments to the study of the political narratives giving birth and springing from “Russiagate” also served to validate the premised convergence and incorporation of common agenda-setting trends and practices typical of traditional psychological operations. However, the effects of both the commercial availability of deep-learning ICTs and the cognition-based structuration afforded by their ubiquity and economic centrality set this “dispositif” apart, thereby deserving a unique conceptualization and research framework. This study is a contribution to such endeavor
    corecore