Search CORE

236 research outputs found

Semantic Exploration of Text Documents with Multi-Faceted Metadata Employing Word Embeddings: The Patent Landscaping Use Case

Author: Skripnikova Tatyana
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2019
Field of study

Die Menge der Veröentlichungen, die den wissenschaftlichen Fortschritt dokumentieren, wächst kontinuierlich. Dies erfordert die Entwicklung der technologischen Hilfsmittel für eine eziente Analyse dieser Werke. Solche Dokumente kennzeichnen sich nicht nur durch ihren textuellen Inhalt, sondern auch durch eine Menge von Metadaten-Attributen verschiedenster Art, unter anderem Beziehungen zwischen den Dokumenten. Diese Komplexität macht die Entwicklung eines Visualisierungsansatzes, der eine Untersuchung der schriftlichen Werke unterstützt, zu einer notwendigen und anspruchsvollen Aufgabe. Patente sind beispielhaft für das beschriebene Problem, weil sie in großen Mengen von Firmen untersucht werden, die sich Wettbewerbsvorteile verschaffen oder eigene Forschung und Entwicklung steuern wollen. Vorgeschlagen wird ein Ansatz für eine explorative Visualisierung, der auf Metadaten und semantischen Embeddings von Patentinhalten basiert ist. Wortembeddings aus einem vortrainierten Word2vec-Modell werden genutzt, um Ähnlichkeiten zwischen Dokumenten zu bestimmen. Darüber hinaus helfen hierarchische Clusteringmethoden dabei, mehrere semantische Detaillierungsgrade durch extrahierte relevante Stichworte anzubieten. Derzeit dürfte der vorliegende Visualisierungsansatz der erste sein, der semantische Embeddings mit einem hierarchischen Clustering verbindet und dabei diverse Interaktionstypen basierend auf Metadaten-Attributen unterstützt. Der vorgestellte Ansatz nimmt Nutzerinteraktionstechniken wie Brushing and Linking, Focus plus Kontext, Details-on-Demand und Semantic Zoom in Anspruch. Dadurch wird ermöglicht, Zusammenhänge zu entdecken, die aus dem Zusammenspiel von 1) Verteilungen der Metadatenwerten und 2) Positionen im semantischen Raum entstehen. Das Visualisierungskonzept wurde durch Benutzerinterviews geprägt und durch eine Think-Aloud-Studie mit Patentenexperten evaluiert. Während der Evaluation wurde der vorgestellte Ansatz mit einem Baseline-Ansatz verglichen, der auf TF-IDF-Vektoren basiert. Die Benutzbarkeitsstudie ergab, dass die Visualisierungsmetaphern und die Interaktionstechniken angemessen gewählt wurden. Darüber hinaus zeigte sie, dass die Benutzerschnittstelle eine deutlich größere Rolle bei den Eindrücken der Probanden gespielt hat als die Art und Weise, wie die Patente platziert und geclustert waren. Tatsächlich haben beide Ansätze sehr ähnliche extrahierte Clusterstichworte ergeben. Dennoch wurden bei dem semantischen Ansatz die Cluster intuitiver platziert und deutlicher abgetrennt. Das vorgeschlagene Visualisierungslayout sowie die Interaktionstechniken und semantischen Methoden können auch auf andere Arten von schriftlichen Werken erweitert werden, z. B. auf wissenschaftliche Publikationen. Andere Embeddingmethoden wie Paragraph2vec [61] oder BERT [32] können zudem verwendet werden, um kontextuelle Abhängigkeiten im Text über die Wortebene hinaus auszunutzen

KITopen

Visual Analysis of High-Dimensional Point Clouds using Topological Abstraction

Author: Oesterling Patrick
Publication venue
Publication date: 14/04/2016
Field of study

This thesis is about visualizing a kind of data that is trivial to process by computers but difficult to imagine by humans because nature does not allow for intuition with this type of information: high-dimensional data. Such data often result from representing observations of objects under various aspects or with different properties. In many applications, a typical, laborious task is to find related objects or to group those that are similar to each other. One classic solution for this task is to imagine the data as vectors in a Euclidean space with object variables as dimensions. Utilizing Euclidean distance as a measure of similarity, objects with similar properties and values accumulate to groups, so-called clusters, that are exposed by cluster analysis on the high-dimensional point cloud. Because similar vectors can be thought of as objects that are alike in terms of their attributes, the point cloud\''s structure and individual cluster properties, like their size or compactness, summarize data categories and their relative importance. The contribution of this thesis is a novel analysis approach for visual exploration of high-dimensional point clouds without suffering from structural occlusion. The work is based on implementing two key concepts: The first idea is to discard those geometric properties that cannot be preserved and, thus, lead to the typical artifacts. Topological concepts are used instead to shift away the focus from a point-centered view on the data to a more structure-centered perspective. The advantage is that topology-driven clustering information can be extracted in the data\''s original domain and be preserved without loss in low dimensions. The second idea is to split the analysis into a topology-based global overview and a subsequent geometric local refinement. The occlusion-free overview enables the analyst to identify features and to link them to other visualizations that permit analysis of those properties not captured by the topological abstraction, e.g. cluster shape or value distributions in particular dimensions or subspaces. The advantage of separating structure from data point analysis is that restricting local analysis only to data subsets significantly reduces artifacts and the visual complexity of standard techniques. That is, the additional topological layer enables the analyst to identify structure that was hidden before and to focus on particular features by suppressing irrelevant points during local feature analysis. This thesis addresses the topology-based visual analysis of high-dimensional point clouds for both the time-invariant and the time-varying case. Time-invariant means that the points do not change in their number or positions. That is, the analyst explores the clustering of a fixed and constant set of points. The extension to the time-varying case implies the analysis of a varying clustering, where clusters appear as new, merge or split, or vanish. Especially for high-dimensional data, both tracking---which means to relate features over time---but also visualizing changing structure are difficult problems to solve

Qucosa - Publikationsserver der Universität Leipzig

Recommended from our members

Essays on Probabilistic Machine Learning for Economics

Author: Kuhlen Nikolas
Publication venue: University of Cambridge
Publication date: 06/07/2021
Field of study

This thesis consists of three essays that explore the use of probabilistic machine learning techniques in combination with information-theoretic concepts to answer economic questions. Over the past years, economists have started applying machine learning methods to a wide range of topics. Probabilistic methods in the context of unsupervised learning represent one particular modelling approach at the intersection of computer science and statistics. While widely used in applied statistics, these models, however, do not necessarily provide relevant and interpretable outputs from an economist's perspective. In this thesis, I appeal to information-theoretic methods to summarise the probabilistic information inferred from such models and construct economically meaningful measures.Nikolas Kuhlen gratefully acknowledges the financial support of The Alan Turing Institute under research award No. TU/C/000030

Apollo (Cambridge)

Patent data driven innovation logic

Author: Dewulf Simon
Publication venue: Dyson School of Design Engineering, Imperial College London
Publication date: 01/11/2020
Field of study

Innovation research is conventionally conducted with creativity techniques such as TRIZ, Mind Mapping, Brainstorming, etc. (Dewulf, Baillie 1998). Patent research is typically used to research novelty or prior art, and legal studies. This thesis is at the intersection of creativity techniques, and patent data analysis. It describes how to utilise patent data for distilling Innovation Logic and conducting innovation research. Using the patent research tool PatentInspiration (© AULIVE Software NV), the 4 different stages of the Innovation Logic approach have been subjected to text analysis in patent literature. The specific text patterns were identified and documented on several case studies, with one case study across the whole thesis: the toothbrush. The opportunities and limitations of Patent Data Driven Innovation Research have been documented and discussed. This methodology has been demonstrated within a proposed structural approach to problem solving, technology marketing and innovation research. Furthermore, the potential of artificial idea generation and artificial creativity was examined and debated for the purpose of computer aided creativity. This thesis examines and confirms three claims: CLAIM 1: PROPERTIES AND FUNCTIONS CAN BE ADJECTIVES AND VERBS IN PATENT LITERATURE CLAIM 2: PATENT DATA ANALYSIS AUGMENTS THE FULL INNOVATION LOGIC PROCESS CLAIM 3: ARTIFICIAL INNOVATION METHODS CAN BE FUELED BY PATENT DATA Patent data can be text mined, acting as a global brain consisting of over 100 million invention documents. It is possible to use this existing data to reverse engineer thinking methodologies, allowing scientists and engineers to solve new problems, invent new products or processes, or find new markets for existing technologies. Patent Data Driven Innovation Logic will demonstrate a systematic innovation approach that combines the force of contemporary data mining methods on patent literature, with a structured innovation research methodology.Open Acces

Spiral - Imperial College Digital Repository

Edge Based RGB-D SLAM and SLAM Based Navigation

Author: Bose Laurie N
Publication venue
Publication date: 07/03/2017
Field of study

Explore Bristol Research

Assessing Low-Carbon Fuel Technology Innovation Through a Technology Innovation System Approach

Author: Kessler Jeff
Publication venue: eScholarship, University of California
Publication date: 01/12/2015
Field of study

While the existing work on technology innovation is abundant, the innovation process largely remains a “black box,” shrouded in mystery. Energy models that incorporate innovation concepts, such as experience curves, fail to consider the fundamental processes that drive innovation. This dissertation research establishes a set of methodological approaches to better break in to this innovation black box, aiding in the quantification of the more qualitative approaches to innovation. These methods are applied to better examine low-carbon technology innovation in transportation. Specifically, this dissertation looks at biofuel innovation and the more recent diffusion of electric vehicles. Patent trends, one traditional approach for quantifying innovations, are used to provide a point of comparison for the novel methodologies employed. This research shows that the innovation narrative and conclusions that can be drawn from patent data are largely dependent on how patents are classified. Employing statistical models in conjunction with computational linguistics and machine-learning algorithms, it is possible to classify large bodies of text. This methodology is applied to a large selection of patents to better classify biofuel technologies. Additionally, this method is applied to a large repository of textual media, such as newspaper articles and trade journals, to select for specific technologies, and to classify articles by the type of information they convey. This Technology Innovation System (TIS) database is believed to adequately proxy the flow of information over time, due to the large number of documents collected. The innovation trends captured in the TIS database align well with the biofuel narrative established in literature. There is also good alignment between patent data classified through this methodology and the TIS database. Through use of the TIS database in conjunction with deployment data and policy data, this dissertation demonstrates several applications for assessing technology innovation. Results can be used to provide suggestions, supported by the data, which may foster improved innovation outcomes for low-carbon transportation technologie

Ezid

eScholarship - University of California

Misc. Pub. 91-1

Author
Publication venue: Agricultural and Forestry Experiment Station, School of Agriculture and Land Resources Management, University of Alaska Fairbanks
Publication date: 31/12/1990
Field of study

I submit herewith the annual report of the Agricultural and Forestry Experiment Station, School of Agriculture and Land Resources Management, University of Alaska Fairbanks, for the period ending December 31,1990. This is done in accordance with an act of the Congress, approved March 2,1887, entitled "An act to establish Agricultural Experiment Stations, in connection with the Agricultural Colleges established in the several states under the provisions of an act approved July 2,1862, and under the acts supplementary thereto," and also of the act of the Alaska Territorial Legislature, approved March 12,1935, accepting the provisions of the act of Congress. James V. Drew, DirectorStatement of Purpose -- Plant and Animal Sciences -- Forest Sciences -- Resources Management -- Financial Statement -- Publications - Staf

ScholarWorks@UA

Media Infrastructures and the Politics of Digital Time

Author
Publication venue: 'Amsterdam University Press'
Publication date: 08/09/2021
Field of study

Digital media everyday inscribe new patterns of time, promising instant communication, synchronous collaboration, intricate time management, and profound new advantages in speed. The essays in this volume reconsider these outward interfaces of convenience by calling attention to their supporting infrastructures, the networks of digital time that exert pressures of conformity and standardization on the temporalities of lived experience and have important ramifications for social relations, stratifications of power, practices of cooperation, and ways of life. Interdisciplinary in method and international in scope, the volume draws together insights from media and communication studies, cultural studies, and science and technology studies while staging an important encounter between two distinct approaches to the temporal patterning of media infrastructures, a North American strain emphasizing the social and cultural experiences of lived time and a European tradition, prominent especially in Germany, focusing on technological time and time-critical processes

Directory of Open Access Books (DOAB)

Products and Services

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Todayâ€™s global economy offers more opportunities, but is also more complex and competitive than ever before. This fact leads to a wide range of research activity in different fields of interest, especially in the so-called high-tech sectors. This book is a result of widespread research and development activity from many researchers worldwide, covering the aspects of development activities in general, as well as various aspects of the practical application of knowledge

Directory of Open Access Books (DOAB)

Spinoff 1997: 25 Years of Reporting Down-to-Earth Benefits

Author
Publication venue
Publication date: 01/01/1997
Field of study

The 25th annual issue of NASA's report on technology transfer and research and development (R&D) from its ten field centers is presented. The publication is divided into three sections. Section 1 comprises a summary of R&D over the last 25 years. Section 2 presents details of the mechanisms NASA uses to transfer technology to private industry as well as the assistance NASA provides in commercialization efforts. Section 3, which is the focal point of the publication, features success stories of manufacturers and entrepreneurs in developing commercial products and services that improve the economy and life in general

NASA Technical Reports Server