Search CORE

28 research outputs found

A hybrid similarity measure method for patent portfolio analysis

Author: Huang L
Lu J
Porter AL
Shang L
Zhang G
Zhang Y
Zhu D
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

© 2016 Elsevier Ltd Similarity measures are fundamental tools for identifying relationships within or across patent portfolios. Many bibliometric indicators are used to determine similarity measures; for example, bibliographic coupling, citation and co-citation, and co-word distribution. This paper aims to construct a hybrid similarity measure method based on multiple indicators to analyze patent portfolios. Two models are proposed: categorical similarity and semantic similarity. The categorical similarity model emphasizes international patent classifications (IPCs), while the semantic similarity model emphasizes textual elements. We introduce fuzzy set routines to translate the rough technical (sub-) categories of IPCs into defined numeric values, and we calculate the categorical similarities between patent portfolios using membership grade vectors. In parallel, we identify and highlight core terms in a 3-level tree structure and compute the semantic similarities by comparing the tree-based structures. A weighting model is designed to consider: 1) the bias that exists between the categorical and semantic similarities, and 2) the weighting or integrating strategy for a hybrid method. A case study to measure the technological similarities between selected firms in China's medical device industry is used to demonstrate the reliability our method, and the results indicate the practical meaning of our method in a broad range of informetric applications

OPUS - University of Technology Sydney

Recommended from our members

An Information Retrieval Approach for Automatically Constructing Software Libraries

Author: Berry Daniel M.
Kaiser Gail E.
Maarek Yoelle S.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1990
Field of study

Although software reuse presents clear advantages for programmer productivity and code reliability, it is not practiced enough. One of the reasons for the only moderate success of reuse is the lack of software libraries that facilitate the actual locating and understanding of reusable components. This paper describes a technology for automatically assembling large software libraries which promote software reuse by helping the user locate the components closest to her/his needs. Software libraries are automatically assembled from a set of unorganized components by using information retrieval techniques. The construction of the library is done in two steps. First, attributes are automatically extracted from natural language documentation by using a new indexing scheme based on the notions of lexical affinities and quantity of information. Then a hierarchy for browsing is automatically generated using a clustering technique which draws only on the information provided by the attributes. Thanks to the free-text indexing scheme, tools following this approach can accept free-style natural language queries. This technology has been implemented in the GURU system, which has been applied to construct an organized library of AIX utilities. An experiment was conducted in order to evaluate the retrieval effectiveness of GURU as compared to INFOEXPLORER a hypertext library system for AIX 3 on the IBM RISC System/6000 series. We followed the usual evaluation procedure used in information retrieval, based upon recall and precision measures, and determined that our system performs 15% better on a random test set, while being much less expensive to build than INFOEXPLORER

Columbia University Academic Commons

Detecting and predicting the topic change of Knowledge-based Systems: A topic-based bibliometric analysis from 1991 to 2016

Author: Chen H
Lu J
Zhang G
Zhang Y
Publication venue: 'Elsevier BV'
Publication date: 01/10/2017
Field of study

© 2017 The journal Knowledge-based Systems (KnoSys) has been published for over 25 years, during which time its main foci have been extended to a broad range of studies in computer science and artificial intelligence. Answering the questions: “What is the KnoSys community interested in?” and “How does such interest change over time?” are important to both the editorial board and audience of KnoSys. This paper conducts a topic-based bibliometric study to detect and predict the topic changes of KnoSys from 1991 to 2016. A Latent Dirichlet Allocation model is used to profile the hotspots of KnoSys and predict possible future trends from a probabilistic perspective. A model of scientific evolutionary pathways applies a learning-based process to detect the topic changes of KnoSys in sequential time slices. Six main research areas of KnoSys are identified, i.e., expert systems, machine learning, data mining, decision making, optimization, and fuzzy, and the results also indicate that the interest of KnoSys communities in the area of computational intelligence is raised, and the ability to construct practical systems through knowledge use and accurate prediction models is highly emphasized. Such empirical insights can be used as a guide for KnoSys submissions

OPUS - University of Technology Sydney

A heuristic information retrieval study : an investigation of methods for enhanced searching of distributed data objects exploiting bidirectional relevance feedback

Author: Petratos Panagiotis
Publication venue: University of Bedfordshire
Publication date: 01/01/2004
Field of study

A thesis submitted for the degree of Doctor of Philosophy of the University of LutonThe primary aim of this research is to investigate methods of improving the effectiveness of current information retrieval systems. This aim can be achieved by accomplishing numerous supporting objectives. A foundational objective is to introduce a novel bidirectional, symmetrical fuzzy logic theory which may prove valuable to information retrieval, including internet searches of distributed data objects. A further objective is to design, implement and apply the novel theory to an experimental information retrieval system called ANACALYPSE, which automatically computes the relevance of a large number of unseen documents from expert relevance feedback on a small number of documents read. A further objective is to define a methodology used in this work as an experimental information retrieval framework consisting of multiple tables including various formulae which anow a plethora of syntheses of similarity functions, ternl weights, relative term frequencies, document weights, bidirectional relevance feedback and history adjusted term weights. The evaluation of bidirectional relevance feedback reveals a better correspondence between system ranking of documents and users' preferences than feedback free system ranking. The assessment of similarity functions reveals that the Cosine and Jaccard functions perform significantly better than the DotProduct and Overlap functions. The evaluation of history tracking of the documents visited from a root page reveals better system ranking of documents than tracking free information retrieval. The assessment of stemming reveals that system information retrieval performance remains unaffected, while stop word removal does not appear to be beneficial and can sometimes be harmful. The overall evaluation of the experimental information retrieval system in comparison to a leading edge commercial information retrieval system and also in comparison to the expert's golden standard of judged relevance according to established statistical correlation methods reveal enhanced system information retrieval effectiveness

OpenGrey Repository

University of Bedfordshire Repository

Information retrieval (Part I):Introduction

Author: Paijmans J.J.
Publication venue: Institute for Language Technology and Artifical IntelIigence, Tilburg University
Publication date: 01/01/1992
Field of study

Tilburg University Repository

The intellectual structure and substance of the knowledge utilization field: A longitudinal author co-citation analysis, 1945 to 2004

Author: A Kothari
A Zuccala
AG Gandhi
AL Cochrane
AW Blackman
AW Gouldner
B Cronin
B Latour
B Ryan
B Sarter
B Seely
C Van den Bulte
C Weiss
CA Cottrill
CA Estabrooks
Carole A Estabrooks
CH Weiss
CH Weiss
CH Weiss
Connie Winther
CW Churchman
CW Churchman
D Crane
DJ de Solla Price
DL Sackett
E Katz
E Katz
E Mansfield
E Mansfield
E Mansfield
E Mansfield
E Mansfield
E Mansfield
E Rogers
E West
EA Lindquist
EA McGlynn
EM Glaser
EM Rogers
EM Rogers
EM Rogers
EM Rogers
EM Rogers
EM Rogers
EM Rogers
EM Rogers
ES Lang
Evidence-Based Medicine Working Group
G Farley
G Salton
G Salton
G Tarde
G Walker
G Zaltman
G Zaltman
GH Guyatt
GH Guyatt
GR Baker
H Nowotny
H Nowotny
H Zuckerman
HD White
HD White
HD White
HD White
HD White
HF Lionberger
HF Moed
HG Small
HG Small
HG Small
HM Collins
HM Collins
J Beall
J Coleman
J Grimshaw
J Lomas
J Shaperman
Joanne Profetto-McGrath
John N Lavis
JS Coleman
JW Duncan
K Knorr Cetina
K Nilsson
KW McCain
KW McCain
L Fitzgerald
L Hamers
L Leydesdorff
LA Brown
LA Lievrouw
Lars Wallin
Linda Derksen
M Callon
M Gibbons
M Gibbons
M Gmür
M Haider
M Mark
M Van de Vall
M Van de Vall
MA Schuster
ME Loomis
Mitroff II
N Stehr
N Stehr
N Stehr
NS Caplan
O Persson
P Allmark
PR Orszag
R Grol
R Grol
R Rich
R Serpa
RB Haynes
RF Rich
RF Rich
RG Havelock
RG Havelock
RG Havelock
RK Merton
RK Merton
RK Merton
RK Merton
RK Merton
RK Merton
RK Yin
RK Yin
S Cooney
S Shapin
S Sunesson
S Sunesson
Shannon D Scott
ST Koerner
T Hägerstrand
T Hägerstrand
T Hägerstrand
T Hägerstrand
T Kuhn
TE Backer
TE Backer
TE Backer
TJ Allen
TJ Allen
TJ Allen
TW Valente
TW Valente
V Mahajan
V Mahajan
WB Hart
WF Ogburn
WN Dunn
Y Okubo
Z Griliches
Publication venue: BioMed Central
Publication date: 01/11/2008
Field of study

Abstract Background It has been argued that science and society are in the midst of a far-reaching renegotiation of the social contract between science and society, with society becoming a far more active partner in the creation of knowledge. On the one hand, new forms of knowledge production are emerging, and on the other, both science and society are experiencing a rapid acceleration in new forms of knowledge utilization. Concomitantly since the Second World War, the science underpinning the knowledge utilization field has had exponential growth. Few in-depth examinations of this field exist, and no comprehensive analyses have used bibliometric methods. Methods Using bibliometric analysis, specifically first author co-citation analysis, our group undertook a domain analysis of the knowledge utilization field, tracing its historical development between 1945 and 2004. Our purposes were to map the historical development of knowledge utilization as a field, and to identify the changing intellectual structure of its scientific domains. We analyzed more than 5,000 articles using citation data drawn from the Web of Science®. Search terms were combinations of knowledge, research, evidence, guidelines, ideas, science, innovation, technology, information theory and use, utilization, and uptake. Results We provide an overview of the intellectual structure and how it changed over six decades. The field does not become large enough to represent with a co-citation map until the mid-1960s. Our findings demonstrate vigorous growth from the mid-1960s through 2004, as well as the emergence of specialized domains reflecting distinct collectives of intellectual activity and thought. Until the mid-1980s, the major domains were focused on innovation diffusion, technology transfer, and knowledge utilization. Beginning slowly in the mid-1980s and then growing rapidly, a fourth scientific domain, evidence-based medicine, emerged. The field is dominated in all decades by one individual, Everett Rogers, and by one paradigm, innovation diffusion. Conclusion We conclude that the received view that social science disciplines are in a state where no accepted set of principles or theories guide research (<it>i.e.</it>, that they are pre-paradigmatic) could not be supported for this field. Second, we document the emergence of a new domain within the knowledge utilization field, evidence-based medicine. Third, we conclude that Everett Rogers was the dominant figure in the field and, until the emergence of evidence-based medicine, his representation of the general diffusion model was the dominant paradigm in the field.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Use of Web mining for an actualized and coherent chatterbot dialogue

Author: Dosquet Benjamin
Magnant Xavier
Publication venue
Publication date: 01/01/2004
Field of study

Repository of the University of Namur

Proceedings of the International Workshop on Web Information Systems Modeling:WISM 2006

Author: Frasincar Flavius
Houben Geert-Jan
Thiran Philippe
Publication venue
Publication date: 01/01/2006
Field of study

Repository of the University of Namur

Cluster Analysis of Legal Documents

Author: Boreham J
Publication venue
Publication date: 22/02/2022
Field of study

Single-link cluster analysis has been used to provide classifications of several collections of legal documents, based on various characteristics of the text. Each document was represented in terms of the chosen characteristics by a vector whose elements were the frequencies of occurrence of the characteristics in that document. The values of similarity between documents were determined by calculating the cosine of the angle between each pair of document vectors. The clustering algorithm then operated on these similarity coefficients to group documents which were most similar. A suite of computer programs was written to perform the classification. Four programs were required to (a) select the document descriptors from the full-text of the documents, (b) construct document vectors, (c) calculate similarity coefficients, and (d) perform single-link clustering. Three classification experiments were performed. The first classified the full-text of both the English and French versions of the Treaties of the Council of Europe. The words of the full-text, taken singly and in pairs, were used to describe the treaties, and the two cases of including and excluding the 'common' words were investigated. The best classification was based on single words with common words excluded. Since each treaty was a lengthy collection of non-homogeneous clauses, it was thought that a classification - ii - of the individual articles would be more useful. In this case the formal and non-formal clauses clustered separately, whereas before the formal clauses, present in every. treaty, had caused semantically unrelated treaties to be brought together. During the course of this study an opportunity arose to investigate the use of cluster analysis to test the trustworthiness of certain oral confessions presented as evidence in criminal proceedings. The common or function words, which are generally agreed to characterise the style of an author, were used as document descriptors for two sets of statements, one which the defendant admitted, the other which he was alleged to have made but which he denied. The two sets of statements clustered separately, indicating a difference in style. On the basis of this and other comparative tests it was possible to say that the disputed statements were unlikely to have been made by the defendant. The third experiment involved the use of the marginal citations in Statutes as document descriptors. Statutes were regarded as semantically related if they cited the same Acts. The Public General Acts of Parliament for the three years 1973 - 1975 were successfully clustered into groups of related Acts

Kent Academic Repository

Integrating and conceptualizing heterogeneous ontologies on the web

Author: GOH HAI KIAT
Publication venue
Publication date: 21/12/2006
Field of study

Master'sMASTER OF SCIENC

ScholarBank@NUS