Search CORE

85 research outputs found

The Impact of Name-Matching and Blocking on Author Disambiguation

Author: Davidson Ian
De Carvalho Ana Paula
Galvez Carmen
Ioannidis Yannis E.
Kardes Hakan
Kurien Biji T.
Publication venue: New York
Publication date: 01/01/2018
Field of study

In this work, we address the problem of blocking in the context of author name disambiguation. We describe a framework that formalizes different ways of name-matching to determine which names could potentially refer to the same author. We focus on name variations that follow from specifying a name with different completeness (i.e. full first name or only initial). We extend this framework by a simple way to define traditional, new and custom blocking schemes. Then, we evaluate different old and new schemes in the Web of Science. In this context we define and compare a new type of blocking schemes. Based on these results, we discuss the question whether name-matching can be used in blocking evaluation as a replacement of annotated author identifiers. Finally, we argue that blocking can have a strong impact on the application and evaluation of author disambiguation

Crossref

ZENODO

SSOAR - Social Science Open Access Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The GMAP: a versatile tool for physical data independence

Author: Marvin H. Solomon
Odysseas G. Tsatalos
Yannis E. Ioannidis
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

QueryVis: Logic-based diagrams help users understand complicated SQL queries faster

Author: Ahadi Alireza
Chatzopoulou Gloria
Fish Andrew
Howe Bill
Howse John
Ioannidis Yannis E.
Jaakkola Hannu
Khoussainova Nodira
Sakia RM
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/04/2020
Field of study

Understanding the meaning of existing SQL queries is critical for code maintenance and reuse. Yet SQL can be hard to read, even for expert users or the original creator of a query. We conjecture that it is possible to capture the logical intent of queries in \emph{automatically-generated visual diagrams} that can help users understand the meaning of queries faster and more accurately than SQL text alone. We present initial steps in that direction with visual diagrams that are based on the first-order logic foundation of SQL and can capture the meaning of deeply nested queries. Our diagrams build upon a rich history of diagrammatic reasoning systems in logic and were designed using a large body of human-computer interaction best practices: they are \emph{minimal} in that no visual element is superfluous; they are \emph{unambiguous} in that no two queries with different semantics map to the same visualization; and they \emph{extend} previously existing visual representations of relational schemata and conjunctive queries in a natural way. An experimental evaluation involving 42 users on Amazon Mechanical Turk shows that with only a 2--3 minute static tutorial, participants could interpret queries meaningfully faster with our diagrams than when reading SQL alone. Moreover, we have evidence that our visual diagrams result in participants making fewer errors than with SQL. We believe that more regular exposure to diagrammatic representations of SQL can give rise to a \emph{pattern-based} and thus more intuitive use and re-use of SQL. All details on the experimental study, the evaluation stimuli, raw data, and analyses, and source code are available at https://osf.io/mycr2Comment: Full version of paper appearing in SIGMOD 202

arXiv.org e-Print Archive

Crossref

Query Optimization

Author: Yannis E. Ioannidis
Publication venue
Publication date: 01/01/1996
Field of study

Imagine yourself standing in front of an exquisite buffet filled with numerous delicacies. Your goal is to try them all out, but you need to decide in what order. What exchange of tastes will maximize the overall pleasure of your palate? Although much less pleasurable and subjective, that is the type of problem that query optimizers are called to solve. Given a query, there are many plans that a database management system (DBMS) can follow to process it and produce its answer. All plans are equivalent in terms of their final output but vary in their cost, i.e., the amount of time that they need to run. What is the plan that needs the least amount of time? Such query optimization is absolutely necessary in a DBMS. The cost difference between two alternatives can be enormous. For example, consider the following database schema, which will be..

CiteSeerX

Commutativity and its Role in the Processing of Linear Recursion

Author: Ioannidis Yannis E
Publication venue: University of Wisconsin-Madison Department of Computer Sciences
Publication date: 01/01/1988
Field of study

Minds@University of Wisconsin