Search CORE

82 research outputs found

Which Conference Is That? A Case Study in Computer Science

Author: Demetrescu C.
Finocchi I.
Ribichini A.
Schaerf M.
Publication venue: Association for Computing Machinery
Publication date: 01/01/2022
Field of study

Conferences play a major role in some disciplines such as computer science and are often used in research quality evaluation exercises. Differently from journals and books, for which ISSN and ISBN codes provide unambiguous keys, recognizing the conference series in which a paper was published is a rather complex endeavor: There is no unique code assigned to conferences, and the way their names are written may greatly vary across years and catalogs. In this article, we propose a technique for the entity resolution of conferences based on the analysis of different semantic parts of their names. We present the results of an investigation of our technique on a dataset of 42,395 distinct computer science conference names excerpted from the DBLP computer science repository,1 which we automatically link to different authority files. With suitable data cleaning, the precision of our record linkage algorithm can be as high as 94%. A comparison with results obtainable using state-of-the-art general-purpose record linkage algorithms rounds off the article, showing that our ad hoc solution largely outperforms them in terms of the quality of the results

Archivio della ricerca- Università di Roma La Sapienza

FlashProfile: A Framework for Synthesizing Data Profiles

Author: Gulwani Sumit
Jain Prateek
Millstein Todd
Padhi Saswat
Perelman Daniel
Polozov Oleksandr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2018
Field of study

We address the problem of learning a syntactic profile for a collection of strings, i.e. a set of regex-like patterns that succinctly describe the syntactic variations in the strings. Real-world datasets, typically curated from multiple sources, often contain data in various syntactic formats. Thus, any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify the different formats is infeasible in standard big-data scenarios. Prior techniques are restricted to a small set of pre-defined patterns (e.g. digits, letters, words, etc.), and provide no control over granularity of profiles. We define syntactic profiling as a problem of clustering strings based on syntactic similarity, followed by identifying patterns that succinctly describe each cluster. We present a technique for synthesizing such profiles over a given language of patterns, that also allows for interactive refinement by requesting a desired number of clusters. Using a state-of-the-art inductive synthesis framework, PROSE, we have implemented our technique as FlashProfile. Across

153

tasks over

75

large real datasets, we observe a median profiling time of only

\sim\,0.7\,

s. Furthermore, we show that access to syntactic profiles may allow for more accurate synthesis of programs, i.e. using fewer examples, in programming-by-example (PBE) workflows such as FlashFill.Comment: 28 pages, SPLASH (OOPSLA) 201

arXiv.org e-Print Archive

eScholarship - University of California

Symbolic verification of message passing interface programs

Author: Bouajjani Ahmed
Cadar C.
Cimatti Alessandro
Droste Alexander
Forum MPI
Fu Xianjin
Inverso Omar
Jiang Ke
Kragl Bernhard
Krammer Bettina
Li Hongbo
Luo Ziqing
Siegel Stephen F
Yin Liangze
Publication venue
Publication date: 01/05/2020
Field of study

Temporal Stream Logic: Synthesis beyond the Bools

Author: A Pnueli
A Solar-Lezama
B Finkbeiner
Carsten Gerstacker
EL Post
EM Clarke
H Liu
H Liu
MT Vechev
P Faymonville
R Bloem
R Dimitrova
R Ehlers
Roderick Bloem
S Lindley
T Wongpiromsarn
V Kuncak
W Jeltsch
Z Manna
Publication venue
Publication date: 01/01/2019
Field of study

Reactive systems that operate in environments with complex data, such as mobile apps or embedded controllers with many sensors, are difficult to synthesize. Synthesis tools usually fail for such systems because the state space resulting from the discretization of the data is too large. We introduce TSL, a new temporal logic that separates control and data. We provide a CEGAR-based synthesis approach for the construction of implementations that are guaranteed to satisfy a TSL specification for all possible instantiations of the data processing functions. TSL provides an attractive trade-off for synthesis. On the one hand, synthesis from TSL, unlike synthesis from standard temporal logics, is undecidable in general. On the other hand, however, synthesis from TSL is scalable, because it is independent of the complexity of the handled data. Among other benchmarks, we have successfully synthesized a music player Android app and a controller for an autonomous vehicle in the Open Race Car Simulator (TORCS.

arXiv.org e-Print Archive