Search CORE

108 research outputs found

Conformance Checking for Pushdown Reactive Systems based on Visibly Pushdown Languages

Author: Bonifacio Adilson Luiz
Publication venue
Publication date: 14/08/2023
Field of study

Testing pushdown reactive systems is deemed important to guarantee a precise and robust software development process. Usually, such systems can be specified by the formalism of Input/Output Visibly Pushdown Labeled Transition System (IOVPTS), where the interaction with the environment is regulated by a pushdown memory. Hence a conformance checking can be applied in a testing process to verify whether an implementation is in compliance to a specification using an appropriate conformance relation. In this work we establish a novelty conformance relation based on Visibly Pushdown Languages (VPLs) that can model sets of desirable and undesirable behaviors of systems. Further, we show that test suites with a complete fault coverage can be generated using this conformance relation for pushdown reactive systems.Comment: arXiv admin note: substantial text overlap with arXiv:2107.1142

arXiv.org e-Print Archive

Characterizing Faults on Real-Time Systems Based on Grid Automata

Author: Doi Gilson
Luiz Bonifacio Adilson
Publication venue: Cosmos Schoalrs Publishing House
Publication date: 15/06/2016
Field of study

Real-time systems are, in general, critical systems that interact with the environment through input and output events regulated by time constraints. The testing activity on systems of this nature requires rigorous approaches due to their critical aspects. Model-based testing approaches rely on formalisms that provide more reliability to testing activities. However, a model-based testing approach for real-time systems depends on techniques that can deal with continuous evolution of time appropriately. Several testing approaches apply discretization techniques in order to represent continuous behavior of timed models. Test suites can then be extracted from discretized models to support conformance testing between specifications and their respective implementations. Therefore an evaluation of test suites considering a fault coverage is an important task, but rarely addressed by model-based testing approaches for real-time systems. In this work we propose a systematic strategy to identify faults in TIOA models based on their corresponding discretized models. We precisely define a fault model to support model-based testing activities such as coverage analysis and test case generation

Cosmos Scholars Publishing House: Journals Management System

In Defense of Cross-Encoders for Zero-Shot Retrieval

Author: Abonizio Hugo
Bonifacio Luiz
Fadaee Marzieh
Jeronymo Vitor
Lotufo Roberto
Nogueira Rodrigo
Rosa Guilherme
Publication venue
Publication date: 12/12/2022
Field of study

Bi-encoders and cross-encoders are widely used in many state-of-the-art retrieval pipelines. In this work we study the generalization ability of these two types of architectures on a wide range of parameter count on both in-domain and out-of-domain scenarios. We find that the number of parameters and early query-document interactions of cross-encoders play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that cross-encoders largely outperform bi-encoders of similar size in several tasks. In the BEIR benchmark, our largest cross-encoder surpasses a state-of-the-art bi-encoder by more than 4 average points. Finally, we show that using bi-encoders as first-stage retrievers provides no gains in comparison to a simpler retriever such as BM25 on out-of-domain tasks. The code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.gitComment: arXiv admin note: substantial text overlap with arXiv:2206.0287

arXiv.org e-Print Archive

InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval

Author: Abonizio Hugo
Bonifacio Luiz
Fadaee Marzieh
Jeronymo Vitor
Lotufo Roberto
Nogueira Rodrigo
Zavrel Jakub
Publication venue
Publication date: 26/05/2023
Field of study

Recently, InPars introduced a method to efficiently use large language models (LLMs) in information retrieval tasks: via few-shot examples, an LLM is induced to generate relevant queries for documents. These synthetic query-document pairs can then be used to train a retriever. However, InPars and, more recently, Promptagator, rely on proprietary LLMs such as GPT-3 and FLAN to generate such datasets. In this work we introduce InPars-v2, a dataset generator that uses open-source LLMs and existing powerful rerankers to select synthetic query-document pairs for training. A simple BM25 retrieval pipeline followed by a monoT5 reranker finetuned on InPars-v2 data achieves new state-of-the-art results on the BEIR benchmark. To allow researchers to further improve our method, we open source the code, synthetic data, and finetuned models: https://github.com/zetaalphavector/inPars/tree/master/tp

arXiv.org e-Print Archive

No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval

Author: Abonizio Hugo
Bonifacio Luiz
Fadaee Marzieh
Jeronymo Vitor
Lotufo Roberto
Nogueira Rodrigo
Rosa Guilherme Moraes
Publication venue
Publication date: 12/12/2022
Field of study

Recent work has shown that small distilled language models are strong competitors to models that are orders of magnitude larger and slower in a wide range of information retrieval tasks. This has made distilled and dense models, due to latency constraints, the go-to choice for deployment in real-world retrieval applications. In this work, we question this practice by showing that the number of parameters and early query-document interaction play a significant role in the generalization ability of retrieval models. Our experiments show that increasing model size results in marginal gains on in-domain test sets, but much larger gains in new domains never seen during fine-tuning. Furthermore, we show that rerankers largely outperform dense ones of similar size in several tasks. Our largest reranker reaches the state of the art in 12 of the 18 datasets of the Benchmark-IR (BEIR) and surpasses the previous state of the art by 3 average points. Finally, we confirm that in-domain effectiveness is not a good indicator of zero-shot effectiveness. Code is available at https://github.com/guilhermemr04/scaling-zero-shot-retrieval.gi

arXiv.org e-Print Archive