1,301 research outputs found
Rich Counter-Examples for Temporal-Epistemic Logic Model Checking
Model checking verifies that a model of a system satisfies a given property,
and otherwise produces a counter-example explaining the violation. The verified
properties are formally expressed in temporal logics. Some temporal logics,
such as CTL, are branching: they allow to express facts about the whole
computation tree of the model, rather than on each single linear computation.
This branching aspect is even more critical when dealing with multi-modal
logics, i.e. logics expressing facts about systems with several transition
relations. A prominent example is CTLK, a logic that reasons about temporal and
epistemic properties of multi-agent systems. In general, model checkers produce
linear counter-examples for failed properties, composed of a single computation
path of the model. But some branching properties are only poorly and partially
explained by a linear counter-example.
This paper proposes richer counter-example structures called tree-like
annotated counter-examples (TLACEs), for properties in Action-Restricted CTL
(ARCTL), an extension of CTL quantifying paths restricted in terms of actions
labeling transitions of the model. These counter-examples have a branching
structure that supports more complete description of property violations.
Elements of these counter-examples are annotated with parts of the property to
give a better understanding of their structure. Visualization and browsing of
these richer counter-examples become a critical issue, as the number of
branches and states can grow exponentially for deeply-nested properties.
This paper formally defines the structure of TLACEs, characterizes adequate
counter-examples w.r.t. models and failed properties, and gives a generation
algorithm for ARCTL properties. It also illustrates the approach with examples
in CTLK, using a reduction of CTLK to ARCTL. The proposed approach has been
implemented, first by extending the NuSMV model checker to generate and export
branching counter-examples, secondly by providing an interactive graphical
interface to visualize and browse them.Comment: In Proceedings IWIGP 2012, arXiv:1202.422
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law
Out-of-distribution (OOD) testing is increasingly popular for evaluating a
machine learning system's ability to generalize beyond the biases of a training
set. OOD benchmarks are designed to present a different joint distribution of
data and labels between training and test time. VQA-CP has become the standard
OOD benchmark for visual question answering, but we discovered three troubling
practices in its current use. First, most published methods rely on explicit
knowledge of the construction of the OOD splits. They often rely on
``inverting'' the distribution of labels, e.g. answering mostly 'yes' when the
common training answer is 'no'. Second, the OOD test set is used for model
selection. Third, a model's in-domain performance is assessed after retraining
it on in-domain splits (VQA v2) that exhibit a more balanced distribution of
labels. These three practices defeat the objective of evaluating
generalization, and put into question the value of methods specifically
designed for this dataset. We show that embarrassingly-simple methods,
including one that generates answers at random, surpass the state of the art on
some question types. We provide short- and long-term solutions to avoid these
pitfalls and realize the benefits of OOD evaluation
INDICADORES DE DESEMPENHO DA GESTÃO ORÇAMENTÁRIA DA UFSC NO PERÍODO DE 2010 A 2013
O objetivo desse trabalho é verificar a variação no desempenho da gestão orçamentária da Universidade Federal de Santa Catarina tendo como base os indicadores no período de 2010 a 2013. A pesquisa se caracteriza pelo seu caráter qualitativo, quantitativo, descritivo, aplicado, bibliográfico, documental e estudo de caso. A partir da coleta de dados secundários em documentos obtidos junto a UFSC pode-se observar que houve variações nos indicadores de desempenho e que apesar de não serem significativas, influenciaram consideravelmente no orçamento. E, esta variação, na gestão dos recursos financeiros de uma instituição de ensino superior pública do porte da UFSC deve ser acompanhada incessantemente para que ações pertinentes, caso necessárias, sejam tomadas em tempo hábil e correções realizadas evitando assim maiores prejuízo
A study on like-attracts-like versus elitist selection criterion for human-like social behavior of memetic mulitagent systems
Memetic multi agent system emerges as an enhanced version of multiagent systems with the implementation of meme-inspired computational agents. It aims to evolve human-like behavior of multiple agents by exploiting the Dawkins' notion of a meme and Universal Darwinism. Previous research has developed a computational framework in which a series of memetic operations have been designed for implementing humanlike agents. This paper will focus on improving the human-like behavior of multiple agents when they are engaged in social interactions. The improvement is mainly on how an agent shall learn from others and adapt its behavior in a complex dynamic environment. In particular, we design a new mechanism that supervises how the agent shall select one of the other agents for the learning purpose. The selection is a trade-off between the elitist and like-attracts-like principles. We demonstrate the desirable interactions of multiple agents in two problem domains
T-DepExp: Simulating transitive dependence based coalition formation
In this paper, we introduce T-DepExp system to simulate the transitive dependence based coalition formation (CF). It is a multi-agent based simulation (MABS) tool that aims to enhance cooperation between agents through transitive dependence. Previously, the transitive dependence was introduced by An and his colleagues for expressing the indirect dependence between agents in their cooperation. However, it did not receive much attention. Although it has a few problems need to be addressed, we try to propose our own mechanism to increase the efficiency of the transitive dependence based CF. To simulate MAS dependence relationship, we have included two fundamental dependence relationships in this MABS tool, which are AND-Dependence and OR-Dependence. In addition, the architecture of the T-DepExp system is presented and discussed. It allows possible integration of other features such as budget mechanism and trust model. Subsequently, hypothesis for the experiments and experimental setup are explained. The overall system will be demonstrated for its functionality and the experimental results will also be discussed
- …