Search CORE

1,822 research outputs found

In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search

Author: Brahman Faeze
Choi Yejin
Li Huihan
Li Xiang Lorraine
Liao Zeyi
Lu Ximing
Ning Yuting
Ren Xiang
Wang Siyuan
Zhao Wenting
Publication venue
Publication date: 13/11/2023
Field of study

Since large language models have approached human-level performance on many tasks, it has become increasingly harder for researchers to find tasks that are still challenging to the models. Failure cases usually come from the long-tail distribution - data that an oracle language model could assign a probability on the lower end of its distribution. Current methodology such as prompt engineering or crowdsourcing are insufficient for creating long-tail examples because humans are constrained by cognitive bias. We propose a Logic-Induced-Knowledge-Search (LINK) framework for systematically generating long-tail knowledge statements. Grounded by a symbolic rule, we search for long-tail values for each variable of the rule by first prompting a LLM, then verifying the correctness of the values with a critic, and lastly pushing for the long-tail distribution with a reranker. With this framework we construct a dataset, Logic-Induced-Long-Tail (LINT), consisting of 200 symbolic rules and 50K knowledge statements spanning across four domains. Human annotations find that 84% of the statements in LINT are factually correct. In contrast, ChatGPT and GPT4 struggle with directly generating long-tail statements under the guidance of logic rules, each only getting 56% and 78% of their statements correct. Moreover, their "long-tail" generations in fact fall into the higher likelihood range, and thus are not really long-tail. Our findings suggest that LINK is effective for generating data in the long-tail distribution while enforcing quality. LINT can be useful for systematically evaluating LLMs' capabilities in the long-tail distribution. We challenge the models with a simple entailment classification task using samples from LINT. We find that ChatGPT and GPT4's capability in identifying incorrect knowledge drop by ~3% in the long-tail distribution compared to head distribution

arXiv.org e-Print Archive

Fractional embeddings and stochastic time

Author: Cresson Jacky
Inizan Pierre
Publication venue
Publication date: 01/07/2008
Field of study

As a model problem for the study of chaotic Hamiltonian systems, we look for the effects of a long-tail distribution of recurrence times on a fixed Hamiltonian dynamics. We follow Stanislavsky's approach of Hamiltonian formalism for fractional systems. We prove that his formalism can be retrieved from the fractional embedding theory. We deduce that the fractional Hamiltonian systems of Stanislavsky stem from a particular least action principle, said causal. In this case, the fractional embedding becomes coherent.Comment: 11 page

arXiv.org e-Print Archive

HAL-INSU

HAL-OBSPM

Personalized Federated Learning on Long-Tailed Data via Adversarial Feature Augmentation

Author: Huang Gang
Lu Yang
Qian Pinxin
Wang Hanzi
Publication venue
Publication date: 27/03/2023
Field of study

Personalized Federated Learning (PFL) aims to learn personalized models for each client based on the knowledge across all clients in a privacy-preserving manner. Existing PFL methods generally assume that the underlying global data across all clients are uniformly distributed without considering the long-tail distribution. The joint problem of data heterogeneity and long-tail distribution in the FL environment is more challenging and severely affects the performance of personalized models. In this paper, we propose a PFL method called Federated Learning with Adversarial Feature Augmentation (FedAFA) to address this joint problem in PFL. FedAFA optimizes the personalized model for each client by producing a balanced feature set to enhance the local minority classes. The local minority class features are generated by transferring the knowledge from the local majority class features extracted by the global model in an adversarial example learning manner. The experimental results on benchmarks under different settings of data heterogeneity and long-tail distribution demonstrate that FedAFA significantly improves the personalized performance of each client compared with the state-of-the-art PFL algorithm. The code is available at https://github.com/pxqian/FedAFA.Comment: Accepted by ICASSP 202

arXiv.org e-Print Archive

Shelf space strategy in long-tail markets

Author: Anderson
Bai
Barabási
Bentley
Bentley
Bentley
Chen
Chen
Evans
Gillespie
Hahn
Herzog
Hwang
Lim
Mark E. Madsen
Newman
Nowak
Ormerod
Ormerod
Pareto
Paul Ormerod
R. Alexander Bentley
Salganik
Schelling
Schweitzer
Simon
Stigler
Tintner
Watts
Watts
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

The Internet is known to have had a powerful impact on on-line retailer strategies in markets characterised by long-tail distribution of sales. Such retailers can exploit the long tail of the market, since they are effectively without physical limit on the number of choices on offer. Here we examine two extensions of this phenomenon. First, we introduce turnover into the long-tail distribution of sales. Although over any given period such as a week or a month, the distribution is right-skewed and often power law distributed, over time there is considerable turnover in the rankings of sales of individual products. Second, we establish some initial results on the implications for shelf-space strategy of physical retailers in such markets.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Comment on Nonextensive Statistical Mechanics

Author: KAFRI Oded
Publication venue: Journal of Economics Library
Publication date: 18/12/2016
Field of study

Abstract. There is a conception that Boltzmann-Gibbs statistics cannot yield the long tail distribution. This is the justification for the intensive research of nonextensive entropies (i.e. Tsallis entropy and others). Here the error that caused this misconception is explained and it is shown that a long tail distribution exists in equilibrium thermodynamics for more than a century.Keywords. Long-tail distribution, Power Law, Zipf Law, Tsallis Entropy.JEL. C62

KSP Journals

Journal of Economics Library

M|G|∞ queue busy period length with PME distribution analysis through Laplace transform

Author: Ferreira M. A. M.
Publication venue: Universum Research E-Center
Publication date: 01/01/2018
Field of study

In this article it is shown that if the busy period of a M|G|∞ queue system is PME distributed, the respective service time is a random variable with a long-tail distribution. The result is obtained through Laplace transforms analysis.info:eu-repo/semantics/acceptedVersio

Repositório Institucional do ISCTE-IUL