Search CORE

16,178 research outputs found

Optimization of transport protocols with path-length constraints in complex networks

Author: Boettcher Stefan
de la Lama Marta S.
Lopez Eduardo
Ramasco Jose J.
Publication venue: 'American Physical Society (APS)'
Publication date: 03/06/2010
Field of study

We propose a protocol optimization technique that is applicable to both weighted or unweighted graphs. Our aim is to explore by how much a small variation around the Shortest Path or Optimal Path protocols can enhance protocol performance. Such an optimization strategy can be necessary because even though some protocols can achieve very high traffic tolerance levels, this is commonly done by enlarging the path-lengths, which may jeopardize scalability. We use ideas borrowed from Extremal Optimization to guide our algorithm, which proves to be an effective technique. Our method exploits the degeneracy of the paths or their close-weight alternatives, which significantly improves the scalability of the protocols in comparison to Shortest Paths or Optimal Paths protocols, keeping at the same time almost intact the length or weight of the paths. This characteristic ensures that the optimized routing protocols are composed of paths that are quick to traverse, avoiding negative effects in data communication due to path-length increases that can become specially relevant when information losses are present.Comment: 8 pages, 8 figure

arXiv.org e-Print Archive

Digital.CSIC

Word forms are structured for efficient use

Author: Atkinson
Baayen
Beddor
Bergem
Brown
Brysbaert
Chen
Coady
Dautriche
Dautriche
Edward Gibson
Fedzechkina
Ferreira
Ferrer-i-Cancho
Frank
Frauenfelder
Gahl
Hills
Hume
Isabelle Dautriche
Jaeger
Jaeger
Jusczyk
Kanwal
Kawasaki
Kyle Mahowald
Landauer
Levy
Lindblom
Lindblom
Lindblom
Luce
Magnuson
Mahowald
Manin
New
Ngon
Ohala
Pate
Piantadosi
Piantadosi
Piantadosi
Sadat
Shannon
Smith
Stemberger
Steven T. Piantadosi
Storkel
Storkel
Storkel
Storkel
Storkel
Swingley
Vitevitch
Vitevitch
Vitevitch
Vitevitch
Zipf
Zipf
Publication venue: 'Wiley'
Publication date: 01/08/2018
Field of study

Zipf famously stated that, if natural language lexicons are structured for efficient communication, the words that are used the most frequently should require the least effort. This observation explains the famous finding that the most frequent words in a language tend to be short. A related prediction is that, even within words of the same length, the most frequent word forms should be the ones that are easiest to produce and understand. Using orthographics as a proxy for phonetics, we test this hypothesis using corpora of 96 languages from Wikipedia. We find that, across a variety of languages and language families and controlling for length, the most frequent forms in a language tend to be more orthographically well‐formed and have more orthographic neighbors than less frequent forms. We interpret this result as evidence that lexicons are structured by language usage pressures to facilitate efficient communication. Keywords: Lexicon; Word frequency; Phonology; Communication; EfficiencyNational Science Foundation (Grant ES/N0174041/1

DSpace@MIT

Crossref

Edinburgh Research Explorer

Collective emotions online and their influence on community life

Author: A Chmiel
A Chmiel
A Chmiel
A Czaplicka
A Kappas
A Tumasjan
A-L Barabási
A-L Barabási
AJ Gerber
Anna Chmiel
Arvid Kappas
Attila Szolnoki
B Kujawski
B Pang
BA Huberman
C Castellano
C Darwin
C Macdonald
F Radicchi
F Schweitzer
F Sebastiani
G Paltoglou
G Paltoglou
Georgios Paltoglou
H Rheingold
J Posner
J Suler
J Walther
J-P Onnela
Janusz A. Hołyst
Julian Sienkiewicz
Kevan Buckley
LA Feldman
M Gamon
M Mitrović
M Mitrović
M Mitrović
M Skowron
M Szell
M Taboada
Mike Thelwall
NH Frijda
P Krapivsky
P Krapivsky
P Sobkowicz
PJ Lang
PS Dodds
R Reisenzein
RB Zajonc
Riloff E
RIM Dunbar
S Gobron
SH Hemenover
T Wilson
W James
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 13/07/2011
Field of study

E-communities, social groups interacting online, have recently become an object of interdisciplinary research. As with face-to-face meetings, Internet exchanges may not only include factual information but also emotional information - how participants feel about the subject discussed or other group members. Emotions are known to be important in affecting interaction partners in offline communication in many ways. Could emotions in Internet exchanges affect others and systematically influence quantitative and qualitative aspects of the trajectory of e-communities? The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. It is not clear if emotions in e-communities primarily derive from individual group members' personalities or if they result from intra-group interactions, and whether they influence group activities. We show the collective character of affective phenomena on a large scale as observed in 4 million posts downloaded from Blogs, Digg and BBC forums. To test whether the emotions of a community member may influence the emotions of others, posts were grouped into clusters of messages with similar emotional valences. The frequency of long clusters was much higher than it would be if emotions occurred at random. Distributions for cluster lengths can be explained by preferential processes because conditional probabilities for consecutive messages grow as a power law with cluster length. For BBC forum threads, average discussion lengths were higher for larger values of absolute average emotional valence in the first ten comments and the average amount of emotion in messages fell during discussions. Our results prove that collective emotional states can be created and modulated via Internet communication and that emotional expressiveness is the fuel that sustains some e-communities.Comment: 23 pages including Supporting Information, accepted to PLoS ON

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Decoding billions of integers per second through vectorization

Author: Aksyonoff A
Büttcher S
Jones DM
Witten IH
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

In many important applications -- such as search engines and relational database systems -- data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time. Therefore, substantial effort has been made to reduce costs associated with compression and decompression. In particular, researchers have exploited the superscalar nature of modern processors and SIMD instructions. Nevertheless, we introduce a novel vectorized scheme called SIMD-BP128 that improves over previously proposed vectorized approaches. It is nearly twice as fast as the previously fastest schemes on desktop processors (varint-G8IU and PFOR). At the same time, SIMD-BP128 saves up to 2 bits per integer. For even better compression, we propose another new vectorized scheme (SIMD-FastPFOR) that has a compression ratio within 10% of a state-of-the-art scheme (Simple-8b) while being two times faster during decoding.Comment: For software, see https://github.com/lemire/FastPFor, For data, see http://boytsov.info/datasets/clueweb09gap

arXiv.org e-Print Archive

R-libre

Crossref

The placement of the head that maximizes predictability. An information theoretic approach

Author: Ferrer-i-Cancho Ramon
Publication venue
Publication date: 01/01/2017
Field of study

The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order. Here we complete that theory from the perspective of information theory, adding a competing word order principle: the maximization of predictability of a target element. These two principles are in conflict: to maximize the predictability of the head, the head should appear last, which maximizes the costs with respect to dependency length minimization. The implications of such a broad theoretical framework to understand the optimality, diversity and evolution of the six possible orderings of subject, object and verb are reviewed.Comment: in press in Glottometric

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The optimality of word lengths. Theoretical foundations and an empirical study

Author: Bentz Christian
Casas-i-Muñoz Antoni
Cluet-i-Martinell Jordi
Ferrer-i-Cancho Ramon
Petrini Sonia
Wang Mengxue
Publication venue
Publication date: 05/04/2023
Field of study

Zipf's law of abbreviation, namely the tendency of more frequent words to be shorter, has been viewed as a manifestation of compression, i.e. the minimization of the length of forms -- a universal principle of natural communication. Although the claim that languages are optimized has become trendy, attempts to measure the degree of optimization of languages have been rather scarce. Here we present two optimality scores that are dualy normalized, namely, they are normalized with respect to both the minimum and the random baseline. We analyze the theoretical and statistical pros and cons of these and other scores. Harnessing the best score, we quantify for the first time the degree of optimality of word lengths in languages. This indicates that languages are optimized to 62 or 67 percent on average (depending on the source) when word lengths are measured in characters, and to 65 percent on average when word lengths are measured in time. In general, spoken word durations are more optimized than written word lengths in characters. Our work paves the way to measure the degree of optimality of the vocalizations or gestures of other species, and to compare them against written, spoken, or signed human languages.Comment: On the one hand, the article has been reduced: analyses of the law of abbreviation and some of the methods have been moved to another article; appendix B has been reduced. On the other hand, various parts have been rewritten for clarity; new figures have been added to ease the understanding of the scores; new citations added. Many typos have been correcte

arXiv.org e-Print Archive