15 research outputs found

    Using sounds and sonifications for astronomy outreach

    Get PDF
    Good astronomy pictures, like those of the HST, play an important and wellknown role in astronomy outreach, triggering curiosity and interest. This same aim can also be achieved by means of sounds. Here we present the use of astronomy-related sounds and data sonifications to be used in astronomy outreach. These sounds, which people are unlikely to hear in the normal course of things, are a good tool for stimulating interest when teaching astronomy. In our case, sounds are successfully used in ‘‘The sounds of science,’’ a weekend science-dissemination program heard on the principal national radio station, Radio Nacional de Espan˜a (RNE). But teachers can also easily make use of these sounds in the classroom, since only a simple cassette player is needed

    Scaling and universality in the human voice

    Get PDF
    Speech is a distinctive complex feature of human capabilities. In order to understand the physics underlying speech production, in this work, we empirically analyse the statistics of large human speech datasets ranging several languages. We first show that during speech, the energy is unevenly released and powerlaw distributed, reporting a universal robust Gutenberg–Richter-like law in speech. We further show that such ‘earthquakes in speech’ show temporal correlations, as the interevent statistics are again power-law distributed. As this feature takes place in the intraphoneme range, we conjecture that the process responsible for this complex phenomenon is not cognitive, but it resides in the physiological (mechanical) mechanisms of speech production. Moreover, we show that these waiting time distributions are scale invariant under a renormalization group transformation, suggesting that the process of speech generation is indeed operating close to a critical point. These results are put in contrast with current paradigms in speech processing, which point towards low dimensional deterministic chaos as the origin of nonlinear traits in speech fluctuations. As these latter fluctuations are indeed the aspects that humanize synthetic speech, these findings may have an impact in future speech synthesis technologies. Results are robust and independent of the communication language or the number of speakers, pointing towards a universal pattern and yet another hint of complexity in human speech

    Phase transitions in number theory: from the birthday problem to Sidon sets

    Get PDF
    In this work, we show how number theoretical problems can be fruitfully approached with the tools of statistical physics. We focus on g-Sidon sets, which describe sequences of integers whose pairwise sums are different, and propose a random decision problem which addresses the probability of a random set of k integers to be g-Sidon. First, we provide numerical evidence showing that there is a crossover between satisfiable and unsatisfiable phases which converts to an abrupt phase transition in a properly defined thermodynamic limit. Initially assuming independence, we then develop a mean-field theory for the g-Sidon decision problem. We further improve the mean-field theory, which is only qualitatively correct, by incorporating deviations from independence, yielding results in good quantitative agreement with the numerics for both finite systems and in the thermodynamic limit. Connections between the generalized birthday problem in probability theory, the number theory of Sidon sets and the properties of q-Potts models in condensed matter physics are briefly discusse

    To the sun and beyond

    Full text link
    Tell Bartolo Luque and Fernando Ballesteros how far the Sun is from the Earth, and they will tell you the size of the Universe

    Is speech a self-organized critical signal?

    No full text
    A lo largo del siglo XX, los estudios en lingüística cuantitativa han ido mostrando la aparición de leyes potenciales en las lenguas, primero en textos escritos y posteriormente en el habla. Son leyes que parecen ubicuas y robustas, pero ¿por qué aparecen en el lenguaje? ¿Son resultados espurios debidos a la arbitrariedad de la segmentación de las palabras, o realmente son universales de la comunicación compleja? ¿Podemos investigar la presencia de estas leyes en otros sistemas de comunicación animal de los que no conocemos el código? Los enfoques interdisciplinares y transdisciplinares en la lingüística y el estudio de los sistemas de comunicación se antojan imprescindibles. Se exponen a modo de ejemplo dos estudios recientes realizados sobre corpus acústicos de hasta dieciséis lenguas, mediante un método general de segmentación de señales (método de los umbrales). Exploramos aquí la posibilidad de que las leyes estadísticas que emergen en el lenguaje sean fruto de un sistema crítico auto–organizado, al igual que otros fenómenos presentes en la Naturaleza. El método de los umbrales que se presenta permite analizar cualquier tipo de señal sin necesidad de conocer su codificación o segmentación. Esto abre nuevos caminos en la investigación lingüística permitiendo entre otras cosas realizar estudios comparativos entre el lenguaje humano y otros sistemas de comunicación animal.Peer ReviewedPostprint (published version

    Is speech a self-organized critical signal?

    No full text
    A lo largo del siglo XX, los estudios en lingüística cuantitativa han ido mostrando la aparición de leyes potenciales en las lenguas, primero en textos escritos y posteriormente en el habla. Son leyes que parecen ubicuas y robustas, pero ¿por qué aparecen en el lenguaje? ¿Son resultados espurios debidos a la arbitrariedad de la segmentación de las palabras, o realmente son universales de la comunicación compleja? ¿Podemos investigar la presencia de estas leyes en otros sistemas de comunicación animal de los que no conocemos el código? Los enfoques interdisciplinares y transdisciplinares en la lingüística y el estudio de los sistemas de comunicación se antojan imprescindibles. Se exponen a modo de ejemplo dos estudios recientes realizados sobre corpus acústicos de hasta dieciséis lenguas, mediante un método general de segmentación de señales (método de los umbrales). Exploramos aquí la posibilidad de que las leyes estadísticas que emergen en el lenguaje sean fruto de un sistema crítico auto–organizado, al igual que otros fenómenos presentes en la Naturaleza. El método de los umbrales que se presenta permite analizar cualquier tipo de señal sin necesidad de conocer su codificación o segmentación. Esto abre nuevos caminos en la investigación lingüística permitiendo entre otras cosas realizar estudios comparativos entre el lenguaje humano y otros sistemas de comunicación animal.Peer Reviewe

    Log-normal distribution in acoustic linguistic units

    No full text
    In this work we verify with accuracy that acoustically transcribed durations of linguistic units at several scales (phonemes, words and Breath Groups) comply with log-normal distribution. To do this we have used a very well-known Corpus which contains conversational speech by native English speakers gathering approximately 3•10^5 words with time-aligned phonetic labels. Secondly, we explain this log-normal distribution using a new model: a Non-interacting Cascade Approach (NICA) model. This NICA model can explain the emergence of Lognormal distributions across linguistic levels (words, Breathe Group) solely based on the assumption that phoneme durations are also Lognormal. As we will see, we find an extremely good quantitative agreement between the NICA and the experimental results of the duration distribution for the case of phonemes and words, but such agreement is less spectacular in the case of Breath Groups. Finally, we discuss our results and justify our recommendation to work with medians instead of with mean values (that assumes Gaussian distribution) to avoid biases and erroneous conclusions in statistical learning studies based on acoustic elements with long-tailed distributions.Postprint (published version

    Linguistic laws in Catalan

    No full text
    In this work, we explore and review linguistic laws, in the case of Catalan, going from prelinguistic to higher linguistic levels and addressing both speech and writing. We show evidence supporting the theory that linguistic laws are universal patterns in human language that are more robust in the oral corpus than in writing. This reinforces the “physical hypothesis,” which argues that linguistic laws could have a physiological and biophysical origin, and they are reflected in written texts as a consequence of speech symbolization. However, future work is necessary to increase empirical evidence by deeply analyzing other language corpora, propose new cognitive and physical models that clarify the mathematical formulation of some statistical patterns, find more evidence to explain the relationship between prelinguistic and higher linguistic levels, and understand the results reported up to date from a global interdisciplinary perspective of language theory.I.G.T. and A.H.-F. were supported by the project PRO2020-S03 (RCO03080 Lingüística Quantitativa) and PRO2021-S03HERNANDEZ by Institut d’Estudis Catalans. A.H-F. was also supported by the grant TIN2017-89244-R (MACDA) (Ministerio de Economía, Industria y Competitividad, Gobierno de España).Peer ReviewedPostprint (published version

    Log-normal distribution in acoustic linguistic units

    No full text
    In this work we verify with accuracy that acoustically transcribed durations of linguistic units at several scales (phonemes, words and Breath Groups) comply with log-normal distribution. To do this we have used a very well-known Corpus which contains conversational speech by native English speakers gathering approximately 3•10^5 words with time-aligned phonetic labels. Secondly, we explain this log-normal distribution using a new model: a Non-interacting Cascade Approach (NICA) model. This NICA model can explain the emergence of Lognormal distributions across linguistic levels (words, Breathe Group) solely based on the assumption that phoneme durations are also Lognormal. As we will see, we find an extremely good quantitative agreement between the NICA and the experimental results of the duration distribution for the case of phonemes and words, but such agreement is less spectacular in the case of Breath Groups. Finally, we discuss our results and justify our recommendation to work with medians instead of with mean values (that assumes Gaussian distribution) to avoid biases and erroneous conclusions in statistical learning studies based on acoustic elements with long-tailed distributions

    A statistical model from information theory to explain Zipf's law of brevity

    No full text
    Brevity and frequency are two crucial factors in the processes of statistical learning. The compression principle had already been used previously to explain the origin of Zipf’s law for the frequency of words. Here we use a model from information theory to also explain the Zipf’s law of abbreviation, or the statistical tendency of more frequent elements in language to be shorter (in characters in the case of written language, and in time durations for oral communication). As far as we know, we show for the first time that Zipf’s law of abbreviation is a global speech process that holds in words regardless of what are the linguistics units of study. In addition, the derived model from information theory allows us to fit empirically linguistic data considering both acoustic elements (phonemes, words and sentences) and its transcripts. This raises that the processes measured in units of written text are a byproduct of spontaneous speech patterns. The more a word is used, the greatest effort in compression that will make it shorter; but also the shorter it is, the more times it will be used statistically. This work paves the way for new experimental approaches to the study of statistical learning.Peer ReviewedPostprint (published version
    corecore