73 research outputs found
The localization spread and polarizability of rings and periodic chains
The localization spread gives a criterion to decide between metallic and insulating behavior of a material. It is defined as the second moment cumulant of the many-body position operator, divided by the number of electrons. Different operators are used for systems treated with open or periodic boundary conditions. In particular, in the case of periodic systems, we use the complex position definition, which was already used in similar contexts for the treatment of both classical and quantum situations. In this study, we show that the localization spread evaluated on a finite ring system of radius R with open boundary conditions leads, in the large R limit, to the same formula derived by Resta and co-workers [C. Sgiarovello, M. Peressi, and R. Resta, Phys. Rev. B 64, 115202 (2001)] for 1D systems with periodic Born-von Kármán boundary conditions. A second formula, alternative to Resta’s, is also given based on the sum-over-state formalism, allowing for an interesting generalization to polarizability and other similar quantities
Controlling the accuracy of the density matrix renormalization group method: The Dynamical Block State Selection approach
We have applied the momentum space version of the Density Matrix
Renormalization Group method (-DMRG) in quantum chemistry in order to study
the accuracy of the algorithm in the new context. We have shown numerically
that it is possible to determine the desired accuracy of the method in advance
of the calculations by dynamically controlling the truncation error and the
number of block states using a novel protocol which we dubbed Dynamical Block
State Selection (DBSS). The relationship between the real error and truncation
error has been studied as a function of the number of orbitals and the fraction
of filled orbitals. We have calculated the ground state of the molecules
CH, HO, and F as well as the first excited state of CH. Our
largest calculations were carried out with 57 orbitals, the largest number of
block states was 1500--2000, and the largest dimensions of the Hilbert space of
the superblock configuration was 800.000--1.200.000.Comment: 12 page
Actividades extracurriculares en el aprendizaje de una lengua extranjera
N\ufacleo (Revista de la Universidad Central de Venezuela), numero especial
Making Way in Corpus-based Intepreting Studies
The idea of editing a volume entirely focused on corpus-based interpreting studies was first discussed following the First Forlì International Workshop on Corpus-based interpreting studies: The State of the Art which was held at the Forlì Campus of the University of Bologna on May 7th and 8th 2015. This event gathered more than 100 scholars from different parts of the world with the aim of sharing their corpus-based research endeavors, ranging from studies that exploited fully machine-readable corpora to small collections of texts or transcripts for manual analysis. This volume serves a dual purpose. On the one hand, it aims at promoting the understanding of the interpretation process and product based not on anecdotal observations or small-size case-studies, but on comparatively large datasets of professional interpretations mostly stored and queried according to standard corpus linguistics methodologies. The volume showcases descriptions of and studies on major interpreting corpora available to date: the EPIC Corpus and its off-springs EPTIC (including also translations) developed at the University of Bologna, EPICG from the University of Ghent (Belgium) and the TIC Corpus from the University of Poznán (Poland); the 2249i Corpus, the DIRSI Corpus and the IMITES Corpus, again from the University of Bologna (Italy); the CorIT Corpus from the University of Trieste (Italy); the FOOTIE Corpus created at UNINT University in Rome (Italy); the NAIST Corpus from the Nara Institute of Science and Technology (Japan) and the CEIPPC Corpus, which was built at the Guangdong University of Foreign Studies (China). On the other hand, the volume is also intended as a renewed call (after Miriam Shlesinger’s first call in 1998) to the research community to further develop the field of corpus-based interpreting studies by offering scholars more corpus-based data and methodologies to compile their own corpora according to their research designs
European Parliament Interpreting Corpus (EPIC): Methodological issues and preliminary results on lexical patterns in SI
The European Parliament Interpreting Corpus (EPIC) is one of the first machine-readable corpora available in the field of Interpreting Studies. It was created in 2004/2006 by the Directionality Research Group, based at the University of Bologna at Forl\uec (Italy), and consists of 9 sub-corpora in total: three sub-corpora of source language speeches (Italian, English and Spanish) and six sub-corpora of simultaneously interpreted speeches, thus comprising all possible directions and combinations of the three languages involved. The present paper focuses on two main areas of Corpus-based Interpreting Studies: methodology and applied research. The first part addresses some of the main methodological issues that arise when creating a machine-readable corpus of simultaneous interpreting (SI) material, particularly in data collection and corpus design. The second part presents the main results of one of the studies carried out on EPIC material so far, namely a study of lexical patterns that draws on Laviosa\u2019s study on lexical density and lexical variety in source and target texts of English narrative prose (Laviosa 1998b). The same methodology is applied to all EPIC material, analysed from both a comparable and a parallel perspective. The results thus obtained shed light on the role played by translation mode (written translation vs. simultaneous interpreting), language combination and language direction
Looking for Lexical Patterns in a Trilingual Corpus of Source and Interpreted Speeches: Extended Analysis of EPIC (European Parliament Interpreting Corpus)
The research described in the present article is an extensive study of the European Parliament Interpreting Corpus (EPIC), a collection of 9 sub-corpora containing transcripts of source speeches and corresponding interpreted versions in three languages (English, Italian and Spanish). The authors investigated lexical patterns in speeches originally delivered in Spanish and speeches interpreted into Spanish (from English and Italian), by focusing on lexical density (expressed as the ratio of lexical words over the total number of running words in each sub-corpus) and lexical variety (expressed as the percentage of each sub-corpus accounted for by the 100 most frequent words). This methodology was used by Laviosa to study lexical patterns in English written (original and translated) texts. The results obtained from our Spanish sub-corpora are compared with Laviosa\u2019s results and with the results of a previous study of ours conducted on the English and Italian materials of EPIC. Thus, EPIC is analysed as both a comparable and a parallel corpus. The complex lexical patterns emerging from the present study provide insights into the role played by mode of translation (written translation vs. simultaneous interpreting), language combination and language direction
European Parliament Interpreting Corpus (EPIC)
The EPIC corpus is the first parallel corpus of European Parliament speeches and their corresponding simultaneous interpretations. This corpus includes source speeches in Italian, English and Spanish and interpreted speeches in all possible combinations and directions (from English into Italian and Spanish; from Italian into English and Spanish; and from Spanish into Italian and English). It contains a total of 357 speeches (177,295 words). The EPIC corpus includes video clips of each source language speaker, audio clips of the corresponding interpreted target speeches and transcripts of all the clips. The corpus has been orthographically transcribed. Annotation includes paralinguistic features (truncated, mispronounced words, .) and metadata (a header at the beginning of each transcript and information about the speaker and the speech). The transcripts are POS (part-of-speech) tagged and lemmatised. Non-tagged transcripts in text format are also available. Size of the nine subcorpora in the EPIC corpus: sub-corpus / number of speeches / total word count / % of EPIC ORG-EN (source) / 81 / 42,705 / 25 INT-EN-IT (interpretation) / 81 / 35,765 / 20 INT-EN-ES (interpretation) / 81 / 38,066 / 21 ORG-IT (source) / 17 / 6,765 / 4 INT-IT-EN (interpretation) / 17 / 6,708 / 4 INT-IT-ES (interpretation) / 17 / 7,052 / 4 ORG-ES (source) / 21 / 14,406 / 8 INT-ES-IT (interpretation) / 21 / 12,833 / 7 INT-ES-EN (interpretation) / 21 / 12,995 / 7 TOTAL / 357 / 177,295 / 100. The EPIC corpus was developed by a multidisciplinary research group based at the Department of Interdisciplinary Studies in Translation, Languages and Cultures (University of Bologna at Forlì), involving interpreting scholars, corpus linguists and IT technicians
European Parliament Interpreting Corpus (EPIC)
The EPIC corpus is the first parallel corpus of European Parliament speeches and their corresponding simultaneous interpretations. This corpus includes source speeches in Italian, English and Spanish and interpreted speeches in all possible combinations and directions (from English into Italian and Spanish; from Italian into English and Spanish; and from Spanish into Italian and English). It contains a total of 357 speeches (177,295 words).
The EPIC corpus includes video clips of each source language speaker, audio clips of the corresponding interpreted target speeches and transcripts of all the clips. The corpus has been orthographically transcribed. Annotation includes paralinguistic features (truncated, mispronounced words, ...) and metadata (a header at the beginning of each transcript and information about the speaker and the speech). The transcripts are POS (part-of-speech) tagged and lemmatised. Non-tagged transcripts in text format are also available.
Size of the nine subcorpora in the EPIC corpus:
sub-corpus / number of speeches / total word count / % of EPIC
ORG-EN (source) / 81 / 42,705 / 25
INT-EN-IT (interpretation) / 81 / 35,765 / 20
INT-EN-ES (interpretation) / 81 / 38,066 / 21
ORG-IT (source) / 17 / 6,765 / 4
INT-IT-EN (interpretation) / 17 / 6,708 / 4
INT-IT-ES (interpretation) / 17 / 7,052 / 4
ORG-ES (source) / 21 / 14,406 / 8
INT-ES-IT (interpretation) / 21 / 12,833 / 7
INT-ES-EN (interpretation) / 21 / 12,995 / 7
TOTAL / 357 / 177,295 / 100
The EPIC corpus was developed by a multidisciplinary research group based at the Department of Interdisciplinary Studies in Translation, Languages and Cultures (University of Bologna at Forl\uec), involving interpreting scholars, corpus linguists and IT technicians: Mariachiara Russo (coordinator), Claudio Bendazzoli, Cristina Monti, Annalisa Sandrelli, Marco Baroni, Silvia Bernardini, Gabriele Mack, Lorenzo Piccioni, Eros Zanchetta, Elio Ballardini, Peter Mead.
Applications
Applications existing : Speech recognition#Automatic speech recognition#Automatic person recognition
Technical Information
Distribution medium : DVD
Contents Click on the arrow to display content.
speech corpus
Language(s) : English >>>> Italian ; Italian >>>> English ; Spanish, Castilian >>>> English ; English >>>> Spanish, Castilian ; Spanish, Castilian >>>> Italian ; Italian >>>> Spanish, Castilian
TEXT_QUANTISATION8-bit
TEXT_CLIPPING_RATE_PERCENTAGE32 KhZ
Source Channel : Microphone
TEXT_SOUND_TYPE_ANNOTATIONMispronunciation#Truncation
TEXT_TRANSCRIPTION_ENTRIESOrthographic
TEXT_ANNOTATION_COVERAGEFull
TEXT_ANNOTATION_LEVELOrthographic
TEXT_ANNOTATION_LANGUAGEXML
Video
Number of languages : Parallel
Language(s)
The localization tensor for the H2 molecule: Closed formulae for the Heitler-London and related wavefunctions and comparison with full configuration interaction
A closed analytical formula for the localization tensor of the Heitler-London and related wavefunctions of the hydrogen molecule is given. For the wavefunctions with a well defined nature, the various contributions of the analytical expressions can be interpreted in simple terms. The results are then compared with full configuration interaction calculations, showing that the main contributions to the localization tensor for the ground state wavefunction are caught by the very simple wavefunctions here considered
- …