    Introducing a corpus of conversational stories. Construction and annotation of the Narrative Corpus

    Although widely seen as critical both in terms of its frequency and its social significance as a prime means of encoding and perpetuating moral stance and configuring self and identity, conversational narrative has received little attention in corpus linguistics. In this paper we describe the construction and annotation of a corpus that is intended to advance the linguistic theory of this fundamental mode of everyday social interaction: the Narrative Corpus (NC). The NC contains narratives extracted from the demographically-sampled sub-corpus of the British National Corpus (BNC) (XML version). It includes more than 500 narratives, socially balanced in terms of participant sex, age, and social class. We describe the extraction techniques, selection criteria, and sampling methods used in constructing the NC. Further, we describe four levels of annotation implemented in the corpus: speaker (social information on speakers), text (text Ids, title, type of story, type of embedding etc.), textual components (pre-/post-narrative talk, narrative, and narrative-initial/final utterances), and utterance (participation roles, quotatives and reporting modes). A brief rationale is given for each level of annotation, and possible avenues of research facilitated by the annotation are sketched out

    Alternating gaze in multi-party storytelling

    We present a single case study on gaze alternation’ in three-party storytelling. The study makes use of the XML method, a ‘combinatorial approach’ (Haugh & Musgrave 2019) involving multimodal CA transcription converted into the XML syntax. We approach gaze alternation via (i) the addressee-status hypothesis, (ii) the texturing hypothesis, and (iii) the acceleration hypothesis. Hypothesis (i) proposes that the storyteller alternatingly looks at the recipients not only when their addressee status is symmetrical but also when their addressee status is asymmetrical. Hypothesis (ii) predicts that gaze alternation ‘textures’ the telling by occurring when the storytelling progresses from one segment to another. Hypothesis (iii) states that gaze alternation accelerates toward Climax and decelerates in Post-completion sequences. The analyses support the hypotheses. They suggest that alternating gaze works against the danger of exclusion caused by the dyadic structure of conversation. It further partakes in story organization as it occurs at points of transition from one story section to another section. Finally, accelerated gaze alternation constitutes an indexical process drawing the recipients’ attention to the immediate relevance of stance display (Stivers 2008). We conclude that the three hypotheses warrant further investigation to determine their generalizability across speakers and speech situations

    Conversation Analysis and the XML method

    In this paper we introduce the XML method, a trio of technologies that can benefit conversation-analytic research. Specifically, we make a case for converting the center piece of CA research, the Jeffersonian transcript, into the format of the eXtensible Mark-up Language (XML). XML essentially turns documents into hierarchically ordered networks of nodes. As a network, an XML document can be exhaustively searched and any node or node set it contains can be extracted. We argue that the main benefit of formatting CA transcriptions in XML lies in the quantifiability that the format facilitates: CA-as-XML can provide precise "numbers and statistics" (Robinson 2007:65) thus helping to efficiently quantify observations and statistically substantiate claims about the 'generalizability' of observed practices of social action. We also introduce XPath and XQuery, two related query languages designed to exploit the XML format. Further, we describe XTranscript, a free online tool developed to convert completed CA transcripts to XML. Central to our approach is that the methodology be accessible to linguistics of varying levels of technical experience. Therefore, we also describe how this, and common concerns relating to the treatment of spoken data, have shaped our work in this area thus far

    Word frequency and cognitive effort in turns-at-talk: turn structure affects processing load in natural conversation

    Frequency distributions are known to widely affect psycholinguistic processes. The effects of word frequency in turns-at-talk, the nucleus of social action in conversation, have, by contrast, been largely neglected. This study probes into this gap by applying corpus-linguistic methods on the conversational component of the British National Corpus (BNC) and the Freiburg Multimodal Interaction Corpus (FreMIC). The latter includes continuous pupil size measures of participants of the recorded conversations, allowing for a systematic investigation of patterns in the contained speech and language on the one hand and their relation to concurrent processing costs they may incur in speakers and recipients on the other hand. We test a first hypothesis in this vein, analyzing whether word frequency distributions within turns-at-talk are correlated with interlocutors' processing effort during the production and reception of these turns. Turns are found to generally show a regular distribution pattern of word frequency, with highly frequent words in turn-initial positions, mid-range frequency words in turn-medial positions, and low-frequency words in turn-final positions. Speakers' pupil size is found to tend to increase during the course of a turn at talk, reaching a climax toward the turn end. Notably, the observed decrease in word frequency within turns is inversely correlated with the observed increase in pupil size in speakers, but not in recipients, with steeper decreases in word frequency going along with steeper increases in pupil size in speakers. We discuss the implications of these findings for theories of speech processing, turn structure, and information packaging. Crucially, we propose that the intensification of processing effort in speakers during a turn at talk is owed to an informational climax, which entails a progression from high-frequency, low-information words through intermediate levels to low-frequency, high-information words. At least in English conversation, interlocutors seem to make use of this pattern as one way to achieve efficiency in conversational interaction, creating a regularly recurring distribution of processing load across speaking turns, which aids smooth turn transitions, content prediction, and effective information transfer

    Conversational Grammar- Feminine Grammar? A Sociopragmatic Corpus Study

    One area in language and gender research that has so far received only little attention is the extent to which the sexes make use of what recent corpus research has termed “conversational grammar.” The author’s initial findings have suggested that the majority of features distinctive of conversational grammar may be used predominantly by female speakers. This article reports on a study designed to test the hypothesis that conversational grammar is “feminine grammar” in the sense that women’s conversational language is more adapted to the conversational situation than men’s. Based on data from the conversational subcorpus of the British National Corpus and following the situational framework for the description of conversational features elaborated in the author’s previous research, features distinctive of conversational grammar are grouped into five functional categories and their normed frequencies compared across the sexes. The functional categories distinguish features that can be seen as adaptations to constraints set by the situational factors of (1) Shared Context, (2) Co-Construction, (3) Real-Time Processing, (4) Discourse Management, and (5) Relation Management. The study’s results, described in detail in relation to the biological category of speaker sex and cultural notions of gender, suggest that the feminine grammar hypothesis is valid

    Network-based quantitative trait linkage analysis of microbiome composition in inflammatory bowel disease families

    Introduction: Inflammatory bowel disease (IBD) is characterized by a dysbiosis of the gut microbiome that results from the interaction of the constituting taxa with one another, and with the host. At the same time, host genetic variation is associated with both IBD risk and microbiome composition.Methods: In the present study, we defined quantitative traits (QTs) from modules identified in microbial co-occurrence networks to measure the inter-individual consistency of microbial abundance and subjected these QTs to a genome-wide quantitative trait locus (QTL) linkage analysis.Results: Four microbial network modules were consistently identified in two cohorts of healthy individuals, but three of the corresponding QTs differed significantly between IBD patients and unaffected individuals. The QTL linkage analysis was performed in a sub-sample of the Kiel IBD family cohort (IBD-KC), an ongoing study of 256 German families comprising 455 IBD patients and 575 first- and second-degree, non-affected relatives. The analysis revealed five chromosomal regions linked to one of three microbial module QTs, namely on chromosomes 3 (spanning 10.79 cM) and 11 (6.69 cM) for the first module, chr9 (0.13 cM) and chr16 (1.20 cM) for the second module, and chr13 (19.98 cM) for the third module. None of these loci have been implicated in a microbial phenotype before.Discussion: Our study illustrates the benefit of combining network and family-based linkage analysis to identify novel genetic drivers of microbiome composition in a specific disease context

    Paternal chronic colitis causes epigenetic inheritance of susceptibility to colitis.

    Inflammatory bowel disease (IBD) arises by unknown environmental triggers in genetically susceptible individuals. Epigenetic regulation of gene expression may integrate internal and external influences and may thereby modulate disease susceptibility. Epigenetic modification may also affect the germ-line and in certain contexts can be inherited to offspring. This study investigates epigenetic alterations consequent to experimental murine colitis induced by dextran sodium sulphate (DSS), and their paternal transmission to offspring. Genome-wide methylome- and transcriptome-profiling of intestinal epithelial cells (IECs) and sperm cells of males of the F0 generation, which received either DSS and consequently developed colitis (F0(DSS)), or non-supplemented tap water (F0(Ctrl)) and hence remained healthy, and of their F1 offspring was performed using reduced representation bisulfite sequencing (RRBS) and RNA-sequencing (RNA-Seq), respectively. Offspring of F0(DSS) males exhibited aberrant methylation and expression patterns of multiple genes, including Igf1r and Nr4a2, which are involved in energy metabolism. Importantly, DSS colitis in F0(DSS) mice was associated with decreased body weight at baseline of their F1 offspring, and these F1 mice exhibited increased susceptibility to DSS-induced colitis compared to offspring from F0(Ctrl) males. This study hence demonstrates epigenetic transmissibility of metabolic and inflammatory traits resulting from experimental colitis.This study was carried out as part of the Research Training Group “Genes, Environment and Inflammation”, supported by the Deutsche Forschungsgemeinschaft (RTG 1743/1) of which A.F. is the spokesperson, the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007–2013)/ERC Grant agreement no. 260961 (A.K.), the Austrian Science Fund and Ministry of Science P21530-B18 and START Y446-B18 (A.K.), the Wellcome Trust (investigator award 106260/Z/14/Z) to A.K., the Cambridge Biomedical Research Centre (A.K.), a fellowship from the European Crohn’s and Colitis Organisation (M.T. and T.E.A.) and a DOC fellowship from the Austrian Academy of Sciences (J.K.).This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/10.1038/srep3164

    Detailed stratified GWAS analysis for severe COVID-19 in four European populations

    Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended genome-wide association meta-analysis of a well-characterized cohort of 3255 COVID-19 patients with respiratory failure and 12 488 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a ~0.9-Mb inversion polymorphism that creates two highly differentiated haplotypes and characterized the potential effects of the inversion in detail. Our data, together with the 5th release of summary statistics from the COVID-19 Host Genetics Initiative including non-Caucasian individuals, also identified a new locus at 19q13.33, including NAPSA, a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.S.E.H. and C.A.S. partially supported genotyping through a philanthropic donation. A.F. and D.E. were supported by a grant from the German Federal Ministry of Education and COVID-19 grant Research (BMBF; ID:01KI20197); A.F., D.E. and F.D. were supported by the Deutsche Forschungsgemeinschaft Cluster of Excellence ‘Precision Medicine in Chronic Inflammation’ (EXC2167). D.E. was supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the Computational Life Sciences funding concept (CompLS grant 031L0165). D.E., K.B. and S.B. acknowledge the Novo Nordisk Foundation (NNF14CC0001 and NNF17OC0027594). T.L.L., A.T. and O.Ö. were funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), project numbers 279645989; 433116033; 437857095. M.W. and H.E. are supported by the German Research Foundation (DFG) through the Research Training Group 1743, ‘Genes, Environment and Inflammation’. L.V. received funding from: Ricerca Finalizzata Ministero della Salute (RF-2016-02364358), Italian Ministry of Health ‘CV PREVITAL’—strategie di prevenzione primaria cardiovascolare primaria nella popolazione italiana; The European Union (EU) Programme Horizon 2020 (under grant agreement No. 777377) for the project LITMUS- and for the project ‘REVEAL’; Fondazione IRCCS Ca’ Granda ‘Ricerca corrente’, Fondazione Sviluppo Ca’ Granda ‘Liver-BIBLE’ (PR-0391), Fondazione IRCCS Ca’ Granda ‘5permille’ ‘COVID-19 Biobank’ (RC100017A). A.B. was supported by a grant from Fondazione Cariplo to Fondazione Tettamanti: ‘Bio-banking of Covid-19 patient samples to support national and international research (Covid-Bank). This research was partly funded by an MIUR grant to the Department of Medical Sciences, under the program ‘Dipartimenti di Eccellenza 2018–2022’. This study makes use of data generated by the GCAT-Genomes for Life. Cohort study of the Genomes of Catalonia, Fundació IGTP (The Institute for Health Science Research Germans Trias i Pujol) IGTP is part of the CERCA Program/Generalitat de Catalunya. GCAT is supported by Acción de Dinamización del ISCIII-MINECO and the Ministry of Health of the Generalitat of Catalunya (ADE 10/00026); the Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR) (2017-SGR 529). M.M. received research funding from grant PI19/00335 Acción Estratégica en Salud, integrated in the Spanish National RDI Plan and financed by ISCIII-Subdirección General de Evaluación and the Fondo Europeo de Desarrollo Regional (European Regional Development Fund (FEDER)-Una manera de hacer Europa’). B.C. is supported by national grants PI18/01512. X.F. is supported by the VEIS project (001-P-001647) (co-funded by the European Regional Development Fund (ERDF), ‘A way to build Europe’). Additional data included in this study were obtained in part by the COVICAT Study Group (Cohort Covid de Catalunya) supported by IsGlobal and IGTP, European Institute of Innovation & Technology (EIT), a body of the European Union, COVID-19 Rapid Response activity 73A and SR20-01024 La Caixa Foundation. A.J. and S.M. were supported by the Spanish Ministry of Economy and Competitiveness (grant numbers: PSE-010000-2006-6 and IPT-010000-2010-36). A.J. was also supported by national grant PI17/00019 from the Acción Estratégica en Salud (ISCIII) and the European Regional Development Fund (FEDER). The Basque Biobank, a hospital-related platform that also involves all Osakidetza health centres, the Basque government’s Department of Health and Onkologikoa, is operated by the Basque Foundation for Health Innovation and Research-BIOEF. M.C. received Grants BFU2016-77244-R and PID2019-107836RB-I00 funded by the Agencia Estatal de Investigación (AEI, Spain) and the European Regional Development Fund (FEDER, EU). M.R.G., J.A.H., R.G.D. and D.M.M. are supported by the ‘Spanish Ministry of Economy, Innovation and Competition, the Instituto de Salud Carlos III’ (PI19/01404, PI16/01842, PI19/00589, PI17/00535 and GLD19/00100) and by the Andalussian government (Proyectos Estratégicos-Fondos Feder PE-0451-2018, COVID-Premed, COVID GWAs). The position held by Itziar de Rojas Salarich is funded by grant FI20/00215, PFIS Contratos Predoctorales de Formación en Investigación en Salud. Enrique Calderón’s team is supported by CIBER of Epidemiology and Public Health (CIBERESP), ‘Instituto de Salud Carlos III’. J.C.H. reports grants from Research Council of Norway grant no 312780 during the conduct of the study. E.S. reports grants from Research Council of Norway grant no. 312769. The BioMaterialBank Nord is supported by the German Center for Lung Research (DZL), Airway Research Center North (ARCN). The BioMaterialBank Nord is member of popgen 2.0 network (P2N). P.K. Bergisch Gladbach, Germany and the Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases, University of Cologne, Cologne, Germany. He is supported by the German Federal Ministry of Education and Research (BMBF). O.A.C. is supported by the German Federal Ministry of Research and Education and is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—CECAD, EXC 2030–390661388. The COMRI cohort is funded by Technical University of Munich, Munich, Germany. This work was supported by grants of the Rolf M. Schwiete Stiftung, the Saarland University, BMBF and The States of Saarland and Lower Saxony. K.U.L. is supported by the German Research Foundation (DFG, LU-1944/3-1). Genotyping for the BoSCO study is funded by the Institute of Human Genetics, University Hospital Bonn. F.H. was supported by the Bavarian State Ministry for Science and Arts. Part of the genotyping was supported by a grant to A.R. from the German Federal Ministry of Education and Research (BMBF, grant: 01ED1619A, European Alzheimer DNA BioBank, EADB) within the context of the EU Joint Programme—Neurodegenerative Disease Research (JPND). Additional funding was derived from the German Research Foundation (DFG) grant: RA 1971/6-1 to A.R. P.R. is supported by the DFG (CCGA Sequencing Centre and DFG ExC2167 PMI and by SH state funds for COVID19 research). F.T. is supported by the Clinician Scientist Program of the Deutsche Forschungsgemeinschaft Cluster of Excellence ‘Precision Medicine in Chronic Inflammation’ (EXC2167). C.L. and J.H. are supported by the German Center for Infection Research (DZIF). T.B., M.M.B., O.W. und A.H. are supported by the Stiftung Universitätsmedizin Essen. M.A.-H. was supported by Juan de la Cierva Incorporacion program, grant IJC2018-035131-I funded by MCIN/AEI/10.13039/501100011033. E.C.S. is supported by the Deutsche Forschungsgemeinschaft (DFG; SCHU 2419/2-1).Peer reviewe

