Search CORE

112 research outputs found

A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data

Author: Brom Timothy H.
Brown C. Titus
Howe Adina
Pyrkosz Alexis B.
Zhang Qingpeng
Publication venue
Publication date: 21/05/2012
Field of study

Deep shotgun sequencing and analysis of genomes, transcriptomes, amplified single-cell genomes, and metagenomes has enabled investigation of a wide range of organisms and ecosystems. However, sampling variation in short-read data sets and high sequencing error rates of modern sequencers present many new computational challenges in data interpretation. These challenges have led to the development of new classes of mapping tools and {\em de novo} assemblers. These algorithms are challenged by the continued improvement in sequencing throughput. We here describe digital normalization, a single-pass computational algorithm that systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, and removing the majority of errors. Digital normalization substantially reduces the size of shotgun data sets and decreases the memory and time requirements for {\em de novo} sequence assembly, all without significantly impacting content of the generated contigs. We apply digital normalization to the assembly of microbial genomic data, amplified single-cell genomic data, and transcriptomic data. Our implementation is freely available for use and modification

arXiv.org e-Print Archive

CiteSeerX

Representation Bias of Adolescents in AI: A Bilingual, Bicultural Study

Author: Dangol Aayushi
Hiniker Alexis
Howe Bill
Wolfe Robert
Publication venue
Publication date: 04/08/2024
Field of study

Popular and news media often portray teenagers with sensationalism, as both a risk to society and at risk from society. As AI begins to absorb some of the epistemic functions of traditional media, we study how teenagers in two countries speaking two languages: 1) are depicted by AI, and 2) how they would prefer to be depicted. Specifically, we study the biases about teenagers learned by static word embeddings (SWEs) and generative language models (GLMs), comparing these with the perspectives of adolescents living in the U.S. and Nepal. We find English-language SWEs associate teenagers with societal problems, and more than 50% of the 1,000 words most associated with teenagers in the pretrained GloVe SWE reflect such problems. Given prompts about teenagers, 30% of outputs from GPT2-XL and 29% from LLaMA-2-7B GLMs discuss societal problems, most commonly violence, but also drug use, mental illness, and sexual taboo. Nepali models, while not free of such associations, are less dominated by social problems. Data from workshops with N=13 U.S. adolescents and N=18 Nepalese adolescents show that AI presentations are disconnected from teenage life, which revolves around activities like school and friendship. Participant ratings of how well 20 trait words describe teens are decorrelated from SWE associations, with Pearson's r=.02, n.s. in English FastText and r=.06, n.s. in GloVe; and r=.06, n.s. in Nepali FastText and r=-.23, n.s. in GloVe. U.S. participants suggested AI could fairly present teens by highlighting diversity, while Nepalese participants centered positivity. Participants were optimistic that, if it learned from adolescents, rather than media sources, AI could help mitigate stereotypes. Our work offers an understanding of the ways SWEs and GLMs misrepresent a developmentally vulnerable group and provides a template for less sensationalized characterization.Comment: Accepted at Artificial Intelligence, Ethics, and Society 202

arXiv.org e-Print Archive

ML-EAT: A Multilevel Embedding Association Test for Interpretable and Transparent Social Science

Author: Hiniker Alexis
Howe Bill
Wolfe Robert
Publication venue
Publication date: 27/08/2024
Field of study

This research introduces the Multilevel Embedding Association Test (ML-EAT), a method designed for interpretable and transparent measurement of intrinsic bias in language technologies. The ML-EAT addresses issues of ambiguity and difficulty in interpreting the traditional EAT measurement by quantifying bias at three levels of increasing granularity: the differential association between two target concepts with two attribute concepts; the individual effect size of each target concept with two attribute concepts; and the association between each individual target concept and each individual attribute concept. Using the ML-EAT, this research defines a taxonomy of EAT patterns describing the nine possible outcomes of an embedding association test, each of which is associated with a unique EAT-Map, a novel four-quadrant visualization for interpreting the ML-EAT. Empirical analysis of static and diachronic word embeddings, GPT-2 language models, and a CLIP language-and-image model shows that EAT patterns add otherwise unobservable information about the component biases that make up an EAT; reveal the effects of prompting in zero-shot models; and can also identify situations when cosine similarity is an ineffective metric, rendering an EAT unreliable. Our work contributes a method for rendering bias more observable and interpretable, improving the transparency of computational investigations into human minds and societies.Accepted at Artificial Intelligence, Ethics, and Society 202

arXiv.org e-Print Archive

Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI

Author: Dangol Aayushi
Hiniker Alexis
Howe Bill
Wolfe Robert
Publication venue
Publication date: 27/08/2024
Field of study

Multimodal AI models capable of associating images and text hold promise for numerous domains, ranging from automated image captioning to accessibility applications for blind and low-vision users. However, uncertainty about bias has in some cases limited their adoption and availability. In the present work, we study 43 CLIP vision-language models to determine whether they learn human-like facial impression biases, and we find evidence that such biases are reflected across three distinct CLIP model families. We show for the first time that the the degree to which a bias is shared across a society predicts the degree to which it is reflected in a CLIP model. Human-like impressions of visually unobservable attributes, like trustworthiness and sexuality, emerge only in models trained on the largest dataset, indicating that a better fit to uncurated cultural data results in the reproduction of increasingly subtle social biases. Moreover, we use a hierarchical clustering approach to show that dataset size predicts the extent to which the underlying structure of facial impression bias resembles that of facial impression bias in humans. Finally, we show that Stable Diffusion models employing CLIP as a text encoder learn facial impression biases, and that these biases intersect with racial biases in Stable Diffusion XL-Turbo. While pretrained CLIP models may prove useful for scientific studies of bias, they will also require significant dataset curation when intended for use as general-purpose models in a zero-shot setting.Accepted at Artificial Intelligence, Ethics, and Society 202

arXiv.org e-Print Archive

Sensory Communication

Author: Aviles Walter A.
Basdogan Cagatay
Beauregard G. Lee
Birtolo Dylan J.
Braida Louis D.
Brantley Merry A.
Brughera Andrew R.
Brungart Douglas S.
Chen Frederick W.
Chen Jyh-Shing
De Suvranu
Delhorne Lorraine A.
Desloge Joseph G.
DiFranco David E.
Duchnowski Paul
Durlach Nathaniel I.
Farel Alexis E.
Frisbie Joseph A.
Garnett Rebecca L.
Goldman Susan L.
Graaf Isaac
Grant Kenneth W.
Greenberg Julie E.
Hall Dorrie
Hall Seth M.
Held Richard M.
Ho Chih-Hao
Hou I-Chun A.
Howe Robert D.
Karason Steingrimur P.
Karmacharya Rabi
Kassem Salim F.
Keller Matthew B.
Kincy Bryan D.
Kjolaas Kari Anne H.
Koh Glenn
Krause Jean C.
LaMotte Robert H.
Liao Jung-Chi
Lin Gregory G.
Lum David S.
Manowitz David H.
Mansour Sharieff A.
Masaki Kinuko
Molnar Lajos
Mwanyoha Sadiki P.
Myers Amanda S.
O'Connell Michael P.
Park John
Payton Karen L.
Pfautz Jonathan D.
Plant Geoffrey L.
Power Matthew H.
Rabinowitz William M.
Raju Balasundar I.
Rankovic Christine M.
Reed Charlotte M.
Rhoads Deborah P.
Salisbury J. Kenneth
Santos Jonathan R.
Schloerb David W.
Schlueter Steven J.
Sekiyama Kaoru
Sexton Matthew G.
Shinn-Cunningham Barbara G.
Slaughter Adrienne H.
Srinivasan Mandayam A.
Sroka Jason
Stachowiak Maciej
Takeuchi Anne H.
Tambe Prasanna B.
Tan Hong Z.
Tassa Coral D.
Taylor Francis G.
Voss Kimberley J.
Vyzas Elias A.
Wiegand Thomas E. v.
Wies Evan F.
Yellin Elron A.
Zeltzer David
Zurek Patrick M.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains table of contents for Section 2, an introduction and reports on fourteen research projects.National Institutes of Health Grant RO1 DC00117National Institutes of Health Grant RO1 DC02032National Institutes of Health/National Institute on Deafness and Other Communication Disorders Grant R01 DC00126National Institutes of Health Grant R01 DC00270National Institutes of Health Contract N01 DC52107U.S. Navy - Office of Naval Research/Naval Air Warfare Center Contract N61339-95-K-0014U.S. Navy - Office of Naval Research/Naval Air Warfare Center Contract N61339-96-K-0003U.S. Navy - Office of Naval Research Grant N00014-96-1-0379U.S. Air Force - Office of Scientific Research Grant F49620-95-1-0176U.S. Air Force - Office of Scientific Research Grant F49620-96-1-0202U.S. Navy - Office of Naval Research Subcontract 40167U.S. Navy - Office of Naval Research/Naval Air Warfare Center Contract N61339-96-K-0002National Institutes of Health Grant R01-NS33778U.S. Navy - Office of Naval Research Grant N00014-92-J-184

DSpace@MIT

Meta-analysis of SHANK Mutations in Autism Spectrum Disorders: A Gradient of Severity in Cognitive Impairments.

Author: A Bremer
A Denayer
A Guilmatre
A Shcheglovitov
A Wischmeijer
Agnès Rastetter
Alexandra Afenjar
Alexandre Mathieu
Alexis Brice
AM Grabrucker
Anne Claude Tabet
Anne Polge
Anthony P. Monaco
Aurélia Jacquette
AY Hung
B Chilian
Beatrice Regnault
BJ O'Roak
BM Neale
Brigitte Assouline
C Betancur
C Betancur
C Lord
C Lord
C Schluth-Bolard
Caroline Nava
Caroline Schluth-Bolard
Catalina Betancur
Christel Depienne
Christelle M. Durand
Christopher Gillberg
CJ Clopper
Claire S. Leblond
CM Durand
CM Durand
Coline Stordeur
CP Schaaf
CR Marshall
CS Leblond
D Pinto
D Sato
Daisuke Sato
Dalila Pinto
Damien Sanlaville
Delphine Heron
DH Geschwind
Diana Zelenika
Dominique Bonneau
Elena Maestrini
Elodie Ey
Fabienne Giuliano
Fanny Laffargue
François Rivier
Françoise Devillard
Frédérique Amsellem
G Huguet
GM Cooper
Gregory S. Barsh
Gudrun A. Rappold
Guillaume Huguet
Guy A. Rouleau
H Won
Hugo Peyre
I. Carina Gillberg
J Gauthier
J Gauthier
J Peca
J Sebat
James Lespinasse
Jean Chiesa
Jennifer Howe
Jessica Guibert
JO Friedrich
JP Higgins
JT Glessner
Julie Gauthier
K Phelan
Kevin Mouzat
L Boccuto
L Soorya
L Wing
Laurence Perrin
M Wohr
M Yang
Marc Delepine
Maria Rastam
Marion Leboyer
Mark Lathrop
Mary Coleman
MC Bonaglia
MH Arons
Michael J. Schmeisser
MJ Schmeisser
N Krumm
Nathalie Lemière
P Szatmari
Patrick Edery
Peter Szatmari
Pilar Galan
PW Lane
R Delorme
R Moessner
R Toro
Richard Delorme
Richard Holt
Roberto Toro
S Berkel
S Berkel
S Jamain
Serge Lumbroso
SJ Sanders
SJ Sanders
Stephen W. Scherer
TC Sudhof
Thomas Bourgeron
Tobias M. Boeckers
X Wang
YH Jiang
Publication venue: Public Library of Science
Publication date: 01/01/2014
Field of study

International audienceSHANK genes code for scaffold proteins located at the post-synaptic density of glutamatergic synapses. In neurons, SHANK2 and SHANK3 have a positive effect on the induction and maturation of dendritic spines, whereas SHANK1 induces the enlargement of spine heads. Mutations in SHANK genes have been associated with autism spectrum disorders (ASD), but their prevalence and clinical relevance remain to be determined. Here, we performed a new screen and a meta-analysis of SHANK copy-number and coding-sequence variants in ASD. Copy-number variants were analyzed in 5,657 patients and 19,163 controls, coding-sequence variants were ascertained in 760 to 2,147 patients and 492 to 1,090 controls (depending on the gene), and, individuals carrying de novo or truncating SHANK mutations underwent an extensive clinical investigation. Copy-number variants and truncating mutations in SHANK genes were present in ∼1% of patients with ASD: mutations in SHANK1 were rare (0.04%) and present in males with normal IQ and autism; mutations in SHANK2 were present in 0.17% of patients with ASD and mild intellectual disability; mutations in SHANK3 were present in 0.69% of patients with ASD and up to 2.12% of the cases with moderate to profound intellectual disability. In summary, mutations of the SHANK genes were detected in the whole spectrum of autism with a gradient of severity in cognitive impairment. Given the rare frequency of SHANK1 and SHANK2 deleterious mutations, the clinical relevance of these genes remains to be ascertained. In contrast, the frequency and the penetrance of SHANK3 mutations in individuals with ASD and intellectual disability-more than 1 in 50-warrant its consideration for mutation screening in clinical practice

A Week in Guatemala: Assorted Mental Souvenirs

Author: Bertranou Eleonora
Campbell Bruce
Diedrich Ernest
Echavez-Solano Nelsy
Hemmesch Joy
Howe Alexis
Livingston Michael
Sanchez-Mora Elena
Shouse-Tourino Corey
Supalla Cheri
Publication venue: DigitalCommons@CSB/SJU
Publication date: 10/02/2012
Field of study

College of Saint Benedict and Saint John’s University: DigitalCommons@CSB/SJU

Ripe to be Heard: Worker Voice in the Fair Food Programme

Author: Archon Fung
Bair Jennifer
Brudney James J
De Vaus David
Gereffi Gary
Gianoni Silvia
Guild Alexis
Hayter Susan
Howe Joanna
Iosifides Theodoros
Johnston Elizabeth
Lenard Patti Tamara
Martin Phil
Mason Jennifer
Seidman Gay
Smith Annie
Tan Eugene
Vosko Leah
Wilkinson Adrian
World Development Report WDR
Publication venue: Wiley
Publication date: 22/03/2021
Field of study

The Fair Food Program (FFP) provides a mechanism through which agricultural workers’ collective voice is expressed, heard and responded to within global value chains. The FFP's model of worker-driven social responsibility presents an alternative to traditional corporate social responsibility. This article identifies the FFP's key components and demonstrates its resilience by identifying the ways in which the issues faced by a new group of migrant workers – recruited through a “guest-worker” scheme – were incorporated and dealt with. This case study highlights the important potential presented by the programme to address labour abuses across transnationalized labour markets while considering early replication possibilities

Durham Research Online

Crossref

Morphological Diversity between Culture Strains of a Chlorarachniophyte, Lotharella globosa

Author: Alexis Howe
C Dietz
DJ Hibberd
Erick R. James
F Kasai
GH Gile
GI McFadden
K Ishida
K Ishida
K Ishida
K Ishida
K Ishida
MB Rogers
Patrick J. Keeling
PR Gilson
Rosemary J. Redfield
S Ota
S Ota
S Ota
S Ota
S Ota
S Ota
Yoshihisa Hirakawa
Ø Moestrup
Publication venue: Public Library of Science
Publication date: 15/08/2011
Field of study

Chlorarachniophytes are marine unicellular algae that possess secondary plastids of green algal origin. Although chlorarachniophytes are a small group (the phylum of Chlorarachniophyta contains 14 species in 8 genera), they have variable and complex life cycles that include amoeboid, coccoid, and/or flagellate cells. The majority of chlorarachniophytes possess two or more cell types in their life cycles, and which cell types are found is one of the principle morphological criteria used for species descriptions. Here we describe an unidentified chlorarachniophyte that was isolated from an artificial coral reef that calls this criterion into question. The life cycle of the new strain includes all three major cell types, but DNA barcoding based on the established nucleomorph ITS sequences showed it to share 100% sequence identity with Lotharella globosa. The type strain of L. globosa was also isolated from a coral reef, but is defined as completely lacking an amoeboid stage throughout its life cycle. We conclude that L. globosa possesses morphological diversity between culture strains, and that the new strain is a variety of L. globosa, which we describe as Lotharella globosa var. fortis var. nov. to include the amoeboid stage in the formal description of L. globosa. This intraspecies variation suggest that gross morphological stages maybe lost rather rapidly, and specifically that the type strain of L. globosa has lost the ability to form the amoeboid stage, perhaps recently. This in turn suggests that even major morphological characters used for taxonomy of this group may be variable in natural populations, and therefore misleading

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

A New Chicken Genome Assembly Provides Insight into Avian Genome Structure

Author: Abrahamsen Mitch
Bouk Nathan
Brown C Titus
Burt David W
Cheng Hans H
Chow William
Dodgson Jerry B
Fillon Valerie
Fulton Janet E
Graves Tina
Hawken Rachel
Hillier LaDeana W
Howe Kerstin
Kremitzki Milinn
Kuo Richard
Lovell Peter
Mansour Tamer A
Markovic Chris
Mason Andrew S
Mello Claudio V
Miller Marcia M
Minx Patrick
Morisson Mireille
Pruitt Kim D
Pyrkosz Alexis B
Schneider Valerie
Thibaud-Nissen Francoise
Tomlinson Chad
Vignal Alain
Warren Wesley C
Wirthlin Morgan
Zimin Aleksey
Publication venue: 'Genetics Society of America'
Publication date: 14/11/2016
Field of study

The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts

Crossref

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

eScholarship - University of California

White Rose Research Online

ProdInra

HAL: Hyper Article en Ligne

UQ eSpace (University of Queensland)