Search CORE

1,204 research outputs found

Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size

Author: Comont Richard
Lambin Christopher
Mellish Chris
O’Mahony Elaine
Robinson Anne-Marie
Sharma Nirwan
Siddharthan Advaith
Van Der Wal René
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/07/2016
Field of study

We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BeeWatch, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BeeWatch can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label

Aberdeen University Research

Crossref

Open Research Online (The Open University)

Speech Processes for Brain-Computer Interfaces

Author: Herff Christian Emanuel
Publication venue
Publication date: 01/01/2016
Field of study

Speech interfaces have become widely used and are integrated in many applications and devices. However, speech interfaces require the user to produce intelligible speech, which might be hindered by loud environments, concern to bother bystanders or the general in- ability to produce speech due to disabilities. Decoding a usera s imagined speech instead of actual speech would solve this problem. Such a Brain-Computer Interface (BCI) based on imagined speech would enable fast and natural communication without the need to actually speak out loud. These interfaces could provide a voice to otherwise mute people. This dissertation investigates BCIs based on speech processes using functional Near In- frared Spectroscopy (fNIRS) and Electrocorticography (ECoG), two brain activity imaging modalities on opposing ends of an invasiveness scale. Brain activity data have low signal- to-noise ratio and complex spatio-temporal and spectral coherence. To analyze these data, techniques from the areas of machine learning, neuroscience and Automatic Speech Recog- nition are combined in this dissertation to facilitate robust classification of detailed speech processes while simultaneously illustrating the underlying neural processes. fNIRS is an imaging modality based on cerebral blood flow. It only requires affordable hardware and can be set up within minutes in a day-to-day environment. Therefore, it is ideally suited for convenient user interfaces. However, the hemodynamic processes measured by fNIRS are slow in nature and the technology therefore offers poor temporal resolution. We investigate speech in fNIRS and demonstrate classification of speech processes for BCIs based on fNIRS. ECoG provides ideal signal properties by invasively measuring electrical potentials artifact- free directly on the brain surface. High spatial resolution and temporal resolution down to millisecond sampling provide localized information with accurate enough timing to capture the fast process underlying speech production. This dissertation presents the Brain-to- Text system, which harnesses automatic speech recognition technology to decode a textual representation of continuous speech from ECoG. This could allow to compose messages or to issue commands through a BCI. While the decoding of a textual representation is unparalleled for device control and typing, direct communication is even more natural if the full expressive power of speech - including emphasis and prosody - could be provided. For this purpose, a second system is presented, which directly synthesizes neural signals into audible speech, which could enable conversation with friends and family through a BCI. Up to now, both systems, the Brain-to-Text and synthesis system are operating on audibly produced speech. To bridge the gap to the final frontier of neural prostheses based on imagined speech processes, we investigate the differences between audibly produced and imagined speech and present first results towards BCI from imagined speech processes. This dissertation demonstrates the usage of speech processes as a paradigm for BCI for the first time. Speech processes offer a fast and natural interaction paradigm which will help patients and healthy users alike to communicate with computers and with friends and family efficiently through BCIs

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Brain-to-text: Decoding spoken phrases from phone representations in the brain

Author: Adriana de Pesters
Blakely
Bouchard
Bouchard
Brumberg
Canolty
Chang
Christian Herff
Crane
Crone
Crone
Deng
Dominic Heger
Dominic Telaar
Farwell
Formisano
Fukuda
Gales
Gales
Gasser
Gerwin Schalk
Guenther
Haeb-Umbach
Huang
Jelinek
Kellis
Kennedy
Kubanek
Kubanek
Lee
Leuthardt
Leuthardt
Lotte
Martin
McFarland
Mesgarani
Miller
Mugler
Mugler
Pasley
Pei
Pei
Peter Brunner
Potes
PulvermÃ¼ller
Rabiner
Roy
Sahin
Schalk
Schultz
Stolcke
Sutter
Talairach
Tanja Schultz
Telaar
Towle
Unknown.
Wolpaw
Publication venue: Frontiers Media
Publication date: 01/01/2015
Field of study

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech

Crossref

KITopen

Frontiers - Publisher Connector

PubMed Central

Classification of the Structural Behavior of Tall Buildings with a Diagrid Structure: A Machine Learning-Based Approach

Author: Ghisi A
Kazemi P
Mariani S
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

We study the relationship between the architectural form of tall buildings and their structural response to a conventional seismic load. A series of models are generated by varying the top and bottom plan geometries of the buildings, and a steel diagrid structure is mapped onto their skin. A supervised machine learning approach is then adopted to learn the features of the aforementioned relationship. Six different classifiers, namely k-nearest neighbour, support vector machine, decision tree, ensemble method, discriminant analysis, and naive Bayes, are adopted to this aim, targeting the structural response as the building drift, i.e., the lateral displacement at its top under the considered external excitation. By focusing on the classification of the structural response, it is shown that some classifiers, like, e.g., decision tree, k-nearest neighbour and the ensemble method, can learn well the structural behavior, and can therefore help design teams to select more efficient structural solutions

Archivio istituzionale della ricerca - Politecnico di Milano

Clinical Prediction from Structural Brain MRI Scans: A Large-Scale Empirical Study

Author: Konukoglu Ender
Sabuncu Mert R
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Multivariate pattern analysis (MVPA) methods have become an important tool in neuroimaging, revealing complex associations and yielding powerful prediction models. Despite methodological developments and novel application domains, there has been little effort to compile benchmark results that researchers can reference and compare against. This study takes a significant step in this direction. We employed three classes of state-of-the-art MVPA algorithms and common types of structural measurements from brain Magnetic Resonance Imaging (MRI) scans to predict an array of clinically relevant variables (diagnosis of Alzheimer’s, schizophrenia, autism, and attention deficit and hyperactivity disorder; age, cerebrospinal fluid derived amyloid-β levels and mini-mental state exam score). We analyzed data from over 2,800 subjects, compiled from six publicly available datasets. The employed data and computational tools are freely distributed (https://www.nmr.mgh.harvard.edu/lab/mripredict), making this the largest, most comprehensive, reproducible benchmark image-based prediction experiment to date in structural neuroimaging. Finally, we make several observations regarding the factors that influence prediction performance and point to future research directions. Unsurprisingly, our results suggest that the biological footprint (effect size) has a dramatic influence on prediction performance. Though the choice of image measurement and MVPA algorithm can impact the result, there was no universally optimal selection. Intriguingly, the choice of algorithm seemed to be less critical than the choice of measurement type. Finally, our results showed that cross-validation estimates of performance, while generally optimistic, correlate well with generalization accuracy on a new dataset.BrightFocus Foundation (Alzheimer’s Disease pilot grant (AHAF A2012333))National Institutes of Health (U.S.) (K25 grant (NIBIB 1K25EB013649-01))National Center for Research Resources (U.S.) (U24 RR021382)National Institutes of Health. National Institute for Biomedical Imaging and Bioengineering (R01EB006758)National Institute of Neurological Disorders and Stroke (U.S.) (R01 NS052585-01, 1R21NS072652-01, 1R01NS070963, R01NS083534)National Institutes of Health (U.S.) (Blueprint for Neuroscience Research (5U01-MH093765)

DSpace@MIT

PubMed Central

ALEC: Active learning with ensemble of classifiers for clinical diagnosis of coronary artery disease

Author: Khozeimeh Fahime
Shoeibi Afshin
Publication venue: Elsevier
Publication date: 01/05/2023
Field of study

Invasive angiography is the reference standard for coronary artery disease (CAD) diagnosis but is expensive and associated with certain risks. Machine learning (ML) using clinical and noninvasive imaging parameters can be used for CAD diagnosis to avoid the side effects and cost of angiography. However, ML methods require labeled samples for efficient training. The labeled data scarcity and high labeling costs can be mitigated by active learning. This is achieved through selective query of challenging samples for labeling. To the best of our knowledge, active learning has not been used for CAD diagnosis yet. An Active Learning with Ensemble of Classifiers (ALEC) method is proposed for CAD diagnosis, consisting of four classifiers. Three of these classifiers determine whether a patient’s three main coronary arteries are stenotic or not. The fourth classifier predicts whether the patient has CAD or not. ALEC is first trained using labeled samples. For each unlabeled sample, if the outputs of the classifiers are consistent, the sample along with its predicted label is added to the pool of labeled samples. Inconsistent samples are manually labeled by medical experts before being added to the pool. The training is performed once more using the samples labeled so far. The interleaved phases of labeling and training are repeated until all samples are labeled. Compared with 19 other active learning algorithms, ALEC combined with a support vector machine classifier attained superior performance with 97.01% accuracy. Our method is justified mathematically as well. We also comprehensively analyze the CAD dataset used in this paper. As part of dataset analysis, features pairwise correlation is computed. The top 15 features contributing to CAD and stenosis of the three main coronary arteries are determined. The relationship between stenosis of the main arteries is presented using conditional probabilities. The effect of considering the number of stenotic arteries on sample discrimination is investigated. The discrimination power over dataset samples is visualized, assuming each of the three main coronary arteries as a sample label and considering the two remaining arteries as sample features

Repositorio Institucional Universidad de Granada

Recommended from our members

Biomarker for tracking progression of Alzheimer's disease in clinical trials

Author: Verma Nishant
Publication venue
Publication date: 05/05/2017
Field of study

Currently, there are no treatments available for mitigating the neurological effects of Alzheimer's disease. All clinical trials of disease-modifying treatments, which showed promise in animal models, have failed to show a significant treatment effect in human trials. The lack of a sensitive outcome measure and the focus on the dementia stage for investigating treatments are believed to be the primary reasons behind the failure of all clinical trials till date. The currently used outcome measure, the Alzheimer's Disease Assessment Scale-Cognitive subscale (ADAS-Cog), suffers from low sensitivity in tracking progression of cognitive impairment in clinical trials. A shift in the focus to the prodromal mild cognitive impairment (MCI) stage may help improve the efficiency of clinical trials. However, even lower sensitivity of the ADAS-Cog and an inability to specifically select progressive MCI patients limit the efficiency of clinical trials in the MCI stage. Cerebral atrophy measured on structural magnetic resonance (MR) imaging is highly promising for tracking disease progression in clinical trials. However, cerebral atrophy has not been yet approved as a valid biomarker due to the lack of an understanding behind its relationship with cognitive impairment. The focus of this dissertation spans across the two research areas of (i) developing automatic algorithms for analysis of patients' brain MR volumes, and (ii) improving the efficiency of clinical trials of disease-modifying treatments. This dissertation presents a novel knowledge-driven decision theory approach for automatic tissue segmentation of brain MR volumes, which shows better segmentation performance than the existing approaches. The remaining dissertation contributions focus at improving the efficiency of clinical trials of disease-modifying treatments. An improved scoring methodology is presented for the ADAS-Cog outcome measure, which measures cognitive impairment with better accuracy and significantly improves the sensitivity of the ADAS-Cog in the mild-to-moderate Alzheimer's disease stage. However, the ADAS-Cog continues to suffers from low sensitivity in the MCI stage due to inherent limitations of its items. For improving the efficiency of clinical trials in the MCI stage, a biomarker has been developed that combines the ADAS-Cog with cerebral atrophy for more accurate tracking of Alzheimer's progression and facilitating selection of MCI patients in clinical trials.Biomedical Engineerin

Texas ScholarWorks

Presenting Suspicious Details in User-Facing E-mail Headers Does Not Improve Phishing Detection

Author: Becker Ingolf
Zheng Sarah
Publication venue: USENIX Association
Publication date: 07/08/2022
Field of study

Phishing requires humans to fall for impersonated sources. Sender authenticity can often be inferred from e-mail header information commonly displayed by e-mail clients, such as sender and recipient details. People may be biased by convincing e-mail content and overlook these details, and subsequently fall for phishing. This study tests whether people are better at detecting phishing e-mails when they are only presented with user-facing e-mail headers, instead of full emails. Results from a representative sample show that most phishing e-mails were detected by less than 30% of the participants, regardless of which e-mail part was displayed. In fact, phishing detection was worst when only e-mail headers were provided. Thus, people still fall for phishing, because they do not recognize online impersonation tactics. No personal traits, e-mail characteristics, nor URL interactions reliably predicted phishing detection abilities. These findings highlight the need for novel approaches to help users with evaluating e-mail authenticity

UCL Discovery

Confounds and overestimations in fake review detection: Experimentally controlling for product-ownership and data-origin

Author: Johnson Shane D
Kleinberg Bennett
Soldner Felix
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2022
Field of study

The popularity of online shopping is steadily increasing. At the same time, fake product reviews are published widely and have the potential to affect consumer purchasing behavior. In response, previous work has developed automated methods utilizing natural language processing approaches to detect fake product reviews. However, studies vary considerably in how well they succeed in detecting deceptive reviews, and the reasons for such differences are unclear. A contributing factor may be the multitude of strategies used to collect data, introducing potential confounds which affect detection performance. Two possible confounds are data-origin (i.e., the dataset is composed of more than one source) and product ownership (i.e., reviews written by individuals who own or do not own the reviewed product). In the present study, we investigate the effect of both confounds for fake review detection. Using an experimental design, we manipulate data-origin, product ownership, review polarity, and veracity. Supervised learning analysis suggests that review veracity (60.26-69.87%) is somewhat detectable but reviews additionally confounded with product-ownership (66.19-74.17%), or with data-origin (84.44-86.94%) are easier to classify. Review veracity is most easily classified if confounded with product-ownership and data-origin combined (87.78-88.12%). These findings are moderated by review polarity. Overall, our findings suggest that detection accuracy may have been overestimated in previous studies, provide possible explanations as to why, and indicate how future studies might be designed to provide less biased estimates of detection accuracy

arXiv.org e-Print Archive

Directory of Open Access Journals

UCL Discovery

PubMed Central