3,752 research outputs found
5-Amino-3-(4-pyridyl)isoxazole
In the title compound, C8H7N3O, there are two independent molecules in the asymmetric unit, in which the angles between the pyridine ring and the isoxazole ring are 35.8 (6) and 10.6 (2)°. The crystal packing is stabilized by N—H⋯N hydrogen bonds, which result in the molecules forming a two-dimensional supramolecular layer
Cell population‐based framework of genetic epidemiology in the single‐cell omics era
Genetic epidemiology is a rapidly advancing field due to the recent availability of large amounts of omics data. In recent years, it has become possible to obtain omics information at the single-cell level, so genetic epidemiological models need to be updated to integrate with single-cell expression data. In this perspective paper, we propose a cell population-based framework for genetic epidemiology in the single-cell era. In this framework, genetic diversity influences phenotypic diversity through the diversity of cell population profiles, which are defined as high-dimensional probability distributions of the state spaces of biomolecules of each omics layer. We discuss how biomolecular experimental measurement data can capture the different properties of this distribution. In particular, single-cell data constitute a sample from this population distribution where only some coordinate values are observable. From a data analysis standpoint, we introduce methodology for feature extraction from cell population profiles. Finally, we discuss how this framework can be applied not only to genetic epidemiology but also to systems biology
Data-driven comparison of multiple high-dimensional single-cell expression profiles
Comparing multiple single-cell expression datasets such as cytometry and scRNA-seq data between case and control donors provides information to elucidate the mechanisms of disease. We propose a completely data-driven computational biological method for this task. This overcomes the challenges of conventional cellular subset-based comparisons and facilitates further analyses such as machine learning and gene set analysis of single-cell expression datasets
Human Transcription Quality Improvement
High quality transcription data is crucial for training automatic speech
recognition (ASR) systems. However, the existing industry-level data collection
pipelines are expensive to researchers, while the quality of crowdsourced
transcription is low. In this paper, we propose a reliable method to collect
speech transcriptions. We introduce two mechanisms to improve transcription
quality: confidence estimation based reprocessing at labeling stage, and
automatic word error correction at post-labeling stage. We collect and release
LibriCrowd - a large-scale crowdsourced dataset of audio transcriptions on 100
hours of English speech. Experiment shows the Transcription WER is reduced by
over 50%. We further investigate the impact of transcription error on ASR model
performance and found a strong correlation. The transcription quality
improvement provides over 10% relative WER reduction for ASR models. We release
the dataset and code to benefit the research community.Comment: 5 pages, 3 figures, 5 tables, INTERSPEECH 202
Data-driven identification and classification of nonlinear aging patterns reveals the landscape of associations between DNA methylation and aging
オミックスデータから非線形な加齢変化の全体像を取得する解析手法を開発. 京都大学プレスリリース. 2023-02-13.[Background] Aging affects the incidence of diseases such as cancer and dementia, so the development of biomarkers for aging is an important research topic in medical science. While such biomarkers have been mainly identified based on the assumption of a linear relationship between phenotypic parameters, including molecular markers, and chronological age, numerous nonlinear changes between markers and aging have been identified. However, the overall landscape of the patterns in nonlinear changes that exist in aging is unknown. [Result] We propose a novel computational method, Data-driven Identification and Classification of Nonlinear Aging Patterns (DICNAP), that is based on functional data analysis to identify biomarkers for aging and potential patterns of change during aging in a data-driven manner. We applied the proposed method to large-scale, public DNA methylation data to explore the potential patterns of age-related changes in methylation intensity. The results showed that not only linear, but also nonlinear changes in DNA methylation patterns exist. A monotonous demethylation pattern during aging, with its rate decreasing at around age 60, was identified as the candidate stable nonlinear pattern. We also analyzed the age-related changes in methylation variability. The results showed that the variability of methylation intensity tends to increase with age at age-associated sites. The representative variability pattern is a monotonically increasing pattern that accelerates after middle age. [Conclusion] DICNAP was able to identify the potential patterns of the changes in the landscape of DNA methylation during aging. It contributes to an improvement in our theoretical understanding of the aging process
- …