77 research outputs found
Computer-aided diagnosis:Detection and localization of prostate cancer within the peripheral zone
Reliable Generation of EHR Time Series via Diffusion Models
Electronic Health Records (EHRs) are rich sources of patient-level data,
including laboratory tests, medications, and diagnoses, offering valuable
resources for medical data analysis. However, concerns about privacy often
restrict access to EHRs, hindering downstream analysis. Researchers have
explored various methods for generating privacy-preserving EHR data. In this
study, we introduce a new method for generating diverse and realistic synthetic
EHR time series data using Denoising Diffusion Probabilistic Models (DDPM). We
conducted experiments on six datasets, comparing our proposed method with eight
existing methods. Our results demonstrate that our approach significantly
outperforms all existing methods in terms of data utility while requiring less
training effort. Our approach also enhances downstream medical data analysis by
providing diverse and realistic synthetic EHR data
Spitzer View of Massive Star Formation in the Tidally Stripped Magellanic Bridge
The Magellanic Bridge is the nearest low-metallicity, tidally stripped
environment, offering a unique high-resolution view of physical conditions in
merging and forming galaxies. In this paper we present analysis of candidate
massive young stellar objects (YSOs), i.e., {\it in situ, current} massive star
formation (MSF) in the Bridge using {\it Spitzer} mid-IR and complementary
optical and near-IR photometry. While we definitely find YSOs in the Bridge,
the most massive are , found in the Large
Magellanic Cloud (LMC). The intensity of MSF in the Bridge also appears
decreasing, as the most massive YSOs are less massive than those formed in the
past. To investigate environmental effects on MSF, we have compared properties
of massive YSOs in the Bridge to those in the LMC. First, YSOs in the Bridge
are apparently less embedded than in the LMC: 81% of Bridge YSOs show optical
counterparts, compared to only 56% of LMC sources with the same range of mass,
circumstellar dust mass, and line-of-sight extinction. Circumstellar envelopes
are evidently more porous or clumpy in the Bridge's low-metallicity
environment. Second, we have used whole samples of YSOs in the LMC and the
Bridge to estimate the probability of finding YSOs at a given \hi\ column
density, N(HI). We found that the LMC has higher probability than
the Bridge for N(HI) cm, but the trend reverses at
lower N(HI). Investigating whether this lower efficiency relative to HI is due
to less efficient molecular cloud formation, or less efficient cloud collapse,
or both, will require sensitive molecular gas observations.Comment: 41 pages, 20 figures, 6 tables; accepted for publication in ApJ;
several figures are in low resolution due to the size limit here and a high
resolution version can be downloaded via
http://www.astro.virginia.edu/~cc5ye/ms_bridge20140215.pd
Recent Advances in Machine Learning for Network Automation in the O-RAN
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/The evolution of network technologies has witnessed a paradigm shift toward open and intelligent networks, with the Open Radio Access Network (O-RAN) architecture emerging as a promising solution. O-RAN introduces disaggregation and virtualization, enabling network operators to deploy multi-vendor and interoperable solutions. However, managing and automating the complex O-RAN ecosystem presents numerous challenges. To address this, machine learning (ML) techniques have gained considerable attention in recent years, offering promising avenues for network automation in O-RAN. This paper presents a comprehensive survey of the current research efforts on network automation using ML in O-RAN. We begin by providing an overview of the O-RAN architecture and its key components, highlighting the need for automation. Subsequently, we delve into O-RAN support for ML techniques. The survey then explores challenges in network automation using ML within the O-RAN environment, followed by the existing research studies discussing application of ML algorithms and frameworks for network automation in O-RAN. The survey further discusses the research opportunities by identifying important aspects where ML techniques can benefit.Peer reviewe
A Spitzer Space Telescope far-infrared spectral atlas of compact sources in the Magellanic Clouds. I. The Large Magellanic Cloud
[abridged] We present 52-93 micron spectra obtained with Spitzer in the
MIPS-SED mode, of a representative sample of luminous compact far-IR sources in
the LMC. These include carbon stars, OH/IR AGB stars, post-AGB objects and PNe,
RCrB-type star HV2671, OH/IR red supergiants WOHG064 and IRAS05280-6910, B[e]
stars IRAS04530-6916, R66 and R126, Wolf-Rayet star Brey3a, Luminous Blue
Variable R71, supernova remnant N49, a large number of young stellar objects,
compact HII regions and molecular cores, and a background galaxy (z~0.175). We
use the spectra to constrain the presence and temperature of cold dust and the
excitation conditions and shocks within the neutral and ionized gas, in the
circumstellar environments and interfaces with the surrounding ISM. Evolved
stars, including LBV R71, lack cold dust except in some cases where we argue
that this is swept-up ISM. This leads to an estimate of the duration of the
prolific dust-producing phase ("superwind") of several thousand years for both
RSGs and massive AGB stars, with a similar fractional mass loss experienced
despite the different masses. We tentatively detect line emission from neutral
oxygen in the extreme RSG WOHG064, with implications for the wind driving. In
N49, the shock between the supernova ejecta and ISM is revealed by its strong
[OI] 63-micron emission and possibly water vapour; we estimate that 0.2 Msun of
ISM dust was swept up. Some of the compact HII regions display pronounced
[OIII] 88-micron emission. The efficiency of photo-electric heating in the
interfaces of ionized gas and molecular clouds is estimated at 0.1-0.3%. We
confirm earlier indications of a low nitrogen content in the LMC. Evidence for
solid state emission features is found in both young and evolved object; some
of the YSOs are found to contain crystalline water ice.Comment: Accepted for publication in The Astronomical Journal. This paper
accompanies the Summer 2009 SAGE-Spec release of 48 MIPS-SED spectra, but
uses improved spectrum extraction. (Fig. 2 reduced resolution because of
arXiv limit.
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
What does it take to create the Babel Fish, a tool that can help individuals
translate speech between any two languages? While recent breakthroughs in
text-based models have pushed machine translation coverage beyond 200
languages, unified speech-to-speech translation models have yet to achieve
similar strides. More specifically, conventional speech-to-speech translation
systems rely on cascaded systems that perform translation progressively,
putting high-performing unified systems out of reach. To address these gaps, we
introduce SeamlessM4T, a single model that supports speech-to-speech
translation, speech-to-text translation, text-to-speech translation,
text-to-text translation, and automatic speech recognition for up to 100
languages. To build this, we used 1 million hours of open speech audio data to
learn self-supervised speech representations with w2v-BERT 2.0. Subsequently,
we created a multimodal corpus of automatically aligned speech translations.
Filtered and combined with human-labeled and pseudo-labeled data, we developed
the first multilingual system capable of translating from and into English for
both speech and text. On FLEURS, SeamlessM4T sets a new standard for
translations into multiple target languages, achieving an improvement of 20%
BLEU over the previous SOTA in direct speech-to-text translation. Compared to
strong cascaded models, SeamlessM4T improves the quality of into-English
translation by 1.3 BLEU points in speech-to-text and by 2.6 ASR-BLEU points in
speech-to-speech. Tested for robustness, our system performs better against
background noises and speaker variations in speech-to-text tasks compared to
the current SOTA model. Critically, we evaluated SeamlessM4T on gender bias and
added toxicity to assess translation safety. Finally, all contributions in this
work are open-sourced and accessible at
https://github.com/facebookresearch/seamless_communicatio
Using Pre-existing Microarray Datasets to Increase Experimental Power: Application to Insulin Resistance
Although they have become a widely used experimental technique for identifying differentially expressed (DE) genes, DNA microarrays are notorious for generating noisy data. A common strategy for mitigating the effects of noise is to perform many experimental replicates. This approach is often costly and sometimes impossible given limited resources; thus, analytical methods are needed which increase accuracy at no additional cost. One inexpensive source of microarray replicates comes from prior work: to date, data from hundreds of thousands of microarray experiments are in the public domain. Although these data assay a wide range of conditions, they cannot be used directly to inform any particular experiment and are thus ignored by most DE gene methods. We present the SVD Augmented Gene expression Analysis Tool (SAGAT), a mathematically principled, data-driven approach for identifying DE genes. SAGAT increases the power of a microarray experiment by using observed coexpression relationships from publicly available microarray datasets to reduce uncertainty in individual genes' expression measurements. We tested the method on three well-replicated human microarray datasets and demonstrate that use of SAGAT increased effective sample sizes by as many as 2.72 arrays. We applied SAGAT to unpublished data from a microarray study investigating transcriptional responses to insulin resistance, resulting in a 50% increase in the number of significant genes detected. We evaluated 11 (58%) of these genes experimentally using qPCR, confirming the directions of expression change for all 11 and statistical significance for three. Use of SAGAT revealed coherent biological changes in three pathways: inflammation, differentiation, and fatty acid synthesis, furthering our molecular understanding of a type 2 diabetes risk factor. We envision SAGAT as a means to maximize the potential for biological discovery from subtle transcriptional responses, and we provide it as a freely available software package that is immediately applicable to any human microarray study
- …