259 research outputs found
Genome-Wide Association Study for Plant Height and Grain Yield in Rice under Contrasting Moisture Regimes
Drought is one of the vitally critical environmental stresses affecting both growth and yield potential in rice. Drought resistance is a complicated quantitative trait that is regulated by numerous small effect loci and hundreds of genes controlling various morphological and physiological responses to drought. For this study, 270 rice landraces and cultivars were analyzed for their drought resistance. This was done via determination of changes in plant height and grain yield under contrasting water regimes, followed by detailed identification of the underlying genetic architecture via genome-wide association study (GWAS). We controlled population structure by setting top two eigenvectors and combining kinship matrix for GWAS in this study. Eighteen, five, and six associated loci were identified for plant height, grain yield per plant, and drought resistant coefficient, respectively. Nine known functional genes were identified, including five for plant height (OsGA2ox3, OsGH3-2, sd-1, OsGNA1 and OsSAP11/OsDOG), two for grain yield per plant (OsCYP51G3 and OsRRMh) and two for drought resistant coefficient (OsPYL2 and OsGA2ox9), implying very reliable results. A previous study reported OsGNA1 to regulate root development, but this study reports additional controlling of both plant height and root length. Moreover, OsRLK5 is a new drought resistant candidate gene discovered in this study. OsRLK5 mutants showed faster water loss rates in detached leaves. This gene plays an important role in the positive regulation of yield-related traits under drought conditions. We furthermore discovered several new loci contributing to the three investigated traits (plant height, grain yield, and drought resistance). These associated loci and genes significantly improve our knowledge of the genetic control of these traits in rice. In addition, many drought resistant cultivars screened in this study can be used as parental genotypes to improve drought resistance of rice by molecular breeding
Genome-wide Association Study (GWAS) of mesocotyl elongation based on re-sequencing approach in rice
Annotation of candidate genes anchored by associated SNPs. (XLSX 34 kb
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge
This report describes the winning solution to the WeatherProof Dataset
Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are
available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an
enhanced semantic segmentation pipeline for this challenge. Firstly, we improve
semantic segmentation models, using backbone pretrained with Depth Anything to
improve UperNet model and SETRMLA model, and adding language guidance based on
both weather and category information to InternImage model. Secondly, we
introduce a new dataset WeatherProofExtra with wider viewing angle and employ
data augmentation methods, including adverse weather and super-resolution.
Finally, effective training strategies and ensemble method are applied to
improve final performance further. Our solution is ranked 1st on the final
leaderboard. Code will be available at
https://github.com/KaneiGi/WeatherProofChallenge
Delay-penalized transducer for low-latency streaming ASR
In streaming automatic speech recognition (ASR), it is desirable to reduce
latency as much as possible while having minimum impact on recognition
accuracy. Although a few existing methods are able to achieve this goal, they
are difficult to implement due to their dependency on external alignments. In
this paper, we propose a simple way to penalize symbol delay in transducer
model, so that we can balance the trade-off between symbol delay and accuracy
for streaming models without external alignments. Specifically, our method adds
a small constant times (T/2 - t), where T is the number of frames and t is the
current frame, to all the non-blank log-probabilities (after normalization)
that are fed into the two dimensional transducer recursion. For both streaming
Conformer models and unidirectional long short-term memory (LSTM) models,
experimental results show that it can significantly reduce the symbol delay
with an acceptable performance degradation. Our method achieves similar
delay-accuracy trade-off to the previously published FastEmit, but we believe
our method is preferable because it has a better justification: it is
equivalent to penalizing the average symbol delay. Our work is open-sourced and
publicly available (https://github.com/k2-fsa/k2).Comment: Submitted to 2023 IEEE International Conference on Acoustics, Speech
and Signal Processin
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
In this paper, we introduce Libriheavy, a large-scale ASR corpus consisting
of 50,000 hours of read English speech derived from LibriVox. To the best of
our knowledge, Libriheavy is the largest freely-available corpus of speech with
supervisions. Different from other open-sourced datasets that only provide
normalized transcriptions, Libriheavy contains richer information such as
punctuation, casing and text context, which brings more flexibility for system
building. Specifically, we propose a general and efficient pipeline to locate,
align and segment the audios in previously published Librilight to its
corresponding texts. The same as Librilight, Libriheavy also has three training
subsets small, medium, large of the sizes 500h, 5000h, 50000h respectively. We
also extract the dev and test evaluation sets from the aligned audios and
guarantee there is no overlapping speakers and books in training sets. Baseline
systems are built on the popular CTC-Attention and transducer models.
Additionally, we open-source our dataset creatation pipeline which can also be
used to other audio alignment tasks.Comment: Submitted to ICASSP 202
Delay-penalized CTC implemented based on Finite State Transducer
Connectionist Temporal Classification (CTC) suffers from the latency problem
when applied to streaming models. We argue that in CTC lattice, the alignments
that can access more future context are preferred during training, thereby
leading to higher symbol delay. In this work we propose the delay-penalized CTC
which is augmented with latency penalty regularization. We devise a flexible
and efficient implementation based on the differentiable Finite State
Transducer (FST). Specifically, by attaching a binary attribute to CTC
topology, we can locate the frames that firstly emit non-blank tokens on the
resulting CTC lattice, and add the frame offsets to the log-probabilities.
Experimental results demonstrate the effectiveness of our proposed
delay-penalized CTC, which is able to balance the delay-accuracy trade-off.
Furthermore, combining the delay-penalized transducer enables the CTC model to
achieve better performance and lower latency. Our work is open-sourced and
publicly available https://github.com/k2-fsa/k2.Comment: Accepted in INTERSPEECH 202
PromptASR for contextualized ASR with controllable style
Prompts are crucial to large language models as they provide context
information such as topic or logical relationships. Inspired by this, we
propose PromptASR, a framework that integrates prompts in end-to-end automatic
speech recognition (E2E ASR) systems to achieve contextualized ASR with
controllable style of transcriptions. Specifically, a dedicated text encoder
encodes the text prompts and the encodings are injected into the speech encoder
by cross-attending the features from two modalities. When using the ground
truth text from preceding utterances as content prompt, the proposed system
achieves 21.9% and 6.8% relative word error rate reductions on a book reading
dataset and an in-house dataset compared to a baseline ASR system. The system
can also take word-level biasing lists as prompt to improve recognition
accuracy on rare words. An additional style prompt can be given to the text
encoder and guide the ASR system to output different styles of transcriptions.
The code is available at icefall.Comment: Submitted to ICASSP202
Zipformer: A faster and better encoder for automatic speech recognition
The Conformer has become the most popular encoder model for automatic speech
recognition (ASR). It adds convolution modules to a transformer to learn both
local and global dependencies. In this work we describe a faster, more
memory-efficient, and better-performing transformer, called Zipformer. Modeling
changes include: 1) a U-Net-like encoder structure where middle stacks operate
at lower frame rates; 2) reorganized block structure with more modules, within
which we re-use attention weights for efficiency; 3) a modified form of
LayerNorm called BiasNorm allows us to retain some length information; 4) new
activation functions SwooshR and SwooshL work better than Swish. We also
propose a new optimizer, called ScaledAdam, which scales the update by each
tensor's current scale to keep the relative change about the same, and also
explictly learns the parameter scale. It achieves faster convergence and better
performance than Adam. Extensive experiments on LibriSpeech, Aishell-1, and
WenetSpeech datasets demonstrate the effectiveness of our proposed Zipformer
over other state-of-the-art ASR models. Our code is publicly available at
https://github.com/k2-fsa/icefall.Comment: Published as a conference paper at ICLR 202
- …
