14,767 research outputs found
Advances on Concept Drift Detection in Regression Tasks using Social Networks Theory
Mining data streams is one of the main studies in machine learning area due
to its application in many knowledge areas. One of the major challenges on
mining data streams is concept drift, which requires the learner to discard the
current concept and adapt to a new one. Ensemble-based drift detection
algorithms have been used successfully to the classification task but usually
maintain a fixed size ensemble of learners running the risk of needlessly
spending processing time and memory. In this paper we present improvements to
the Scale-free Network Regressor (SFNR), a dynamic ensemble-based method for
regression that employs social networks theory. In order to detect concept
drifts SFNR uses the Adaptive Window (ADWIN) algorithm. Results show
improvements in accuracy, especially in concept drift situations and better
performance compared to other state-of-the-art algorithms in both real and
synthetic data
Decentralized projected Riemannian gradient method for smooth optimization on compact submanifolds
We consider the problem of decentralized nonconvex optimization over a
compact submanifold, where each local agent's objective function defined by the
local dataset is smooth. Leveraging the powerful tool of proximal smoothness,
we establish local linear convergence of the projected gradient descent method
with unit step size for solving the consensus problem over the compact
manifold. This serves as the basis for analyzing decentralized algorithms on
manifolds. Then, we propose two decentralized methods, namely the decentralized
projected Riemannian gradient descent (DPRGD) and the decentralized projected
Riemannian gradient tracking (DPRGT) methods. We establish their convergence
rates of and , respectively, to
reach a stationary point. To the best of our knowledge, DPRGT is the first
decentralized algorithm to achieve exact convergence for solving decentralized
optimization over a compact manifold. The key ingredients in the proof are the
Lipschitz-type inequalities of the projection operator on the compact manifold
and smooth functions on the manifold, which could be of independent interest.
Finally, we demonstrate the effectiveness of our proposed methods compared to
state-of-the-art ones through numerical experiments on eigenvalue problems and
low-rank matrix completion.Comment: 32 page
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Changes in PRC1 activity during interphase modulate lineage transition in pluripotent cells
The potential of pluripotent cells to respond to developmental cues and trigger cell differentiation is enhanced during the G1 phase of the cell cycle, but the molecular mechanisms involved are poorly understood. Variations in polycomb activity during interphase progression have been hypothesized to regulate the cell-cycle-phase-dependent transcriptional activation of differentiation genes during lineage transition in pluripotent cells. Here, we show that recruitment of Polycomb Repressive Complex 1 (PRC1) and associated molecular functions, ubiquitination of H2AK119 and three-dimensional chromatin interactions, are enhanced during S and G2 phases compared to the G1 phase. In agreement with the accumulation of PRC1 at target promoters upon G1 phase exit, cells in S and G2 phases show firmer transcriptional repression of developmental regulator genes that is drastically perturbed upon genetic ablation of the PRC1 catalytic subunit RING1B. Importantly, depletion of RING1B during retinoic acid stimulation interferes with the preference of mouse embryonic stem cells (mESCs) to induce the transcriptional activation of differentiation genes in G1 phase. We propose that incremental enrolment of polycomb repressive activity during interphase progression reduces the tendency of cells to respond to developmental cues during S and G2 phases, facilitating activation of cell differentiation in the G1 phase of the pluripotent cell cycle
Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Automatic speech recognition (ASR) has gained a remarkable success thanks to
recent advances of deep learning, but it usually degrades significantly under
real-world noisy conditions. Recent works introduce speech enhancement (SE) as
front-end to improve speech quality, which is proved effective but may not be
optimal for downstream ASR due to speech distortion problem. Based on that,
latest works combine SE and currently popular self-supervised learning (SSL) to
alleviate distortion and improve noise robustness. Despite the effectiveness,
the speech distortion caused by conventional SE still cannot be completely
eliminated. In this paper, we propose a self-supervised framework named
Wav2code to implement a generalized SE without distortions for noise-robust
ASR. First, in pre-training stage the clean speech representations from SSL
model are sent to lookup a discrete codebook via nearest-neighbor feature
matching, the resulted code sequence are then exploited to reconstruct the
original clean representations, in order to store them in codebook as prior.
Second, during finetuning we propose a Transformer-based code predictor to
accurately predict clean codes by modeling the global dependency of input noisy
representations, which enables discovery and restoration of high-quality clean
representations without distortions. Furthermore, we propose an interactive
feature fusion network to combine original noisy and the restored clean
representations to consider both fidelity and quality, resulting in even more
informative features for downstream ASR. Finally, experiments on both synthetic
and real noisy datasets demonstrate that Wav2code can solve the speech
distortion and improve ASR performance under various noisy conditions,
resulting in stronger robustness.Comment: 12 pages, 7 figures, Submitted to IEEE/ACM TASL
Event-based tracking of human hands
This paper proposes a novel method for human hands tracking using data from
an event camera. The event camera detects changes in brightness, measuring
motion, with low latency, no motion blur, low power consumption and high
dynamic range. Captured frames are analysed using lightweight algorithms
reporting 3D hand position data. The chosen pick-and-place scenario serves as
an example input for collaborative human-robot interactions and in obstacle
avoidance for human-robot safety applications. Events data are pre-processed
into intensity frames. The regions of interest (ROI) are defined through object
edge event activity, reducing noise. ROI features are extracted for use
in-depth perception. Event-based tracking of human hand demonstrated feasible,
in real time and at a low computational cost. The proposed ROI-finding method
reduces noise from intensity images, achieving up to 89% of data reduction in
relation to the original, while preserving the features. The depth estimation
error in relation to ground truth (measured with wearables), measured using
dynamic time warping and using a single event camera, is from 15 to 30
millimetres, depending on the plane it is measured. Tracking of human hands in
3D space using a single event camera data and lightweight algorithms to define
ROI features (hands tracking in space)
Continual Learning of Hand Gestures for Human-Robot Interaction
In this paper, we present an efficient method to incrementally learn to
classify static hand gestures. This method allows users to teach a robot to
recognize new symbols in an incremental manner. Contrary to other works which
use special sensors or external devices such as color or data gloves, our
proposed approach makes use of a single RGB camera to perform static hand
gesture recognition from 2D images. Furthermore, our system is able to
incrementally learn up to 38 new symbols using only 5 samples for each old
class, achieving a final average accuracy of over 90\%. In addition to that,
the incremental training time can be reduced to a 10\% of the time required
when using all data available
BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts
Twitter bot detection has become a crucial task in efforts to combat online
misinformation, mitigate election interference, and curb malicious propaganda.
However, advanced Twitter bots often attempt to mimic the characteristics of
genuine users through feature manipulation and disguise themselves to fit in
diverse user communities, posing challenges for existing Twitter bot detection
models. To this end, we propose BotMoE, a Twitter bot detection framework that
jointly utilizes multiple user information modalities (metadata, textual
content, network structure) to improve the detection of deceptive bots.
Furthermore, BotMoE incorporates a community-aware Mixture-of-Experts (MoE)
layer to improve domain generalization and adapt to different Twitter
communities. Specifically, BotMoE constructs modal-specific encoders for
metadata features, textual content, and graphical structure, which jointly
model Twitter users from three modal-specific perspectives. We then employ a
community-aware MoE layer to automatically assign users to different
communities and leverage the corresponding expert networks. Finally, user
representations from metadata, text, and graph perspectives are fused with an
expert fusion layer, combining all three modalities while measuring the
consistency of user information. Extensive experiments demonstrate that BotMoE
significantly advances the state-of-the-art on three Twitter bot detection
benchmarks. Studies also confirm that BotMoE captures advanced and evasive
bots, alleviates the reliance on training data, and better generalizes to new
and previously unseen user communities.Comment: Accepted at SIGIR 202
Audio-Visual Automatic Speech Recognition Towards Education for Disabilities
Education is a fundamental right that enriches everyone’s life. However, physically challenged people often debar from the general and advanced education system. Audio-Visual Automatic Speech Recognition (AV-ASR) based system is useful to improve the education of physically challenged people by providing hands-free computing. They can communicate to the learning system through AV-ASR. However, it is challenging to trace the lip correctly for visual modality. Thus, this paper addresses the appearance-based visual feature along with the co-occurrence statistical measure for visual speech recognition. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-Level Co-occurrence Matrix (GLCM) is proposed for visual speech information. The experimental results show that the proposed system achieves 76.60 % accuracy for visual speech and 96.00 % accuracy for audio speech recognition
- …