10,764 research outputs found
Neural Architecture Search: Insights from 1000 Papers
In the past decade, advances in deep learning have resulted in breakthroughs
in a variety of areas, including computer vision, natural language
understanding, speech recognition, and reinforcement learning. Specialized,
high-performing neural architectures are crucial to the success of deep
learning in these areas. Neural architecture search (NAS), the process of
automating the design of neural architectures for a given task, is an
inevitable next step in automating machine learning and has already outpaced
the best human-designed architectures on many tasks. In the past few years,
research in NAS has been progressing rapidly, with over 1000 papers released
since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized
and comprehensive guide to neural architecture search. We give a taxonomy of
search spaces, algorithms, and speedup techniques, and we discuss resources
such as benchmarks, best practices, other surveys, and open-source libraries
Qluster: An easy-to-implement generic workflow for robust clustering of health data
The exploration of heath data by clustering algorithms allows to better describe the populations of interest by seeking the sub-profiles that compose it. This therefore reinforces medical knowledge, whether it is about a disease or a targeted population in real life. Nevertheless, contrary to the so-called conventional biostatistical methods where numerous guidelines exist, the standardization of data science approaches in clinical research remains a little discussed subject. This results in a significant variability in the execution of data science projects, whether in terms of algorithms used, reliability and credibility of the designed approach. Taking the path of parsimonious and judicious choice of both algorithms and implementations at each stage, this article proposes Qluster, a practical workflow for performing clustering tasks. Indeed, this workflow makes a compromise between (1) genericity of applications (e.g. usable on small or big data, on continuous, categorical or mixed variables, on database of high-dimensionality or not), (2) ease of implementation (need for few packages, few algorithms, few parameters, ...), and (3) robustness (e.g. use of proven algorithms and robust packages, evaluation of the stability of clusters, management of noise and multicollinearity). This workflow can be easily automated and/or routinely applied on a wide range of clustering projects. It can be useful both for data scientists with little experience in the field to make data clustering easier and more robust, and for more experienced data scientists who are looking for a straightforward and reliable solution to routinely perform preliminary data mining. A synthesis of the literature on data clustering as well as the scientific rationale supporting the proposed workflow is also provided. Finally, a detailed application of the workflow on a concrete use case is provided, along with a practical discussion for data scientists. An implementation on the Dataiku platform is available upon request to the authors
VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
We introduce VIVE3D, a novel approach that extends the capabilities of
image-based 3D GANs to video editing and is able to represent the input video
in an identity-preserving and temporally consistent way. We propose two new
building blocks. First, we introduce a novel GAN inversion technique
specifically tailored to 3D GANs by jointly embedding multiple frames and
optimizing for the camera parameters. Second, besides traditional semantic face
edits (e.g. for age and expression), we are the first to demonstrate edits that
show novel views of the head enabled by the inherent properties of 3D GANs and
our optical flow-guided compositing technique to combine the head with the
background video. Our experiments demonstrate that VIVE3D generates
high-fidelity face edits at consistent quality from a range of camera
viewpoints which are composited with the original video in a temporally and
spatially consistent manner.Comment: CVPR 2023. Project webpage and video available at
http://afruehstueck.github.io/vive3
Information-Theoretic GAN Compression with Variational Energy-based Model
We propose an information-theoretic knowledge distillation approach for the
compression of generative adversarial networks, which aims to maximize the
mutual information between teacher and student networks via a variational
optimization based on an energy-based model. Because the direct computation of
the mutual information in continuous domains is intractable, our approach
alternatively optimizes the student network by maximizing the variational lower
bound of the mutual information. To achieve a tight lower bound, we introduce
an energy-based model relying on a deep neural network to represent a flexible
variational distribution that deals with high-dimensional images and consider
spatial dependencies between pixels, effectively. Since the proposed method is
a generic optimization algorithm, it can be conveniently incorporated into
arbitrary generative adversarial networks and even dense prediction networks,
e.g., image enhancement models. We demonstrate that the proposed algorithm
achieves outstanding performance in model compression of generative adversarial
networks consistently when combined with several existing models.Comment: Accepted at Neurips202
Accurate and Interpretable Solution of the Inverse Rig for Realistic Blendshape Models with Quadratic Corrective Terms
We propose a new model-based algorithm solving the inverse rig problem in
facial animation retargeting, exhibiting higher accuracy of the fit and
sparser, more interpretable weight vector compared to SOTA. The proposed method
targets a specific subdomain of human face animation - highly-realistic
blendshape models used in the production of movies and video games. In this
paper, we formulate an optimization problem that takes into account all the
requirements of targeted models. Our objective goes beyond a linear blendshape
model and employs the quadratic corrective terms necessary for correctly
fitting fine details of the mesh. We show that the solution to the proposed
problem yields highly accurate mesh reconstruction even when general-purpose
solvers, like SQP, are used. The results obtained using SQP are highly accurate
in the mesh space but do not exhibit favorable qualities in terms of weight
sparsity and smoothness, and for this reason, we further propose a novel
algorithm relying on a MM technique. The algorithm is specifically suited for
solving the proposed objective, yielding a high-accuracy mesh fit while
respecting the constraints and producing a sparse and smooth set of weights
easy to manipulate and interpret by artists. Our algorithm is benchmarked with
SOTA approaches, and shows an overall superiority of the results, yielding a
smooth animation reconstruction with a relative improvement up to 45 percent in
root mean squared mesh error while keeping the cardinality comparable with
benchmark methods. This paper gives a comprehensive set of evaluation metrics
that cover different aspects of the solution, including mesh accuracy, sparsity
of the weights, and smoothness of the animation curves, as well as the
appearance of the produced animation, which human experts evaluated
Central-provincial Politics and Industrial Policy-making in the Electric Power Sector in China
In addition to the studies that provide meaningful insights into the complexity of technical and economic issues, increasing studies have focused on the political process of market transition in network industries such as the electric power sector. This dissertation studies the central–provincial interactions in industrial policy-making and implementation, and attempts to evaluate the roles of Chinese provinces in the market reform process of the electric power sector. Market reforms of this sector are used as an illustrative case because the new round of market reforms had achieved some significant breakthroughs in areas such as pricing reform and wholesale market trading. Other policy measures, such as the liberalization of the distribution market and cross-regional market-building, are still at a nascent stage and have only scored moderate progress. It is important to investigate why some policy areas make greater progress in market reforms than others. It is also interesting to examine the impacts of Chinese central-provincial politics on producing the different market reform outcomes. Guangdong and Xinjiang are two provinces being analyzed in this dissertation. The progress of market reforms in these two provinces showed similarities although the provinces are very different in terms of local conditions such as the stages of their economic development and energy structures. The actual reform can be understood as the outcomes of certain modes of interactions between the central and provincial actors in the context of their particular capabilities and preferences in different policy areas. This dissertation argues that market reform is more successful in policy areas where the central and provincial authorities are able to engage mainly in integrative negotiations than in areas where they engage mainly in distributive negotiations
Examples of works to practice staccato technique in clarinet instrument
Klarnetin staccato tekniğini güçlendirme aşamaları eser çalışmalarıyla uygulanmıştır. Staccato
geçişlerini hızlandıracak ritim ve nüans çalışmalarına yer verilmiştir. Çalışmanın en önemli amacı
sadece staccato çalışması değil parmak-dilin eş zamanlı uyumunun hassasiyeti üzerinde de
durulmasıdır. Staccato çalışmalarını daha verimli hale getirmek için eser çalışmasının içinde etüt
çalışmasına da yer verilmiştir. Çalışmaların üzerinde titizlikle durulması staccato çalışmasının ilham
verici etkisi ile müzikal kimliğe yeni bir boyut kazandırmıştır. Sekiz özgün eser çalışmasının her
aşaması anlatılmıştır. Her aşamanın bir sonraki performans ve tekniği güçlendirmesi esas alınmıştır.
Bu çalışmada staccato tekniğinin hangi alanlarda kullanıldığı, nasıl sonuçlar elde edildiği bilgisine
yer verilmiştir. Notaların parmak ve dil uyumu ile nasıl şekilleneceği ve nasıl bir çalışma disiplini
içinde gerçekleşeceği planlanmıştır. Kamış-nota-diyafram-parmak-dil-nüans ve disiplin
kavramlarının staccato tekniğinde ayrılmaz bir bütün olduğu saptanmıştır. Araştırmada literatür
taraması yapılarak staccato ile ilgili çalışmalar taranmıştır. Tarama sonucunda klarnet tekniğin de
kullanılan staccato eser çalışmasının az olduğu tespit edilmiştir. Metot taramasında da etüt
çalışmasının daha çok olduğu saptanmıştır. Böylelikle klarnetin staccato tekniğini hızlandırma ve
güçlendirme çalışmaları sunulmuştur. Staccato etüt çalışmaları yapılırken, araya eser çalışmasının
girmesi beyni rahatlattığı ve istekliliği daha arttırdığı gözlemlenmiştir. Staccato çalışmasını yaparken
doğru bir kamış seçimi üzerinde de durulmuştur. Staccato tekniğini doğru çalışmak için doğru bir
kamışın dil hızını arttırdığı saptanmıştır. Doğru bir kamış seçimi kamıştan rahat ses çıkmasına
bağlıdır. Kamış, dil atma gücünü vermiyorsa daha doğru bir kamış seçiminin yapılması gerekliliği
vurgulanmıştır. Staccato çalışmalarında baştan sona bir eseri yorumlamak zor olabilir. Bu açıdan
çalışma, verilen müzikal nüanslara uymanın, dil atış performansını rahatlattığını ortaya koymuştur.
Gelecek nesillere edinilen bilgi ve birikimlerin aktarılması ve geliştirici olması teşvik edilmiştir.
Çıkacak eserlerin nasıl çözüleceği, staccato tekniğinin nasıl üstesinden gelinebileceği anlatılmıştır.
Staccato tekniğinin daha kısa sürede çözüme kavuşturulması amaç edinilmiştir. Parmakların
yerlerini öğrettiğimiz kadar belleğimize de çalışmaların kaydedilmesi önemlidir. Gösterilen azmin ve
sabrın sonucu olarak ortaya çıkan yapıt başarıyı daha da yukarı seviyelere çıkaracaktır
CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition
We present CrossLoc3D, a novel 3D place recognition method that solves a
large-scale point matching problem in a cross-source setting. Cross-source
point cloud data corresponds to point sets captured by depth sensors with
different accuracies or from different distances and perspectives. We address
the challenges in terms of developing 3D place recognition methods that account
for the representation gap between points captured by different sources. Our
method handles cross-source data by utilizing multi-grained features and
selecting convolution kernel sizes that correspond to most prominent features.
Inspired by the diffusion models, our method uses a novel iterative refinement
process that gradually shifts the embedding spaces from different sources to a
single canonical space for better metric learning. In addition, we present
CS-Campus3D, the first 3D aerial-ground cross-source dataset consisting of
point cloud data from both aerial and ground LiDAR scans. The point clouds in
CS-Campus3D have representation gaps and other features like different views,
point densities, and noise patterns. We show that our CrossLoc3D algorithm can
achieve an improvement of 4.74% - 15.37% in terms of the top 1 average recall
on our CS-Campus3D benchmark and achieves performance comparable to
state-of-the-art 3D place recognition method on the Oxford RobotCar. We will
release the code and CS-Campus3D benchmark
Deep Unrestricted Document Image Rectification
In recent years, tremendous efforts have been made on document image
rectification, but existing advanced algorithms are limited to processing
restricted document images, i.e., the input images must incorporate a complete
document. Once the captured image merely involves a local text region, its
rectification quality is degraded and unsatisfactory. Our previously proposed
DocTr, a transformer-assisted network for document image rectification, also
suffers from this limitation. In this work, we present DocTr++, a novel unified
framework for document image rectification, without any restrictions on the
input distorted images. Our major technical improvements can be concluded in
three aspects. Firstly, we upgrade the original architecture by adopting a
hierarchical encoder-decoder structure for multi-scale representation
extraction and parsing. Secondly, we reformulate the pixel-wise mapping
relationship between the unrestricted distorted document images and the
distortion-free counterparts. The obtained data is used to train our DocTr++
for unrestricted document image rectification. Thirdly, we contribute a
real-world test set and metrics applicable for evaluating the rectification
quality. To our best knowledge, this is the first learning-based method for the
rectification of unrestricted document images. Extensive experiments are
conducted, and the results demonstrate the effectiveness and superiority of our
method. We hope our DocTr++ will serve as a strong baseline for generic
document image rectification, prompting the further advancement and application
of learning-based algorithms. The source code and the proposed dataset are
publicly available at https://github.com/fh2019ustc/DocTr-Plus
Machine Learning Research Trends in Africa: A 30 Years Overview with Bibliometric Analysis Review
In this paper, a critical bibliometric analysis study is conducted, coupled
with an extensive literature survey on recent developments and associated
applications in machine learning research with a perspective on Africa. The
presented bibliometric analysis study consists of 2761 machine learning-related
documents, of which 98% were articles with at least 482 citations published in
903 journals during the past 30 years. Furthermore, the collated documents were
retrieved from the Science Citation Index EXPANDED, comprising research
publications from 54 African countries between 1993 and 2021. The bibliometric
study shows the visualization of the current landscape and future trends in
machine learning research and its application to facilitate future
collaborative research and knowledge exchange among authors from different
research institutions scattered across the African continent
- …