11 research outputs found
Methods for Interpreting and Understanding Deep Neural Networks
This paper provides an entry point to the problem of interpreting a deep
neural network model and explaining its predictions. It is based on a tutorial
given at ICASSP 2017. It introduces some recently proposed techniques of
interpretation, along with theory, tricks and recommendations, to make most
efficient use of these techniques on real data. It also discusses a number of
practical applications.Comment: 14 pages, 10 figure
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
We present the content deformation field CoDeF as a new type of video
representation, which consists of a canonical content field aggregating the
static contents in the entire video and a temporal deformation field recording
the transformations from the canonical image (i.e., rendered from the canonical
content field) to each individual frame along the time axis.Given a target
video, these two fields are jointly optimized to reconstruct it through a
carefully tailored rendering pipeline.We advisedly introduce some
regularizations into the optimization process, urging the canonical content
field to inherit semantics (e.g., the object shape) from the video.With such a
design, CoDeF naturally supports lifting image algorithms for video processing,
in the sense that one can apply an image algorithm to the canonical image and
effortlessly propagate the outcomes to the entire video with the aid of the
temporal deformation field.We experimentally show that CoDeF is able to lift
image-to-image translation to video-to-video translation and lift keypoint
detection to keypoint tracking without any training.More importantly, thanks to
our lifting strategy that deploys the algorithms on only one image, we achieve
superior cross-frame consistency in processed videos compared to existing
video-to-video translation approaches, and even manage to track non-rigid
objects like water and smog.Project page can be found at
https://qiuyu96.github.io/CoDeF/.Comment: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code:
https://github.com/qiuyu96/CoDe
LemurFaceID: a face recognition system to facilitate individual identification of lemurs
Background: Long-term research of known individuals is critical for understanding the demographic and evolutionary processes that influence natural populations. Current methods for individual identification of many animals include capture and tagging techniques and/or researcher knowledge of natural variation in individual phenotypes. These methods can be costly, time-consuming, and may be impractical for larger-scale, populationlevel studies. Accordingly, for many animal lineages, long-term research projects are often limited to only a few taxa. Lemurs, a mammalian lineage endemic to Madagascar, are no exception. Long-term data needed to address evolutionary questions are lacking for many species. This is, at least in part, due to difficulties collecting consistent data on known individuals over long periods of time. Here, we present a new method for individual identification of lemurs (LemurFaceID). LemurFaceID is a computer-assisted facial recognition system that can be used to identify individual lemurs based on photographs.
Results: LemurFaceID was developed using patch-wise Multiscale Local Binary Pattern features and modified facial image normalization techniques to reduce the effects of facial hair and variation in ambient lighting on identification. We trained and tested our system using images from wild red-bellied lemurs (Eulemur rubriventer) collected in Ranomafana National Park, Madagascar. Across 100 trials, with different partitions of training and test sets, we demonstrate that the LemurFaceID can achieve 98.7% ± 1.81% accuracy (using 2-query image fusion) in correctly identifying individual lemurs.
Conclusions: Our results suggest that human facial recognition techniques can be modified for identification of individual lemurs based on variation in facial patterns. LemurFaceID was able to identify individual lemurs based on photographs of wild individuals with a relatively high degree of accuracy. This technology would remove many limitations of traditional methods for individual identification. Once optimized, our system can facilitate long-term research of known individuals by providing a rapid, cost-effective, and accurate method for individual identification
Automatic detection of the mental foramen for estimating mandibular cortical width in dental panoramic radiographs
Screening tests are vital for detecting diseases, especially at early stages, where efforts can prevent further illness. For example, osteoporosis is a systemic skeletal disease characterized by low bone mass and microarchitectural deterioration of bone tissue, resulting in bone fragility and susceptibility to fracture. Dual-energy x-ray absorptiometry is commonly used to diagnose osteoporosis since it evaluates bone mineral density. It is the most standard method for diagnosing osteoporosis, but it is not immediately available and is commonly used for research due to the high capital cost. Further, dual-energy x-ray absorptiometry is not used for populational-based screening due to its suboptimal ability to predict hip fractures based on measurements. Therefore, it is recommended to adopt a case-finding strategy to identify individuals at risk who benefit from the dual-energy x-ray absorptiometry examination.
Several indices have been developed to estimate bone quality in dental panoramic radiographs to identify individuals at risk of osteoporosis. In particular, the mandibular cortical width index. Studies suggest that dentists can measure the mandibular cortical width to identify individuals at risk and refer them for bone mineral density testing. However, this endeavor is time-consuming and inconsistent due to the bone's unclear borders and the challenge of determining the mental foramen's position, leading to varying measurements between clinicians. Therefore, the dentistry community is investigating how to automate this process effectively and accurately.
In an attempt to address some of these problems, this thesis presents a method to assess the mandibular cortical width index automatically. Four different object detectors were analyzed to determine the mental foramen's position. EfficientDet showed the highest average precision (0.30). Therefore, it was combined with an iterative procedure to estimate mandibular cortical width. The results are promising
Deep Learning for Text Style Transfer: A Survey
Text style transfer is an important task in natural language generation,
which aims to control certain attributes in the generated text, such as
politeness, emotion, humor, and many others. It has a long history in the field
of natural language processing, and recently has re-gained significant
attention thanks to the promising performance brought by deep neural models. In
this paper, we present a systematic survey of the research on neural text style
transfer, spanning over 100 representative articles since the first neural text
style transfer work in 2017. We discuss the task formulation, existing datasets
and subtasks, evaluation, as well as the rich methodologies in the presence of
parallel and non-parallel data. We also provide discussions on a variety of
important topics regarding the future development of this task. Our curated
paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202
Verquickung der mathematischen und informatischen Forschung an zivilen deutschen Hochschulen mit der modernen Kriegsführung
Central aspects of modern warfare such as drone attacks, guided missiles, cyber attacks or the use of reconnaissance satellites would not be possible without contemporary mathematical and computer science research. This dissertation will revolve about the following questions: What impact does the connection between military application and civil research have on mathematics and computer science? What impact does the research have on society? And what ethical and societal questions arise from that
Automatic semantic and geometric enrichment of CityGML 3D building models of varying architectural styles with HOG-based template matching
While the number of 3D geo-spatial digital models of buildings with cultural heritage interest is burgeoning, most lack semantic annotation that could be used to inform users of mobile and desktop applications about the architectural features and origins of the buildings. Additionally, while automated reconstruction of 3D building models is an active research area, the labelling of architectural features (objects) is comparatively less well researched, while distinguishing between different architectural styles is less well researched still. Meanwhile, the successful automatic identification of architectural objects, typified by a comparatively less symmetrical or less regular distribution of objects on façades, particularly on older buildings, has so far eluded researchers.
This research has addressed these issues by automating the semantic and geometric enrichment of existing 3D building models by using Histogram of Oriented Gradients (HOG)-based template matching. The methods are applied to the texture maps of 3D building models of 20th century styles, of Georgian-Regency (1715-1830) style and of the Norman (1066 to late 12th century) style, where the amalgam of styles present on buildings of the latter style necessitates detection of styles of the Gothic tradition (late 12th century to present day).
The most successful results were obtained when applying a set of heuristics including the use of real world dimensions, while a Support Vector Machine (SVM)-based machine learning approach was found effective in obviating the need for thresholds on matchscores when making detection decisions