11 research outputs found

    Methods for Interpreting and Understanding Deep Neural Networks

    Full text link
    This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical applications.Comment: 14 pages, 10 figure

    CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

    Full text link
    We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i.e., rendered from the canonical content field) to each individual frame along the time axis.Given a target video, these two fields are jointly optimized to reconstruct it through a carefully tailored rendering pipeline.We advisedly introduce some regularizations into the optimization process, urging the canonical content field to inherit semantics (e.g., the object shape) from the video.With such a design, CoDeF naturally supports lifting image algorithms for video processing, in the sense that one can apply an image algorithm to the canonical image and effortlessly propagate the outcomes to the entire video with the aid of the temporal deformation field.We experimentally show that CoDeF is able to lift image-to-image translation to video-to-video translation and lift keypoint detection to keypoint tracking without any training.More importantly, thanks to our lifting strategy that deploys the algorithms on only one image, we achieve superior cross-frame consistency in processed videos compared to existing video-to-video translation approaches, and even manage to track non-rigid objects like water and smog.Project page can be found at https://qiuyu96.github.io/CoDeF/.Comment: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code: https://github.com/qiuyu96/CoDe

    LemurFaceID: a face recognition system to facilitate individual identification of lemurs

    Full text link
    Background: Long-term research of known individuals is critical for understanding the demographic and evolutionary processes that influence natural populations. Current methods for individual identification of many animals include capture and tagging techniques and/or researcher knowledge of natural variation in individual phenotypes. These methods can be costly, time-consuming, and may be impractical for larger-scale, populationlevel studies. Accordingly, for many animal lineages, long-term research projects are often limited to only a few taxa. Lemurs, a mammalian lineage endemic to Madagascar, are no exception. Long-term data needed to address evolutionary questions are lacking for many species. This is, at least in part, due to difficulties collecting consistent data on known individuals over long periods of time. Here, we present a new method for individual identification of lemurs (LemurFaceID). LemurFaceID is a computer-assisted facial recognition system that can be used to identify individual lemurs based on photographs. Results: LemurFaceID was developed using patch-wise Multiscale Local Binary Pattern features and modified facial image normalization techniques to reduce the effects of facial hair and variation in ambient lighting on identification. We trained and tested our system using images from wild red-bellied lemurs (Eulemur rubriventer) collected in Ranomafana National Park, Madagascar. Across 100 trials, with different partitions of training and test sets, we demonstrate that the LemurFaceID can achieve 98.7% ± 1.81% accuracy (using 2-query image fusion) in correctly identifying individual lemurs. Conclusions: Our results suggest that human facial recognition techniques can be modified for identification of individual lemurs based on variation in facial patterns. LemurFaceID was able to identify individual lemurs based on photographs of wild individuals with a relatively high degree of accuracy. This technology would remove many limitations of traditional methods for individual identification. Once optimized, our system can facilitate long-term research of known individuals by providing a rapid, cost-effective, and accurate method for individual identification

    Automatic detection of the mental foramen for estimating mandibular cortical width in dental panoramic radiographs

    Get PDF
    Screening tests are vital for detecting diseases, especially at early stages, where efforts can prevent further illness. For example, osteoporosis is a systemic skeletal disease characterized by low bone mass and microarchitectural deterioration of bone tissue, resulting in bone fragility and susceptibility to fracture. Dual-energy x-ray absorptiometry is commonly used to diagnose osteoporosis since it evaluates bone mineral density. It is the most standard method for diagnosing osteoporosis, but it is not immediately available and is commonly used for research due to the high capital cost. Further, dual-energy x-ray absorptiometry is not used for populational-based screening due to its suboptimal ability to predict hip fractures based on measurements. Therefore, it is recommended to adopt a case-finding strategy to identify individuals at risk who benefit from the dual-energy x-ray absorptiometry examination. Several indices have been developed to estimate bone quality in dental panoramic radiographs to identify individuals at risk of osteoporosis. In particular, the mandibular cortical width index. Studies suggest that dentists can measure the mandibular cortical width to identify individuals at risk and refer them for bone mineral density testing. However, this endeavor is time-consuming and inconsistent due to the bone's unclear borders and the challenge of determining the mental foramen's position, leading to varying measurements between clinicians. Therefore, the dentistry community is investigating how to automate this process effectively and accurately. In an attempt to address some of these problems, this thesis presents a method to assess the mandibular cortical width index automatically. Four different object detectors were analyzed to determine the mental foramen's position. EfficientDet showed the highest average precision (0.30). Therefore, it was combined with an iterative procedure to estimate mandibular cortical width. The results are promising

    Deep Learning for Text Style Transfer: A Survey

    Full text link
    Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a systematic survey of the research on neural text style transfer, spanning over 100 representative articles since the first neural text style transfer work in 2017. We discuss the task formulation, existing datasets and subtasks, evaluation, as well as the rich methodologies in the presence of parallel and non-parallel data. We also provide discussions on a variety of important topics regarding the future development of this task. Our curated paper list is at https://github.com/zhijing-jin/Text_Style_Transfer_SurveyComment: Computational Linguistics Journal 202

    Verquickung der mathematischen und informatischen Forschung an zivilen deutschen Hochschulen mit der modernen Kriegsführung

    Get PDF
    Central aspects of modern warfare such as drone attacks, guided missiles, cyber attacks or the use of reconnaissance satellites would not be possible without contemporary mathematical and computer science research. This dissertation will revolve about the following questions: What impact does the connection between military application and civil research have on mathematics and computer science? What impact does the research have on society? And what ethical and societal questions arise from that

    Big data-driven multimodal traffic management : trends and challenges

    Get PDF

    Automatic semantic and geometric enrichment of CityGML 3D building models of varying architectural styles with HOG-based template matching

    Get PDF
    While the number of 3D geo-spatial digital models of buildings with cultural heritage interest is burgeoning, most lack semantic annotation that could be used to inform users of mobile and desktop applications about the architectural features and origins of the buildings. Additionally, while automated reconstruction of 3D building models is an active research area, the labelling of architectural features (objects) is comparatively less well researched, while distinguishing between different architectural styles is less well researched still. Meanwhile, the successful automatic identification of architectural objects, typified by a comparatively less symmetrical or less regular distribution of objects on façades, particularly on older buildings, has so far eluded researchers. This research has addressed these issues by automating the semantic and geometric enrichment of existing 3D building models by using Histogram of Oriented Gradients (HOG)-based template matching. The methods are applied to the texture maps of 3D building models of 20th century styles, of Georgian-Regency (1715-1830) style and of the Norman (1066 to late 12th century) style, where the amalgam of styles present on buildings of the latter style necessitates detection of styles of the Gothic tradition (late 12th century to present day). The most successful results were obtained when applying a set of heuristics including the use of real world dimensions, while a Support Vector Machine (SVM)-based machine learning approach was found effective in obviating the need for thresholds on matchscores when making detection decisions
    corecore