18 research outputs found

    Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition

    Full text link
    Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. However, when dealing with high dimensional inputs, the training of RNNs becomes computational expensive due to the large number of model parameters. This hinders RNNs from solving many important computer vision tasks, such as Action Recognition in Videos and Image Captioning. To overcome this problem, we propose a compact and flexible structure, namely Block-Term tensor decomposition, which greatly reduces the parameters of RNNs and improves their training efficiency. Compared with alternative low-rank approximations, such as tensor-train RNN (TT-RNN), our method, Block-Term RNN (BT-RNN), is not only more concise (when using the same rank), but also able to attain a better approximation to the original RNNs with much fewer parameters. On three challenging tasks, including Action Recognition in Videos, Image Captioning and Image Generation, BT-RNN outperforms TT-RNN and the standard RNN in terms of both prediction accuracy and convergence rate. Specifically, BT-LSTM utilizes 17,388 times fewer parameters than the standard LSTM to achieve an accuracy improvement over 15.6\% in the Action Recognition task on the UCF11 dataset.Comment: CVPR201

    In Vivo Evaluation of the Nitroimidazole-Based Thioflavin-T Derivatives as Cerebral Ischemia Markers

    Get PDF
    Timely imaging and accurate interpretation of cerebral ischemia are required to identify patients who might benefit from more aggressive therapy, and nuclear medicine offers a noninvasive method for demonstrating cerebral ischemia. Three nitroimidazole-based thioflavin-T derivatives, N-[4-(benzothiazol-2-yl)phenyl]-3-(4-nitroimidazole-1-yl) propanamide (4NPBTA), N-[4-(benzothiazol-2-yl)phenyl]-3-(4-nitroimidazole-1-yl)-N-methylpropanamide (4NPBTA-1), and N-[4-(benzothiazol-2-yl)phenyl]-3-(2-nitroimidazole-1-yl) propanamide (2NPBTA), were radioiodinated and evaluated as possible cerebral ischemia markers. In normal mice, these compounds showed good permeation of the intact blood-brain barrier (BBB), high initial brain uptake, and rapid washout. In gerbil stroke models that had been subjected to right common carotid artery ligation to produce cerebral ischemia, [131I]2NPBTA, uptake in the right cerebral hemisphere decreased more slowly than that of the left, and the right/left hemisphere uptake ratios increased with time. Also, the right/left hemisphere uptake ratios correlated positively with the severity of the stroke. The results showed that [131I]2NPBTA had a specific location in the cerebral ischemic tissue. This represented a first step in finding new drugs and might provide a possible cerebral ischemic marker

    Scalable nonparametric multiway data analysis

    Get PDF
    Abstract Multiway data analysis deals with multiway arrays, i.e., tensors, and the goal is twofold: predicting missing entries by modeling the interactions between array elements and discovering hidden patterns, such as clusters or communities in each mode. Despite the success of existing tensor factorization approaches, they are either unable to capture nonlinear interactions, or computationally expensive to handle massive data. In addition, most of the existing methods lack a principled way to discover latent clusters, which is important for better understanding of the data. To address these issues, we propose a scalable nonparametric tensor decomposition model. It employs Dirichlet process mixture (DPM) prior to model the latent clusters; it uses local Gaussian processes (GPs) to capture nonlinear relationships and to improve scalability. An efficient online variational Bayes Expectation-Maximization algorithm is proposed to learn the model. Experiments on both synthetic and real-world data show that the proposed model is able to discover latent clusters with higher prediction accuracy than competitive methods. Furthermore, the proposed model obtains significantly better predictive performance than the state-of-the-art large scale tensor decomposition algorithm, GigaTensor, on two large datasets with billions of entries

    Efficient Phytase Secretion and Phytate Degradation by Recombinant Bifidobacterium longum JCM 1217

    Get PDF
    Genetic engineering of probiotics, like bifidobacteria, may improve their microbial cell factory economy. This work designed a novel shuttle plasmid pBPES, which bears exogenous appA and is stable within Bifidobacterium longum JCM 1217. Cloning of three predicted promoters into pBPES proved that all of them drive appA expression in B. longum JCM 1217. Transformation of plasmids pBPES-tu and pBPES-groEL into B. longum JCM1217 resulted in much more phytase secretion suggests Ptu and PgroEL are strong promoters. Further in vitro and in vivo experiments suggested B. longum JCM 1217/pBPES-tu degrades phytate efficiently. In conclusion, the study screened two stronger promoters and constructed a recombinant live probiotic strain for effectively phytase secretion and phytate degradation in gut. The strategy used in the study provided a novel technique for improving the bioaccessibility of phytate and decreasing phosphorus excretion

    Layout-aware mixture models for patch-based image representation and analysis

    Get PDF
    Image and video representation and modeling is an important topic in computer vision and image processing. An image model provides an abstraction of the large amount of data contained in an image and enables the systematic development of algorithms for accomplishing a particular image-related task, such as detection, recognition and segmentation (analysis) as well as inpainting, summarization and colorization (synthesis). Since an image is usually comprised of millions of pixels, developing models in such a high dimensional space is not always feasible. One of the most popular ways of modeling images is to break them into patches; the reason is that not only is the dimensionality reduced, but it is easier to define similarities between patches as they experience less distortion as compared with defining similarity between images. Patch-based image models are often more flexible in modeling appearances by exploring redundancies in image and videos. By adjusting the patch size, these models trade off the good qualities of each end of the spectrum - the discriminative power of images and the representational power of pixel histograms. When breaking an image into a collection of patches, one must be able to model two kinds of information in order to describe the image completely. On one hand, one must be able to model the patch appearance with some statistical model; on the other hand, there must be some other statistics to describe how the patches are organized together in an image. We call the first kind the "appearance model" and the second the "layout model". In this thesis, we describe the historical progress made in the past decade starting from patch-based appearance models without considering layout information, onto how spatial modeling improves performance and enables applications in analysis tasks such as recognition, detection and segmentation as well as synthesis tasks such as colorization by explaining our works in the past three years. This thesis proposes both a discriminative formulation as well as a generative formulation in describing patch layouts. The algorithm developed upon the discriminative framework achieves state-of-the-art results in the joint detection and its subcategory recognition problem. Algorithms developed for these models are also discussed in the process with results and examples

    Epitome and its applications

    Get PDF
    Due to the lack of explicit spatial consideration, the existing epitome model may fail for image recognition and target detection, which directly motivates us to propose the so-called spatialized epitome in this thesis. Extended from the original simple graphical model of epitome, the spatialized epitome provides a general framework to integrate both appearance and spatial arrangement of patches in the image to achieve a more precise likelihood representation for image(s) and eliminate ambiguities in image reconstruction and recognition. From the extended graphical model of epitome, a new EM learning procedure is derived under the framework of variational approximation. The learning procedure can generate an optimized summary of the image appearance based on patches and automatically cluster the spatial distribution of the similar patches. From the spatialized epitome, we present a principled (parameter-free) way of inferring the probability of a new input image under the learned model and thereby enabling image recognition and target detection. We show how the incorporation of spatial information enhances the epitome’s ability for discrimination on several tough vision tasks, e.g., misalignment/cross-pose face recognition, and vehicle detection with a few training samples. We also apply this model to image colorization which not only increases the visual appeal of grayscale images, but also enriches the information contained in scientific images that lack color information. Most existing methods of colorization require laborious user interaction for scribbles or image segmentation. To eliminate the need for human labor, we develop an automatic image colorization method using epitome. Built upon a generative graphical model, epitome is a condensed image appearance and shape model which also proves to be an effective summary of color information for the colorization task. We train the epitome from the reference images and perform inference in the epitome to colorize grayscale images, rendering better colorization results than previous methods

    Modeling of visual patterns

    No full text
    Natural images reveal an overwhelming number of visual patterns from objects and scenes in nature. Human vision system identifies and recognizes scene images or ob- jects in the images based on their visual patterns. Modeling these visual patterns is of fundamental importance for generic vision tasks, such as perceptual organization, segmentation, and recognition. Successful implementations of these tasks can enable applications in medical imaging, video surveillance, media analysis, human computer interaction and many other interesting fields. In literature, visual patterns have many different sets of modeling methods. This report delves into both descriptive and gener- ative methods where each method has different emphasis on visual pattern modeling. One of the descriptive model, Gabor filter banks, is investigated for the modeling of texture patterns in chapter 2 because Gabor filters responses are believed to be able to obtain sufficient statistics which best describe a texture pattern. Furthermore, a novel strategy to make this model rotation and scale invariant is proposed in this chapter. The generative model for the modeling of local visual patterns is investigated in chapter 3, where the emphasis is on the graphical models which describe the topology of the components of a complex probability model, clarify assumptions about the rep- resentation, and lead to algorithms that make use of the topology to increase speed and accuracy. Therefore in chapter 3 the goal is to find a generative model that is the best fit to the data rather than extracting the most representative statistics as that in chapter 2. A new way of modeling a class of local patterns by summarizing a collection of images of the visual pattern is proposed in this chapter. By exploring global texture patterns and local visual patterns with their respective modeling methods, this report gives an introductive overview about what is a visual pattern, what is the difference between global texture pattern and local visual pattern and how can they be modeled in different ways. On top of that, two new modeling methods for different types of patterns are presented in this report.Bachelor of Engineerin
    corecore