271 research outputs found
Discriminative latent variable models for visual recognition
Visual Recognition is a central problem in computer vision, and it has numerous potential applications in many dierent elds, such as robotics, human computer interaction, and entertainment. In this dissertation, we propose two discriminative latent variable models for handling challenging visual recognition problems. In particular, we use latent variables to capture and model various prior knowledge in the training data. In the rst model, we address the problem of recognizing human actions from still images. We jointly consider both poses and actions in a unied framework, and treat human poses as latent variables. The learning of this model follows the framework of latent SVM. Secondly, we propose another latent variable model to address the problem of automated tag learning on YouTube videos. In particular, we address the semantic variations (sub-tags) of the videos which have the same tag. In the model, each video is assumed to be associated with a sub-tag label, and we treat this sub-tag label as latent information. This model is trained using a latent learning framework based on LogitBoost, which jointly considers both the latent sub-tag label and the tag label. Moreover, we propose a novel discriminative latent learning framework, kernel latent SVM, which combines the benet of latent SVM and kernel methods. The framework of kernel latent SVM is general enough to be applied in many applications of visual recognition. It is also able to handle complex latent variables with interdependent structures using composite kernels
Pose Embeddings: A Deep Architecture for Learning to Match Human Poses
We present a method for learning an embedding that places images of humans in
similar poses nearby. This embedding can be used as a direct method of
comparing images based on human pose, avoiding potential challenges of
estimating body joint positions. Pose embedding learning is formulated under a
triplet-based distance criterion. A deep architecture is used to allow learning
of a representation capable of making distinctions between different poses.
Experiments on human pose matching and retrieval from video data demonstrate
the potential of the method
VIDEO THUMBNAIL SELECTION BASED ON DEEP LEARNING
Video thumbnails are often the first thing a viewer sees when browsing or searching for videos. A frame that is visually representative of the video is typically selected and used as a thumbnail representation of the video. Sometimes, such a thumbnail is not an adequate semantic representation of the video. Further, it is possible that such a thumbnail is not visually pleasing. This disclosure describes deep learning techniques to select video thumbnails that are visually attractive and reflect the content of a video. Thumbnails as described in this disclosure are attractive, improve a likelihood of user selection, and help users find relevant content easily
A Dimension-Augmented Physics-Informed Neural Network (DaPINN) with High Level Accuracy and Efficiency
Physics-informed neural networks (PINNs) have been widely applied in
different fields due to their effectiveness in solving partial differential
equations (PDEs). However, the accuracy and efficiency of PINNs need to be
considerably improved for scientific and commercial use. To address this issue,
we systematically propose a novel dimension-augmented physics-informed neural
network (DaPINN), which simultaneously and significantly improves the accuracy
and efficiency of the PINN. In the DaPINN model, we introduce inductive bias in
the neural network to enhance network generalizability by adding a special
regularization term to the loss function. Furthermore, we manipulate the
network input dimension by inserting additional sample features and
incorporating the expanded dimensionality in the loss function. Moreover, we
verify the effectiveness of power series augmentation, Fourier series
augmentation and replica augmentation, in both forward and backward problems.
In most experiments, the error of DaPINN is 12 orders of magnitude lower
than that of PINN. The results show that the DaPINN outperforms the original
PINN in terms of both accuracy and efficiency with a reduced dependence on the
number of sample points. We also discuss the complexity of the DaPINN and its
compatibility with other methods.Comment: 33 pages, 12 figure
Influence of concentration-dependent material properties on the fracture and debonding of electrode particles with core–shell structure
Core–shell electrode particle designs offer a route to improved lithium-ion battery performance. However, they are susceptible to mechanical damage such as fracture and debonding, which can significantly reduce their lifetime. Using a coupled finite element model, we explore the impacts of diffusion-induced stresses on the failure mechanisms of an exemplar system with an NMC811 core and an NMC111 shell. In particular, we systematically compare the implications of assuming constant material properties against using Li concentration-dependent diffusion coefficient and partial molar volume. With constant material properties, our results show that smaller cores with thinner shells avoid debonding and fracture regimes. When factoring in a concentration-dependent partial molar volume, the maximum values of tensile hoop stress in the shell are found to be significantly lower than those predicted with constant properties, reducing the likelihood of fracture. Furthermore, with a concentration-dependent diffusion coefficient, significant barriers to full electrode utilisation are observed due to reduced lithium mobility at high states of lithiation. This provides a possible explanation for the reduced accessible capacity observed in experiments. Shell thickness is found to be the dominant factor in precluding structural integrity once the concentration dependency is accounted for. These findings shed new light on the performance and effective design of core–shell electrode particles
- …