Search CORE

977 research outputs found

Multimodal sequential fashion attribute prediction

Author: Anbarjafari Gholamreza
Arslan Hasan Sait
Fishel Mark
Sirts Kairit
Publication venue: 'MDPI AG'
Publication date: 01/10/2019
Field of study

We address multimodal product attribute prediction of fashion items based on product images and titles. The product attributes, such as type, sub-type, cut or fit, are in a chain format, with previous attribute values constraining the values of the next attributes. We propose to address this task with a sequential prediction model that can learn to capture the dependencies between the different attribute values in the chain. Our experiments on three product datasets show that the sequential model outperforms two non-sequential baselines on all experimental datasets. Compared to other models, the sequential model is also better able to generate sequences of attribute chains not seen during training. We also measure the contributions of both image and textual input and show that while text-only models always outperform image-only models, only the multimodal sequential model combining both image and text improves over the text-only model on all experimental dataset

Multidisciplinary Digital Publishing Institute

DSpace@HKU

Facial Expression Recognition from World Wild Web

Author: Abdollahi Hojjat
Chan David
Hassani Behzad
Mahoor Mohammad H.
Mollahosseini Ali
Salvador Michelle J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/01/2017
Field of study

Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions from the web. Three search engines were queried using 1250 emotion related keywords in six different languages and the retrieved images were mapped by two annotators to six basic expressions and neutral. Deep neural networks and noise modeling were used in three different training scenarios to find how accurately facial expressions can be recognized when trained on noisy images collected from the web using query terms (e.g. happy face, laughing man, etc)? The results of our experiments show that deep neural networks can recognize wild facial expressions with an accuracy of 82.12%

arXiv.org e-Print Archive

Crossref

Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging

Author: Aabedi Alexander
Adapa Arjun
Al-Holou Wajd
Berger Mitchel S.
Camelo-Piragua Sandra
Castro Maria
Chowdury Asadur
Freudiger Christian
Golfinos John G.
Hervey-Jumper Shawn L.
Heth Jason
Hollon Todd C.
Jiang Cheng
Kondepudi Akhil
Lee Honglak
Lowenstein Pedro
Nasir-Moin Mustafa
Neuschmelting Volker
Orringer Daniel A.
Reinecke David
Sagher Oren
Snuderl Matija
von Spreckelsen Niklas
Wadiura Lisa Irina
Widhalm Georg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/03/2023
Field of study

Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid (

< 90

seconds), artificial-intelligence-based diagnostic screening system to streamline the molecular diagnosis of diffuse gliomas. DeepGlioma is trained using a multimodal dataset that includes stimulated Raman histology (SRH); a rapid, label-free, non-consumptive, optical imaging method; and large-scale, public genomic data. In a prospective, multicenter, international testing cohort of patients with diffuse glioma (

n=153

) who underwent real-time SRH imaging, we demonstrate that DeepGlioma can predict the molecular alterations used by the World Health Organization to define the adult-type diffuse glioma taxonomy (IDH mutation, 1p19q co-deletion and ATRX mutation), achieving a mean molecular classification accuracy of

93.3\pm 1.6\%

. Our results represent how artificial intelligence and optical histology can be used to provide a rapid and scalable adjunct to wet lab methods for the molecular screening of patients with diffuse glioma.Comment: Paper published in Nature Medicin

arXiv.org e-Print Archive

Support matrix machine: A review

Author: Akhtar Mushir
Kumari Anuradha
Shah Rupal
Tanveer M.
Publication venue
Publication date: 30/10/2023
Field of study

Support vector machine (SVM) is one of the most studied paradigms in the realm of machine learning for classification and regression problems. It relies on vectorized input data. However, a significant portion of the real-world data exists in matrix format, which is given as input to SVM by reshaping the matrices into vectors. The process of reshaping disrupts the spatial correlations inherent in the matrix data. Also, converting matrices into vectors results in input data with a high dimensionality, which introduces significant computational complexity. To overcome these issues in classifying matrix input data, support matrix machine (SMM) is proposed. It represents one of the emerging methodologies tailored for handling matrix input data. The SMM method preserves the structural information of the matrix data by using the spectral elastic net property which is a combination of the nuclear norm and Frobenius norm. This article provides the first in-depth analysis of the development of the SMM model, which can be used as a thorough summary by both novices and experts. We discuss numerous SMM variants, such as robust, sparse, class imbalance, and multi-class classification models. We also analyze the applications of the SMM model and conclude the article by outlining potential future research avenues and possibilities that may motivate academics to advance the SMM algorithm

arXiv.org e-Print Archive

Joint Intermodal and Intramodal Label Transfers for Extremely Rare or Unseen Classes

Author: Aggarwal Charu
Huang Thomas
Liu Wei
Qi Guo-Jun
Publication venue
Publication date: 22/03/2017
Field of study

In this paper, we present a label transfer model from texts to images for image classification tasks. The problem of image classification is often much more challenging than text classification. On one hand, labeled text data is more widely available than the labeled images for classification tasks. On the other hand, text data tends to have natural semantic interpretability, and they are often more directly related to class labels. On the contrary, the image features are not directly related to concepts inherent in class labels. One of our goals in this paper is to develop a model for revealing the functional relationships between text and image features as to directly transfer intermodal and intramodal labels to annotate the images. This is implemented by learning a transfer function as a bridge to propagate the labels between two multimodal spaces. However, the intermodal label transfers could be undermined by blindly transferring the labels of noisy texts to annotate images. To mitigate this problem, we present an intramodal label transfer process, which complements the intermodal label transfer by transferring the image labels instead when relevant text is absent from the source corpus. In addition, we generalize the inter-modal label transfer to zero-shot learning scenario where there are only text examples available to label unseen classes of images without any positive image examples. We evaluate our algorithm on an image classification task and show the effectiveness with respect to the other compared algorithms.Comment: The paper has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence. It will apear in a future issu

arXiv.org e-Print Archive

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Learning recurrent representations for hierarchical behavior modeling

Author: Branson Kristin
Eyjolfsdottir Eyrun
Perona Pietro
Yue Yisong
Publication venue
Publication date: 15/11/2016
Field of study

We propose a framework for detecting action patterns from motion sequences and modeling the sensory-motor relationship of animals, using a generative recurrent neural network. The network has a discriminative part (classifying actions) and a generative part (predicting motion), whose recurrent cells are laterally connected, allowing higher levels of the network to represent high level phenomena. We test our framework on two types of data, fruit fly behavior and online handwriting. Our results show that 1) taking advantage of unlabeled sequences, by predicting future motion, significantly improves action detection performance when training labels are scarce, 2) the network learns to represent high level phenomena such as writer identity and fly gender, without supervision, and 3) simulated motion trajectories, generated by treating motion prediction as input to the network, look realistic and may be used to qualitatively evaluate whether the model has learnt generative control rules

arXiv.org e-Print Archive

Caltech Authors