7 research outputs found
Factors Influencing the Surprising Instability of Word Embeddings
Despite the recent popularity of word embedding methods, there is only a
small body of work exploring the limitations of these representations. In this
paper, we consider one aspect of embedding spaces, namely their stability. We
show that even relatively high frequency words (100-200 occurrences) are often
unstable. We provide empirical evidence for how various factors contribute to
the stability of word embeddings, and we analyze the effects of stability on
downstream tasks.Comment: NAACL HLT 201
Multimodal analysis and prediction of latent user dimensions
Humans upload over 1.8 billion digital images to the internet each day, yet the relationship between the images that a person shares with others and his/her psychological characteristics remains poorly understood. In the current research, we analyze the relationship between images, captions, and the latent demographic/psychological dimensions of personality and gender. We consider a wide range of automatically extracted visual and textual features of images/captions that are shared by a large sample of individuals Using correlational methods, we identify several visual and textual properties that show strong relationships with individual differences between participants. Additionally, we explore the task of predicting user attributes using a multimodal approach that simultaneously leverages images and their captions. Results from these experiments suggest that images alone have significant predictive power and, additionally, multimodal methods outperform both visual features and textual features in isolation when attempting to predict individual differences