1,565 research outputs found

    An analysis of the user occupational class through Twitter content

    Get PDF
    Social media content can be used as a complementary source to the traditional methods for extracting and studying collective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classification using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a user’s occupational class with strong accuracy for the coarsest level of a standard occupation taxonomy which includes nine classes. Combined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user attribute that can be embedded in a multitude of downstream applications

    Company Similarity using Large Language Models

    Full text link
    Identifying companies with similar profiles is a core task in finance with a wide range of applications in portfolio construction, asset pricing and risk attribution. When a rigorous definition of similarity is lacking, financial analysts usually resort to 'traditional' industry classifications such as Global Industry Classification System (GICS) which assign a unique category to each company at different levels of granularity. Due to their discrete nature, though, GICS classifications do not allow for ranking companies in terms of similarity. In this paper, we explore the ability of pre-trained and finetuned large language models (LLMs) to learn company embeddings based on the business descriptions reported in SEC filings. We show that we can reproduce GICS classifications using the embeddings as features. We also benchmark these embeddings on various machine learning and financial metrics and conclude that the companies that are similar according to the embeddings are also similar in terms of financial performance metrics including return correlation.Comment: 8 pages, 2 figures, 2 table

    Text Analysis of Airline Tweets

    Get PDF
    By acting as a succinct summary, keywords and key phrases can be a useful tool for swiftly assessing enormous amounts of textual material. A keyword is defined as a word that briefly and accurately characterises the subject, or an aspect of the subject, presented in a text, according to the International Encyclopaedia of Information and Library Science (Bolger et al., 1989) (Feather et al., 1996). People are more likely to complain when they are anxious, according to research (Bolger et al., 1989)(Meier et al., 2013), and moods are affected by time (Ryan et al., 2010). Due to this study, airlines will have a tool to calibrate and judge the positivity/negativity of tweets based on the day of the week, which is a topic that has yet to be researched. We want to do text and sentiment analysis on extracted airline travel tweets, taking into account when the tweet was ‘tweeted’ and if it had a good or negative impact
    corecore