1 research outputs found
Multi-Industry Simplex : A Probabilistic Extension of GICS
Accurate industry classification is a critical tool for many asset management
applications. While the current industry gold-standard GICS (Global Industry
Classification Standard) has proven to be reliable and robust in many settings,
it has limitations that cannot be ignored. Fundamentally, GICS is a
single-industry model, in which every firm is assigned to exactly one group -
regardless of how diversified that firm may be. This approach breaks down for
large conglomerates like Amazon, which have risk exposure spread out across
multiple sectors. We attempt to overcome these limitations by developing MIS
(Multi-Industry Simplex), a probabilistic model that can flexibly assign a firm
to as many industries as can be supported by the data. In particular, we
utilize topic modeling, an natural language processing approach that utilizes
business descriptions to extract and identify corresponding industries. Each
identified industry comes with a relevance probability, allowing for high
interpretability and easy auditing, circumventing the black-box nature of
alternative machine learning approaches. We describe this model in detail and
provide two use-cases that are relevant to asset management - thematic
portfolios and nearest neighbor identification. While our approach has
limitations of its own, we demonstrate the viability of probabilistic industry
classification and hope to inspire future research in this field.Comment: 17 pages, 10 figure