114 research outputs found

    Realizing EDGAR: eliminating information asymmetries through artificial intelligence analysis of SEC filings

    Get PDF
    The U.S. Securities and Exchange Commission (SEC) maintains a publicly-accessible database of all required filings of all publicly traded companies. Known as EDGAR (Electronic Data Gathering, Analysis, and Retrieval), this database contains documents ranging from annual reports of major companies to personal disclosures of senior managers. However, the common user and particularly the retail investor are overwhelmed by the deluge of information, not empowered. EDGAR as it currently functions entrenches the information asymmetry between these retail investors and the large financial institutions with which they often trade. With substantial research staffs and budgets coupled to an industry standard of “playing both sides” of a transaction, these investors “in the know” lead price fluctuations while others must follow. In general, this thesis applies recent technological advancements to the development of software tools that will derive valuable insights from EDGAR documents in an efficient time period. While numerous such commercial products currently exist, all come with significant price tags and many still rely on significant human involvement in deriving such insights. Recent years, however, have seen an explosion in the fields of Machine Learning (ML) and Natural Language Processing (NLP), which show promise in automating many of these functions with greater efficiency. ML aims to develop software which learns parameters from large datasets as opposed to traditional software which merely applies a programmer’s logic. NLP aims to read, understand, and generate language naturally, an area where recent ML advancements have proven particularly adept. Specifically, this thesis serves as an exploratory study in applying recent advancements in ML and NLP to the vast range of documents contained in the EDGAR database. While algorithms will likely never replace the hordes of research analysts that now saturate securities markets nor the advantages that accrue to large and diverse trading desks, they do hold the potential to provide small yet significant insights at little cost. This study first examines methods for document acquisition from EDGAR with a focus on a baseline efficiency sufficient for the real-time trading needs of market participants. Next, it applies recent advancements in ML and NLP, specifically recurrent neural networks, to the task of standardizing financial statements across different filers. Finally, the conclusion contextualizes these findings in an environment of continued technological and commercial evolution

    An SEC 10-K XML Schema Extension to Extract Cyber Security Risks

    Get PDF
    The text sections of the SEC mandated annual reports abound with important corporate operational information but they are hard to manipulate in bulk because of the varying formats used by the submitting companies. Researchers and private entities have demonstrated the difficulties inherent in extracting and accumulating certain textual portions of these reports. This paper proposes an XML schema that will follow a specific DTD for the 10-K (and 10-Q) reports. Using simple computer commands, the ease of manipulation of the reports text sections is demonstrated

    Applying text timing in corporate spin-off disclosure statement analysis: understanding the main concerns and recommendation of appropriate term weights

    Get PDF
    Text mining helps in extracting knowledge and useful information from unstructured data. It detects and extracts information from mountains of documents and allowing in selecting data related to a particular data. In this study, text mining is applied to the 10-12b filings done by the companies during Corporate Spin-off. The main purposes are (1) To investigate potential and/or major concerns found from these financial statements filed for corporate spin-off and (2) To identify appropriate methods in text mining which can be used to reveal these major concerns. 10-12b filings from thirty-four companies were taken and only the Risk Factors category was taken for analysis. Term weights such as Entropy, IDF, GF-IDF, Normal and None were applied on the input data and out of them Entropy and GF-IDF were found to be the appropriate term weights which provided acceptable results. These accepted term weights gave the results which was acceptable to human expert\u27s expectations. The document distribution from these term weights created a pattern which reflected the mood or focus of the input documents. In addition to the analysis, this study also provides a pilot study for future work in predictive text mining for the analysis of similar financial documents. For example, the descriptive terms found from this study provide a set of start word list which eliminates the try and error method of framing an initial start list --Abstract, page iii

    Intellectual Property Management in Health and Agricultural Innovation: A Handbook of Best Practices, Vol. 1

    Get PDF
    Prepared by and for policy-makers, leaders of public sector research establishments, technology transfer professionals, licensing executives, and scientists, this online resource offers up-to-date information and strategies for utilizing the power of both intellectual property and the public domain. Emphasis is placed on advancing innovation in health and agriculture, though many of the principles outlined here are broadly applicable across technology fields. Eschewing ideological debates and general proclamations, the authors always keep their eye on the practical side of IP management. The site is based on a comprehensive Handbook and Executive Guide that provide substantive discussions and analysis of the opportunities awaiting anyone in the field who wants to put intellectual property to work. This multi-volume work contains 153 chapters on a full range of IP topics and over 50 case studies, composed by over 200 authors from North, South, East, and West. If you are a policymaker, a senior administrator, a technology transfer manager, or a scientist, we invite you to use the companion site guide available at http://www.iphandbook.org/index.html The site guide distills the key points of each IP topic covered by the Handbook into simple language and places it in the context of evolving best practices specific to your professional role within the overall picture of IP management

    Veröffentlichungen und Vorträge 2004 der Mitglieder der Fakultät für Informatik

    Get PDF

    NUC BMAS Enviromental Sciences

    Get PDF
    • …
    corecore