99 research outputs found

    Realizing EDGAR: eliminating information asymmetries through artificial intelligence analysis of SEC filings

    Get PDF
    The U.S. Securities and Exchange Commission (SEC) maintains a publicly-accessible database of all required filings of all publicly traded companies. Known as EDGAR (Electronic Data Gathering, Analysis, and Retrieval), this database contains documents ranging from annual reports of major companies to personal disclosures of senior managers. However, the common user and particularly the retail investor are overwhelmed by the deluge of information, not empowered. EDGAR as it currently functions entrenches the information asymmetry between these retail investors and the large financial institutions with which they often trade. With substantial research staffs and budgets coupled to an industry standard of “playing both sides” of a transaction, these investors “in the know” lead price fluctuations while others must follow. In general, this thesis applies recent technological advancements to the development of software tools that will derive valuable insights from EDGAR documents in an efficient time period. While numerous such commercial products currently exist, all come with significant price tags and many still rely on significant human involvement in deriving such insights. Recent years, however, have seen an explosion in the fields of Machine Learning (ML) and Natural Language Processing (NLP), which show promise in automating many of these functions with greater efficiency. ML aims to develop software which learns parameters from large datasets as opposed to traditional software which merely applies a programmer’s logic. NLP aims to read, understand, and generate language naturally, an area where recent ML advancements have proven particularly adept. Specifically, this thesis serves as an exploratory study in applying recent advancements in ML and NLP to the vast range of documents contained in the EDGAR database. While algorithms will likely never replace the hordes of research analysts that now saturate securities markets nor the advantages that accrue to large and diverse trading desks, they do hold the potential to provide small yet significant insights at little cost. This study first examines methods for document acquisition from EDGAR with a focus on a baseline efficiency sufficient for the real-time trading needs of market participants. Next, it applies recent advancements in ML and NLP, specifically recurrent neural networks, to the task of standardizing financial statements across different filers. Finally, the conclusion contextualizes these findings in an environment of continued technological and commercial evolution

    Composable architecture for rack scale big data computing

    No full text
    The rapid growth of cloud computing, both in terms of the spectrum and volume of cloud workloads, necessitate re-visiting the traditional rack-mountable servers based datacenter design. Next generation datacenters need to offer enhanced support for: (i) fast changing system configuration requirements due to workload constraints, (ii) timely adoption of emerging hardware technologies, and (iii) maximal sharing of systems and subsystems in order to lower costs. Disaggregated datacenters, constructed as a collection of individual resources such as CPU, memory, disks etc., and composed into workload execution units on demand, are an interesting new trend that can address the above challenges. In this paper, we demonstrated the feasibility of composable systems through building a rack scale composable system prototype using PCIe switch. Through empirical approaches, we develop assessment of the opportunities and challenges for leveraging the composable architecture for rack scale cloud datacenters with a focus on big data and NoSQL workloads. In particular, we compare and contrast the programming models that can be used to access the composable resources, and developed the implications for the network and resource provisioning and management for rack scale architecture

    Design and Evaluation of Low-Latency Communication Middleware on High Performance Computing Systems

    Get PDF
    [Resumen]El interés en Java para computación paralela está motivado por sus interesantes características, tales como su soporte multithread, portabilidad, facilidad de aprendizaje,alta productividad y el aumento significativo en su rendimiento omputacional. No obstante, las aplicaciones paralelas en Java carecen generalmente de mecanismos de comunicación eficientes, los cuales utilizan a menudo protocolos basados en sockets incapaces de obtener el máximo provecho de las redes de baja latencia, obstaculizando la adopción de Java en computación de altas prestaciones (High Per- formance Computing, HPC). Esta Tesis Doctoral presenta el diseño, implementación y evaluación de soluciones de comunicación en Java que superan esta limitación. En consecuencia, se desarrollaron múltiples dispositivos de comunicación a bajo nivel para paso de mensajes en Java (Message-Passing in Java, MPJ) que aprovechan al máximo el hardware de red subyacente mediante operaciones de acceso directo a memoria remota que proporcionan comunicaciones de baja latencia. También se incluye una biblioteca de paso de mensajes en Java totalmente funcional, FastMPJ, en la cual se integraron los dispositivos de comunicación. La evaluación experimental ha mostrado que las primitivas de comunicación de FastMPJ son competitivas en comparación con bibliotecas nativas, aumentando significativamente la escalabilidad de aplicaciones MPJ. Por otro lado, esta Tesis analiza el potencial de la computación en la nube (cloud computing) para HPC, donde el modelo de distribución de infraestructura como servicio (Infrastructure as a Service, IaaS) emerge como una alternativa viable a los sistemas HPC tradicionales. La evaluación del rendimiento de recursos cloud específicos para HPC del proveedor líder, Amazon EC2, ha puesto de manifiesto el impacto significativo que la virtualización impone en la red, impidiendo mover las aplicaciones intensivas en comunicaciones a la nube. La clave reside en un soporte de virtualización apropiado, como el acceso directo al hardware de red, junto con las directrices para la optimización del rendimiento sugeridas en esta Tesis.[Resumo]O interese en Java para computación paralela está motivado polas súas interesantes características, tales como o seu apoio multithread, portabilidade, facilidade de aprendizaxe, alta produtividade e o aumento signi cativo no seu rendemento computacional. No entanto, as aplicacións paralelas en Java carecen xeralmente de mecanismos de comunicación e cientes, os cales adoitan usar protocolos baseados en sockets que son incapaces de obter o máximo proveito das redes de baixa latencia, obstaculizando a adopción de Java na computación de altas prestacións (High Performance Computing, HPC). Esta Tese de Doutoramento presenta o deseño, implementaci ón e avaliación de solucións de comunicación en Java que superan esta limitación. En consecuencia, desenvolvéronse múltiples dispositivos de comunicación a baixo nivel para paso de mensaxes en Java (Message-Passing in Java, MPJ) que aproveitan ao máaximo o hardware de rede subxacente mediante operacións de acceso directo a memoria remota que proporcionan comunicacións de baixa latencia. Tamén se inclúe unha biblioteca de paso de mensaxes en Java totalmente funcional, FastMPJ, na cal foron integrados os dispositivos de comunicación. A avaliación experimental amosou que as primitivas de comunicación de FastMPJ son competitivas en comparación con bibliotecas nativas, aumentando signi cativamente a escalabilidade de aplicacións MPJ. Por outra banda, esta Tese analiza o potencial da computación na nube (cloud computing) para HPC, onde o modelo de distribución de infraestrutura como servizo (Infrastructure as a Service, IaaS) xorde como unha alternativa viable aos sistemas HPC tradicionais. A ampla avaliación do rendemento de recursos cloud específi cos para HPC do proveedor líder, Amazon EC2, puxo de manifesto o impacto signi ficativo que a virtualización impón na rede, impedindo mover as aplicacións intensivas en comunicacións á nube. A clave atópase no soporte de virtualización apropiado, como o acceso directo ao hardware de rede, xunto coas directrices para a optimización do rendemento suxeridas nesta Tese.[Abstract]The use of Java for parallel computing is becoming more promising owing to its appealing features, particularly its multithreading support, portability, easy-tolearn properties, high programming productivity and the noticeable improvement in its computational performance. However, parallel Java applications generally su er from inefficient communication middleware, most of which use socket-based protocols that are unable to take full advantage of high-speed networks, hindering the adoption of Java in the High Performance Computing (HPC) area. This PhD Thesis presents the design, development and evaluation of scalable Java communication solutions that overcome these constraints. Hence, we have implemented several lowlevel message-passing devices that fully exploit the underlying network hardware while taking advantage of Remote Direct Memory Access (RDMA) operations to provide low-latency communications. Moreover, we have developed a productionquality Java message-passing middleware, FastMPJ, in which the devices have been integrated seamlessly, thus allowing the productive development of Message-Passing in Java (MPJ) applications. The performance evaluation has shown that FastMPJ communication primitives are competitive with native message-passing libraries, improving signi cantly the scalability of MPJ applications. Furthermore, this Thesis has analyzed the potential of cloud computing towards spreading the outreach of HPC, where Infrastructure as a Service (IaaS) o erings have emerged as a feasible alternative to traditional HPC systems. Several cloud resources from the leading IaaS provider, Amazon EC2, which speci cally target HPC workloads, have been thoroughly assessed. The experimental results have shown the signi cant impact that virtualized environments still have on network performance, which hampers porting communication-intensive codes to the cloud. The key is the availability of the proper virtualization support, such as the direct access to the network hardware, along with the guidelines for performance optimization suggested in this Thesis

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing

    Design of a parallel AES for graphics hardware using the CUDA framework

    Full text link

    Infrastructure Plan for ASC Petascale Environments

    Full text link

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    High performance graphics processor based computed tomography reconstruction algorithms for nuclear and other large scale applications.

    Full text link
    corecore