13 research outputs found

    AI-tocracy

    Get PDF
    Can frontier innovation be sustained under autocracy? We argue that innovation and autocracy can be mutually reinforcing when: (i) the new technology bolsters the autocrat's power; and (ii) the autocrat's demand for the technology stimulates further innovation in applications beyond those benefiting it directly. We test for such a mutually reinforcing relationship in the context of facial recognition AI in China. To do so, we gather comprehensive data on AI firms and government procurement contracts, as well as on social unrest across China during the last decade. We first show that autocrats benefit from AI: local unrest leads to greater government procurement of facial recognition AI, and increased AI procurement suppresses subsequent unrest. We then show that AI innovation benefits from autocrats' suppression of unrest: the contracted AI firms innovate more both for the government and commercial markets. Taken together, these results suggest the possibility of sustained AI innovation under the Chinese regime: AI innovation entrenches the regime, and the regime's investment in AI for political control stimulates further frontier innovation

    Um estudo comparativo das abordagens de detecção e reconhecimento de texto para cenários de computação restrita

    Get PDF
    Orientadores: Ricardo da Silva Torres, Allan da Silva PintoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Textos são elementos fundamentais para uma efetiva comunicação em nosso cotidiano. A mobilidade de pessoas e veículos em ambientes urbanos e a busca por um produto de interesse em uma prateleira de supermercado são exemplos de atividades em que o entendimento dos elementos textuais presentes no ambiente são essenciais para a execução da tarefa. Recentemente, diversos avanços na área de visão computacional têm sido reportados na literatura, com o desenvolvimento de algoritmos e métodos que objetivam reconhecer objetos e textos em cenas. Entretanto, a detecção e reconhecimento de textos são problemas considerados em aberto devido a diversos fatores que atuam como fontes de variabilidades durante a geração e captura de textos em cenas, o que podem impactar as taxas de detecção e reconhecimento de maneira significativa. Exemplo destes fatores incluem diferentes formas dos elementos textuais (e.g., circular ou em linha curva), estilos e tamanhos da fonte, textura, cor, variação de brilho e contraste, entre outros. Além disso, os recentes métodos considerados estado-da-arte, baseados em aprendizagem profunda, demandam altos custos de processamento computacional, o que dificulta a utilização de tais métodos em cenários de computação restritiva. Esta dissertação apresenta um estudo comparativo de técnicas de detecção e reconhecimento de texto, considerando tanto os métodos baseados em aprendizado profundo quanto os métodos que utilizam algoritmos clássicos de aprendizado de máquina. Esta dissertação também apresenta um método de fusão de caixas delimitadoras, baseado em programação genética (GP), desenvolvido para atuar tanto como uma etapa de pós-processamento, posterior a etapa de detecção, quanto para explorar a complementariedade dos algoritmos de detecção de texto investigados nesta dissertação. De acordo com o estudo comparativo apresentado neste trabalho, os métodos baseados em aprendizagem profunda são mais eficazes e menos eficientes, em comparação com os métodos clássicos da literatura e considerando as métricas adotadas. Além disso, o algoritmo de fusão proposto foi capaz de aprender informações complementares entre os métodos investigados nesta dissertação, o que resultou em uma melhora das taxas de precisão e revocação. Os experimentos foram conduzidos considerando os problemas de detecção de textos horizontais, verticais e de orientação arbitráriaAbstract: Texts are fundamental elements for effective communication in our daily lives. The mobility of people and vehicles in urban environments and the search for a product of interest on a supermarket shelf are examples of activities in which the understanding of the textual elements present in the environment is essential to succeed in such tasks. Recently, several advances in computer vision have been reported in the literature, with the development of algorithms and methods that aim to recognize objects and texts in scenes. However, text detection and recognition are still open problems due to several factors that act as sources of variability during scene text generation and capture, which can significantly impact detection and recognition rates of current algorithms. Examples of these factors include different shapes of textual elements (e.g., circular or curved), font styles and sizes, texture, color, brightness and contrast variation, among others. Besides, recent state-of-the-art methods based on deep learning demand high computational processing costs, which difficult their use in restricted computing scenarios. This dissertation presents a comparative study of text detection and recognition techniques, considering methods based on deep learning and methods that use classical machine learning algorithms. This dissertation also presents an algorithm for fusing bounding boxes, based on genetic programming (GP), developed to act as a post-processing step for a single text detector and to explore the complementarity of text detection algorithms investigated in this dissertation. According to the comparative study presented in this work, the methods based on deep learning are more effective and less efficient, in comparison to classic methods for text detection investigated in this work, considering the adopted metrics. Furthermore, the proposed GP-based fusion algorithm was able to learn complementary information from the methods investigated in this dissertation, which resulted in an improvement of precision and recall rates. The experiments were conducted considering text detection problems involving horizontal, vertical and arbitrary orientationsMestradoCiência da ComputaçãoMestre em Ciência da ComputaçãoCAPE

    Viability of Sequence Labeling Encodings for Dependency Parsing

    Get PDF
    Programa Oficial de Doutoramento en Computación . 5009V01[Abstract] This thesis presents new methods for recasting dependency parsing as a sequence labeling task yielding a viable alternative to the traditional transition- and graph-based approaches. It is shown that sequence labeling parsers provide several advantages for dependency parsing, such as: (i) a good trade-off between accuracy and parsing speed, (ii) genericity which enables running a parser in generic sequence labeling software and (iii) pluggability which allows using full parse trees as features to downstream tasks. The backbone of dependency parsing as sequence labeling are the encodings which serve as linearization methods for mapping dependency trees into discrete labels, such that each token in a sentence is associated with a label. We introduce three encoding families comprising: (i) head selection, (ii) bracketing-based and (iii) transition-based encodings which are differentiated by the way they represent a dependency tree as a sequence of labels. We empirically examine the viability of the encodings and provide an analysis of their facets. Furthermore, we explore the feasibility of leveraging external complementary data in order to enhance parsing performance. Our sequence labeling parser is endowed with two kinds of representations. First, we exploit the complementary nature of dependency and constituency parsing paradigms and enrich the parser with representations from both syntactic abstractions. Secondly, we use human language processing data to guide our parser with representations from eye movements. Overall, the results show that recasting dependency parsing as sequence labeling is a viable approach that is fast and accurate and provides a practical alternative for integrating syntax in NLP tasks.[Resumen] Esta tesis presenta nuevos métodos para reformular el análisis sintáctico de dependencias como una tarea de etiquetado secuencial, lo que supone una alternativa viable a los enfoques tradicionales basados en transiciones y grafos. Se demuestra que los analizadores de etiquetado secuencial ofrecen varias ventajas para el análisis sintáctico de dependencias, como por ejemplo (i) un buen equilibrio entre la precisión y la velocidad de análisis, (ii) la genericidad que permite ejecutar un analizador en un software genérico de etiquetado secuencial y (iii) la conectividad que permite utilizar el árbol de análisis completo como características para las tareas posteriores. El pilar del análisis sintáctico de dependencias como etiquetado secuencial son las codificaciones que sirven como métodos de linealización para transformar los árboles de dependencias en etiquetas discretas, de forma que cada token de una frase se asocia con una etiqueta. Introducimos tres familias de codificación que comprenden: (i) selección de núcleos, (ii) codificaciones basadas en corchetes y (iii) codificaciones basadas en transiciones que se diferencian por la forma en que representan un árbol de dependencias como una secuencia de etiquetas. Examinamos empíricamente la viabilidad de las codificaciones y ofrecemos un análisis de sus facetas. Además, exploramos la viabilidad de aprovechar datos complementarios externos para mejorar el rendimiento del análisis sintáctico. Dotamos a nuestro analizador sintáctico de dos tipos de representaciones. En primer lugar, explotamos la naturaleza complementaria de los paradigmas de análisis sintáctico de dependencias y constituyentes, enriqueciendo el analizador sintáctico con representaciones de ambas abstracciones sintácticas. En segundo lugar, utilizamos datos de procesamiento del lenguaje humano para guiar nuestro analizador con representaciones de los movimientos oculares. En general, los resultados muestran que la reformulación del análisis sintáctico de dependencias como etiquetado de secuencias es un enfoque viable, rápido y preciso, y ofrece una alternativa práctica para integrar la sintaxis en las tareas de PLN.[Resumo] Esta tese presenta novos métodos para reformular a análise sintáctica de dependencias como unha tarefa de etiquetaxe secuencial, o que supón unha alternativa viable aos enfoques tradicionais baseados en transicións e grafos. Demóstrase que os analizadores de etiquetaxe secuencial ofrecen varias vantaxes para a análise sintáctica de dependencias, por exemplo (i) un bo equilibrio entre a precisión e a velocidade de análise, (ii) a xenericidade que permite executar un analizador nun software xenérico de etiquetaxe secuencial e (iii) a conectividade que permite empregar a árbore de análise completa como características para as tarefas posteriores. O piar da análise sintáctica de dependencias como etiquetaxe secuencial son as codificacións que serven como métodos de linealización para transformar as árbores de dependencias en etiquetas discretas, de forma que cada token dunha frase se asocia cunha etiqueta. Introducimos tres familias de codificación que comprenden: (i) selección de núcleos, (ii) codificacións baseadas en corchetes e (iii) codificacións baseadas en transicións que se diferencian pola forma en que representan unha árbore de dependencia como unha secuencia de etiquetas. Examinamos empíricamente a viabilidade das codificacións e ofrecemos unha análise das súas facetas. Ademais, exploramos a viabilidade de aproveitar datos complementarios externos para mellorar o rendemento da análise sintáctica. O noso analizador sintáctico de etiquetaxe secuencial está dotado de dous tipos de representacións. En primeiro lugar, explotamos a natureza complementaria dos paradigmas de análise sintáctica de dependencias e constituíntes e enriquecemos o analizador sintáctico con representacións de ambas abstraccións sintácticas. En segundo lugar, empregamos datos de procesamento da linguaxe humana para guiar o noso analizador con representacións dos movementos oculares. En xeral, os resultados mostran que a reformulación da análise sintáctico de dependencias como etiquetaxe de secuencias é un enfoque viable, rápido e preciso, e ofrece unha alternativa práctica para integrar a sintaxe nas tarefas de PLN.This work has been carried out thanks to the funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150)

    Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP

    Get PDF
    Transfer learning, particularly approaches that combine multi-task learning with pre-trained contextualized embeddings and fine-tuning, have advanced the field of Natural Language Processing tremendously in recent years. In this paper we present MaChAmp, a toolkit for easy fine-tuning of contextualized embeddings in multi-task settings. The benefits of MaChAmp are its flexible configuration options, and the support of a variety of natural language processing tasks in a uniform toolkit, from text classification and sequence labeling to dependency parsing, masked language modeling, and text generation.Comment: https://machamp-nlp.github.io

    Cold-start universal information extraction

    Get PDF
    Who? What? When? Where? Why? are fundamental questions asked when gathering knowledge about and understanding a concept, topic, or event. The answers to these questions underpin the key information conveyed in the overwhelming majority, if not all, of language-based communication. At the core of my research in Information Extraction (IE) is the desire to endow machines with the ability to automatically extract, assess, and understand text in order to answer these fundamental questions. IE has been serving as one of the most important components for many downstream natural language processing (NLP) tasks, such as knowledge base completion, machine reading comprehension, machine translation and so on. The proliferation of the Web also intensifies the need of dealing with enormous amount of unstructured data from various sources, such as languages, genres and domains. When building an IE system, the conventional pipeline is to (1) ask expert linguists to rigorously define a target set of knowledge types we wish to extract by examining a large data set, (2) collect resources and human annotations for each type, and (3) design features and train machine learning models to extract knowledge elements. In practice, this process is very expensive as each step involves extensive human effort which is not always available, for example, to specify the knowledge types for a particular scenario, both consumers and expert linguists need to examine a lot of data from that domain and write detailed annotation guidelines for each type. Hand-crafted schemas, which define the types and complex templates of the expected knowledge elements, often provide low coverage and fail to generalize to new domains. For example, none of the traditional event extraction programs, such as ACE (Automatic Content Extraction) and TAC-KBP, include "donation'' and "evacuation'' in their schemas in spite of their potential relevance to natural disaster management users. Additionally, these approaches are highly dependent on linguistic resources and human labeled data tuned to pre-defined types, so they suffer from poor scalability and portability when moving to a new language, domain, or genre. The focus of this thesis is to develop effective theories and algorithms for IE which not only yield satisfactory quality by incorporating prior linguistic and semantic knowledge, but also greater portability and scalability by moving away from the high cost and narrow focus of large-scale manual annotation. This thesis opens up a new research direction called Cold-Start Universal Information Extraction, where the full extraction and analysis starts from scratch and requires little or no prior manual annotation or pre-defined type schema. In addition to this new research paradigm, we also contribute effective algorithms and models towards resolving the following three challenges: How can machines extract knowledge without any pre-defined types or any human annotated data? We develop an effective bottom-up and unsupervised Liberal Information Extraction framework based on the hypothesis that the meaning and underlying knowledge conveyed by linguistic expressions is usually embodied by their usages in language, which makes it possible to automatically induces a type schema based on rich contextual representations of all knowledge elements by combining their symbolic and distributional semantics using unsupervised hierarchical clustering. How can machines benefit from available resources, e.g., large-scale ontologies or existing human annotations? My research has shown that pre-defined types can also be encoded by rich contextual or structured representations, through which knowledge elements can be mapped to their appropriate types. Therefore, we design a weakly supervised Zero-shot Learning and a Semi-Supervised Vector Quantized Variational Auto-Encoder approach that frames IE as a grounding problem instead of classification, where knowledge elements are grounded into any types from an extensible and large-scale target ontology or induced from the corpora, with available annotations for a few types. How can IE approaches be extent to low-resource languages without any extra human effort? There are more than 6000 living languages in the real world while public gold-standard annotations are only available for a few dominant languages. To facilitate the adaptation of these IE frameworks to other languages, especially low resource languages, a Multilingual Common Semantic Space is further proposed to serve as a bridge for transferring existing resources and annotated data from dominant languages to more than 300 low resource languages. Moreover, a Multi-Level Adversarial Transfer framework is also designed to learn language-agnostic features across various languages
    corecore