3,636 research outputs found

    Hardware Acceleration for Unstructured Big Data and Natural Language Processing.

    Full text link
    The confluence of the rapid growth in electronic data in recent years, and the renewed interest in domain-specific hardware accelerators presents exciting technical opportunities. Traditional scale-out solutions for processing the vast amounts of text data have been shown to be energy- and cost-inefficient. In contrast, custom hardware accelerators can provide higher throughputs, lower latencies, and significant energy savings. In this thesis, I present a set of hardware accelerators for unstructured big-data processing and natural language processing. The first accelerator, called HAWK, aims to speed up the processing of ad hoc queries against large in-memory logs. HAWK is motivated by the observation that traditional software-based tools for processing large text corpora use memory bandwidth inefficiently due to software overheads, and, thus, fall far short of peak scan rates possible on modern memory systems. HAWK is designed to process data at a constant rate of 32 GB/s—faster than most extant memory systems. I demonstrate that HAWK outperforms state-of-the-art software solutions for text processing, almost by an order of magnitude in many cases. HAWK occupies an area of 45 sq-mm in its pareto-optimal configuration and consumes 22 W of power, well within the area and power envelopes of modern CPU chips. The second accelerator I propose aims to speed up similarity measurement calculations for semantic search in the natural language processing space. By leveraging the latency hiding concepts of multi-threading and simple scheduling mechanisms, my design maximizes functional unit utilization. This similarity measurement accelerator provides speedups of 36x-42x over optimized software running on server-class cores, while requiring 56x-58x lower energy, and only 1.3% of the area.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/116712/1/prateekt_1.pd

    Exploring enterprises competition: From a perspective of massive recruitment texts mining

    Get PDF
    Extant research has made limited efforts to conduct competitive intelligence analysis based on recruitment texts. To fill the gap, this study proposes a method for deriving and analyzing competitive relationships, identifying competition paths, and calculating asymmetric competitiveness degrees, from the recruitment texts on e-recruiting websites. Specifically, this study developed a competitive evaluation index system for companies’ skill needs and resource base based on 53,171 job descriptions and 42,641 company profiles published by companies across 8 industries (including 35 industry segments) using automated text processing methods. Furthermore, in order to identify competitive paths and calculate the degree of asymmetric competitiveness, this study proposes a modified bipartite graph approach (i.e., MBGA) for competitive intelligence analysis of recruitment texts based on the competition evaluation index system. Experiments on a real-world dataset of the representative companies clearly validated the effectiveness of the method. Compared to the five state-of-the-art methods, MBGA performs better in disclosing the overall competition and is more accurate in terms of the error rating ratio (i.e., ERR) of the competition

    Verb similarity: comparing corpus and psycholinguistic data

    Get PDF
    Similarity, which plays a key role in fields like cognitive science, psycholinguistics and natural language processing, is a broad and multifaceted concept. In this work we analyse how two approaches that belong to different perspectives, the corpus view and the psycholinguistic view, articulate similarity between verb senses in Spanish. Specifically, we compare the similarity between verb senses based on their argument structure, which is captured through semantic roles, with their similarity defined by word associations. We address the question of whether verb argument structure, which reflects the expression of the events, and word associations, which are related to the speakers' organization of the mental lexicon, shape similarity between verbs in a congruent manner, a topic which has not been explored previously. While we find significant correlations between verb sense similarities obtained from these two approaches, our findings also highlight some discrepancies between them and the importance of the degree of abstraction of the corpus annotation and psycholinguistic representations.La similitud, que desempeña un papel clave en campos como la ciencia cognitiva, la psicolingüística y el procesamiento del lenguaje natural, es un concepto amplio y multifacético. En este trabajo analizamos cómo dos enfoques que pertenecen a diferentes perspectivas, la visión del corpus y la visión psicolingüística, articulan la semejanza entre los sentidos verbales en español. Específicamente, comparamos la similitud entre los sentidos verbales basados en su estructura argumental, que se capta a través de roles semánticos, con su similitud definida por las asociaciones de palabras. Abordamos la cuestión de si la estructura del argumento verbal, que refleja la expresión de los acontecimientos, y las asociaciones de palabras, que están relacionadas con la organización de los hablantes del léxico mental, forman similitud entre los verbos de una manera congruente, un tema que no ha sido explorado previamente. Mientras que encontramos correlaciones significativas entre las similitudes de los sentidos verbales obtenidas de estos dos enfoques, nuestros hallazgos también resaltan algunas discrepancias entre ellos y la importancia del grado de abstracción de la anotación del corpus y las representaciones psicolingüísticas.La similitud, que exerceix un paper clau en camps com la ciència cognitiva, la psicolingüística i el processament del llenguatge natural, és un concepte ampli i multifacètic. En aquest treball analitzem com dos enfocaments que pertanyen a diferents perspectives, la visió del corpus i la visió psicolingüística, articulen la semblança entre els sentits verbals en espanyol. Específicament, comparem la similitud entre els sentits verbals basats en la seva estructura argumental, que es capta a través de rols semàntics, amb la seva similitud definida per les associacions de paraules. Abordem la qüestió de si l'estructura de l'argument verbal, que reflecteix l'expressió dels esdeveniments, i les associacions de paraules, que estan relacionades amb l'organització dels parlants del lèxic mental, formen similitud entre els verbs d'una manera congruent, un tema que no ha estat explorat prèviament. Mentre que trobem correlacions significatives entre les similituds dels sentits verbals obtingudes d'aquests dos enfocaments, les nostres troballes també ressalten algunes discrepàncies entre ells i la importància del grau d'abstracció de l'anotació del corpus i les representacions psicolingüístiques

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    How hand movements and speech tip the balance in cognitive development:A story about children, complexity, coordination, and affordances

    Get PDF
    When someone asks us to explain something, such as how a lever or balance scale works, we spontaneously move our hands and gesture. This is also true for children. Furthermore, children use their hands to discover things and to find out how something works. Previous research has shown that children’s hand movements hereby are ahead of speech, and play a leading role in cognitive development. Explanations for this assumed that cognitive understanding takes place in one’s head, and that hand movements and speech (only) reflect this. However, cognitive understanding arises and consists of the constant interplay between (hand) movements and speech, and someone’s physical and social environment. The physical environment includes task properties, for example, and the social environment includes other people. Therefore, I focused on this constant interplay between hand movements, speech, and the environment, to better understand hand movements’ role in cognitive development. Using science and technology tasks, we found that children’s speech affects hand movements more than the other way around. During difficult tasks the coupling between hand movements and speech becomes even stronger than in easy tasks. Interim changes in task properties differently affect hand movements and speech. Collaborating children coordinate their hand movements and speech, and even their head movements together. The coupling between hand movements and speech is related to age and (school) performance. It is important that teachers attend to children’s hand movements and speech, and arrange their lessons and classrooms such that there is room for both

    Acta Cybernetica : Volume 18. Number 2.

    Get PDF

    DESIGN AND EXPLORATION OF NEW MODELS FOR SECURITY AND PRIVACY-SENSITIVE COLLABORATION SYSTEMS

    Get PDF
    Collaboration has been an area of interest in many domains including education, research, healthcare supply chain, Internet of things, and music etc. It enhances problem solving through expertise sharing, ideas sharing, learning and resource sharing, and improved decision making. To address the limitations in the existing literature, this dissertation presents a design science artifact and a conceptual model for collaborative environment. The first artifact is a blockchain based collaborative information exchange system that utilizes blockchain technology and semi-automated ontology mappings to enable secure and interoperable health information exchange among different health care institutions. The conceptual model proposed in this dissertation explores the factors that influences professionals continued use of video- conferencing applications. The conceptual model investigates the role the perceived risks and benefits play in influencing professionals’ attitude towards VC apps and consequently its active and automatic use

    Comparative Analysis of Different Trust Metrics of User-User Trust-Based Recommendation System

    Get PDF
    Information overload is the biggest challenge nowadays for any website, especially e-commerce websites. However, this challenge arises for the fast growth of information on the web (WWW) with easy access to the internet. Collaborative filtering based recommender system is the most useful application to solve the information overload problem by filtering relevant information for the users according to their interests. But, the existing system faces some significant limitations such as data sparsity, low accuracy, cold-start, and malicious attacks. To alleviate the mentioned issues, the relationship of trust incorporates in the system where it can be between the users or items, and such system is known as the trust-based recommender system (TBRS). From the user perspective, the motive of the TBRS is to utilize the reliability between the users to generate more accurate and trusted recommendations. However, the study aims to present a comparative analysis of different trust metrics in the context of the type of trust definition of TBRS. Also, the study accomplishes twenty-four trust metrics in terms of the methodology, trust properties \& measurement, validation approaches, and the experimented dataset

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications
    • …
    corecore