56 research outputs found

    Content-aware compression for big textual data analysis

    Get PDF
    A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements

    Multi Criteria Mapping Based on SVM and Clustering Methods

    Get PDF
    There are many more ways to automate the application process like using some commercial software’s that are used in big organizations to scan bills and forms, but this application is only for the static frames or formats. In our application, we are trying to automate the non-static frames as the study certificate we get are from different counties with different universities. Each and every university have there one format of certificates, so we try developing a very new application that can commonly work for all the frames or formats. As we observe many applicants are from same university which have a common format of the certificate, if we implement this type of tools, then we can analyze this sort of certificates in a simple way within very less time. To make this process more accurate we try implementing SVM and Clustering methods. With these methods we can accurately map courses in certificates to ASE study path if not to exclude list. A grade calculation is done for courses which are mapped to an ASE list by separating the data for both labs and courses in it. At the end, we try to award some points, which includes points from ASE related courses, work experience, specialization certificates and German language skills. Finally, these points are provided to the chair to select the applicant for master course ASE

    National facilities study. Volume 2A: Facility Study Office on the National Wind Tunnel Complex

    Get PDF
    The Facility Study Office (FSO) has completed its assigned activities. The results of the FSO efforts, studies, and assessments are documented. An overview of the FSO activities as well as a general comparison of all concepts considered are provided. Detailed information is also provided for the selected concept, Concept D-Option 5. Only findings are presented. The FSO developed recommendations only as a consequence of assumptions for cost and schedule assessments

    Camas, Winter 2007-2008

    Get PDF
    The Nature of a Literary Journal / Dave Loos -- Thirty Miles of Lead Time / Craig Rigdon -- Coyote Medicine / Merrilyne Lundahl -- On Natural Beauty / Alison Hawthorne Deming -- Finding Numbered Days / Matt Larson -- Ways of a Desert / Jacoba Charles -- Boy Missing Near Judith Gap / Chad Dundas -- Conservatory / M. Frost -- Pegasus in Montana / Alison Hawthorne Deming -- Forest Time / Alison Hawthorne Deming -- Uneasy With Montana / Victor Charlo -- On the Edges of Baranof / Gary Hawk -- The Window / Rick Kempa -- Manicure / Jessica Babcock -- My Hometown Farmer Men / Jessica Babcock -- Devils Tower / Jennifer Johnson -- Conservancy Pines / Jennifer Johnson -- Covenant / Melissa Mylchreest -- Vocabulary Lesson / Melissa Mylchreest -- Near the Base of Juniper Mountain / David Morris -- St. Thomas Reemerges from the Waters of Lake Mead / Cleo Woelfle-Erskine -- Norman Clyde / Paul Willis -- Coyote’s Relinquish Story / Abby Chew -- Cover Photo Nez Perce National Historic Park, Big Hole Battlefield, Montana / Staci Shor

    TUTTI! - Music Composition as Dialogue

    Get PDF
    As an engineer, when I could not comprehend a physical phenomenon, I turned to mathematics. As a mathematician, when I could not link sciences to humanity, I turned to music. As a music composer, I no longer see things, I see others. The novel method of music composition presented herein is a first comprehensive framework, system and architectonic template relying on the ideologies of Mikhail Bakhtin's dialogism as well as on research in auditory perception and cognition to create music dialogue as a means of including and engaging participants in musical communication. Beyond immediate artistic intent, I strive to compose music that fosters inclusiveness and collaboration as a relational social gesture in hope that it might incite people and society to embrace their differences and collaborate with the 'others' around them. After probing aesthetics, communication studies and sociology, I argue that dialogism reveals itself well-suited to the aims of the current research. With dialogism as a guiding philosophy, the chapters then look at the relationship between music and language, perception as authorship, intertextuality, the interplay of imagination and understanding, means of arousal in music, mimesis, motion in music and rhythmic entrainment. Employing findings from Gestalt psychology, psychoacoustics, auditory scene analysis, cognition and psychology of expectation, the remaining chapters propose a cognitively informed polyphonic music composition method capable of reproducing the different constituents of dialogic communication by creating and organizing melodic, harmonic, rhythmic and structural elements. Music theory and principles of orchestration then move to music composition as examples demonstrate how dialogue scored between voice-parts provides opportunities for performers to interact with each other and, consequently, engage listeners experiencing the collaboration. As dialogue can be identified in various works, I postulate that the presented Dialogical Music Composition Method can also serve as a method of music analysis. This personal method of composition also supplies tools that other musicians can opt to employ when endeavouring to build balanced dialogue in music. If visibility is key to identity, then composing music that potentially enters into dialogue which each and every voice promotes 'humanity' through inclusivity, yielding a united Tutti
    • …
    corecore