5,276 research outputs found

    XQuery Summer Institute: Advancing XML-Based Scholarship from Representation to Discovery

    Get PDF
    The XQuery Summer Institute at Vanderbilt University will be aimed at archivists, librarians, professors, and students who have experience marking up texts in XML, but do not yet know how to work computationally with those documents. Our institute aspires to recruit twelve members of the digital humanities community to a two week institute in June 2014. The faculty of the institute will teach participants to work productively with their XML-encoded texts using XQuery, a programming language designed specifically for XML. With XQuery, scholars can learn a single language to ingest their texts into an XML database, ask questions of them, connect them with other sources of information, and publish them on the web. Participants will go beyond using XML for representation to querying XML for discovery

    Integrating Multiple Data Views for Improved Malware Analysis

    Get PDF
    Malicious software (malware) has become a prominent fixture in computing. There have been many methods developed over the years to combat the spread of malware, but these methods have inevitably been met with countermeasures. For instance, signature-based malware detection gave rise to polymorphic viruses. This arms race\u27 will undoubtedly continue for the foreseeable future as the incentives to develop novel malware continue to outweigh the costs. In this dissertation, I describe analysis frameworks for three important problems related to malware: classification, clustering, and phylogenetic reconstruction. The important component of my methods is that they all take into account multiple views of malware. Typically, analysis has been performed in either the static domain (e.g. the byte information of the executable) or the dynamic domain (e.g. system call traces). This dissertation develops frameworks that can easily incorporate well-studied views from both domains, as well as any new views that may become popular in the future. The only restriction that must be met is that a positive semidefinite similarity (kernel) matrix must be defined on the view, a restriction that is easily met in practice. While the classification problem can be solved with well known multiple kernel learning techniques, the clustering and phylogenetic problems required the development of novel machine learning methods, which I present in this dissertation. It is important to note that although these methods were developed in the context of the malware problem, they are applicable to a wide variety of domains
    • …
    corecore