707 research outputs found

    Sharing and Reusing Semantic Queries for Searching the Web

    Get PDF
    This Bachelor's Thesis was preformed during a study stay at the Université de Nice, France. It is about developing a graphical user interface for drawing or generating intentional maps using goals and strategies.This Bachelor's Thesis was preformed during a study stay at the Université de Nice, France. It is about developing a graphical user interface for drawing or generating intentional maps using goals and strategies.

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    DISCOVERY OF VIRULENT APPS AND FAKE GRADING IN PLAY STORE

    Get PDF
    Existing content-based analysis tools not only cause high complexity and cost, but they also fail to handle large levels of files effectively. The proposed RTS methodology is implemented as a system remedy that can work on existing systems, such as the Hadoop file system, using the general interface of the file system and the use of information correlation properties. This document presents an almost real-time plan known as RTS to help an efficient and profitable search data analysis within the cloud. RTS extracts key property information of the given type by multidimensional features to represent these details in multidimensional vectors. An intuitive idea would be to significantly reduce the number of images used by analyzing only the most representative one instead of everything, at least once the mobile phone has an energy limit. RTS uses VFS operations to help semantic grouping. We can get the data from the closet of the page to help send them to the devil. We made a real-world situation of use in which children informed in a very busy environment are identified in time by analyzing 60 million images with RTS. RTS is created to exploit the correlation property of information through the use of hashing and a structured and manageable addressing

    Multivalent Metadata : Exploiting the Layers of Meaning in Digital Resources

    Get PDF
    The rapid growth of the World Wide Web was due in part to the simplicity of the Hypertext Markup Language (HTML). It is anticipated that the next generation of web technology, coined the Semantic Web, by Tim Berners-Lee (1989, p. 1), will be driven by the Extensible Markup Language (XML). The XML suite of technologies provides a framework for the application of metadata, and hence semantic information, to web resources. Advantages of a semantic web include improved sharing and reuse of resources, enhanced search mechanisms and knowledge management. The knowledge or meaning contained in digital information may vary according to the perspective of the viewer and can be seen therefore as multivalent in nature. Semantic information that is highly relevant to one user may be of no interest to another. The aim of this project was to demonstrate the layers of meaning inherent in a data sample and how they could be encapsulated in metadata then accessed and manipulated using current technologies, thus leveraging the knowledge contained. Analysis of the data sample, a typical component of an online training product, determined meaningful ways in which the knowledge contained could be reused and adapted. From this analysis a set of test criteria was generated. Metadata was then created for the sample data and the tests implemented using a range of XML technologies

    Xml Beyond The Tags

    Get PDF
    XML is quickly being utilized in the field of technical communication to transfer information from database to person and company to company. Often communicators will structure information without a second thought of how or why certain tags are used to mark up the information. Because the company or a manual says to use those tags, the communicator does so. However, if professionals want to unlock the true potential of XML for better sharing of information across platforms, they need to understand the effects the technology using XML as well as political and cultural factors have on the tags being used. This thesis reviewed literature from multiple fields utilizing XML to find how tag choices can be influenced. XML allows for the sharing of information across multiple platforms and databases. Because of this efficiency, XML is utilized by many technologies. Often communicators must tag information so that the technologies can find the marked up information; therefore, technologies like single sourcing, data mining, and knowledge management influence the types of tags created. Additionally, cultural and political influences are analyzed to see how they play a role in determining what tags are used and created for specific documents. The thesis concludes with predictions on the future of XML and the technological, political, and cultural influences associated with XML tag sets based on information found within the thesis

    Software supply chain monitoring in containerised open-source digital forensics and incident response tools

    Get PDF
    Abstract. Legal context makes software development challenging for the tool-oriented Digital Forensics and Incident Response (DFIR) field. Digital evidence must be complete, accurate, reliable, and acquirable in reproducible methods in order to be used in court. However, the lack of sufficient software quality is a well-known problem in this context. The popularity of Open-source Software (OSS) based development has increased the tool availability on different channels, highlighting their varying quality. The lengthened software supply chain has introduced additional factors affecting the tool quality and control over the use of the exact software version. Prior research on the quality level has primarily targeted the fundamental codebase of the tool, not the underlying dependencies. There is no research about the role of the software supply chain for quality factors in the DFIR context. The research in this work focuses on the container-based package ecosystem, where the case study includes 51 maintained open-source DFIR tools published as Open Container Initiative (OCI) containers. The package ecosystem was improved, and an experimental system was implemented to monitor upstream release version information and provide it for both package maintainers and end-users. The system guarantees that the described tool version matches the actual version of the tool package, and all information about tool versions is available. The primary purpose is to bring more control over the packages and support the reproducibility and documentation requirement of the investigations while also helping with the maintenance work. The tools were also monitored and maintained for six months to observe software dependency-related factors affecting the tool functionality between different versions. After that period, the maintenance was halted for additional six months, and the tool’s current package version was rebuilt to limit gathered information for the changed dependencies. A significant amount of different built time and runtime failures were discovered, which have either prevented or hindered the tool installation or significantly affected the tool used in the investigation process. Undocumented, changed or too new environment-related dependencies were the significant factors leading to tool failures. These findings support known software dependency-related problems. The nature of the failures suggests that tool package maintainers are required to possess a prominent level of various kinds of skills for making operational tool packages, and maintenance is an effort-intensive job. If the investigator does not have similar skills and there is a dependency-related failure present in the software, the software may not be usable.Ohjelmistotoimitusketjun seuranta kontitetuissa avoimen lĂ€hdekoodin digitaaliforensiikan ja tietoturvapoikkeamien reagoinnin työkaluissa. TiivistelmĂ€. Oikeudellinen asiayhteys tekee ohjelmistokehityksestĂ€ haasteellista työkalupainotteiselle digitaaliforensiikalle ja tietoturvapoikkeamiin reagoinnille (DFIR). Digitaalisen todistusaineiston on oltava kokonaista, tĂ€smĂ€llistĂ€, luotettavaa ja hankittavissa toistettavilla menetelmillĂ€, jotta sitĂ€ voidaan kĂ€yttÀÀ tuomioistuimessa. Laadun puute on kuitenkin tĂ€ssĂ€ yhteydessĂ€ tunnettu ongelma. Avoimeen lĂ€hdekoodin perustuva ohjelmistokehitys on kasvattanut suosiotaan, mikĂ€ on luonnollisesti lisĂ€nnyt työkalujen saatavuutta eri kanavilla, korostaen niiden vaihtelevaa laatua. Ohjelmistotoimitusketjun pidentyminen on tuonut mukanaan työkalujen laatuun ja tĂ€smĂ€llisen ohjelmistoversion hallintaan vaikuttavia lisĂ€tekijöitĂ€. Laatutasoa koskevassa aikaisemmassa tutkimuksessa on keskitytty pÀÀasiassa työkalun olennaiseen koodipohjaan; ei sen taustalla oleviin riippuvuuksiin. Ohjelmistotoimitusketjun merkityksestĂ€ laadullisiin tekijöihin ei ole olemassa tutkimusta DFIR-asiayhteydessĂ€. TĂ€mĂ€n työn tutkimuksessa keskitytÀÀn konttipohjaiseen pakettiekosysteemiin, missĂ€ tapaustutkimuksen kohteena on 51 yllĂ€pidettyĂ€ avoimen lĂ€hdekoodin DFIR-työkalua, jotka julkaistaan ns. OCI-kontteina. TyössĂ€ parannettiin pakettiekosysteemiĂ€ ja toteutettiin kokeellinen jĂ€rjestelmĂ€, jolla seurattiin julkaisuversiotietoja ja tarjottiin niitĂ€ sekĂ€ pakettien yllĂ€pitĂ€jille ettĂ€ loppukĂ€yttĂ€jille. JĂ€rjestelmĂ€ takaa, ettĂ€ kuvattu työkaluversio vastaa työkalupaketin todellista versiota, ja kaikki tieto työkaluversioista on saatavilla. Ensisijaisena tarkoituksena oli lisĂ€tĂ€ ohjelmistopakettien hallintaa ja tukea tutkintojen toistettavuus- ja dokumentointivaatimusta, kuten myös auttaa pakettien yllĂ€pitotyössĂ€. TyössĂ€ myös seurattiin ja yllĂ€pidettiin työkaluja kuuden kuukauden ajan sellaisten ohjelmistoriippuvuuksien aiheuttamien tekijöiden tunnistamiseksi, jotka vaikuttavat työkalun toimivuuteen eri versioiden vĂ€lillĂ€. LisĂ€ksi odotettiin vielĂ€ kuusi kuukautta ilman yllĂ€pitoa, ja työkalun nykyinen pakettiversio rakennettiin uudelleen, jotta kerĂ€tty tieto voitiin rajoittaa vain muuttuneisiin riippuvuuksiin. Työn aikana löydettiin huomattava mÀÀrĂ€ erilaisia rakennusaika- ja suoritusaikavirheitĂ€, mitkĂ€ ovat joko estĂ€neet tai haitanneet työkalun asennusta, tai muuten vaikuttaneet merkittĂ€vĂ€sti tutkinnassa kĂ€ytettyyn työkaluun. Dokumentoimattomat, muuttuneet tai liian uudet ympĂ€ristöriippuvuudet olivat merkittĂ€viĂ€ työkaluvirheisiin johtaneita tekijöitĂ€. NĂ€mĂ€ löydökset tukevat ennestÀÀn tunnettuja ohjelmistoriippuvuusongelmia. Virheiden luonteesta voidaan pÀÀtellĂ€, ettĂ€ työkalujen yllĂ€pitĂ€jiltĂ€ vaaditaan paljon erilaista osaamista toiminnallisten työkalupakettien yllĂ€pitĂ€misessĂ€, ja yllĂ€pitĂ€minen vaatii paljon vaivaa. Jos tutkijalla ei ole vastaavia taitoja ja ohjelmistossa on riippuvuuksiin liittyvĂ€ virhe, ohjelmisto saattaa olla kĂ€yttökelvoton

    TwiddleNet metadata tagging and data dissemination in mobile device networks

    Get PDF
    y were only a few years ago; instead they offer a range of content capture capabilities, including high resolution photos, videos and sound recordings. Their communication modalities and processing power have also evolved significantly. Modern mobile devices are very capable platforms, many surpassing their desktop cousins only a few years removed. TwiddleNet is a distributed architecture of personal servers that harnesses the power of these mobile devices, enabling real time information dissemination and file sharing of multiple data types from commercial-off-the-shelf platforms. This thesis focuses on two specific issues of the TwiddleNet design; metadata tagging and data dissemination. Through a combination of automatically generated and user input metadata tag values, TwiddleNet users can locate files across participating devices. Metaphor appropriate custom tags can be added as needed to insure efficient, rich and successful file searches. Intelligent data dissemination algorithms provide context sensitive governance to the file transfer scheme. Smart dissemination reconciles device and operational states with the amount of requested data and content to send, enabling providers to meet their most pressing needs, whether that is continuing to generate content or servicing requests.http://archive.org/details/twiddlenetmetada109453333US Navy (USN) author.Approved for public release; distribution is unlimited

    Semantic multimedia modelling & interpretation for annotation

    Get PDF
    The emergence of multimedia enabled devices, particularly the incorporation of cameras in mobile phones, and the accelerated revolutions in the low cost storage devices, boosts the multimedia data production rate drastically. Witnessing such an iniquitousness of digital images and videos, the research community has been projecting the issue of its significant utilization and management. Stored in monumental multimedia corpora, digital data need to be retrieved and organized in an intelligent way, leaning on the rich semantics involved. The utilization of these image and video collections demands proficient image and video annotation and retrieval techniques. Recently, the multimedia research community is progressively veering its emphasis to the personalization of these media. The main impediment in the image and video analysis is the semantic gap, which is the discrepancy among a user’s high-level interpretation of an image and the video and the low level computational interpretation of it. Content-based image and video annotation systems are remarkably susceptible to the semantic gap due to their reliance on low-level visual features for delineating semantically rich image and video contents. However, the fact is that the visual similarity is not semantic similarity, so there is a demand to break through this dilemma through an alternative way. The semantic gap can be narrowed by counting high-level and user-generated information in the annotation. High-level descriptions of images and or videos are more proficient of capturing the semantic meaning of multimedia content, but it is not always applicable to collect this information. It is commonly agreed that the problem of high level semantic annotation of multimedia is still far from being answered. This dissertation puts forward approaches for intelligent multimedia semantic extraction for high level annotation. This dissertation intends to bridge the gap between the visual features and semantics. It proposes a framework for annotation enhancement and refinement for the object/concept annotated images and videos datasets. The entire theme is to first purify the datasets from noisy keyword and then expand the concepts lexically and commonsensical to fill the vocabulary and lexical gap to achieve high level semantics for the corpus. This dissertation also explored a novel approach for high level semantic (HLS) propagation through the images corpora. The HLS propagation takes the advantages of the semantic intensity (SI), which is the concept dominancy factor in the image and annotation based semantic similarity of the images. As we are aware of the fact that the image is the combination of various concepts and among the list of concepts some of them are more dominant then the other, while semantic similarity of the images are based on the SI and concept semantic similarity among the pair of images. Moreover, the HLS exploits the clustering techniques to group similar images, where a single effort of the human experts to assign high level semantic to a randomly selected image and propagate to other images through clustering. The investigation has been made on the LabelMe image and LabelMe video dataset. Experiments exhibit that the proposed approaches perform a noticeable improvement towards bridging the semantic gap and reveal that our proposed system outperforms the traditional systems
    • 

    corecore