2,279 research outputs found

    CHORUS Deliverable 3.4: Vision Document

    Get PDF
    The goal of the CHORUS Vision Document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area and to highlight trends and challenges in this domain. The vision of CHORUS is strongly connected to the CHORUS Roadmap Document (D2.3). A concise document integrating the outcomes of the two deliverables will be prepared for the end of the project (NEM Summit)

    SDK development for bridging heterogeneous data sources through connect bridge platform

    Get PDF
    Nesta dissertação apresentou-se um SDK para a criação de conectores a integrar com o CB Server, que pretende: acelerar o desenvolvimento, garantir melhores práticas e simplificar as diversas atividades e tarefas no processo de desenvolvimento. O SDK fornece uma API pública e simples, suportada por um conjunto de ferramentas, que facilitam o processo de desenvolvimento, explorando as facilidades disponibilizadas através da API. Para analisar a exatidão, viabilidade, integridade e acessibilidade da solução apresentam-se dois exemplos e casos de estudo. Através dos casos de estudo foi possível identificar uma lista de problemas, de pontos sensíveis e melhorias na solução proposta. Para avaliar a usabilidade da API, uma metodologia baseada em vários métodos de avaliação de usabilidade foi estabelecida. O múltiplo caso de estudo funciona como o principal método de avaliação, combinando vários métodos de pesquisa. O caso de estudo consiste em três fases de avaliação: um workshop, uma avaliação heurística e uma análise subjetiva. O caso de estudo envolveu três engenheiros de software (incluindo programadores e avaliadores). A metodologia aplicada gerou resultados com base num método de inspeção, testes de utilizador e entrevistas. Identificou-se não só pontos sensíveis e falhas no código-fonte, mas também problemas estruturais, de documentação e em tempo de execução, bem como problemas relacionados com a experiência do utilizador. O contexto do estudo é apresentado de modo a tirar conclusões acerca dos resultados obtidos. O trabalho futuro incluirá o desenvolvimento de novas funcionalidades. Adicionalmente, pretende-se resolver problemas encontrados na metodologia aplicada para avaliar a usabilidade da API, nomeadamente problemas e falhas no código fonte (por exemplo, validações) e problemas estruturais.In this dissertation, we present an SDK for the creation of connectors to integrate with CB Server which accelerates deployment, ensures best practices and simplifies the various activities and tasks in the development process. The SDK provides a public and simple API leveraged by a set of tools around the API developed which facilitate the development process by exploiting the API facilities. To analyse the correctness, feasibility, completeness, and accessibility of our solution, we presented two examples and case studies. From the case studies, we derived a list of issues found in our solution and a set of proposals for improvement. To evaluate the usability of the API, a methodology based on several usability evaluation methods has been established. Multiple case study works as the main evaluation method, combining several research methods. The case study consists of three evaluation phases – a hands-on workshop, a heuristic evaluation and subjective analysis. The case study involved three computer science engineers (including novice and expert developers and evaluators). The applied methodology generated insights based on an inspection method, a user test, and interviews. We identify not only problems and flaws in the source code, but also runtime, structural and documentation problems, as well as problems related to user experience. To help us draw conclusion from the results, we point out the context of the study. Future work will include the development of new functionalities. Additionally, we aim to solve problems found in the applied methodology to evaluate the usability of the API, namely problems and flaws in the source code (e.g. validations) and structural problems

    Development of Grid e-Infrastructure in South-Eastern Europe

    Full text link
    Over the period of 6 years and three phases, the SEE-GRID programme has established a strong regional human network in the area of distributed scientific computing and has set up a powerful regional Grid infrastructure. It attracted a number of user communities and applications from diverse fields from countries throughout the South-Eastern Europe. From the infrastructure point view, the first project phase has established a pilot Grid infrastructure with more than 20 resource centers in 11 countries. During the subsequent two phases of the project, the infrastructure has grown to currently 55 resource centers with more than 6600 CPUs and 750 TBs of disk storage, distributed in 16 participating countries. Inclusion of new resource centers to the existing infrastructure, as well as a support to new user communities, has demanded setup of regionally distributed core services, development of new monitoring and operational tools, and close collaboration of all partner institution in managing such a complex infrastructure. In this paper we give an overview of the development and current status of SEE-GRID regional infrastructure and describe its transition to the NGI-based Grid model in EGI, with the strong SEE regional collaboration.Comment: 22 pages, 12 figures, 4 table

    A Large-scale Benchmark for Log Parsing

    Full text link
    Log data is pivotal in activities like anomaly detection and failure diagnosis in the automated maintenance of software systems. Due to their unstructured format, log parsing is often required to transform them into a structured format for automated analysis. A variety of log parsers exist, making it vital to benchmark these tools to comprehend their features and performance. However, existing datasets for log parsing are limited in terms of scale and representativeness, posing challenges for studies that aim to evaluate or develop log parsers. This problem becomes more pronounced when these parsers are evaluated for production use. To address these issues, we introduce a new collection of large-scale annotated log datasets, named LogPub, which more accurately mirrors log data observed in real-world software systems. LogPub comprises 14 datasets, each averaging 3.6 million log lines. Utilizing LogPub, we re-evaluate 15 log parsers in a more rigorous and practical setting. We also propose a new evaluation metric to lessen the sensitivity of current metrics to imbalanced data distribution. Furthermore, we are the first to scrutinize the detailed performance of log parsers on logs that represent rare system events and offer comprehensive information for system troubleshooting. Parsing such logs accurately is vital yet challenging. We believe that our work could shed light on the design and evaluation of log parsers in more realistic settings, thereby facilitating their implementation in production systems

    Enabling Data-Driven Transportation Safety Improvements in Rural Alaska

    Get PDF
    Safety improvements require funding. A clear need must be demonstrated to secure funding. For transportation safety, data, especially data about past crashes, is the usual method of demonstrating need. However, in rural locations, such data is often not available, or is not in a form amenable to use in funding applications. This research aids rural entities, often federally recognized tribes and small villages acquire data needed for funding applications. Two aspects of work product are the development of a traffic counting application for an iPad or similar device, and a review of the data requirements of the major transportation funding agencies. The traffic-counting app, UAF Traffic, demonstrated its ability to count traffic and turning movements for cars and trucks, as well as ATVs, snow machines, pedestrians, bicycles, and dog sleds. The review of the major agencies demonstrated that all the likely funders would accept qualitative data and Road Safety Audits. However, quantitative data, if it was available, was helpful

    Changeset-based Retrieval of Source Code Artifacts for Bug Localization

    Get PDF
    Modern software development is extremely collaborative and agile, with unprecedented speed and scale of activity. Popular trends like continuous delivery and continuous deployment aim at building, fixing, and releasing software with greater speed and frequency. Bug localization, which aims to automatically localize bug reports to relevant software artifacts, has the potential to improve software developer efficiency by reducing the time spent on debugging and examining code. To date, this problem has been primarily addressed by applying information retrieval techniques based on static code elements, which are intrinsically unable to reflect how software evolves over time. Furthermore, as prior approaches frequently rely on exact term matching to measure relatedness between a bug report and a software artifact, they are prone to be affected by the lexical gap that exists between natural and programming language. This thesis explores using software changes (i.e., changesets), instead of static code elements, as the primary data unit to construct an information retrieval model toward bug localization. Changesets, which represent the differences between two consecutive versions of the source code, provide a natural representation of a software change, and allow to capture both the semantics of the source code, and the semantics of the code modification. To bridge the lexical gap between source code and natural language, this thesis investigates using topic modeling and deep learning architectures that enable creating semantically rich data representation with the goal of identifying latent connection between bug reports and source code. To show the feasibility of the proposed approaches, this thesis also investigates practical aspects related to using a bug localization tool, such retrieval delay and training data availability. The results indicate that the proposed techniques effectively leverage historical data about bugs and their related source code components to improve retrieval accuracy, especially for bug reports that are expressed in natural language, with little to no explicit code references. Further improvement in accuracy is observed when the size of the training dataset is increased through data augmentation and data balancing strategies proposed in this thesis, although depending on the model architecture the magnitude of the improvement varies. In terms of retrieval delay, the results indicate that the proposed deep learning architecture significantly outperforms prior work, and scales up with respect to search space size
    corecore