897 research outputs found

    The Use of Controlled Vocabularies and Structured Expressions in the Assurance of CPS

    Get PDF
    To date, work on the development of assurance cases has largely been concerned with the broad structure and content of arguments to contextualise the data. However, at a more detailed level, use of natural language in an argument can lead to conflicting terminology, to difficulties in understanding the nature of the claims being made or to logical inferences which are obscure to the readers of the argument. This problem has become increasingly complex as more and more suppliers are involved in the development chain, making it more difficult to evaluate the strengths and weaknesses of assurance data or to re-use it. This paper explores the development of controlled vocabulary and structured expressions for CPS in the automotive domain, using the Semantics of Business Vocabulary and Business Rules (SBVR) to improve communication and to provide presents some formal consistency checking of content. We highlight the challenges this work has exposed. Keywords: safety, assurance, controlled language, SBVR, automotive

    A methodological approach on the creation of trustful test suites for grammar error detection

    Get PDF
    Machine translation’s research has been expanding over time and so has the need to automatically detect and correct errors in texts. As such, Unbabel combines machine translation with human editors in post-edition to provide high quality translations. In order to assist post-editors in these tasks, a proprietary error detection tool called Smartcheck was developed by Unbabel to identify errors and suggest corrections. The state-of-the-art method of identifying translation errors depends on curated annotated texts (associated with error-type categories), which are fed to machine translation systems as their evaluation standard, i.e. the test suites to evaluate a system’s error detection accuracy. It is commonly assumed that evaluation sets are reliable and representative of the content the systems translate, leading to the assumption that the root problem usually relates to grammar-checking rules. However, the issue may instead lie in the quality of the evaluation set. If so, then the decisions made upon evaluation will possibly even have the opposite effect to the one intended. Thus, it is of utmost importance to have suitable datasets with representative data of the structures needed for each system, the same for Smartcheck. With this in mind, this dissertation developed and implemented a new methodology on creating reliable and revised test suites to be applied on the evaluation process of MT systems and error detection tools. Using the resulting curated test suites to evaluate proprietary systems and tools to Unbabel, it became possible to trust the conclusions and decisions made from said evaluations. This methodology accomplished robust identification of problematic error types, grammar-checking rules, and language- and/or register-specific issues, therefore allowing production measures to be adopted. With Smartcheck’s (now reliable and accurate) correction suggestions and the improvement on post-edition revision, the work presented hereafter led to an improvement on the translation quality provided to customers.O presente trabalho focou-se na avaliação do desempenho de uma ferramenta proprietĂĄria da Unbabel, para detecção automĂĄtica de erros, baseada em segmentos previamente anotados pela comunidade de anotadores, o Smartcheck. Assim, foi proposta uma metodologia para criação de um corpus de teste (do inglĂȘs test suites) baseado em dados de referĂȘncia com estruturas relevantes (do inglĂȘs gold data). Deste modo, tornou-se possĂ­vel melhorar a qualidade das sugestĂ”es de correção de erros do Smartcheck e, consequentemente, das traduçÔes facultadas. Para alĂ©m do objetivo inicial, a nova metodologia permitiu assegurar uma avaliação rigorosa, apropriada e fundamentada relativamente Ă s regras usadas pelo Smartcheck, para identificar possĂ­veis erros de tradução, assim como avaliar outras ferramentas e sistemas de tradução automĂĄtica da Unbabel. Recentemente, assistiu-se tambĂ©m a uma fusĂŁo da Lingo24 com a Unbabel e, por essa razĂŁo, os dados presentes no corpus incluem conteĂșdo traduzido por ambas. Como tal, o trabalho desenvolvido contribuiu inclusivamente para a recente integração da Lingo24. A Secção 2 foi dedicada Ă  apresentação da Unbabel, na qual se referem os processos de controlo de qualidade utilizados para assegurar nĂ­veis de qualidade exigidos e se descreve pormenorizadamente a ferramenta em foco, o Smartcheck. A Secção 3 focou-se no estado da arte da Tradução AutomĂĄtica e em processos de controlo de qualidade, dando especial atenção a corpora de teste e Ă  influĂȘncia dos mesmos. AlĂ©m disso, foi tambĂ©m incluĂ­da uma descrição relativa ao desenvolvimento de ferramentas automĂĄticas de deteção e correção de erros, criadas para aperfeiçoar os textos provenientes de traduçÔes automĂĄticas. A metodologia criada, descrita na Secção 4, foi dividida em trĂȘs partes principais: avaliação piloto relativa Ă s regras preexistentes do Smartcheck; anĂĄlise de causas de erros (do inglĂȘs root-cause analysis); e, por fim, construção de um novo corpus de teste, com dados mais recentes e corrigidos. O primeiro passo na metodologia consistiu na avaliação do desempenho da ferramenta em foco na presente tese. Para tal, foi realizada uma anĂĄlise piloto na qual cada regra utilizada pelo Smartcheck foi avaliada de acordo com mĂ©tricas comumente aplicadas para avaliação de sistemas de deteção de erros, como o nĂșmero de verdadeiros positivos (true positives) - casos em que o sistema conseguiu corretamente identificar erros -, de falsos negativos (false negatives) - casos em que existia um erro, mas o sistema nĂŁo o identificou - e de falsos positivos (false positives) - casos em que o sistema incorretamente considerou existir erros. Outras mĂ©tricas utilizadas para avaliação consistiram no cĂĄlculo de Precision, Recall, e F1-score, a partir dos valores obtidos das mĂ©tricas anteriormente mencionadas. Tendo terminado a avaliação piloto, concluiu-se que nem todas as regras foram passĂ­veis de avaliação (razĂŁo pela qual se tornou impossĂ­vel averiguar o desempenho individual para cada regra) e, quanto Ă s que foram avaliadas, os resultados nĂŁo foram considerados satisfatĂłrios. Isto porque, as regras nĂŁo identificavam erros existentes nas traduçÔes e consideravam como problemĂĄticos inĂșmeros segmentos gramaticalmente corretos. A segunda etapa da metodologia surgiu, entĂŁo, como tentativa de identificar possĂ­veis razĂ”es pelas quais o Smartcheck e as regras associadas demonstraram um baixo desempenho. Em vista desse objetivo, foi feita uma anĂĄlise na qual foi colocada a hipĂłtese de que as regras teriam sido avaliadas com um corpus de teste nĂŁo apropriado e obsoleto, explicando assim as mĂ©tricas muito baixas da avaliação piloto. Esta hipĂłtese surgiu uma vez que foi nĂŁo sĂł considerada a possibilidade de os dados do corpus nĂŁo serem representativos das traduçÔes feitas atualmente, mas tambĂ©m pelo facto de as estruturas consideradas problemĂĄticas para os sistemas de tradução serem alteradas constantemente. De modo a corroborar a hipĂłtese colocada, o corpus foi analisado com base em variados critĂ©rios: qual o tipo de tradução dos dados - se os segmentos analisados tinham ou nĂŁo sido previamente revisto por pĂłs-editores antes da respetiva submissĂŁo; existĂȘncia de segmentos duplicados ou cujo texto de partida (do inglĂȘs source text) poderia conter erros - i.e. dados ruidosos; e revisĂŁo das anotaçÔes e das severidades associadas a cada erro, de acordo com tipologias e diretrizes especĂ­ficas da Unbabel - considerando o nĂșmero de anotaçÔes/severidades correta e incorretamente atribuĂ­das, assim como em falta. Uma vez finalizada a anĂĄlise, concluĂ­mos que cerca de 20% dos dados correspondiam a duplicaçÔes - tanto para o registo formal como para o informal -, que entre 15-25% das anotaçÔes foram consideradas incorretas e que apenas metade das severidades foram corretamente atribuĂ­das. Assim sendo, considerĂĄmos que seria mais vantajoso criar um novo corpus representativo e refinado, ao invĂ©s de corrigir todas as anotaçÔes incorretas do corpus previamente usado. O terceiro e Ășltimo passo da metodologia consistiu na construção de um novo corpus de teste com 27 500 exemplos previamente anotados de traduçÔes automĂĄticas. Os procedimentos para a criação deste novo corpus incluĂ­ram: filtragem de um conjunto de traduçÔes automĂĄticas, com dados representativos para todas as lĂ­nguas suportadas pela Unbabel; distinção entre segmentos dependentes e nĂŁo dependentes de contexto (uma limitação do corpus prĂ©vio); exclusĂŁo de exemplos duplicados e de casos com textos de partida problemĂĄticos; e, por fim, revisĂŁo por parte de linguistas e tradutores das anotaçÔes atribuĂ­das, seguindo tipologias proprietĂĄrias. Este Ășltimo procedimento foi ainda subdividido em: uma avaliação geral, de modo a garantir que as traduçÔes transmitiam de forma coerente, fluĂ­da e apropriada a mensagem do texto de partida e que, para alĂ©m disso, seguiam regras especĂ­ficas para cada lĂ­ngua; uma avaliação focada em especificidades por cliente, de modo a assegurar diretrizes existentes; e uma revisĂŁo de severidades associadas a cada anotação. Tendo sido a metodologia dada como terminada, o corpus de teste consistia agora num conjunto de dados de confiança, capaz de avaliar sistemas de tradução automĂĄtica e ferramentas como o Smartcheck de uma forma objetiva e fundamentada. Posto isto, as vĂĄrias avaliaçÔes realizadas - descritas na Secção 5 - usaram os dados compreendidos no corpus como termo de comparação. A primeira avaliação teve como objetivo principal comparar os resultados obtidos na anĂĄlise piloto quanto Ă s regras do Smartcheck com os resultados de uma nova avaliação das mesmas usando o novo corpus de teste, de forma a chegar a conclusĂ”es mais fiĂĄveis e credĂ­veis. A partir desta, foi possĂ­vel concluir nĂŁo sĂł que, contrariamente Ă s conclusĂ”es anteriores, todas as regras sĂŁo agora passĂ­veis de avaliação, mas tambĂ©m que o nĂșmero de casos em que o Smartcheck incorretamente identificava segmentos como problemĂĄticos foi reduzido. A avaliação seguinte comparou anotaçÔes recorrendo a uma matriz de confusĂŁo (do inglĂȘs confusion matrix) entre previsĂ”es concedidas tanto pelo Smartcheck como pelo corpus de teste. Deste modo, foi possĂ­vel identificar quais os tipos de erros mais frequentes e quais os tipos mais (e menos) problemĂĄticos de identificar pelo sistema. Assim, o corpus de teste foi considerado como gold standard de modo a realizar uma avaliação global do Smartcheck, calculando o nĂșmero total de falsos positivos (atingindo cerca de 45%), falsos negativos (com 35%) e verdadeiros positivos (aproximadamente 20%). Quanto aos verdadeiros positivos, estes foram divididos em dois tipos: segmentos corretamente identificados pelo Smartcheck como erro, mas que foram classificados incorretamente (cerca de 11%); e erros em que tanto a extensĂŁo como a classificação foram atribuĂ­das corretamente (a rondar os 8% do nĂșmero total de anotaçÔes). A terceira e Ășltima anĂĄlise recorreu aos totais obtidos na avaliação anterior para calcular valores para mĂ©tricas como Precision, Recall e F1-score para cada lĂ­ngua e para cada registo suportado. Desta forma, foi possĂ­vel concluir que, quanto Ă  primeira mĂ©trica, a mĂ©dia entre registos estava bastante equilibrada, mas o mesmo nĂŁo se verificou em Recall nem F1-score, uma vez que o registo formal atingiu valores superiores. Para alĂ©m disso, recorremos ainda ao corpus para avaliar spell checkers usados pela Unbabel e, analisando os resultados obtidos, pudemos concluir que o spell checker em uso obteve a avaliação mais baixa. Tendo isto em conta, foi decidido que seria entĂŁo preferĂ­vel substituĂ­-lo pelo spell checker com a melhor avaliação, de modo a reduzir o nĂșmero de erros nas traduçÔes e assim melhorar a qualidade das mesmas. Todo o trabalho realizado pĂŽde ser implementado em vĂĄrios outros campos para alĂ©m do inicialmente estabelecido, i.e. para alĂ©m da avaliação sistemĂĄtica da ferramenta Smartcheck. Demonstrando, deste modo, todo o impacto que uma anĂĄlise bem fundamentada pode ter no processo de tomada de decisĂŁo. Isto porque, sem um corpus de teste representativo e estruturado, as avaliaçÔes feitas nĂŁo seriam vĂĄlidas e os resultados obtidos facilmente levariam a conclusĂ”es imprĂłprias ou atĂ© nocivas para o desenvolvimento dos sistemas e ferramentas em questĂŁo

    Generative AI in the Construction Industry: A State-of-the-art Analysis

    Full text link
    The construction industry is a vital sector of the global economy, but it faces many productivity challenges in various processes, such as design, planning, procurement, inspection, and maintenance. Generative artificial intelligence (AI), which can create novel and realistic data or content, such as text, image, video, or code, based on some input or prior knowledge, offers innovative and disruptive solutions to address these challenges. However, there is a gap in the literature on the current state, opportunities, and challenges of generative AI in the construction industry. This study aims to fill this gap by providing a state-of-the-art analysis of generative AI in construction, with three objectives: (1) to review and categorize the existing and emerging generative AI opportunities and challenges in the construction industry; (2) to propose a framework for construction firms to build customized generative AI solutions using their own data, comprising steps such as data collection, dataset curation, training custom large language model (LLM), model evaluation, and deployment; and (3) to demonstrate the framework via a case study of developing a generative model for querying contract documents. The results show that retrieval augmented generation (RAG) improves the baseline LLM by 5.2, 9.4, and 4.8% in terms of quality, relevance, and reproducibility. This study provides academics and construction professionals with a comprehensive analysis and practical framework to guide the adoption of generative AI techniques to enhance productivity, quality, safety, and sustainability across the construction industry.Comment: 74 pages, 11 figures, 20 table

    Latent Print Examination and Human Factors: Improving the Practice Through a Systems Approach: The Report of the Expert Working Group on Human Factors in Latent Print Analysis

    Get PDF
    Fingerprints have provided a valuable method of personal identification in forensic science and criminal investigations for more than 100 years. Fingerprints left at crime scenes generally are latent prints—unintentional reproductions of the arrangement of ridges on the skin made by the transfer of materials (such as amino acids, proteins, polypeptides, and salts) to a surface. Palms and the soles of feet also have friction ridge skin that can leave latent prints. The examination of a latent print consists of a series of steps involving a comparison of the latent print to a known (or exemplar) print. Courts have accepted latent print evidence for the past century. However, several high-profile cases in the United States and abroad have highlighted the fact that human errors can occur, and litigation and expressions of concern over the evidentiary reliability of latent print examinations and other forensic identification procedures has increased in the last decade. “Human factors” issues can arise in any experience- and judgment-based analytical process such as latent print examination. Inadequate training, extraneous knowledge about the suspects in the case or other matters, poor judgment, health problems, limitations of vision, complex technology, and stress are but a few factors that can contribute to errors. A lack of standards or quality control, poor management, insufficient resources, and substandard working conditions constitute other potentially contributing factors

    Roberto Gerhard’s Sound Compositions: A Historical-Philological Perspective. Archive, Process, Intent and reenactment

    Get PDF
    This research advances the current state of knowledge in the field of early tape music both empirically and methodologically. The purpose of this study is to evaluate the impact that the electronic medium exerted in the musical thinking of Roberto Gerhard, one of the most outspoken, prolific and influential composers in the Spanish diaspora whose musical legacy, for the most part unknown, is a major landmark in the early history of electroacoustic music. Gerhard’s personal tape collection, one of the largest historical archives of its kind reported in the literature, is exceptional for both its antiquity (50+-year-old tapes) and its abundance of production materials. Through the digitisation and analysis of the composer’s tape collection this research argues that the empirical study of audio documents sets out a basis for a broader understanding of textual processes. More specifically, the research demonstrates that the reconstruction of works based on magnetic tape sketches is a powerful method to advance the understanding of early tape music. This research also examines Gerhard’s sound compositions in relation to the post-war context in which they were composed. Finally, this research presents performance documentation that proposes an approach to the electroacoustic music repertoire in which creativity is not at odds with rigor and critical discernment demonstrating that archival study can be closely aligned to the concept of re-enactment

    Smart Tech is all Around us – Bridging Employee Vulnerability with Organizational Active Trust-Building

    Get PDF
    Public and academic opinion remains divided regarding the benefits and pitfalls of datafication technology in organizations, particularly regarding their impact on employees. Taking a dual-process perspective on trust, we propose that datafication technology can create small, erratic surprises in the workplace that highlight employee vulnerability and increase employees’ reliance on the systematic processing of trust. We argue that these surprises precipitate a phase in the employment relationship in which employees more actively weigh trust-related cues, and the employer should therefore engage in active trust management to protect and strengthen the relationship. Our paper develops a framework of symbolic and substantive strategies to guide organizations’ active trust management efforts to (re-)create situational normality, root goodwill intentions, and enable a more balanced interdependence between the organization and its employees. We discuss the implications of our paper for reconciling competing narratives about the future of work and for developing an understanding of trust processes.</p

    Principles of Security and Trust

    Get PDF
    This open access book constitutes the proceedings of the 8th International Conference on Principles of Security and Trust, POST 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conference on Theory and Practice of Software, ETAPS 2019. The 10 papers presented in this volume were carefully reviewed and selected from 27 submissions. They deal with theoretical and foundational aspects of security and trust, including on new theoretical results, practical applications of existing foundational ideas, and innovative approaches stimulated by pressing practical problems

    Symposium on Forensic Expert Testimony, \u3ci\u3eDaubert\u3c/i\u3e, and Rule 702

    Get PDF

    Proceedings of the International Workshop on EuroPLOT Persuasive Technology for Learning, Education and Teaching (IWEPLET 2013)

    Get PDF
    "This book contains the proceedings of the International Workshop on EuroPLOT Persuasive Technology for Learning, Education and Teaching (IWEPLET) 2013 which was held on 16.-17.September 2013 in Paphos (Cyprus) in conjunction with the EC-TEL conference. The workshop and hence the proceedings are divided in two parts: on Day 1 the EuroPLOT project and its results are introduced, with papers about the specific case studies and their evaluation. On Day 2, peer-reviewed papers are presented which address specific topics and issues going beyond the EuroPLOT scope. This workshop is one of the deliverables (D 2.6) of the EuroPLOT project, which has been funded from November 2010 – October 2013 by the Education, Audiovisual and Culture Executive Agency (EACEA) of the European Commission through the Lifelong Learning Programme (LLL) by grant #511633. The purpose of this project was to develop and evaluate Persuasive Learning Objects and Technologies (PLOTS), based on ideas of BJ Fogg. The purpose of this workshop is to summarize the findings obtained during this project and disseminate them to an interested audience. Furthermore, it shall foster discussions about the future of persuasive technology and design in the context of learning, education and teaching. The international community working in this area of research is relatively small. Nevertheless, we have received a number of high-quality submissions which went through a peer-review process before being selected for presentation and publication. We hope that the information found in this book is useful to the reader and that more interest in this novel approach of persuasive design for teaching/education/learning is stimulated. We are very grateful to the organisers of EC-TEL 2013 for allowing to host IWEPLET 2013 within their organisational facilities which helped us a lot in preparing this event. I am also very grateful to everyone in the EuroPLOT team for collaborating so effectively in these three years towards creating excellent outputs, and for being such a nice group with a very positive spirit also beyond work. And finally I would like to thank the EACEA for providing the financial resources for the EuroPLOT project and for being very helpful when needed. This funding made it possible to organise the IWEPLET workshop without charging a fee from the participants.

    Music Composed For Calm And Catharsis Using A Compositional Toolkit For Emotional Evocation - Inspired By And Directed Towards Healthcare Contexts And Self-Managed Wellness

    Get PDF
    Emotional experience through music listening is a universal experience. In the age of COVID-19 and an ever-mentally enslaved population, music that encourages calm and/or catharsis is more relevant than ever (Gallagher et al., 2020). As composers, can we form a framework for and create music to pointedly evoke an intentional emotion? This dissertation seeks to build on the solid foundation of music and emotion researchers’ past theories, and demonstrate how to further utilise the power that music has in both our everyday lives, and also in healthcare settings – providing an output of a large suite of music for use for calm and catharsis, and a Compositional Toolbox for Emotional Evocation that composers might use to effect positive emotional change. In two pilot studies: one for children and one for adults, this dissertation tests music written using said Toolbox, to observe its effect on arousal and pleasure. The studies also utilise visuals as a secondary means of sensory control, and to investigate whether the multisensory application of music and visuals enhances emotional evocation over isolated experience. Participants rated on a Likert-type scale, how they think each sample would make someone feel, or how it made them feel. An analysis of pieces from these studies is included in this dissertation. Mixed-method, deductive, and thematic analysis was used for data, which was collected via surveys and interviews. It was found that music using the Toolbox was more emotionally evocative, more calming, and happier overall than that written without. Most of the pieces achieved their emotional aims, and positive correlations between the use of music and visuals together have arisen. Music without the visuals appeared to be calmer than that with visuals in one of the studies. This dissertation begins to promote the use of the Compositional Toolbox for Emotional Evocation as a framework for emotional composition
    • 

    corecore