Search CORE

897 research outputs found

The Use of Controlled Vocabularies and Structured Expressions in the Assurance of CPS

Author: Attwood Katrina Claire
Conmy Philippa
Kelly Tim
Publication venue
Publication date: 01/12/2014
Field of study

To date, work on the development of assurance cases has largely been concerned with the broad structure and content of arguments to contextualise the data. However, at a more detailed level, use of natural language in an argument can lead to conflicting terminology, to difficulties in understanding the nature of the claims being made or to logical inferences which are obscure to the readers of the argument. This problem has become increasingly complex as more and more suppliers are involved in the development chain, making it more difficult to evaluate the strengths and weaknesses of assurance data or to re-use it. This paper explores the development of controlled vocabulary and structured expressions for CPS in the automotive domain, using the Semantics of Business Vocabulary and Business Rules (SBVR) to improve communication and to provide presents some formal consistency checking of content. We highlight the challenges this work has exposed. Keywords: safety, assurance, controlled language, SBVR, automotive

White Rose Research Online

A methodological approach on the creation of trustful test suites for grammar error detection

Author: Cabeça Mariana Isabel Pombo
Publication venue
Publication date: 30/11/2022
Field of study

Machine translation’s research has been expanding over time and so has the need to automatically detect and correct errors in texts. As such, Unbabel combines machine translation with human editors in post-edition to provide high quality translations. In order to assist post-editors in these tasks, a proprietary error detection tool called Smartcheck was developed by Unbabel to identify errors and suggest corrections. The state-of-the-art method of identifying translation errors depends on curated annotated texts (associated with error-type categories), which are fed to machine translation systems as their evaluation standard, i.e. the test suites to evaluate a system’s error detection accuracy. It is commonly assumed that evaluation sets are reliable and representative of the content the systems translate, leading to the assumption that the root problem usually relates to grammar-checking rules. However, the issue may instead lie in the quality of the evaluation set. If so, then the decisions made upon evaluation will possibly even have the opposite effect to the one intended. Thus, it is of utmost importance to have suitable datasets with representative data of the structures needed for each system, the same for Smartcheck. With this in mind, this dissertation developed and implemented a new methodology on creating reliable and revised test suites to be applied on the evaluation process of MT systems and error detection tools. Using the resulting curated test suites to evaluate proprietary systems and tools to Unbabel, it became possible to trust the conclusions and decisions made from said evaluations. This methodology accomplished robust identification of problematic error types, grammar-checking rules, and language- and/or register-specific issues, therefore allowing production measures to be adopted. With Smartcheck’s (now reliable and accurate) correction suggestions and the improvement on post-edition revision, the work presented hereafter led to an improvement on the translation quality provided to customers.O presente trabalho focou-se na avaliação do desempenho de uma ferramenta proprietária da Unbabel, para detecção automática de erros, baseada em segmentos previamente anotados pela comunidade de anotadores, o Smartcheck. Assim, foi proposta uma metodologia para criação de um corpus de teste (do inglês test suites) baseado em dados de referência com estruturas relevantes (do inglês gold data). Deste modo, tornou-se possível melhorar a qualidade das sugestões de correção de erros do Smartcheck e, consequentemente, das traduções facultadas. Para além do objetivo inicial, a nova metodologia permitiu assegurar uma avaliação rigorosa, apropriada e fundamentada relativamente às regras usadas pelo Smartcheck, para identificar possíveis erros de tradução, assim como avaliar outras ferramentas e sistemas de tradução automática da Unbabel. Recentemente, assistiu-se também a uma fusão da Lingo24 com a Unbabel e, por essa razão, os dados presentes no corpus incluem conteúdo traduzido por ambas. Como tal, o trabalho desenvolvido contribuiu inclusivamente para a recente integração da Lingo24. A Secção 2 foi dedicada à apresentação da Unbabel, na qual se referem os processos de controlo de qualidade utilizados para assegurar níveis de qualidade exigidos e se descreve pormenorizadamente a ferramenta em foco, o Smartcheck. A Secção 3 focou-se no estado da arte da Tradução Automática e em processos de controlo de qualidade, dando especial atenção a corpora de teste e à influência dos mesmos. Além disso, foi também incluída uma descrição relativa ao desenvolvimento de ferramentas automáticas de deteção e correção de erros, criadas para aperfeiçoar os textos provenientes de traduções automáticas. A metodologia criada, descrita na Secção 4, foi dividida em três partes principais: avaliação piloto relativa às regras preexistentes do Smartcheck; análise de causas de erros (do inglês root-cause analysis); e, por fim, construção de um novo corpus de teste, com dados mais recentes e corrigidos. O primeiro passo na metodologia consistiu na avaliação do desempenho da ferramenta em foco na presente tese. Para tal, foi realizada uma análise piloto na qual cada regra utilizada pelo Smartcheck foi avaliada de acordo com métricas comumente aplicadas para avaliação de sistemas de deteção de erros, como o número de verdadeiros positivos (true positives) - casos em que o sistema conseguiu corretamente identificar erros -, de falsos negativos (false negatives) - casos em que existia um erro, mas o sistema não o identificou - e de falsos positivos (false positives) - casos em que o sistema incorretamente considerou existir erros. Outras métricas utilizadas para avaliação consistiram no cálculo de Precision, Recall, e F1-score, a partir dos valores obtidos das métricas anteriormente mencionadas. Tendo terminado a avaliação piloto, concluiu-se que nem todas as regras foram passíveis de avaliação (razão pela qual se tornou impossível averiguar o desempenho individual para cada regra) e, quanto às que foram avaliadas, os resultados não foram considerados satisfatórios. Isto porque, as regras não identificavam erros existentes nas traduções e consideravam como problemáticos inúmeros segmentos gramaticalmente corretos. A segunda etapa da metodologia surgiu, então, como tentativa de identificar possíveis razões pelas quais o Smartcheck e as regras associadas demonstraram um baixo desempenho. Em vista desse objetivo, foi feita uma análise na qual foi colocada a hipótese de que as regras teriam sido avaliadas com um corpus de teste não apropriado e obsoleto, explicando assim as métricas muito baixas da avaliação piloto. Esta hipótese surgiu uma vez que foi não só considerada a possibilidade de os dados do corpus não serem representativos das traduções feitas atualmente, mas também pelo facto de as estruturas consideradas problemáticas para os sistemas de tradução serem alteradas constantemente. De modo a corroborar a hipótese colocada, o corpus foi analisado com base em variados critérios: qual o tipo de tradução dos dados - se os segmentos analisados tinham ou não sido previamente revisto por pós-editores antes da respetiva submissão; existência de segmentos duplicados ou cujo texto de partida (do inglês source text) poderia conter erros - i.e. dados ruidosos; e revisão das anotações e das severidades associadas a cada erro, de acordo com tipologias e diretrizes específicas da Unbabel - considerando o número de anotações/severidades correta e incorretamente atribuídas, assim como em falta. Uma vez finalizada a análise, concluímos que cerca de 20% dos dados correspondiam a duplicações - tanto para o registo formal como para o informal -, que entre 15-25% das anotações foram consideradas incorretas e que apenas metade das severidades foram corretamente atribuídas. Assim sendo, considerámos que seria mais vantajoso criar um novo corpus representativo e refinado, ao invés de corrigir todas as anotações incorretas do corpus previamente usado. O terceiro e último passo da metodologia consistiu na construção de um novo corpus de teste com 27 500 exemplos previamente anotados de traduções automáticas. Os procedimentos para a criação deste novo corpus incluíram: filtragem de um conjunto de traduções automáticas, com dados representativos para todas as línguas suportadas pela Unbabel; distinção entre segmentos dependentes e não dependentes de contexto (uma limitação do corpus prévio); exclusão de exemplos duplicados e de casos com textos de partida problemáticos; e, por fim, revisão por parte de linguistas e tradutores das anotações atribuídas, seguindo tipologias proprietárias. Este último procedimento foi ainda subdividido em: uma avaliação geral, de modo a garantir que as traduções transmitiam de forma coerente, fluída e apropriada a mensagem do texto de partida e que, para além disso, seguiam regras específicas para cada língua; uma avaliação focada em especificidades por cliente, de modo a assegurar diretrizes existentes; e uma revisão de severidades associadas a cada anotação. Tendo sido a metodologia dada como terminada, o corpus de teste consistia agora num conjunto de dados de confiança, capaz de avaliar sistemas de tradução automática e ferramentas como o Smartcheck de uma forma objetiva e fundamentada. Posto isto, as várias avaliações realizadas - descritas na Secção 5 - usaram os dados compreendidos no corpus como termo de comparação. A primeira avaliação teve como objetivo principal comparar os resultados obtidos na análise piloto quanto às regras do Smartcheck com os resultados de uma nova avaliação das mesmas usando o novo corpus de teste, de forma a chegar a conclusões mais fiáveis e credíveis. A partir desta, foi possível concluir não só que, contrariamente às conclusões anteriores, todas as regras são agora passíveis de avaliação, mas também que o número de casos em que o Smartcheck incorretamente identificava segmentos como problemáticos foi reduzido. A avaliação seguinte comparou anotações recorrendo a uma matriz de confusão (do inglês confusion matrix) entre previsões concedidas tanto pelo Smartcheck como pelo corpus de teste. Deste modo, foi possível identificar quais os tipos de erros mais frequentes e quais os tipos mais (e menos) problemáticos de identificar pelo sistema. Assim, o corpus de teste foi considerado como gold standard de modo a realizar uma avaliação global do Smartcheck, calculando o número total de falsos positivos (atingindo cerca de 45%), falsos negativos (com 35%) e verdadeiros positivos (aproximadamente 20%). Quanto aos verdadeiros positivos, estes foram divididos em dois tipos: segmentos corretamente identificados pelo Smartcheck como erro, mas que foram classificados incorretamente (cerca de 11%); e erros em que tanto a extensão como a classificação foram atribuídas corretamente (a rondar os 8% do número total de anotações). A terceira e última análise recorreu aos totais obtidos na avaliação anterior para calcular valores para métricas como Precision, Recall e F1-score para cada língua e para cada registo suportado. Desta forma, foi possível concluir que, quanto à primeira métrica, a média entre registos estava bastante equilibrada, mas o mesmo não se verificou em Recall nem F1-score, uma vez que o registo formal atingiu valores superiores. Para além disso, recorremos ainda ao corpus para avaliar spell checkers usados pela Unbabel e, analisando os resultados obtidos, pudemos concluir que o spell checker em uso obteve a avaliação mais baixa. Tendo isto em conta, foi decidido que seria então preferível substituí-lo pelo spell checker com a melhor avaliação, de modo a reduzir o número de erros nas traduções e assim melhorar a qualidade das mesmas. Todo o trabalho realizado pôde ser implementado em vários outros campos para além do inicialmente estabelecido, i.e. para além da avaliação sistemática da ferramenta Smartcheck. Demonstrando, deste modo, todo o impacto que uma análise bem fundamentada pode ter no processo de tomada de decisão. Isto porque, sem um corpus de teste representativo e estruturado, as avaliações feitas não seriam válidas e os resultados obtidos facilmente levariam a conclusões impróprias ou até nocivas para o desenvolvimento dos sistemas e ferramentas em questão

Universidade de Lisboa: Repositório.UL

Generative AI in the Construction Industry: A State-of-the-art Analysis

Author: Abdulai Sulemana Fatoama
Bello Idris Temitope
Saka Abdullahi
Salami Babatunde Abiodun
Taiwo Ridwan
Yussif Abdul-Mugis
Zayed Tarek
Publication venue
Publication date: 15/02/2024
Field of study

The construction industry is a vital sector of the global economy, but it faces many productivity challenges in various processes, such as design, planning, procurement, inspection, and maintenance. Generative artificial intelligence (AI), which can create novel and realistic data or content, such as text, image, video, or code, based on some input or prior knowledge, offers innovative and disruptive solutions to address these challenges. However, there is a gap in the literature on the current state, opportunities, and challenges of generative AI in the construction industry. This study aims to fill this gap by providing a state-of-the-art analysis of generative AI in construction, with three objectives: (1) to review and categorize the existing and emerging generative AI opportunities and challenges in the construction industry; (2) to propose a framework for construction firms to build customized generative AI solutions using their own data, comprising steps such as data collection, dataset curation, training custom large language model (LLM), model evaluation, and deployment; and (3) to demonstrate the framework via a case study of developing a generative model for querying contract documents. The results show that retrieval augmented generation (RAG) improves the baseline LLM by 5.2, 9.4, and 4.8% in terms of quality, relevance, and reproducibility. This study provides academics and construction professionals with a comprehensive analysis and practical framework to guide the adoption of generative AI techniques to enhance productivity, quality, safety, and sustainability across the construction industry.Comment: 74 pages, 11 figures, 20 table

arXiv.org e-Print Archive

Latent Print Examination and Human Factors: Improving the Practice Through a Systems Approach: The Report of the Expert Working Group on Human Factors in Latent Print Analysis

Author: Busey Thomas
et al.
Gische Melissa R.
Kaye David H.
LaPorte Gerry
Shappell Scott A.
Publication venue: Scholarly Commons
Publication date: 01/02/2012
Field of study

Fingerprints have provided a valuable method of personal identification in forensic science and criminal investigations for more than 100 years. Fingerprints left at crime scenes generally are latent prints—unintentional reproductions of the arrangement of ridges on the skin made by the transfer of materials (such as amino acids, proteins, polypeptides, and salts) to a surface. Palms and the soles of feet also have friction ridge skin that can leave latent prints. The examination of a latent print consists of a series of steps involving a comparison of the latent print to a known (or exemplar) print. Courts have accepted latent print evidence for the past century. However, several high-profile cases in the United States and abroad have highlighted the fact that human errors can occur, and litigation and expressions of concern over the evidentiary reliability of latent print examinations and other forensic identification procedures has increased in the last decade. “Human factors” issues can arise in any experience- and judgment-based analytical process such as latent print examination. Inadequate training, extraneous knowledge about the suspects in the case or other matters, poor judgment, health problems, limitations of vision, complex technology, and stress are but a few factors that can contribute to errors. A lack of standards or quality control, poor management, insufficient resources, and substandard working conditions constitute other potentially contributing factors

Embry-Riddle Aeronautical University

Roberto Gerhard’s Sound Compositions: A Historical-Philological Perspective. Archive, Process, Intent and reenactment

Author: Karman Gregorio
Publication venue
Publication date
Field of study

This research advances the current state of knowledge in the field of early tape music both empirically and methodologically. The purpose of this study is to evaluate the impact that the electronic medium exerted in the musical thinking of Roberto Gerhard, one of the most outspoken, prolific and influential composers in the Spanish diaspora whose musical legacy, for the most part unknown, is a major landmark in the early history of electroacoustic music. Gerhard’s personal tape collection, one of the largest historical archives of its kind reported in the literature, is exceptional for both its antiquity (50+-year-old tapes) and its abundance of production materials. Through the digitisation and analysis of the composer’s tape collection this research argues that the empirical study of audio documents sets out a basis for a broader understanding of textual processes. More specifically, the research demonstrates that the reconstruction of works based on magnetic tape sketches is a powerful method to advance the understanding of early tape music. This research also examines Gerhard’s sound compositions in relation to the post-war context in which they were composed. Finally, this research presents performance documentation that proposes an approach to the electroacoustic music repertoire in which creativity is not at odds with rigor and critical discernment demonstrating that archival study can be closely aligned to the concept of re-enactment

University of Huddersfield Repository

Smart Tech is all Around us – Bridging Employee Vulnerability with Organizational Active Trust-Building

Author: Schafheitle Simon
van der Werff Lisa
Weibel Antoinette
Publication venue
Publication date: 04/06/2023
Field of study

Public and academic opinion remains divided regarding the benefits and pitfalls of datafication technology in organizations, particularly regarding their impact on employees. Taking a dual-process perspective on trust, we propose that datafication technology can create small, erratic surprises in the workplace that highlight employee vulnerability and increase employees’ reliance on the systematic processing of trust. We argue that these surprises precipitate a phase in the employment relationship in which employees more actively weigh trust-related cues, and the employer should therefore engage in active trust management to protect and strengthen the relationship. Our paper develops a framework of symbolic and substantive strategies to guide organizations’ active trust management efforts to (re-)create situational normality, root goodwill intentions, and enable a more balanced interdependence between the organization and its employees. We discuss the implications of our paper for reconciling competing narratives about the future of work and for developing an understanding of trust processes.</p

University of Twente Research Information

Principles of Security and Trust

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book constitutes the proceedings of the 8th International Conference on Principles of Security and Trust, POST 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conference on Theory and Practice of Software, ETAPS 2019. The 10 papers presented in this volume were carefully reviewed and selected from 27 submissions. They deal with theoretical and foundational aspects of security and trust, including on new theoretical results, practical applications of existing foundational ideas, and innovative approaches stimulated by pressing practical problems

Directory of Open Access Books (DOAB)

Symposium on Forensic Expert Testimony, \u3ci\u3eDaubert\u3c/i\u3e, and Rule 702

Author: Capra Daniel J.
Publication venue: FLASH: The Fordham Law Archive of Scholarship and History
Publication date: 01/03/2018
Field of study

bepress Legal Repository

Fordham University School of Law

Proceedings of the International Workshop on EuroPLOT Persuasive Technology for Learning, Education and Teaching (IWEPLET 2013)

Author: Behringer R
Sinclair G
Publication venue
Publication date: 01/09/2013
Field of study

"This book contains the proceedings of the International Workshop on EuroPLOT Persuasive Technology for Learning, Education and Teaching (IWEPLET) 2013 which was held on 16.-17.September 2013 in Paphos (Cyprus) in conjunction with the EC-TEL conference. The workshop and hence the proceedings are divided in two parts: on Day 1 the EuroPLOT project and its results are introduced, with papers about the specific case studies and their evaluation. On Day 2, peer-reviewed papers are presented which address specific topics and issues going beyond the EuroPLOT scope. This workshop is one of the deliverables (D 2.6) of the EuroPLOT project, which has been funded from November 2010 – October 2013 by the Education, Audiovisual and Culture Executive Agency (EACEA) of the European Commission through the Lifelong Learning Programme (LLL) by grant #511633. The purpose of this project was to develop and evaluate Persuasive Learning Objects and Technologies (PLOTS), based on ideas of BJ Fogg. The purpose of this workshop is to summarize the findings obtained during this project and disseminate them to an interested audience. Furthermore, it shall foster discussions about the future of persuasive technology and design in the context of learning, education and teaching. The international community working in this area of research is relatively small. Nevertheless, we have received a number of high-quality submissions which went through a peer-review process before being selected for presentation and publication. We hope that the information found in this book is useful to the reader and that more interest in this novel approach of persuasive design for teaching/education/learning is stimulated. We are very grateful to the organisers of EC-TEL 2013 for allowing to host IWEPLET 2013 within their organisational facilities which helped us a lot in preparing this event. I am also very grateful to everyone in the EuroPLOT team for collaborating so effectively in these three years towards creating excellent outputs, and for being such a nice group with a very positive spirit also beyond work. And finally I would like to thank the EACEA for providing the financial resources for the EuroPLOT project and for being very helpful when needed. This funding made it possible to organise the IWEPLET workshop without charging a fee from the participants.

Leeds Beckett Repository

Music Composed For Calm And Catharsis Using A Compositional Toolkit For Emotional Evocation - Inspired By And Directed Towards Healthcare Contexts And Self-Managed Wellness

Author: Nicolas Natalie Danielle
Publication venue: Sydney Conservatorium of Music
Publication date: 01/01/2022
Field of study

Emotional experience through music listening is a universal experience. In the age of COVID-19 and an ever-mentally enslaved population, music that encourages calm and/or catharsis is more relevant than ever (Gallagher et al., 2020). As composers, can we form a framework for and create music to pointedly evoke an intentional emotion? This dissertation seeks to build on the solid foundation of music and emotion researchers’ past theories, and demonstrate how to further utilise the power that music has in both our everyday lives, and also in healthcare settings – providing an output of a large suite of music for use for calm and catharsis, and a Compositional Toolbox for Emotional Evocation that composers might use to effect positive emotional change. In two pilot studies: one for children and one for adults, this dissertation tests music written using said Toolbox, to observe its effect on arousal and pleasure. The studies also utilise visuals as a secondary means of sensory control, and to investigate whether the multisensory application of music and visuals enhances emotional evocation over isolated experience. Participants rated on a Likert-type scale, how they think each sample would make someone feel, or how it made them feel. An analysis of pieces from these studies is included in this dissertation. Mixed-method, deductive, and thematic analysis was used for data, which was collected via surveys and interviews. It was found that music using the Toolbox was more emotionally evocative, more calming, and happier overall than that written without. Most of the pieces achieved their emotional aims, and positive correlations between the use of music and visuals together have arisen. Music without the visuals appeared to be calmer than that with visuals in one of the studies. This dissertation begins to promote the use of the Compositional Toolbox for Emotional Evocation as a framework for emotional composition

Sydney eScholarship