86 research outputs found
PhoSim-NIRCam: Photon-by-photon image simulations of the James Webb Space Telescope's Near-Infrared Camera
Recent instrumentation projects have allocated resources to develop codes for
simulating astronomical images. Novel physics-based models are essential for
understanding telescope, instrument, and environmental systematics in
observations. A deep understanding of these systematics is especially important
in the context of weak gravitational lensing, galaxy morphology, and other
sensitive measurements. In this work, we present an adaptation of a
physics-based ab initio image simulator: The Photon Simulator (PhoSim). We
modify PhoSim for use with the Near-Infrared Camera (NIRCam) -- the primary
imaging instrument aboard the James Webb Space Telescope (JWST). This photon
Monte Carlo code replicates the observational catalog, telescope and camera
optics, detector physics, and readout modes/electronics. Importantly,
PhoSim-NIRCam simulates both geometric aberration and diffraction across the
field of view. Full field- and wavelength-dependent point spread functions are
presented. Simulated images of an extragalactic field are presented. Extensive
validation is planned during in-orbit commissioning
Evaluation of Bulk Charging in Geostationary Transfer Orbit and Earth Escape Trajectories Using the Numit 1-D Charging Model
The NUMIT 1-dimensional bulk charging model is used as a screening to ol for evaluating time-dependent bulk internal or deep dielectric) ch arging of dielectrics exposed to penetrating electron environments. T he code is modified to accept time dependent electron flux time serie s along satellite orbits for the electron environment inputs instead of using the static electron flux environment input originally used b y the code and widely adopted in bulk charging models. Application of the screening technique ts demonstrated for three cases of spacecraf t exposure within the Earth's radiation belts including a geostationa ry transfer orbit and an Earth-Moon transit trajectory for a range of orbit inclinations. Electric fields and charge densities are compute d for dielectric materials with varying electrical properties exposed to relativistic electron environments along the orbits. Our objectiv e is to demonstrate a preliminary application of the time-dependent e nvironments input to the NUMIT code for evaluating charging risks to exposed dielectrics used on spacecraft when exposed to the Earth's ra diation belts. The results demonstrate that the NUMIT electric field values in GTO orbits with multiple encounters with the Earth's radiat ion belts are consistent with previous studies of charging in GTO orb its and that potential threat conditions for electrostatic discharge exist on lunar transit trajectories depending on the electrical proper ties of the materials exposed to the radiation environment
Full proof cryptography: verifiable compilation of efficient zero-knowledge protocols
Developers building cryptography into security-sensitive applications face a daunting task. Not only must they understand the security guarantees delivered by the constructions they choose, they must also implement and combine them correctly and efficiently. Cryptographic compilers free developers from having to implement cryptography on their own by turning high-level specifications of security goals into efficient implementations. Yet, trusting such tools is risky as they rely on complex mathematical machinery and claim security properties that are subtle and difficult to verify.
In this paper, we present ZKCrypt, an optimizing cryptographic compiler that achieves an unprecedented level of assurance without sacrificing practicality for a comprehensive class of cryptographic protocols, known as Zero-Knowledge Proofs of Knowledge. The pipeline of ZKCrypt tightly integrates purpose-built verified compilers and verifying compilers producing formal proofs in the CertiCrypt framework. By combining the guarantees delivered by each stage in the pipeline, ZKCrypt provides assurance that the implementation it outputs securely realizes the high-level proof goal given as input. We report on the main characteristics of ZKCrypt, highlight new definitions and concepts at its foundations, and illustrate its applicability through a representative example of an anonymous credential system.(undefined
Tackling Version Management and Reproducibility in MLOps
A crescente adoção de soluções baseadas em machine learning (ML) exige avanços na aplicação das melhores práticas para manter estes sistemas em produção. Operações de machine learning (MLOps) incorporam princípios de automação contínua ao desenvolvimento de modelos de ML, promovendo entrega, monitoramento e treinamento contínuos. Devido a vários fatores, como a natureza experimental do desenvolvimento de modelos de ML ou a necessidade de otimizações derivadas de mudanças nas necessidades de negócios, espera-se que os cientistas de dados criem
vários experimentos para desenvolver um modelo ou preditor que atenda satisfatoriamente aos principais desafios de um dado problema.
Como a reavaliação de modelos é uma necessidade constante, metadados são constantemente produzidos devido a várias execuções de experimentos. Esses metadados são conhecidos como artefatos ou ativos de ML. A linhagem adequada entre esses artefatos possibilita a recriação do ambiente em que foram desenvolvidos, facilitando a reprodutibilidade do modelo. Vincular informações de experimentos, modelos, conjuntos de dados, configurações e alterações de código requer organização, rastreamento, manutenção e controle de versão adequados.
Este trabalho investigará as melhores práticas, problemas atuais e desafios relacionados ao gerenciamento e versão de artefatos e aplicará esse conhecimento para desenvolver um fluxo de trabalho que suporte a engenharia e operacionalização de ML, aplicando princípios de MLOps que facilitam a reprodutibilidade dos modelos. Cenários cobrindo preparação de dados, geração de modelo, comparação entre versões de modelo, implantação, monitoramento, depuração e re-treinamento demonstraram como as estruturas e ferramentas selecionadas podem ser integradas para atingir esse objetivo.The growing adoption of machine learning solutions requires advancements in applying best practices to maintain artificial intelligence systems in production. Machine Learning Operations (MLOps) incorporates DevOps principles into machine learning development, promoting automation, continuous delivery, monitoring, and training capabilities. Due to multiple factors, such as the experimental nature of the machine learning process or the need for model optimizations derived from changes in business needs, data scientists are expected to create multiple experiments to develop a model or predictor that satisfactorily addresses the main challenges of a given problem.
Since the re-evaluation of models is a constant need, metadata is constantly produced due to multiple experiment runs. This metadata is known as ML artifacts or assets. The proper lineage between these artifacts enables environment recreation, facilitating model reproducibility. Linking information from experiments, models, datasets, configurations, and code changes requires proper organization, tracking, maintenance, and version control of these artifacts.
This work will investigate the best practices, current issues, and open challenges related
to artifact versioning and management and apply this knowledge to develop an ML workflow that supports ML engineering and operationalization, applying MLOps principles that facilitate model reproducibility. Scenarios covering data preparation, model generation, comparison between model versions, deployment, monitoring, debugging, and retraining demonstrated how the selected frameworks and tools could be integrated to achieve that goal
DevOps: Concepts, Practices, Tools, Benefits and Challenges
DevOps, originated in the context of agile software development, seems an appropriate approach to enable the continuous delivery and deployment of working software in small releases. Organizations are taking significant interest in adopting DevOps ways of working. The interest is there, however the challenge is how to effectively adopt DevOps in practice? Before disembarking on the journey of DevOps, there is a need to clearly understand the DevOps concepts, practice, tools, benefits and underlying challenges. Thus, in order to address the research question in hand, this paper adopts a Systematic Literature Review (SLR) approach to identify, review and synthesize the relevant studies published in public domain between: 2010-2016. SLR approach was applied to initially identify a set of 450 papers. Finally, 30 of 450 relevant papers were selected and reviewed to identify the eight key DevOps concepts, twenty practices, and a twelve categories tools. The research also identified seventeen benefits of using DevOps approach for application development and encountered four known challenges. The results of this review will serve as a knowledge base for researchers and practitioners, which can be used to effectively understand and establish the integrated DevOps capability in the local context
Capturing ergonomics requirements in the global automotive industry
This thesis examines the issues surrounding the collection and dissemination of
customer ergonomics requirements in the automotive industry. The aim of the
research is to develop a Toolset of methods, known as the Lifestyle Scenario
Toolset, for gathering customer requirements in overseas markets, and for
presenting the information collected to design teams, taking a user-centred design
approach. The Toolset was developed and evaluated with the co-operation of
employees from a major UK automotive company.
Four studies were conducted, the first comprised a series of interviews to establish
the needs of both the data gatherers and data users for a Toolset of methods to
collect and communicate overseas customer information. The data gatherers were
drawn from the company's Market Researchers, Ergonomists and people
responsible for the company's overseas operations. The data users were the design
team responsible for the development of the company's next generation 4X4
vehicle. Results showed that the data collection tools which formed part I of the
Toolset should be quick to use, require no ergonomics expertise to implement and
be cost effective to use. The interviews with data users identified the need for
tools which could communicate customer ergonomics requirements to them in a
way which fitted in with their current working practices. In addition the tools
needed to communicate information in language which was familiar to the design
team, and be visually based where possible.
The second study explored the development of suitable data collection tools for
inclusion in the Lifestyle Scenario Toolset. Building on the needs identified in the
first study together with information from the current literature a number of data
collection tools were developed for inclusion in part I of the Lifestyle Scenario
Toolset. These tools were a questionnaire, driving diary and photographs, focus
group, ergonomics audit and background information tool. The tools were
designed to collect a range of different data types, e.g. qualitative, quantitative, pictorial and customer verbatims, to provide a rich picture of users and their
activities. The tools were used in a field trial to collect data from overseas
customers about their ergonomics requirements and the tasks they carried out
using their vehicle, in the context of their lifestyle.
The third study focused on the development of a set of tools to communicate the
data collected in part 1 of the Toolset, to the design team who would use it in
their work. The data communication tools were developed to provide information
to design teams at a number of levels, enabling them to use the data at an
appropriate level for their needs. High level summaries of each of the tools were
developed and scenarios presented on storyboards were used to integrate
information from all of the data collection tools to provide detailed information
about customers' ergonomics requirements and lifestyle. The data communication
tools also used a variety of data types and presentation mediums, such as pictures,
graphs and customer quotes to increase the richness of the data presented.
The fourth study involved the evaluation of the suitability of the Toolset for
collecting and communicating overseas customer ergonomics requirements. The
data gatherers, and data users (design team) carried out a field trial using the
Toolset to establish its usefulness to them in their work. The results of the
evaluation showed that the data gatherers found the Toolset easy to implement
and were able to use it to pick up overseas customers ergonomics requirements.
The communication tools were able to provide the design team with new and
useful customer ergonomics information, in a range of formats which they felt
comfortable using in their work. The implementation of a user-centred design
approach to the development of methods for collecting and communicating
overseas customer ergonomics requirements enabled the creation of a Toolset
which met the needs of the people who will use it. This increased its acceptance by
people in the company and thus the likelihood of the Lifestyle Scenario Toolset's
continued use within the company
The HERMIT in the machine: a plugin for the interactive transformation of GHC core language programs
The importance of reasoning about and refactoring programs is a central tenet of functional programming. Yet our compilers and development toolchains only provide rudimentary support for these tasks. This paper introduces a programmatic and compiler-centric interface that facilitates refactoring and equational reasoning. To develop our ideas, we have implemented HERMIT, a toolkit enabling informal but systematic transformation of Haskell programs from inside the Glasgow Haskell Compiler’s optimization pipeline. With HERMIT, users can experiment with optimizations and equational reasoning, while the tedious heavy lifting of performing the actual transformations is done for them. HERMIT provides a transformation API that can be used to build higher-level rewrite tools. One use-case is prototyping new optimizations as clients of this API before being committed to the GHC toolchain. We describe a HERMIT application - a read-eval-print shell for performing transformations using HERMIT. We also demonstrate using this shell to prototype an optimization on a specific example, and report our initial experiences and remaining challenges
The Search for Metabolic Variants in Response to Climate Change in the American Pika
Climate change and rising temperatures pose a serious threat to the long-term survival of American pika (Ochotana princeps), emphasizing the interest in the adaptive capability of the pika. This project queried single nucleotide polymorphisms in a population of American pika in Yosemite National Park using Whole Genome Sequencing data, with a specific interest in metabolic variants. The sample data included temporally separated cohorts, comparing modern population data to historical data taken before rapid anthropogenic climate change. Statistically significant variants were identified under Approximate Bayesian Computation using a population decline model. Although population statistics indicated little change between the temporal cohorts, five intergenic SNPs were identified located about 20,000 base pairs upstream from DECR1, a gene that plays a key role in the metabolism of polyunsaturated fatty acids. Further work is needed to investigate any link between these SNPs and DECR1
Reconstrução e classificação de sequências de ADN desconhecidas
The continuous advances in DNA sequencing technologies and techniques
in metagenomics require reliable reconstruction and accurate classification
methodologies for the diversity increase of the natural repository while contributing
to the organisms' description and organization. However, after
sequencing and de-novo assembly, one of the highest complex challenges
comes from the DNA sequences that do not match or resemble any biological
sequence from the literature. Three main reasons contribute to this
exception: the organism sequence presents high divergence according to the
known organisms from the literature, an irregularity has been created in the
reconstruction process, or a new organism has been sequenced. The inability
to efficiently classify these unknown sequences increases the sample
constitution's uncertainty and becomes a wasted opportunity to discover
new species since they are often discarded.
In this context, the main objective of this thesis is the development and
validation of a tool that provides an efficient computational solution to
solve these three challenges based on an ensemble of experts, namely
compression-based predictors, the distribution of sequence content, and
normalized sequence lengths. The method uses both DNA and amino acid
sequences and provides efficient classification beyond standard referential
comparisons. Unusually, it classifies DNA sequences without resorting directly
to the reference genomes but rather to features that the species biological
sequences share. Specifically, it only makes use of features extracted
individually from each genome without using sequence comparisons.
RFSC was then created as a machine learning classification pipeline that
relies on an ensemble of experts to provide efficient classification in metagenomic
contexts. This pipeline was tested in synthetic and real data, both
achieving precise and accurate results that, at the time of the development
of this thesis, have not been reported in the state-of-the-art. Specifically, it
has achieved an accuracy of approximately 97% in the domain/type classification.Os contínuos avanços em tecnologias de sequenciação de ADN e técnicas
em meta genómica requerem metodologias de reconstrução confiáveis e de
classificação precisas para o aumento da diversidade do repositório natural,
contribuindo, entretanto, para a descrição e organização dos organismos.
No entanto, após a sequenciação e a montagem de-novo, um dos desafios
mais complexos advém das sequências de ADN que não correspondem ou se
assemelham a qualquer sequencia biológica da literatura. São três as principais
razões que contribuem para essa exceção: uma irregularidade emergiu
no processo de reconstrução, a sequência do organismo é altamente dissimilar
dos organismos da literatura, ou um novo e diferente organismo foi
reconstruído. A incapacidade de classificar com eficiência essas sequências
desconhecidas aumenta a incerteza da constituição da amostra e desperdiça
a oportunidade de descobrir novas espécies, uma vez que muitas vezes são
descartadas.
Neste contexto, o principal objetivo desta tese é fornecer uma solução computacional
eficiente para resolver este desafio com base em um conjunto
de especialistas, nomeadamente preditores baseados em compressão, a distribuição de conteúdo de sequência e comprimentos de sequência normalizados.
O método usa sequências de ADN e de aminoácidos e fornece classificação eficiente além das comparações referenciais padrão. Excecionalmente,
ele classifica as sequências de ADN sem recorrer diretamente a genomas
de referência, mas sim às características que as sequências biológicas da
espécie compartilham. Especificamente, ele usa apenas recursos extraídos
individualmente de cada genoma sem usar comparações de sequência. Além
disso, o pipeline é totalmente automático e permite a reconstrução sem referência de genomas a partir de reads FASTQ com a garantia adicional de
armazenamento seguro de informações sensíveis.
O RFSC é então um pipeline de classificação de aprendizagem automática
que se baseia em um conjunto de especialistas para fornecer classificação
eficiente em contextos meta genómicos. Este pipeline foi aplicado em dados
sintéticos e reais, alcançando em ambos resultados precisos e exatos que,
no momento do desenvolvimento desta dissertação, não foram relatados na
literatura. Especificamente, esta ferramenta desenvolvida, alcançou uma
precisão de aproximadamente 97% na classificação de domínio/tipo.Mestrado em Engenharia de Computadores e Telemátic
Resource Management in Grids: Overview and a discussion of a possible approach for an Agent-Based Middleware
14 pagesInternational audienceResource management and job scheduling are important research issues in computational grids. When software agents are used as resource managers and brokers in the Grid a number of additional issues and possible approaches materialize. The aim of this chapter is twofold. First, we discuss traditional job scheduling in grids, and when agents are utilized as grid middleware. Second, we use this as a context for discussion of how job scheduling can be done in the agent-based system under development
- …