70 research outputs found
Dynamic Parameter Allocation in Parameter Servers
To keep up with increasing dataset sizes and model complexity, distributed
training has become a necessity for large machine learning tasks. Parameter
servers ease the implementation of distributed parameter management---a key
concern in distributed training---, but can induce severe communication
overhead. To reduce communication overhead, distributed machine learning
algorithms use techniques to increase parameter access locality (PAL),
achieving up to linear speed-ups. We found that existing parameter servers
provide only limited support for PAL techniques, however, and therefore prevent
efficient training. In this paper, we explore whether and to what extent PAL
techniques can be supported, and whether such support is beneficial. We propose
to integrate dynamic parameter allocation into parameter servers, describe an
efficient implementation of such a parameter server called Lapse, and
experimentally compare its performance to existing parameter servers across a
number of machine learning tasks. We found that Lapse provides near-linear
scaling and can be orders of magnitude faster than existing parameter servers
The DESQ framework for declarative and scalable frequent sequence mining
DESQ is a general-purpose framework for declarative and scalable frequent sequence
mining. Applications express their speciĄc sequence mining tasks using a simple yet powerful powerful
pattern expression language, and DESQŠs computation engine automatically executes the mining task
in an efficient and scalable way. In this paper, we give a brief overview of DESQ and its components
Scalable frequent sequence mining with flexible subsequence constraints
We study scalable algorithms for frequent sequence mining under flexible subsequence constraints. Such constraints enable applications to specify concisely which patterns are of interest and which are not. We focus on the bulk synchronous parallel model with one round of communication; this model is suitable for platforms such as MapReduce or Spark. We derive a general framework for frequent sequence mining under this model and propose the D-SEQ and D-CAND algorithms within this framework. The algorithms differ in what data are communicated and how computation is split up among workers. To the best of our knowledge, D-SEQ and D-CAND are the first scalable algorithms for frequent sequence mining with flexible constraints. We conducted an experimental study on multiple real-world datasets that suggests that our algorithms scale nearly linearly, outperform common baselines, and offer acceptable generalization overhead over existing, less general mining algorithms
Good Intentions: Adaptive Parameter Management via Intent Signaling
Parameter management is essential for distributed training of large machine
learning (ML) tasks. Some ML tasks are hard to distribute because common
approaches to parameter management can be highly inefficient. Advanced
parameter management approaches -- such as selective replication or dynamic
parameter allocation -- can improve efficiency, but to do so, they typically
need to be integrated manually into each task's implementation and they require
expensive upfront experimentation to tune correctly. In this work, we explore
whether these two problems can be avoided. We first propose a novel intent
signaling mechanism that integrates naturally into existing ML stacks and
provides the parameter manager with crucial information about parameter
accesses. We then describe AdaPM, a fully adaptive, zero-tuning parameter
manager based on this mechanism. In contrast to prior systems, this approach
separates providing information (simple, done by the task) from exploiting it
effectively (hard, done automatically by AdaPM). In our experimental
evaluation, AdaPM matched or outperformed state-of-the-art parameter managers
out of the box, suggesting that automatic parameter management is possible
NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access
Parameter servers (PSs) facilitate the implementation of distributed training
for large machine learning tasks. In this paper, we argue that existing PSs are
inefficient for tasks that exhibit non-uniform parameter access; their
performance may even fall behind that of single node baselines. We identify two
major sources of such non-uniform access: skew and sampling. Existing PSs are
ill-suited for managing skew because they uniformly apply the same parameter
management technique to all parameters. They are inefficient for sampling
because the PS is oblivious to the associated randomized accesses and cannot
exploit locality. To overcome these performance limitations, we introduce NuPS,
a novel PS architecture that (i) integrates multiple management techniques and
employs a suitable technique for each parameter and (ii) supports sampling
directly via suitable sampling primitives and sampling schemes that allow for a
controlled quality--efficiency trade-off. In our experimental study, NuPS
outperformed existing PSs by up to one order of magnitude and provided up to
linear scalability across multiple machine learning tasks
Discutindo a educação ambiental no cotidiano escolar: desenvolvimento de projetos na escola formação inicial e continuada de professores
A presente pesquisa buscou discutir como a Educação Ambiental (EA) vem sendo trabalhada, no Ensino Fundamental e como os docentes desta escola compreendem e vem inserindo a EA no cotidiano escolar., em uma escola estadual do município de Tangará da Serra/MT, Brasil. Para tanto, realizou-se entrevistas com os professores que fazem parte de um projeto interdisciplinar de EA na escola pesquisada. Verificou-se que o projeto da escola não vem conseguindo alcançar os objetivos propostos por: desconhecimento do mesmo, pelos professores; formação deficiente dos professores, não entendimento da EA como processo de ensino-aprendizagem, falta de recursos didáticos, planejamento inadequado das atividades. A partir dessa constatação, procurou-se debater a impossibilidade de tratar do tema fora do trabalho interdisciplinar, bem como, e principalmente, a importância de um estudo mais aprofundado de EA, vinculando teoria e prática, tanto na formação docente, como em projetos escolares, a fim de fugir do tradicional vínculo “EA e ecologia, lixo e horta”.Facultad de Humanidades y Ciencias de la Educació
Good Intentions: Adaptive Parameter Management via Intent Signaling
Model parameter management is essential for distributed training of large machine learning (ML) tasks. Some ML tasks are hard to distribute because common approaches to parameter management can be highly inefficient. Advanced parameter management approaches---such as selective replication or dynamic parameter allocation---can improve efficiency, but they typically need to be integrated manually into each task's implementation and they require expensive upfront experimentation to tune correctly. In this work, we explore whether these two problems can be avoided. We first propose a novel intent signaling mechanism that integrates naturally into existing ML stacks and provides the parameter manager with crucial information about parameter accesses. We then describe AdaPM, a fully adaptive, zero-tuning parameter manager based on this mechanism. In contrast to prior parameter managers, our approach decouples how access information is provided (simple) from how and when it is exploited (hard). In our experimental evaluation, AdaPM matched or outperformed state-of-the-art parameter managers out of the box, suggesting that automatic parameter management is possible
German Society for Clinical Chemistry and Laboratory Medicine – areas of expertise: Division reports from the German Congress of Laboratory Medicine 2022 in Mannheim, 13–14 October 2022
The programme of the German Congress for Laboratory Medicine 2022 was essentially designed by the divisions of the German Society for Clinical Chemistry and Laboratory Medicine (DGKL). Almost all chairpersons of the divisions organised a 90-min symposium on current topics, i.e. conceptualised the symposia and invited speakers. For this article all chairpersons summarised the lectures that were given within the symposia. The DGKL’s work is structured into 5 areas of expertise: Molecular Diagnostics, Learning & Teaching, Quality & Management, Laboratory & Diagnostics and Biobanks & Informatics. The areas of expertise are in turn subdivided into divisions. About the history of the establishment of this new structure within the DGKL you can find information in the editorial of this issue
- …