3 research outputs found
Improving Reproducible Deep Learning Workflows with DeepDIVA
The field of deep learning is experiencing a trend towards producing
reproducible research. Nevertheless, it is still often a frustrating experience
to reproduce scientific results. This is especially true in the machine
learning community, where it is considered acceptable to have black boxes in
your experiments. We present DeepDIVA, a framework designed to facilitate easy
experimentation and their reproduction. This framework allows researchers to
share their experiments with others, while providing functionality that allows
for easy experimentation, such as: boilerplate code, experiment management,
hyper-parameter optimization, verification of data integrity and visualization
of data and results. Additionally, the code of DeepDIVA is well-documented and
supported by several tutorials that allow a new user to quickly familiarize
themselves with the framework
DIVA-DAF: A Deep Learning Framework for Historical Document Image Analysis
Deep learning methods have shown strong performance in solving tasks for
historical document image analysis. However, despite current libraries and
frameworks, programming an experiment or a set of experiments and executing
them can be time-consuming. This is why we propose an open-source deep learning
framework, DIVA-DAF, which is based on PyTorch Lightning and specifically
designed for historical document analysis. Pre-implemented tasks such as
segmentation and classification can be easily used or customized. It is also
easy to create one's own tasks with the benefit of powerful modules for loading
data, even large data sets, and different forms of ground truth. The
applications conducted have demonstrated time savings for the programming of a
document analysis task, as well as for different scenarios such as pre-training
or changing the architecture. Thanks to its data module, the framework also
allows to reduce the time of model training significantly
On the Challenges of Implementing Machine Learning Systems in Industry
RĂSUMĂ : Dans lâoptique de ce mĂ©moire, nous nous concentrons sur les dĂ©ïŹs de lâimplantation de sys-tĂšmes dâapprentissage automatique dans le contexte de lâindustrie. Notre travail est rĂ©parti sur deux volets: dans un premier temps, nous explorons des considĂ©rations fondamentales sur le processus dâingĂ©nierie de systĂšmes dâapprentissage automatique et dans un second temps, nous explorons lâaspect pratique de lâingĂ©nierie de tels systĂšmes dans un cadre industriel. Pour le premier volet, nous explorons un des dĂ©ïŹs rĂ©cemment mis en Ă©vidence par la com-munautĂ© scientiïŹque: la reproducibilitĂ©. Nous expliquons les dĂ©ïŹs qui sây rattachent et, Ă la lueur de cette nouvelle comprĂ©hension, nous explorons un des eïŹets rattachĂ©s, omniprĂ©sent dans lâingĂ©nierie logicielle: la prĂ©sence de dĂ©faut logiciels. Ă lâaide dâune mĂ©thodologie rigoureuse nous cherchons Ă savoir si la prĂ©sence de dĂ©fauts logiciels, parmis un Ă©chantillon de taille ïŹxe, dans un cadriciel dâapprentissage automatique impacte le rĂ©sultat dâun processus dâapprentissage.----------ABSTRACT : Software engineering projects face a number of challenges, ranging from managing their life-cycle to ensuring proper testing methodologies, dealing with defects, building, deploying, among others. As machine learning is becoming more prominent, introducing machine learn-ing in new environments requires skills and considerations from software engineering, machine learning and computer engineering, while also sharing their challenges from these disciplines. As democratization of machine learning has increased by the presence of open-source projects led by both academia and industry, industry practitioners and researchers share one thing in common: the tools they use. In machine learning, tools are represented by libraries and frameworks used as software for the various steps necessary in a machine learning project. In this work, we investigate the challenges in implementing machine learning systems in the industry