17 research outputs found
Forked Recurrent Neural Network for Hand Gesture Classification Using Inertial Measurement Data
For many applications of hand gesture recognition, a delayfree, affordable, and mobile system relying on body signals is mandatory. Therefore, we propose an approach for hand gestures classification given signals of inertial measurement units (IMUs) that works with extremely short windows to avoid delays. With a simple recurrent neural network the suitability of the sensor modalities of an IMU (accelerometer, gyroscope, magnetometer) are evaluated by only providing data of one modality. For the multi-modal data a second network with mid-level fusion is proposed. Its forked architecture allows us to process data of each modality individually before carrying out a joint analysis for classification. Experiments on three databases reveal that even when relying on a single modality our proposed system outperforms state-of-the-art systems significantly. With the forked network classification accuracy can be further improved by over 10% absolute compared to the best reported system while causing a fraction of the delay
Designing and evaluating the usability of a machine learning API for rapid prototyping music technology
To better support creative software developers and music technologists' needs, and to empower them as machine learning users and innovators, the usability of and developer experience with machine learning tools must be considered and better understood. We review background research on the design and evaluation of application programming interfaces (APIs), with a focus on the domain of machine learning for music technology software development. We present the design rationale for the RAPID-MIX API, an easy-to-use API for rapid prototyping with interactive machine learning, and a usability evaluation study with software developers of music technology. A cognitive dimensions questionnaire was designed and delivered to a group of 12 participants who used the RAPID-MIX API in their software projects, including people who developed systems for personal use and professionals developing software products for music and creative technology companies. The results from the questionnaire indicate that participants found the RAPID-MIX API a machine learning API which is easy to learn and use, fun, and good for rapid prototyping with interactive machine learning. Based on these findings, we present an analysis and characterization of the RAPID-MIX API based on the cognitive dimensions framework, and discuss its design trade-offs and usability issues. We use these insights and our design experience to provide design recommendations for ML APIs for rapid prototyping of music technology. We conclude with a summary of the main insights, a discussion of the merits and challenges of the application of the CDs framework to the evaluation of machine learning APIs, and directions to future work which our research deems valuable
Advanced Mobile Robotics: Volume 3
Mobile robotics is a challenging field with great potential. It covers disciplines including electrical engineering, mechanical engineering, computer science, cognitive science, and social science. It is essential to the design of automated robots, in combination with artificial intelligence, vision, and sensor technologies. Mobile robots are widely used for surveillance, guidance, transportation and entertainment tasks, as well as medical applications. This Special Issue intends to concentrate on recent developments concerning mobile robots and the research surrounding them to enhance studies on the fundamental problems observed in the robots. Various multidisciplinary approaches and integrative contributions including navigation, learning and adaptation, networked system, biologically inspired robots and cognitive methods are welcome contributions to this Special Issue, both from a research and an application perspective
Attention-based machine perception for intelligent cyber-physical systems
Cyber-physical systems (CPS) fundamentally change the way of how information systems interact with the physical world. They integrate the sensing, computing, and communication capabilities on heterogeneous platforms and infrastructures. Efficient and effective perception of the environment lays the foundation of proper operations in other CPS components (e.g., planning and control). Recent advances in artificial intelligence (AI) have unprecedentedly changed the way of how cyber systems extract knowledge from the collected sensing data, and understand the physical surroundings. This novel data-to-knowledge transformation capability pushes a wide spectrum of recognition tasks (e.g., visual object detection, speech recognition, and sensor-based human activity recognition) to a higher level, and opens an new era of intelligent cyber-physical systems. However, the state-of-the-art neural perception models are typically computation-intensive and sensitive to data noises, which induce significant challenges when they are deployed on resources-limited embedded platforms.
This dissertation works on optimizing both the efficiency and efficacy of deep-neural- network (DNN)-based machine perception in intelligent cyber-physical systems. We extensively exploit and apply the design philosophy of attention, originated from cognitive psychology field, from multiple perspectives of machine perception. It generally means al- locating different degrees of concentration to different perceived stimuli. Specifically, we address the following five research questions: First, can we run the computation-intensive neural perception models in real-time by only looking at (i.e., scheduling) the important parts of the perceived scenes, with the cueing from an external sensor? Second, can we eliminate the dependency on the external cueing and make the scheduling framework a self- cueing system? Third, how to distribute the workloads among cameras in a distributed (visual) perception system, where multiple cameras can observe the same parts of the environment? Fourth, how to optimize the achieved perception quality when sensing data from heterogeneous locations and sensor types are collected and utilized? Fifth, how to handle sensor failures in a distributed sensing system, when the deployed neural perception models are sensitive to missing data?
We formulate the above problems, and introduce corresponding attention-based solutions for each, to construct the fundamental building blocks for envisioning an attention-based machine perception system in intelligent CPS with both efficiency and efficacy guarantees
Multi-sensor data fusion in mobile devices for the identification of Activities of Daily Living
Following the recent advances in technology and the growing use of mobile devices such as
smartphones, several solutions may be developed to improve the quality of life of users in the
context of Ambient Assisted Living (AAL). Mobile devices have different available sensors, e.g.,
accelerometer, gyroscope, magnetometer, microphone and Global Positioning System (GPS)
receiver, which allow the acquisition of physical and physiological parameters for the
recognition of different Activities of Daily Living (ADL) and the environments in which they are
performed. The definition of ADL includes a well-known set of tasks, which include basic selfcare
tasks, based on the types of skills that people usually learn in early childhood, including
feeding, bathing, dressing, grooming, walking, running, jumping, climbing stairs, sleeping,
watching TV, working, listening to music, cooking, eating and others. On the context of AAL,
some individuals (henceforth called user or users) need particular assistance, either because
the user has some sort of impairment, or because the user is old, or simply because users
need/want to monitor their lifestyle. The research and development of systems that provide a
particular assistance to people is increasing in many areas of application. In particular, in the
future, the recognition of ADL will be an important element for the development of a personal
digital life coach, providing assistance to different types of users. To support the recognition
of ADL, the surrounding environments should be also recognized to increase the reliability of
these systems.
The main focus of this Thesis is the research on methods for the fusion and classification of the
data acquired by the sensors available in off-the-shelf mobile devices in order to recognize ADL
in almost real-time, taking into account the large diversity of the capabilities and
characteristics of the mobile devices available in the market. In order to achieve this objective,
this Thesis started with the review of the existing methods and technologies to define the
architecture and modules of the method for the identification of ADL. With this review and
based on the knowledge acquired about the sensors available in off-the-shelf mobile devices,
a set of tasks that may be reliably identified was defined as a basis for the remaining research
and development to be carried out in this Thesis. This review also identified the main stages
for the development of a new method for the identification of the ADL using the sensors
available in off-the-shelf mobile devices; these stages are data acquisition, data processing,
data cleaning, data imputation, feature extraction, data fusion and artificial intelligence. One
of the challenges is related to the different types of data acquired from the different sensors,
but other challenges were found, including the presence of environmental noise, the positioning
of the mobile device during the daily activities, the limited capabilities of the mobile devices
and others. Based on the acquired data, the processing was performed, implementing data
cleaning and feature extraction methods, in order to define a new framework for the recognition of ADL. The data imputation methods were not applied, because at this stage of
the research their implementation does not have influence in the results of the identification
of the ADL and environments, as the features are extracted from a set of data acquired during
a defined time interval and there are no missing values during this stage. The joint selection of
the set of usable sensors and the identifiable set of tasks will then allow the development of a
framework that, considering multi-sensor data fusion technologies and context awareness, in
coordination with other information available from the user context, such as his/her agenda
and the time of the day, will allow to establish a profile of the tasks that the user performs in
a regular activity day. The classification method and the algorithm for the fusion of the features
for the recognition of ADL and its environments needs to be deployed in a machine with some
computational power, while the mobile device that will use the created framework, can
perform the identification of the ADL using a much less computational power. Based on the
results reported in the literature, the method chosen for the recognition of the ADL is composed
by three variants of Artificial Neural Networks (ANN), including simple Multilayer Perceptron
(MLP) networks, Feedforward Neural Networks (FNN) with Backpropagation, and Deep Neural
Networks (DNN).
Data acquisition can be performed with standard methods. After the acquisition, the data must
be processed at the data processing stage, which includes data cleaning and feature extraction
methods. The data cleaning method used for motion and magnetic sensors is the low pass filter,
in order to reduce the noise acquired; but for the acoustic data, the Fast Fourier Transform
(FFT) was applied to extract the different frequencies. When the data is clean, several features
are then extracted based on the types of sensors used, including the mean, standard deviation,
variance, maximum value, minimum value and median of raw data acquired from the motion
and magnetic sensors; the mean, standard deviation, variance and median of the maximum
peaks calculated with the raw data acquired from the motion and magnetic sensors; the five
greatest distances between the maximum peaks calculated with the raw data acquired from
the motion and magnetic sensors; the mean, standard deviation, variance, median and 26 Mel-
Frequency Cepstral Coefficients (MFCC) of the frequencies obtained with FFT based on the raw
data acquired from the microphone data; and the distance travelled calculated with the data
acquired from the GPS receiver. After the extraction of the features, these will be grouped in
different datasets for the application of the ANN methods and to discover the method and
dataset that reports better results. The classification stage was incrementally developed,
starting with the identification of the most common ADL (i.e., walking, running, going upstairs,
going downstairs and standing activities) with motion and magnetic sensors. Next, the
environments were identified with acoustic data, i.e., bedroom, bar, classroom, gym, kitchen,
living room, hall, street and library. After the environments are recognized, and based on the
different sets of sensors commonly available in the mobile devices, the data acquired from the
motion and magnetic sensors were combined with the recognized environment in order to
differentiate some activities without motion, i.e., sleeping and watching TV. The number of recognized activities in this stage was increased with the use of the distance travelled,
extracted from the GPS receiver data, allowing also to recognize the driving activity.
After the implementation of the three classification methods with different numbers of
iterations, datasets and remaining configurations in a machine with high processing
capabilities, the reported results proved that the best method for the recognition of the most
common ADL and activities without motion is the DNN method, but the best method for the
recognition of environments is the FNN method with Backpropagation. Depending on the
number of sensors used, this implementation reports a mean accuracy between 85.89% and
89.51% for the recognition of the most common ADL, equals to 86.50% for the recognition of
environments, and equals to 100% for the recognition of activities without motion, reporting
an overall accuracy between 85.89% and 92.00%.
The last stage of this research work was the implementation of the structured framework for
the mobile devices, verifying that the FNN method requires a high processing power for the
recognition of environments and the results reported with the mobile application are lower
than the results reported with the machine with high processing capabilities used. Thus, the
DNN method was also implemented for the recognition of the environments with the mobile
devices. Finally, the results reported with the mobile devices show an accuracy between 86.39%
and 89.15% for the recognition of the most common ADL, equal to 45.68% for the recognition
of environments, and equal to 100% for the recognition of activities without motion, reporting
an overall accuracy between 58.02% and 89.15%.
Compared with the literature, the results returned by the implemented framework show only
a residual improvement. However, the results reported in this research work comprehend the
identification of more ADL than the ones described in other studies. The improvement in the
recognition of ADL based on the mean of the accuracies is equal to 2.93%, but the maximum
number of ADL and environments previously recognized was 13, while the number of ADL and
environments recognized with the framework resulting from this research is 16. In conclusion,
the framework developed has a mean improvement of 2.93% in the accuracy of the recognition
for a larger number of ADL and environments than previously reported.
In the future, the achievements reported by this PhD research may be considered as a start
point of the development of a personal digital life coach, but the number of ADL and
environments recognized by the framework should be increased and the experiments should be
performed with different types of devices (i.e., smartphones and smartwatches), and the data
imputation and other machine learning methods should be explored in order to attempt to
increase the reliability of the framework for the recognition of ADL and its environments.Após os recentes avanços tecnológicos e o crescente uso dos dispositivos móveis, como por
exemplo os smartphones, várias soluções podem ser desenvolvidas para melhorar a qualidade
de vida dos utilizadores no contexto de Ambientes de Vida Assistida (AVA) ou Ambient Assisted
Living (AAL). Os dispositivos móveis integram vários sensores, tais como acelerómetro,
giroscópio, magnetómetro, microfone e recetor de Sistema de Posicionamento Global (GPS),
que permitem a aquisição de vários parâmetros fÃsicos e fisiológicos para o reconhecimento de
diferentes Atividades da Vida Diária (AVD) e os seus ambientes. A definição de AVD inclui um
conjunto bem conhecido de tarefas que são tarefas básicas de autocuidado, baseadas nos tipos
de habilidades que as pessoas geralmente aprendem na infância. Essas tarefas incluem
alimentar-se, tomar banho, vestir-se, fazer os cuidados pessoais, caminhar, correr, pular, subir
escadas, dormir, ver televisão, trabalhar, ouvir música, cozinhar, comer, entre outras. No
contexto de AVA, alguns indivÃduos (comumente chamados de utilizadores) precisam de
assistência particular, seja porque o utilizador tem algum tipo de deficiência, seja porque é
idoso, ou simplesmente porque o utilizador precisa/quer monitorizar e treinar o seu estilo de
vida. A investigação e desenvolvimento de sistemas que fornecem algum tipo de assistência
particular está em crescente em muitas áreas de aplicação. Em particular, no futuro, o
reconhecimento das AVD é uma parte importante para o desenvolvimento de um assistente
pessoal digital, fornecendo uma assistência pessoal de baixo custo aos diferentes tipos de
pessoas. pessoas. Para ajudar no reconhecimento das AVD, os ambientes em que estas se
desenrolam devem ser reconhecidos para aumentar a fiabilidade destes sistemas.
O foco principal desta Tese é o desenvolvimento de métodos para a fusão e classificação dos
dados adquiridos a partir dos sensores disponÃveis nos dispositivos móveis, para o
reconhecimento quase em tempo real das AVD, tendo em consideração a grande diversidade
das caracterÃsticas dos dispositivos móveis disponÃveis no mercado. Para atingir este objetivo,
esta Tese iniciou-se com a revisão dos métodos e tecnologias existentes para definir a
arquitetura e os módulos do novo método de identificação das AVD. Com esta revisão da
literatura e com base no conhecimento adquirido sobre os sensores disponÃveis nos dispositivos
móveis disponÃveis no mercado, um conjunto de tarefas que podem ser identificadas foi
definido para as pesquisas e desenvolvimentos desta Tese. Esta revisão também identifica os
principais conceitos para o desenvolvimento do novo método de identificação das AVD,
utilizando os sensores, são eles: aquisição de dados, processamento de dados, correção de
dados, imputação de dados, extração de caracterÃsticas, fusão de dados e extração de
resultados recorrendo a métodos de inteligência artificial. Um dos desafios está relacionado
aos diferentes tipos de dados adquiridos pelos diferentes sensores, mas outros desafios foram
encontrados, sendo os mais relevantes o ruÃdo ambiental, o posicionamento do dispositivo durante a realização das atividades diárias, as capacidades limitadas dos dispositivos móveis.
As diferentes caracterÃsticas das pessoas podem igualmente influenciar a criação dos métodos,
escolhendo pessoas com diferentes estilos de vida e caracterÃsticas fÃsicas para a aquisição e
identificação dos dados adquiridos a partir de sensores. Com base nos dados adquiridos,
realizou-se o processamento dos dados, implementando-se métodos de correção dos dados e a
extração de caracterÃsticas, para iniciar a criação do novo método para o reconhecimento das
AVD. Os métodos de imputação de dados foram excluÃdos da implementação, pois não iriam
influenciar os resultados da identificação das AVD e dos ambientes, na medida em que são
utilizadas as caracterÃsticas extraÃdas de um conjunto de dados adquiridos durante um intervalo
de tempo definido.
A seleção dos sensores utilizáveis, bem como das AVD identificáveis, permitirá o
desenvolvimento de um método que, considerando o uso de tecnologias para a fusão de dados
adquiridos com múltiplos sensores em coordenação com outras informações relativas ao
contexto do utilizador, tais como a agenda do utilizador, permitindo estabelecer um perfil de
tarefas que o utilizador realiza diariamente. Com base nos resultados obtidos na literatura, o
método escolhido para o reconhecimento das AVD são as diferentes variantes das Redes
Neuronais Artificiais (RNA), incluindo Multilayer Perceptron (MLP), Feedforward Neural
Networks (FNN) with Backpropagation and Deep Neural Networks (DNN). No final, após a
criação dos métodos para cada fase do método para o reconhecimento das AVD e ambientes, a
implementação sequencial dos diferentes métodos foi realizada num dispositivo móvel para
testes adicionais.
Após a definição da estrutura do método para o reconhecimento de AVD e ambientes usando
dispositivos móveis, verificou-se que a aquisição de dados pode ser realizada com os métodos
comuns. Após a aquisição de dados, os mesmos devem ser processados no módulo de
processamento de dados, que inclui os métodos de correção de dados e de extração de
caracterÃsticas. O método de correção de dados utilizado para sensores de movimento e
magnéticos é o filtro passa-baixo de modo a reduzir o ruÃdo, mas para os dados acústicos, a
Transformada Rápida de Fourier (FFT) foi aplicada para extrair as diferentes frequências.
Após a correção dos dados, as diferentes caracterÃsticas foram extraÃdas com base nos tipos de
sensores usados, sendo a média, desvio padrão, variância, valor máximo, valor mÃnimo e
mediana de dados adquiridos pelos sensores magnéticos e de movimento, a média, desvio
padrão, variância e mediana dos picos máximos calculados com base nos dados adquiridos pelos
sensores magnéticos e de movimento, as cinco maiores distâncias entre os picos máximos
calculados com os dados adquiridos dos sensores de movimento e magnéticos, a média, desvio
padrão, variância e 26 Mel-Frequency Cepstral Coefficients (MFCC) das frequências obtidas
com FFT com base nos dados obtidos a partir do microfone, e a distância calculada com os
dados adquiridos pelo recetor de GPS. Após a extração das caracterÃsticas, as mesmas são agrupadas em diferentes conjuntos de dados
para a aplicação dos métodos de RNA de modo a descobrir o método e o conjunto de
caracterÃsticas que reporta melhores resultados. O módulo de classificação de dados foi
incrementalmente desenvolvido, começando com a identificação das AVD comuns com sensores
magnéticos e de movimento, i.e., andar, correr, subir escadas, descer escadas e parado. Em
seguida, os ambientes são identificados com dados de sensores acústicos, i.e., quarto, bar, sala
de aula, ginásio, cozinha, sala de estar, hall, rua e biblioteca. Com base nos ambientes
reconhecidos e os restantes sensores disponÃveis nos dispositivos móveis, os dados adquiridos
dos sensores magnéticos e de movimento foram combinados com o ambiente reconhecido para
diferenciar algumas atividades sem movimento (i.e., dormir e ver televisão), onde o número
de atividades reconhecidas nesta fase aumenta com a fusão da distância percorrida, extraÃda
a partir dos dados do recetor GPS, permitindo também reconhecer a atividade de conduzir.
Após a implementação dos três métodos de classificação com diferentes números de iterações,
conjuntos de dados e configurações numa máquina com alta capacidade de processamento, os
resultados relatados provaram que o melhor método para o reconhecimento das atividades
comuns de AVD e atividades sem movimento é o método DNN, mas o melhor método para o
reconhecimento de ambientes é o método FNN with Backpropagation. Dependendo do número
de sensores utilizados, esta implementação reporta uma exatidão média entre 85,89% e 89,51%
para o reconhecimento das AVD comuns, igual a 86,50% para o reconhecimento de ambientes,
e igual a 100% para o reconhecimento de atividades sem movimento, reportando uma exatidão
global entre 85,89% e 92,00%.
A última etapa desta Tese foi a implementação do método nos dispositivos móveis, verificando
que o método FNN requer um alto poder de processamento para o reconhecimento de
ambientes e os resultados reportados com estes dispositivos são inferiores aos resultados
reportados com a máquina com alta capacidade de processamento utilizada no
desenvolvimento do método. Assim, o método DNN foi igualmente implementado para o
reconhecimento dos ambientes com os dispositivos móveis. Finalmente, os resultados relatados
com os dispositivos móveis reportam uma exatidão entre 86,39% e 89,15% para o
reconhecimento das AVD comuns, igual a 45,68% para o reconhecimento de ambientes, e igual
a 100% para o reconhecimento de atividades sem movimento, reportando uma exatidão geral
entre 58,02% e 89,15%.
Com base nos resultados relatados na literatura, os resultados do método desenvolvido mostram
uma melhoria residual, mas os resultados desta Tese identificam mais AVD que os demais
estudos disponÃveis na literatura. A melhoria no reconhecimento das AVD com base na média
das exatidões é igual a 2,93%, mas o número máximo de AVD e ambientes reconhecidos pelos
estudos disponÃveis na literatura é 13, enquanto o número de AVD e ambientes reconhecidos
com o método implementado é 16. Assim, o método desenvolvido tem uma melhoria de 2,93%
na exatidão do reconhecimento num maior número de AVD e ambientes. Como trabalho futuro, os resultados reportados nesta Tese podem ser considerados um ponto
de partida para o desenvolvimento de um assistente digital pessoal, mas o número de ADL e
ambientes reconhecidos pelo método deve ser aumentado e as experiências devem ser
repetidas com diferentes tipos de dispositivos móveis (i.e., smartphones e smartwatches), e os
métodos de imputação e outros métodos de classificação de dados devem ser explorados de
modo a tentar aumentar a confiabilidade do método para o reconhecimento das AVD e
ambientes
Predicting Head Pose From Speech
Speech animation, the process of animating a human-like model to give the impression it is talking, most commonly relies on the work of skilled animators, or performance capture. These approaches are time consuming, expensive, and lack the ability to scale. This thesis develops algorithms for content driven speech animation; models that learn visual actions from data without semantic labelling, to predict realistic speech animation from recorded audio.
We achieve these goals by _rst forming a multi-modal corpus that represents the style of speech we want to model; speech that is natural, expressive and prosodic. This allows us to train deep recurrent neural networks to predict compelling animation.
We _rst develop methods to predict the rigid head pose of a speaker. Predicting the head pose of a speaker from speech is not wholly deterministic, so our methods provide a large variety of plausible head pose trajectories from a single utterance. We then apply our methods to learn how to predict the head pose of the listener while in conversation, using only the voice of the speaker. Finally, we show how to predict the lip sync, facial expression, and rigid head pose of the speaker, simultaneously, solely from speec
Electromyography Based Human-Robot Interfaces for the Control of Artificial Hands and Wearable Devices
The design of robotic systems is currently facing human-inspired solutions as a road to replicate the human ability and flexibility in performing motor tasks. Especially for control and teleoperation purposes, the human-in-the-loop approach is a key element within the framework know as Human-Robot Interface. This thesis reports the research activity carried out for the design of Human-Robot Interfaces based on the detection of human motion intentions from surface electromyography. The main goal was to investigate intuitive and natural control solutions for the teleoperation of both robotic hands during grasping tasks and wearable devices during elbow assistive applications.
The design solutions are based on the human motor control principles and surface electromyography interpretation, which are reviewed with emphasis on the concept of synergies. The electromyography based control strategies for the robotic hand grasping and the wearable device assistance are also reviewed.
The contribution of this research for the control of artificial hands rely on the integration of different levels of the motor control synergistic organization, and on the combination of proportional control and machine learning approaches under the guideline of user-centred intuitiveness in the Human-Robot Interface design specifications.
From the side of the wearable devices, the control of a novel upper limb assistive device based on the Twisted String Actuation concept is faced. The contribution regards the assistance of the elbow during load lifting tasks, exploring a simplification in the use of the surface electromyography within the design of the Human-Robot Interface. The aim is to work around complex subject-dependent algorithm calibrations required by joint torque estimation methods
Computer Vision Approaches to Liquid-Phase Transmission Electron Microscopy
Electron microscopy (EM) is a technique that exploits the interaction between electron and matter to produce high resolution images down to atomic level. In order to avoid undesired scattering in the electron path, EM samples are conventionally imaged in solid state under vacuum conditions. Recently, this limit has been overcome by the realization of liquid-phase electron microscopy (LP EM), a technique that enables the analysis of samples in their liquid native state. LP EM paired with a high frame rate acquisition direct detection camera allows tracking the motion of particles in liquids, as well as their temporal dynamic processes. In this research work, LP EM is adopted to image the dynamics of particles undergoing Brownian motion, exploiting their natural rotation to access all the particle views, in order to reconstruct their 3D structure via tomographic techniques. However, specific computer vision-based tools were designed around the limitations of LP EM in order to elaborate the results of the imaging process. Consequently, different deblurring and denoising approaches were adopted to improve the quality of the images. Therefore, the processed LP EM images were adopted to reconstruct the 3D model of the imaged samples. This task was performed by developing two different methods: Brownian tomography (BT) and Brownian particle analysis (BPA). The former tracks in time a single particle, capturing its dynamics evolution over time. The latter is an extension in time of the single particle analysis (SPA) technique. Conventionally it is paired to cryo-EM to reconstruct 3D density maps starting from thousands of EM images by capturing hundreds of particles of the same species frozen on a grid. On the contrary, BPA has the ability to process image sequences that may not contain thousands of particles, but instead monitors individual particle views across consecutive frames, rather than across a single frame
Interactive Machine Learning for User-Innovation Toolkits – An Action Design Research approach
Machine learning offers great potential to developers and end users in the creative industries.
However, to better support creative software developers' needs and empower them as machine
learning users and innovators, the usability of and developer experience with machine learning
tools must be considered and better understood. This thesis asks the following research questions:
How can we apply a user-centred approach to the design of developer tools for rapid prototyping
with Interactive Machine Learning? In what ways can we design better developer tools to accelerate
and broaden innovation with machine learning?
This thesis presents a three-year longitudinal action research study that I undertook within a
multi-institutional consortium leading the EU H2020 -funded Innovation Action RAPID-MIX. The
scope of the research presented here was the application of a user-centred approach to the design
and evaluation of developer tools for rapid prototyping and product development with machine
learning. This thesis presents my work in collaboration with other members of RAPID-MIX,
including design and deployment of a user-centred methodology for the project, interventions for
gathering requirements with RAPID-MIX consortium stakeholders and end users, and prototyping,
development and evaluation of a software development toolkit for interactive machine learning.
This thesis contributes with new understanding about the consequences and implications of a
user-centred approach to the design and evaluation of developer tools for rapid prototyping of
interactive machine learning systems. This includes 1) new understanding about the goals, needs,
expectations, and challenges facing creative machine-learning non-expert developers and 2) an
evaluation of the usability and design trade-offs of a toolkit for rapid prototyping with interactive
machine learning. This thesis also contributes with 3) a methods framework of User-Centred
Design Actions for harmonising User-Centred Design with Action Research and supporting the
collaboration between action researchers and practitioners working in rapid innovation actions,
and 4) recommendations for applying Action Research and User-Centred Design in similar contexts
and scale
Laser Scanner Technology
Laser scanning technology plays an important role in the science and engineering arena. The aim of the scanning is usually to create a digital version of the object surface. Multiple scanning is sometimes performed via multiple cameras to obtain all slides of the scene under study. Usually, optical tests are used to elucidate the power of laser scanning technology in the modern industry and in the research laboratories. This book describes the recent contributions reported by laser scanning technology in different areas around the world. The main topics of laser scanning described in this volume include full body scanning, traffic management, 3D survey process, bridge monitoring, tracking of scanning, human sensing, three-dimensional modelling, glacier monitoring and digitizing heritage monuments