7,861 research outputs found
Human-Interpretable Explanations for Black-Box Machine Learning Models: An Application to Fraud Detection
Machine Learning (ML) has been increasingly used to aid humans making high-stakes
decisions in a wide range of areas, from public policy to criminal justice, education,
healthcare, or financial services. However, it is very hard for humans to grasp the rationale
behind every ML model’s prediction, hindering trust in the system. The field
of Explainable Artificial Intelligence (XAI) emerged to tackle this problem, aiming to
research and develop methods to make those “black-boxes” more interpretable, but there
is still no major breakthrough. Additionally, the most popular explanation methods —
LIME and SHAP — produce very low-level feature attribution explanations, being of
limited usefulness to personas without any ML knowledge.
This work was developed at Feedzai, a fintech company that uses ML to prevent financial
crime. One of the main Feedzai products is a case management application used
by fraud analysts to review suspicious financial transactions flagged by the ML models.
Fraud analysts are domain experts trained to look for suspicious evidence in transactions
but they do not have ML knowledge, and consequently, current XAI methods do not
suit their information needs. To address this, we present JOEL, a neural network-based
framework to jointly learn a decision-making task and associated domain knowledge
explanations. JOEL is tailored to human-in-the-loop domain experts that lack deep technical
ML knowledge, providing high-level insights about the model’s predictions that
very much resemble the experts’ own reasoning. Moreover, by collecting the domain
feedback from a pool of certified experts (human teaching), we promote seamless and
better quality explanations. Lastly, we resort to semantic mappings between legacy expert
systems and domain taxonomies to automatically annotate a bootstrap training set, overcoming
the absence of concept-based human annotations. We validate JOEL empirically
on a real-world fraud detection dataset, at Feedzai. We show that JOEL can generalize
the explanations from the bootstrap dataset. Furthermore, obtained results indicate that
human teaching is able to further improve the explanations prediction quality.A Aprendizagem de Máquina (AM) tem sido cada vez mais utilizada para ajudar os
humanos a tomar decisões de alto risco numa vasta gama de áreas, desde polĂtica atĂ© Ă
justiça criminal, educação, saĂşde e serviços financeiros. PorĂ©m, Ă© muito difĂcil para os
humanos perceber a razão da decisão do modelo de AM, prejudicando assim a confiança
no sistema. O campo da Inteligência Artificial Explicável (IAE) surgiu para enfrentar
este problema, visando desenvolver métodos para tornar as “caixas-pretas” mais interpretáveis,
embora ainda sem grande avanço. Além disso, os métodos de explicação mais
populares — LIME and SHAP — produzem explicações de muito baixo nĂvel, sendo de
utilidade limitada para pessoas sem conhecimento de AM.
Este trabalho foi desenvolvido na Feedzai, a fintech que usa a AM para prevenir crimes
financeiros. Um dos produtos da Feedzai é uma aplicação de gestão de casos, usada por
analistas de fraude. Estes sĂŁo especialistas no domĂnio treinados para procurar evidĂŞncias
suspeitas em transações financeiras, contudo não tendo o conhecimento em AM, os
métodos de IAE atuais não satisfazem as suas necessidades de informação. Para resolver
isso, apresentamos JOEL, a framework baseada em rede neuronal para aprender conjuntamente
a tarefa de tomada de decisão e as explicações associadas. A JOEL é orientada
a especialistas de domĂnio que nĂŁo tĂŞm conhecimento tĂ©cnico profundo de AM, fornecendo
informações de alto nĂvel sobre as previsões do modelo, que muito se assemelham
ao raciocĂnio dos prĂłprios especialistas. Ademais, ao recolher o feedback de especialistas
certificados (ensino humano), promovemos explicações contĂnuas e de melhor qualidade.
Por último, recorremos a mapeamentos semânticos entre sistemas legados e taxonomias
de domĂnio para anotar automaticamente um conjunto de dados, superando a ausĂŞncia
de anotações humanas baseadas em conceitos. Validamos a JOEL empiricamente em um
conjunto de dados de detecção de fraude do mundo real, na Feedzai. Mostramos que a
JOEL pode generalizar as explicações aprendidas no conjunto de dados inicial e que o
ensino humano é capaz de melhorar a qualidade da previsão das explicações
An ontology co-design method for the co-creation of a continuous care ontology
Ontology engineering methodologies tend to emphasize the role of the knowledge engineer or require a very active role of domain experts. In this paper, a participatory ontology engineering method is described that holds the middle ground between these two 'extremes'. After thorough ethnographic research, an interdisciplinary group of domain experts closely interacted with ontology engineers and social scientists in a series of workshops. Once a preliminary ontology was developed, a dynamic care request system was built using the ontology. Additional workshops were organized involving a broader group of domain experts to ensure the applicability of the ontology across continuous care settings. The proposed method successfully actively engaged domain experts in constructing the ontology, without overburdening them. Its applicability is illustrated by presenting the co-created continuous care ontology. The lessons learned during the design and execution of the approach are also presented
A survey of social studies curricula for kindergarten through grade three in selected communities in the United States.
Thesis (Ed.M.)--Boston Universit
Extracting personal information from conversations
Personal knowledge is a versatile resource that is valuable for a wide range of downstream applications. Background facts about users can allow chatbot assistants to produce more topical and empathic replies. In the context of recommendation and retrieval models, personal facts can be used to customize the ranking results for individual users. A Personal Knowledge Base, populated with personal facts, such as demographic information, interests and interpersonal relationships, is a unique endpoint for storing and querying personal knowledge. Such knowledge bases are easily interpretable and can provide users with full control over their own personal knowledge, including revising stored facts and managing access by downstream services for personalization purposes. To alleviate users from extensive manual effort to build such personal knowledge base, we can leverage automated extraction methods applied to the textual content of the users, such as dialogue transcripts or social media posts. Mainstream extraction methods specialize on well-structured data, such as biographical texts or encyclopedic articles, which are rare for most people. In turn, conversational data is abundant but challenging to process and requires specialized methods for extraction of personal facts. In this dissertation we address the acquisition of personal knowledge from conversational data. We propose several novel deep learning models for inferring speakers’ personal attributes: • Demographic attributes, age, gender, profession and family status, are inferred by HAMs - hierarchical neural classifiers with attention mechanism. Trained HAMs can be transferred between different types of conversational data and provide interpretable predictions. • Long-tailed personal attributes, hobby and profession, are predicted with CHARM - a zero-shot learning model, overcoming the lack of labeled training samples for rare attribute values. By linking conversational utterances to external sources, CHARM is able to predict attribute values which it never saw during training. • Interpersonal relationships are inferred with PRIDE - a hierarchical transformer-based model. To accurately predict fine-grained relationships, PRIDE leverages personal traits of the speakers and the style of conversational utterances. Experiments with various conversational texts, including Reddit discussions and movie scripts, demonstrate the viability of our methods and their superior performance compared to state-of-the-art baselines.Personengebundene Fakten sind eine vielseitig nutzbare Quelle für die verschiedensten Anwendungen. Hintergrundfakten über Nutzer können es Chatbot-Assistenten ermöglichen, relevantere und persönlichere Antworten zu geben. Im Kontext von Empfehlungs- und Retrievalmodellen können personengebundene Fakten dazu verwendet werden, die Ranking-Ergebnisse für Nutzer individuell anzupassen. Eine Personengebundene Wissensdatenbank, gefüllt mit persönlichen Daten wie demografischen Angaben, Interessen und Beziehungen, kann eine universelle Schnittstelle für die Speicherung und Abfrage solcher Fakten sein. Wissensdatenbanken sind leicht zu interpretieren und bieten dem Nutzer die vollständige Kontrolle über seine personenbezogenen Fakten, einschließlich der Überarbeitung und der Verwaltung des Zugriffs durch nachgelagerte Dienste, etwa für Personalisierungszwecke. Um den Nutzern den aufwändigen manuellen Aufbau einer solchen persönlichen Wissensdatenbank zu ersparen, können automatisierte Extraktionsmethoden auf den textuellen Inhalten der Nutzer – wie z.B. Konversationen oder Beiträge in sozialen Medien – angewendet werden. Die üblichen Extraktionsmethoden sind auf strukturierte Daten wie biografische Texte oder enzyklopädische Artikel spezialisiert, die bei den meisten Menschen keine Rolle spielen. In dieser Dissertation beschäftigen wir uns mit der Gewinnung von persönlichem Wissen aus Dialogdaten und schlagen mehrere neuartige Deep-Learning-Modelle zur Ableitung persönlicher Attribute von Sprechern vor: • Demographische Attribute wie Alter, Geschlecht, Beruf und Familienstand werden durch HAMs - Hierarchische Neuronale Klassifikatoren mit Attention-Mechanismus - abgeleitet. Trainierte HAMs können zwischen verschiedenen Arten von Gesprächsdaten übertragen werden und liefern interpretierbare Vorhersagen • Vielseitige persönliche Attribute wie Hobbys oder Beruf werden mit CHARM ermittelt - einem Zero-Shot-Lernmodell, das den Mangel an markierten Trainingsbeispielen für seltene Attributwerte überwindet. Durch die Verknüpfung von Gesprächsäußerungen mit externen Quellen ist CHARM in der Lage, Attributwerte zu ermitteln, die es beim Training nie gesehen hat • Zwischenmenschliche Beziehungen werden mit PRIDE, einem hierarchischen transformerbasierten Modell, abgeleitet. Um präzise Beziehungen vorhersagen zu können, nutzt PRIDE persönliche Eigenschaften der Sprecher und den Stil von Konversationsäußerungen Experimente mit verschiedenen Konversationstexten, inklusive Reddit-Diskussionen und Filmskripten, demonstrieren die Praxistauglichkeit unserer Methoden und ihre hervorragende Leistung im Vergleich zum aktuellen Stand der Technik
Multilingual Coarse Political Stance Classification of Media. The Editorial Line of a ChatGPT and Bard Newspaper
Neutrality is difficult to achieve and, in politics, subjective. Traditional
media typically adopt an editorial line that can be used by their potential
readers as an indicator of the media bias. Several platforms currently rate
news outlets according to their political bias. The editorial line and the
ratings help readers in gathering a balanced view of news. But in the advent of
instruction-following language models, tasks such as writing a newspaper
article can be delegated to computers. Without imposing a biased persona, where
would an AI-based news outlet lie within the bias ratings? In this work, we use
the ratings of authentic news outlets to create a multilingual corpus of news
with coarse stance annotations (Left and Right) along with automatically
extracted topic annotations. We show that classifiers trained on this data are
able to identify the editorial line of most unseen newspapers in English,
German, Spanish and Catalan. We then apply the classifiers to 101
newspaper-like articles written by ChatGPT and Bard in the 4 languages at
different time periods. We observe that, similarly to traditional newspapers,
ChatGPT editorial line evolves with time and, being a data-driven system, the
stance of the generated articles differs among languages.Comment: To be published at EMNLP 2023 (Findings
Assessment of ambient assisted living systems for patients with mild cognitive impairment
According to the World Health Organization, about 50 million people worldwide suffer from dementia. Ten million new cases added every year. Mild Cognitive Impairment (MCI) affects more than 15% of the population aged 65. Technological solutions, such as smart home technology with ubiquitous computing devices, 24/7 telemedical observation and support can alleviate the growing problem and lower pressure on the healthcare system. This approach is also preferable for homecare patients in distant and rural areas.
MCI patients are mostly home-based. Ambient Assisted Living (AAL) systems provide tools for automatic registration of vital signs and other medically and socially important information. AAL system for MCI patients is a logical answer to the problem. At the same time, many of the proposed AAL systems are proprietary, technically complicated and have a high price tag for implementation and service. Also, some proposed technical solutions not entirely reflect the opinion of healthcare stakeholders.
The current study was proposed as a way to bridge the possible differences in the positions. An online anonymous questionnaire for healthcare professionals was created to prove or disprove the number of interconnected hypotheses about the necessity and feasibility of AAL system for MCI patients. The main focus was made on the hypotheses: "There is necessity of AAL systems for the healthcare" and "AAL systems are capable of providing assistance for patients with Mild Cognitive Impairment". The questionnaire was presented to more than three hundred potential respondents. Around a hundred and twenty agreed to fill it, and sixty completed the whole questionnaire.
Results were analyzed to produce some directions guideline for future technical applications of AAL systems for MCI patients and future research.
Descriptive statistics show support for the implementation of general AAL and variants for MCI patients. Comparative analysis of ordinal data for specific groups of respondents is done with help of non-parametric tests. Mann–Whitney–Wilcoxon test and Kruskal-Wallis test are applied. Table questions results are analyzed with chisquare for frequency tables. Group analysis demonstrated relative positive uniformity in of responses in the support of AAL of MCI patients.Segundo a Organização Mundial da Saúde, cerca de 50 milhões de pessoas em todo o mundo sofrem de demência. Dez milhões de novos casos adicionados a cada ano. O comprometimento cognitivo leve (MCI) afeta mais de 15% da população com 65 anos.
Soluções tecnolĂłgicas, como tecnologia de casa inteligente com dispositivos de computação onipresentes, observação e suporte telemĂ©dico 24 horas por dia, 7 dias por semana, podem aliviar o problema crescente e diminuir a pressĂŁo sobre o sistema de saĂşde. Essa abordagem tambĂ©m Ă© preferĂvel para pacientes de cuidados domiciliares em áreas distantes e rurais.
Os pacientes com CCL são, em sua maioria, domiciliares. Os sistemas Ambient Assisted Living (AAL) fornecem ferramentas para registro automático de sinais vitais e outras informações médicas e socialmente importantes. O sistema AAL para pacientes com MCI é uma resposta lógica para o problema. Ao mesmo tempo, muitos dos sistemas AAL propostos são proprietários, tecnicamente complicados e têm um alto preço para implementação e serviço. Além disso, algumas soluções técnicas propostas não refletem inteiramente a opinião das partes interessadas na área da saúde.
O presente estudo foi proposto como forma de colmatar as possĂveis diferenças nas posições. Um questionário anĂ´nimo online para profissionais de saĂşde foi criado para comprovar ou refutar o nĂşmero de hipĂłteses interligadas sobre a necessidade e viabilidade do sistema AAL para pacientes com CCL. O foco principal foi feito nas hipĂłteses: "Há necessidade de sistemas de AAL para a saĂşde" e "Os sistemas de AAL sĂŁo capazes de prestar assistĂŞncia a pacientes com Comprometimento Cognitivo Leve".
O questionário foi apresentado a mais de trezentos respondentes potenciais. Cerca de cento e vinte concordaram em preenchê-lo e sessenta preencheram todo o questionário.
Os resultados foram analisados para produzir algumas diretrizes para futuras aplicações técnicas de sistemas AAL para pacientes com MCI e pesquisas futuras.
EstatĂsticas descritivas mostram suporte para a implementação de AAL geral e variantes para pacientes com CCL. A análise comparativa de dados ordinais para grupos especĂficos de respondentes Ă© feita com a ajuda de testes nĂŁo paramĂ©tricos. Aplicam-se os testes de Mann-Whitney-Wilcoxon e Kruskal-Wallis. Os resultados das questões da tabela sĂŁo analisados com qui-quadrado para tabelas de frequĂŞncia. A análise do grupo demonstrou relativa uniformidade positiva nas respostas no suporte de AAL de pacientes com CCL.Selon l'Organisation mondiale de la santĂ©, environ 50 millions de personnes dans le monde souffrent de dĂ©mence. Dix millions de nouveaux cas ajoutĂ©s chaque annĂ©e. Les troubles cognitifs lĂ©gers (MCI) touchent plus de 15 % de la population âgĂ©e de 65 ans.
Les solutions technologiques, telles que la technologie de la maison intelligente avec des appareils informatiques omniprésents, l'observation et le soutien télémédicaux 24 heures sur 24, 7 jours sur 7, peuvent atténuer le problème croissant et réduire la pression sur le système de santé. Cette approche est également préférable pour les patients en soins à domicile dans les régions éloignées et rurales.
Les patients MCI sont pour la plupart à domicile. Les systèmes Ambient Assisted Living (AAL) fournissent des outils pour l'enregistrement automatique des signes vitaux et d'autres informations importantes sur le plan médical et social. Le système AAL pour les patients MCI est une réponse logique au problème. Dans le même temps, bon nombre des systèmes AAL proposés sont propriétaires, techniquement compliqués et ont un prix élevé pour la mise en oeuvre et le service. De plus, certaines solutions techniques proposées ne reflètent pas entièrement l'opinion des acteurs de santé.
L'étude actuelle a été proposée comme un moyen de combler les différences possible dans les positions. Un questionnaire anonyme en ligne destiné aux professionnels de la santé a été créé pour prouver ou réfuter le nombre d'hypothèses interconnectées sur la nécessité et la faisabilité du système AAL pour les patients MCI. L'accent a été mis principalement sur les hypothèses: "Il existe une nécessité de systèmes AAL pour les soins de santé" et "Les systèmes AAL sont capables de fournir une assistance aux patients atteints de troubles cognitifs légers". Le questionnaire a été présenté à plus de trois cents répondants potentiels. Environ cent vingt ont accepté de le remplir, et soixante ont rempli tout le questionnaire. Les résultats ont été analysés pour produire des lignes directrices pour les futures applications techniques des systèmes AAL pour les patients MCI et l'avenir de la recherche.
Les statistiques descriptives montrent un soutien à la mise en oeuvre de l'AAL général et des variantes pour les patients MCI. L'analyse comparative des données ordinales pour des groupes spécifiques de répondants est effectuée à l'aide de tests non paramétriques.
Le test de Mann-Whitney-Wilcoxon et le test de Kruskal-Wallis sont appliqués. Les résultats des questions de tableau sont analysés avec le chi carré pour les tableaux de fréquence. L'analyse de groupe a démontré une uniformité positive relative dans les réponses à l'appui de l'AAL des patients MCI
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
European Neuroendocrine Tumour Society (ENETS) 2023 guidance paper for nonfunctioning pancreatic neuroendocrine tumours.
This ENETS guidance paper for well-differentiated nonfunctioning pancreatic neuroendocrine tumours (NF-Pan-NET) has been developed by a multidisciplinary working group, and provides up-to-date and practical advice on the management of these tumours. Using the extensive experience of centres treating patients with NF-Pan-NEN, the authors of this guidance paper discuss 10 troublesome questions in everyday clinical practice. Our many years of experience in this field are still being verified in the light of the results of new clinical, which set new ways of proceeding in NEN. The treatment of NF-Pan-NEN still requires a decision of a multidisciplinary team of specialists in the field of neuroendocrine neoplasms
- …