Search CORE

35 research outputs found

SERENGETI: Massively Multilingual Language Models for Africa

Author: Abdul-Mageed Muhammad
Adebara Ife
Elmadany AbdelRahim
Inciarte Alcides Alcoba
Publication venue
Publication date: 26/05/2023
Field of study

Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning. To date, only ~31 out of ~2,000 African languages are covered in existing language models. We ameliorate this limitation by developing SERENGETI, a massively multilingual language model that covers 517 African languages and language varieties. We evaluate our novel models on eight natural language understanding tasks across 20 datasets, comparing to 4 mPLMs that cover 4-23 African languages. SERENGETI outperforms other models on 11 datasets across the eights tasks, achieving 82.27 average F_1. We also perform analyses of errors from our models, which allows us to investigate the influence of language genealogy and linguistic similarity when the models are applied under zero-shot settings. We will publicly release our models for research.\footnote{\href{https://github.com/UBC-NLP/serengeti}{https://github.com/UBC-NLP/serengeti}}Comment: To appear in Findings of ACL 202

arXiv.org e-Print Archive

Prosperity in the Twenty-First Century

Author
Publication venue: 'UCL Press'
Publication date: 22/06/2023
Field of study

Prosperity in the Twenty-First Century sets out a new vision for prosperity in the twenty-first century and how it can be achieved for all. The volume challenges orthodox understandings of economic models, but goes beyond contemporary debates to show how social innovation drives economic value. Drawing on substantive research in the UK, Lebanon and Kenya, it develops new concepts, frameworks, models and metrics for prosperity across a wide range of contexts, emphasising commonalities and differences. Its distinctive approach goes beyond defining and measuring prosperity – addressing the debate about the failures of GDP – to formulating and describing what is needed to make prosperity a realisable proposition for specific people living in specific locales. Departing from general propositions about post-growth to delineate pathways to prosperity, the volume emphasises that visions of the good life are diverse and require empirical work co-designed with local communities and stakeholders to drive change. It is essential reading for policymakers who are stuck, local government officers who need new tools, activists who wonder what is next, academics in need of refreshment, and students and people of all ages who want a way forward

UCL Discovery

A Clustering-Based Framework for Individual Travel Behaviour Change Detection

Author: Bucher Dominik
Hong Ye
Martin Henry
Raubal Martin
Xin Yanan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th International Conference on Geographic Information Science (GIScience 2021) - Part II
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Understanding and improving processes for Integrated Urban Water Management strategies in Australia

Author: Guthrie L
Publication venue: RMIT University
Publication date
Field of study

Integrated Urban Water Management (IUWM) is a new and emerging paradigm in urban water management that calls for the integration of the previously separate services of potable water, sewerage and drainage. This thesis investigates a particular aspect of IUWM - IUWM infrastructure strategies. This particular aspect is the collaborative process through which the future infrastructure for all three water services in a given area or region are planned to increase collaboration between services, reduce siloing, and achieve best possible community outcomes. The primary methods of the research have included the examination of nine IUWM strategy case studies which are attached in appendices A 1-9, consultation with 43 experts from 25 organisations, and an industry survey. Overall, this thesis seeks to deepen and increase the understanding and knowledge surrounding IUWM and provide guidance on how its implementation could be improved, with a particular focus on IUWM strategies. Overall this thesis has deepened the understanding of IUWM and IUWM strategies with the 21 findings in the Industry Findings Report and with the categorisation system. This thesis also provides guidance for increasing the implementability of IUWM strategies by providing the insights that HCD can be significantly beneficial when included in IUWM planning, and a list of the important IUWM issues for different types stakeholders

RMIT Research Repository

Architectural Data Flow Analysis for Detecting Violations of Confidentiality Requirements

Author: Seifermann Stephan
Publication venue: KIT Scientific Publishing
Publication date: 19/12/2022
Field of study

Software vendors must consider confidentiality especially while creating software architectures because decisions made here are hard to change later. Our approach represents and analyzes data flows in software architectures. Systems specify data flows and confidentiality requirements specify limitations of data flows. Software architects use detected violations of these limitations to improve the system. We demonstrate how to integrate our approach into existing development processes

Directory of Open Access Books (DOAB)

Architectural Data Flow Analysis for Detecting Violations of Confidentiality Requirements

Author: Seifermann Stephan
Publication venue
Publication date
Field of study

OAPEN Library

Proceedings of the 8th International Conference on Energy Efficiency in Domestic Appliances and Lighting

Author: BERTOLDI PAOLO
DE LUCA Andrea
Publication venue: Publications Office of the European Union
Publication date: 06/11/2015
Field of study

At the EEDAL'15 conference 128 papers dealing with energy consumption and energy efficiency improvements for the residential sector have been presented. Papers focused policies and programmes, technologies and consumer behaviour. Special focus was on standards and labels, demand response and smart meters. All the paper s have been peer reviewed by experts in the sector.JRC.F.7-Renewables and Energy Efficienc

JRC Publications Repository

Three Facets of Online Political Networks: Communities, Antagonisms, and Polarization

Author
Publication venue
Publication date: 01/01/2019
Field of study

abstract: Millions of users leave digital traces of their political engagements on social media platforms every day. Users form networks of interactions, produce textual content, like and share each others' content. This creates an invaluable opportunity to better understand the political engagements of internet users. In this proposal, I present three algorithmic solutions to three facets of online political networks; namely, detection of communities, antagonisms and the impact of certain types of accounts on political polarization. First, I develop a multi-view community detection algorithm to find politically pure communities. I find that word usage among other content types (i.e. hashtags, URLs) complement user interactions the best in accurately detecting communities. Second, I focus on detecting negative linkages between politically motivated social media users. Major social media platforms do not facilitate their users with built-in negative interaction options. However, many political network analysis tasks rely on not only positive but also negative linkages. Here, I present the SocLSFact framework to detect negative linkages among social media users. It utilizes three pieces of information; sentiment cues of textual interactions, positive interactions, and socially balanced triads. I evaluate the contribution of each three aspects in negative link detection performance on multiple tasks. Third, I propose an experimental setup that quantifies the polarization impact of automated accounts on Twitter retweet networks. I focus on a dataset of tragic Parkland shooting event and its aftermath. I show that when automated accounts are removed from the retweet network the network polarization decrease significantly, while a same number of accounts to the automated accounts are removed randomly the difference is not significant. I also find that prominent predictors of engagement of automatically generated content is not very different than what previous studies point out in general engaging content on social media. Last but not least, I identify accounts which self-disclose their automated nature in their profile by using expressions such as bot, chat-bot, or robot. I find that human engagement to self-disclosing accounts compared to non-disclosing automated accounts is much smaller. This observational finding can motivate further efforts into automated account detection research to prevent their unintended impact.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

12th International Conference on Geographic Information Science: GIScience 2023, September 12–15, 2023, Leeds, UK

Author
Publication venue: Schloss Dagstuhl – Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 07/09/2023
Field of study

No abstract available

Enlighten

Architectural Data Flow Analysis for Detecting Violations of Confidentiality Requirements

Author: Seifermann Stephan
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2022
Field of study

Diese Arbeit präsentiert einen Ansatz zur systematischen Berücksichtigung von Vertraulichkeitsanforderungen in Softwarearchitekturen mittels Abbildung und Analyse von Datenflüssen. Die Stärkung von Datenschutzregularien, wie bspw. durch die europäische Datenschutzgrundverordnung (DSGVO), und die Reaktionen der Bevölkerung auf Datenskandale, wie bspw. den Skandal um Cambridge Analytica, haben gezeigt, dass die Wahrung von Vertraulichkeit für Organisationen von essentieller Bedeutung ist. Um Vertraulichkeit zu wahren, muss diese während des gesamten Softwareentwicklungsprozesses berücksichtigt werden. Frühe Entwicklungsphasen benötigen hier insbesondere große Beachtung, weil ein beträchtlicher Anteil an späteren Problemen auf Fehler in diesen frühen Entwicklungsphasen zurückzuführen ist. Hinzu kommt, dass der Aufwand zum Beseitigen von Fehlern aus der Softwarearchitektur in späteren Entwicklungsphasen überproportional steigt. Um Verletzungen von Vertraulichkeitsanforderungen zu erkennen, werden in früheren Entwicklungsphasen häufig datenorientierte Dokumentationen der Softwaresysteme verwendet. Dies kommt daher, dass die Untersuchung einer solchen Verletzung häufig erfordert, Datenflüssen zu folgen. Datenflussdiagramme (DFDs) werden gerne genutzt, um Sicherheit im Allgemeinen und Vertraulichkeit im Speziellen zu untersuchen. Allerdings sind reine DFDs noch nicht ausreichend, um darauf aufbauende Analysen zu formalisieren und zu automatisieren. Stattdessen müssen DFDs oder auch andere Architekturbeschreibungssprachen (ADLs) erweitert werden, um die zur Untersuchung von Vertraulichkeit notwendigen Informationen repräsentieren zu können. Solche Erweiterungen unterstützen häufig nur Vertraulichkeitsanforderungen für genau einen Vertraulichkeitsmechanismus wie etwa Zugriffskontrolle. Eine Kombination von Mechanismen unterstützen solche auf einen einzigen Zweck fokussierten Erweiterungen nicht, was deren Ausdrucksmächtigkeit einschränkt. Möchte ein Softwarearchitekt oder eine Softwarearchitektin den eingesetzten Vertraulichkeitsmechanismus wechseln, muss er oder sie auch die ADL wechseln, was mit hohem Aufwand für das erneute Modellieren der Softwarearchitektur einhergeht. Darüber hinaus bieten viele Analyseansätze keine Integration in bestehende ADLs und Entwicklungsprozesse. Ein systematischer Einsatz eines solchen Ansatzes wird dadurch deutlich erschwert. Existierende, datenorientierte Ansätze bauen entweder stark auf manuelle Aktivitäten und hohe Expertise oder unterstützen nicht die gleichzeitige Repräsentation von Zugriffs- und Informationsflusskontrolle, sowie Verschlüsselung im selben Artefakt zur Architekturspezifikation. Weil die genannten Vertraulichkeitsmechanismen am verbreitetsten sind, ist es wahrscheinlich, dass Softwarearchitekten und Softwarearchitektinnen an der Nutzung all dieser Mechanismen interessiert sind. Die erwähnten, manuellen Tätigkeiten umfassen u.a. die Identifikation von Verletzungen mittels Inspektionen und das Nachverfolgen von Daten durch das System. Beide Tätigkeiten benötigen ein beträchtliches Maß an Erfahrung im Bereich Vertraulichkeit. Wir adressieren in dieser Arbeit die zuvor genannten Probleme mittels vier Beiträgen: Zuerst präsentieren wir eine Erweiterung der DFD-Syntax, durch die die zur Untersuchung von Zugriffs- und Informationsflusskontrolle, sowie Verschlüsselung notwendigen Informationen mittels Eigenschaften und Verhaltensbeschreibungen innerhalb des selben Artefakts zur Architekturspezifikation ausgedrückt werden können. Zweitens stellen wir eine Semantik dieser erweiterten DFD-Syntax vor, die das Verhalten von DFDs über die Ausbreitung von Attributen (engl.: label propagation) formalisiert und damit eine automatisierte Rückverfolgung von Daten ermöglicht. Drittens präsentieren wir Analysedefinitionen, die basierend auf der DFD-Syntax und -Semantik Verletzungen von Vertraulichkeitsanforderungen identifizieren kann. Die unterstützten Vertraulichkeitsanforderungen decken die wichtigsten Varianten von Zugriffs- und Informationsflusskontrolle, sowie Verschlüsselung ab. Viertens stellen wir einen Leitfaden zur Integration des Rahmenwerks für datenorientierte Analysen in bestehende ADLs und deren zugehörige Entwicklungsprozesse vor. Das Rahmenwerk besteht aus den vorherigen drei Beiträgen. Die Validierung der Ausdrucksmächtigkeit, der Ergebnisqualität und des Modellierungsaufwands unserer Beiträge erfolgt fallstudienbasiert auf siebzehn Fallstudiensystemen. Die Fallstudiensysteme stammen größtenteils aus verwandten Arbeiten und decken fünf Arten von Zugriffskontrollanforderungen, vier Arten von Informationsflussanforderungen, zwei Arten von Verschlüsselung und Anforderungen einer Kombination beider Vertraulichkeitsmechanismen ab. Wir haben die Ausdrucksmächtigkeit der DFD-Syntax, sowie der mittels des Integrationsleitfadens erstellten ADLs validiert und konnten alle außer ein Fallstudiensystem repräsentieren. Wir konnten außerdem die Vertraulichkeitsanforderungen von sechzehn Fallstudiensystemen mittels unserer Analysedefinitionen repräsentieren. Die DFD-basierten, sowie die ADL-basierten Analysen lieferten die erwarteten Ergebnisse, was eine hohe Ergebnisqualität bedeutet. Den Modellierungsaufwand in den erweiterten ADLs validierten wir sowohl für das Hinzufügen, als auch das Wechseln eines Vertraulichkeitsmechanismus bei einer bestehenden Softwarearchitektur. In beiden Validierungen konnten wir zeigen, dass die ADL-Integrationen Modellierungsaufwand einsparen, indem beträchtliche Teile bestehender Softwarearchitekturen wiederverwendet werden können. Von unseren Beiträgen profitieren Softwarearchitekten durch gesteigerte Flexibilität bei der Auswahl von Vertraulichkeitsmechanismen, sowie beim Wechsel zwischen diesen Mechanismen. Die frühe Identifikation von Vertraulichkeitsverletzungen verringert darüber hinaus den Aufwand zum Beheben der zugrundeliegenden Probleme

KITopen

Directory of Open Access Books (DOAB)