1,286 research outputs found
Protecting Privacy in Indian Schools: Regulating AI-based Technologies' Design, Development and Deployment
Education is one of the priority areas for the Indian government, where Artificial Intelligence (AI) technologies are touted to bring digital transformation. Several Indian states have also started deploying facial recognition-enabled CCTV cameras, emotion recognition technologies, fingerprint scanners, and Radio frequency identification tags in their schools to provide personalised recommendations, ensure student security, and predict the drop-out rate of students but also provide 360-degree information of a student. Further, Integrating Aadhaar (digital identity card that works on biometric data) across AI technologies and learning and management systems (LMS) renders schools a âpanopticonâ.
Certain technologies or systems like Aadhaar, CCTV cameras, GPS Systems, RFID tags, and learning management systems are used primarily for continuous data collection, storage, and retention purposes. Though they cannot be termed AI technologies per se, they are fundamental for designing and developing AI systems like facial, fingerprint, and emotion recognition technologies. The large amount of student data collected speedily through the former technologies is used to create an algorithm for the latter-stated AI systems. Once algorithms are processed using machine learning (ML) techniques, they learn correlations between multiple datasets predicting each studentâs identity, decisions, grades, learning growth, tendency to drop out, and other behavioural characteristics. Such autonomous and repetitive collection, processing, storage, and retention of student data without effective data protection legislation endangers student privacy.
The algorithmic predictions by AI technologies are an avatar of the data fed into the system. An AI technology is as good as the person collecting the data, processing it for a relevant and valuable output, and regularly evaluating the inputs going inside an AI model. An AI model can produce inaccurate predictions if the person overlooks any relevant data. However, the state, school administrations and parentsâ belief in AI technologies as a panacea to student security and educational development overlooks the context in which âdata practicesâ are conducted. A right to privacy in an AI age is inextricably connected to data practices where data gets âcookedâ. Thus, data protection legislation operating without understanding and regulating such data practices will remain ineffective in safeguarding privacy.
The thesis undergoes interdisciplinary research that enables a better understanding of the interplay of data practices of AI technologies with social practices of an Indian school, which the present Indian data protection legislation overlooks, endangering studentsâ privacy from designing and developing to deploying stages of an AI model. The thesis recommends the Indian legislature frame better legislation equipped for the AI/ML age and the Indian judiciary on evaluating the legality and reasonability of designing, developing, and deploying such technologies in schools
Rules, frequency, and predictability in morphological generalization: behavioral and computational evidence from the German plural system
Morphological generalization, or the task of mapping an unknown word (such as a novel noun Raun) to an inflected form (such as the plural Rauns), has historically proven a contested topic within computational linguistics and cognitive science, e.g. within the past tense debate (Rumelhart and McClelland, 1986; Pinker and Prince, 1988; Seidenberg and Plaut, 2014). Marcus et al. (1995) identified German plural inflection as a key challenge domain to evaluate two competing accounts of morphological generalization: a rule generation view focused on linguistic features of input words, and a type frequency view focused on the distribution of output inflected forms, thought to reflect more domain-general cognitive processes. More recent behavioral and computational research developments support a new view based on predictability, which integrates both input and output distributions. My research uses these methodological innovations to revisit a core dispute of the past tense debate: how do German speakers generalize plural inflection, and can computational learners generalize similarly?
This dissertation evaluates the rule generation, type frequency, and predictability accounts of morphological generalization in a series of behavioral and computational experiments with the stimuli developed by Marcus et al.. I assess predictions for three aspects of German plural generalization: distribution of infrequent plural classes, influence of grammatical gender, and within-item variability. Overall, I find that speaker behavior is best characterized as frequency-matching to a phonologically-conditioned lexical distribution. This result does not support the rule generation view, and qualifies the predictability view: speakers use some, but not all available information to reduce uncertainty in morphological generalization. Neural and symbolic model predictions are typically overconfident relative to speakers; simple Bayesian models show somewhat higher speaker-like variability and accuracy. All computational models are outperformed by a static phonologically-conditioned lexical baseline, suggesting these models have not learned the selective feature preferences that inform speaker generalization
Heterogeneous Federated Learning: State-of-the-art and Research Challenges
Federated learning (FL) has drawn increasing attention owing to its potential
use in large-scale industrial applications. Existing federated learning works
mainly focus on model homogeneous settings. However, practical federated
learning typically faces the heterogeneity of data distributions, model
architectures, network environments, and hardware devices among participant
clients. Heterogeneous Federated Learning (HFL) is much more challenging, and
corresponding solutions are diverse and complex. Therefore, a systematic survey
on this topic about the research challenges and state-of-the-art is essential.
In this survey, we firstly summarize the various research challenges in HFL
from five aspects: statistical heterogeneity, model heterogeneity,
communication heterogeneity, device heterogeneity, and additional challenges.
In addition, recent advances in HFL are reviewed and a new taxonomy of existing
HFL methods is proposed with an in-depth analysis of their pros and cons. We
classify existing methods from three different levels according to the HFL
procedure: data-level, model-level, and server-level. Finally, several critical
and promising future research directions in HFL are discussed, which may
facilitate further developments in this field. A periodically updated
collection on HFL is available at https://github.com/marswhu/HFL_Survey.Comment: 42 pages, 11 figures, and 4 table
Under construction: infrastructure and modern fiction
In this dissertation, I argue that infrastructural development, with its technological promises but widening geographic disparities and social and environmental consequences, informs both the narrative content and aesthetic forms of modernist and contemporary Anglophone fiction. Despite its prevalent material formsâroads, rails, pipes, and wiresâinfrastructure poses particular formal and narrative problems, often receding into the background as mere setting. To address how literary fiction theorizes the experience of infrastructure requires reading âinfrastructurallyâ: that is, paying attention to the seemingly mundane interactions between characters and their built environments. The writers central to this projectâJames Joyce, William Faulkner, Karen Tei Yamashita, and Mohsin Hamidâtake up the representational challenges posed by infrastructure by bringing transit networks, sanitation systems, and electrical grids and the histories of their development and use into the foreground. These writers call attention to the political dimensions of built environments, revealing the ways infrastructures produce, reinforce, and perpetuate racial and socioeconomic fault lines. They also attempt to formalize the material relations of power inscribed by and within infrastructure; the novel itself becomes an imaginary counterpart to the technologies of infrastructure, a form that shapes and constrains what types of social action and affiliation are possible
Frivolous Floodgate Fears
When rejecting plaintiff-friendly liability standards, courts often cite a fear of opening the floodgates of litigation. Namely, courts point to either a desire to protect the docket of federal courts or a burden on the executive branch. But there is little empirical evidence exploring whether the adoption of a stricter standard can, in fact, decrease the filing of legal claims in this circumstance. This Article empirically analyzes and theoretically models the effect of adopting arguably stricter liability standards on litigation by investigating the context of one of the Supreme Courtâs most recent reliances on this argument when adopting a stricter liability standard for causation in employment discrimination claims. In 2013, the Supreme Court held that a plaintiff proving retaliation under Title VII of the Civil Rights Act must prove that their participation in a protected activity was a but-for cause of the adverse employment action they experienced. Rejecting the arguably more plaintiff-friendly motivating-factor standard, the Court stated, â[L]essening the causation standard could also contribute to the filing of frivolous claims, which would siphon resources from efforts by employer[s], administrative agencies, and courts to combat workplace harassment.â Univ. of Tex. Sw. Med. Ctr. v. Nassar, 570 U.S. 338, 358 (2013). And over the past ten years, the Court has overturned the application of motivating-factor causation as applied to at least four different federal antidiscrimination statutes. Contrary to the Supreme Courtâs concern that motivating-factor causation encourages frivolous charges, many employment law scholars worry that the heightened but-for standard will deter legitimate claims. This Article empirically explores these concerns, in part using data received from the Equal Employment Opportunity Commission (EEOC) through a Freedom of Information Act (FOIA) request. Specifically, it empirically tests whether the adoption of the but-for causation standard for claims filed under the Age Discrimination in Employment Act and by federal courts of appeals under the Americans with Disabilities Act has impacted the filing of discrimination claims and the outcome of those claims in federal court. Consistent with theory detailed in this Article, the empirical analysis provides evidence that the stricter standard may have increased the docket of the federal courts by decreasing settlement within the EEOC and during litigation. The empirical results weigh in on concerns surrounding the adoption of the but-for causation standard and provide evidence that the floodgates argument, when relied on to deter frivolous filings by changing liability standards, in fact, may do just the opposite by decreasing the likelihood of settlement in the short term, without impacting the filing of claims or other case outcomes
The Public Performance Of Sanctions In Insolvency Cases: The Dark, Humiliating, And Ridiculous Side Of The Law Of Debt In The Italian Experience. A Historical Overview Of Shaming Practices
This study provides a diachronic comparative overview of how the law of debt has been applied by certain institutions in Italy. Specifically, it offers historical and comparative insights into the public performance of sanctions for insolvency through shaming and customary practices in Roman Imperial Law, in the Middle Ages, and in later periods.
The first part of the essay focuses on the Roman bonorum cessio culo nudo super lapidem and on the medieval customary institution called pietra della vergogna (stone of shame), which originates from the Roman model.
The second part of the essay analyzes the social function of the zecca and the pittima Veneziana during the Republic of Venice, and of the practice of lu soldate a castighe (no translation is possible).
The author uses a functionalist approach to apply some arguments and concepts from the current context to this historical analysis of ancient institutions that we would now consider ridiculous.
The article shows that the customary norms that play a crucial regulatory role in online interactions today can also be applied to the public square in the past. One of these tools is shaming. As is the case in contemporary online settings, in the public square in historic periods, shaming practices were used to enforce the rules of civility in a given community. Such practices can be seen as virtuous when they are intended for use as a tool to pursue positive change in forces entrenched in the culture, and thus to address social wrongs considered outside the reach of the law, or to address human rights abuses
Novel neural architectures & algorithms for efficient inference
In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance.
Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}.
Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts:
\textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme.
\textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL).
In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure.
Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work
Securing IoT Applications through Decentralised and Distributed IoT-Blockchain Architectures
The integration of blockchain into IoT can provide reliable control of the IoT network's
ability to distribute computation over a large number of devices. It also allows the AI
system to use trusted data for analysis and forecasts while utilising the available IoT
hardware to coordinate the execution of tasks in parallel, using a fully distributed
approach.
This thesis's ârst contribution is a practical implementation of a real world IoT-
blockchain application,
ood detection use case, is demonstrated using Ethereum proof
of authority (PoA). This includes performance measurements of the transaction con-
ârmation time, the system end-to-end latency, and the average power consumption.
The study showed that blockchain can be integrated into IoT applications, and that
Ethereum PoA can be used within IoT for permissioned implementation. This can be
achieved while the average energy consumption of running the
ood detection system
including the Ethereum Geth client is small (around 0.3J).
The second contribution is a novel IoT-centric consensus protocol called honesty-
based distributed proof of authority (HDPoA) via scalable work. HDPoA was analysed
and then deployed and tested. Performance measurements and evaluation along with
the security analyses of HDPoA were conducted using a total of 30 diâerent IoT de-
vices comprising Raspberry Pis, ESP32, and ESP8266 devices. These measurements
included energy consumption, the devices' hash power, and the transaction conârma-
tion time. The measured values of hash per joule (h/J) for mining were 13.8Kh/J,
54Kh/J, and 22.4Kh/J when using the Raspberry Pi, the ESP32 devices, and the
ESP8266 devices, respectively, this achieved while there is limited impact on each de-
vice's power. In HDPoA the transaction conârmation time was reduced to only one
block compared to up to six blocks in bitcoin.
The third contribution is a novel, secure, distributed and decentralised architecture
for supporting the implementation of distributed artiâcial intelligence (DAI) using
hardware platforms provided by IoT. A trained DAI system was implemented over the
IoT, where each IoT device hosts one or more neurons within the DAI layers. This
is accomplished through the utilisation of blockchain technology that allows trusted
interaction and information exchange between distributed neurons. Three diâerent
datasets were tested and the system achieved a similar accuracy as when testing on a
standalone system; both achieved accuracies of 92%-98%. The system accomplished
that while ensuring an overall latency of as low as two minutes. This showed the secure architecture capabilities of facilitating the implementation of DAI within IoT
while ensuring the accuracy of the system is preserved.
The fourth contribution is a novel and secure architecture that integrates the ad-
vantages oâered by edge computing, artiâcial intelligence (AI), IoT end-devices, and
blockchain. This new architecture has the ability to monitor the environment, collect
data, analyse it, process it using an AI-expert engine, provide predictions and action-
able outcomes, and ânally share it on a public blockchain platform. The pandemic
caused by the wide and rapid spread of the novel coronavirus COVID-19 was used as
a use-case implementation to test and evaluate the proposed system. While providing
the AI-engine trusted data, the system achieved an accuracy of 95%,. This is achieved
while the AI-engine only requires a 7% increase in power consumption. This demon-
strate the system's ability to protect the data and support the AI system, and improves
the IoT overall security with limited impact on the IoT devices.
The âfth and ânal contribution is enhancing the security of the HDPoA through
the integration of a hardware secure module (HSM) and a hardware wallet (HW). A
performance evaluation regarding the energy consumption of nodes that are equipped
with HSM and HW and a security analysis were conducted. In addition to enhancing
the nodes' security, the HSM can be used to sign more than 120 bytes/joule and
encrypt up to 100 bytes/joule, while the HW can be used to sign up to 90 bytes/joule
and encrypt up to 80 bytes/joule. The result and analyses demonstrated that the HSM
and HW enhance the security of HDPoA, and also can be utilised within IoT-blockchain
applications while providing much needed security in terms of conâdentiality, trust in
devices, and attack deterrence.
The above contributions showed that blockchain can be integrated into IoT systems.
It showed that blockchain can successfully support the integration of other technolo-
gies such as AI, IoT end devices, and edge computing into one system thus allowing
organisations and users to beneât greatly from a resilient, distributed, decentralised,
self-managed, robust, and secure systems
Decentralized Hyper-Gradient Computation over Time-Varying Directed Networks
This paper addresses the communication issues when estimating hyper-gradients
in decentralized federated learning (FL). Hyper-gradients in decentralized FL
quantifies how the performance of globally shared optimal model is influenced
by the perturbations in clients' hyper-parameters. In prior work, clients trace
this influence through the communication of Hessian matrices over a static
undirected network, resulting in (i) excessive communication costs and (ii)
inability to make use of more efficient and robust networks, namely,
time-varying directed networks. To solve these issues, we introduce an
alternative optimality condition for FL using an averaging operation on model
parameters and gradients. We then employ Push-Sum as the averaging operation,
which is a consensus optimization technique for time-varying directed networks.
As a result, the hyper-gradient estimator derived from our optimality condition
enjoys two desirable properties; (i) it only requires Push-Sum communication of
vectors and (ii) it can operate over time-varying directed networks. We confirm
the convergence of our estimator to the true hyper-gradient both theoretically
and empirically, and we further demonstrate that it enables two novel
applications: decentralized influence estimation and personalization over
time-varying networks.Comment: Under revie
Synchronization of data in heterogeneous decentralized systems
Data synchronization is the problem of reconciling the differences between large data stores that differ in a small number of records. It is a common thread among disparate distributed systems ranging from fleets of Internet of Things (IoT) devices to clusters of distributed databases in the cloud. Most recently, data synchronization has arisen in globally distributed public blockchains that build the basis for the envisioned decentralized Internet of the future. Moreover, the parallel development of edge computing has significantly increased the heterogeneity of networks and computing devices. The merger of highly heterogeneous system resources and the decentralized nature of future Internet applications calls for a new approach to data synchronization. In this dissertation, we look at the problem of data synchronization through the prism of set reconciliation and introduce novel tools and protocols that improve the performance of data synchronization in heterogeneous decentralized systems.
First, we compare the analytical properties of the state-of-the-art set reconciliation protocols, and investigate the impact of theoretical assumptions and implementation decisions on the synchronization performance. Second, we introduce GenSync, the first unified set reconciliation middleware. Using GenSync's distinctive benchmarking layer, we find that the best protocol choice is highly sensitive to the system conditions, and a bad protocol choice causes a severe hit in performance. We showcase the evaluative power of GenSync in one of the world's largest wireless network emulators, and demonstrate choosing the best GenSync protocol under a high and low user mobility in an emulated cellular network. Finally, we introduce SREP (Set Reconciliation-Enhanced Propagation), a novel blockchain transaction pool synchronization protocol with quantifiable guarantees. Through simulations, we show that SREP incurs significantly smaller bandwidth overhead than a similar approach from the literature, especially in the networks of realistic sizes (tens of thousands of participants)
- âŠ