1,335 research outputs found
Machine Learning Models that Remember Too Much
Machine learning (ML) is becoming a commodity. Numerous ML frameworks and
services are available to data holders who are not ML experts but want to train
predictive models on their data. It is important that ML models trained on
sensitive inputs (e.g., personal images or documents) not leak too much
information about the training data.
We consider a malicious ML provider who supplies model-training code to the
data holder, does not observe the training, but then obtains white- or
black-box access to the resulting model. In this setting, we design and
implement practical algorithms, some of them very similar to standard ML
techniques such as regularization and data augmentation, that "memorize"
information about the training dataset in the model yet the model is as
accurate and predictive as a conventionally trained model. We then explain how
the adversary can extract memorized information from the model.
We evaluate our techniques on standard ML tasks for image classification
(CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20
Newsgroups and IMDB). In all cases, we show how our algorithms create models
that have high predictive power yet allow accurate extraction of subsets of
their training data
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU
Deep learning (DL) has emerged as a powerful subset of machine learning (ML)
and artificial intelligence (AI), outperforming traditional ML methods,
especially in handling unstructured and large datasets. Its impact spans across
various domains, including speech recognition, healthcare, autonomous vehicles,
cybersecurity, predictive analytics, and more. However, the complexity and
dynamic nature of real-world problems present challenges in designing effective
deep learning models. Consequently, several deep learning models have been
developed to address different problems and applications. In this article, we
conduct a comprehensive survey of various deep learning models, including
Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs),
Generative Models, Deep Reinforcement Learning (DRL), and Deep Transfer
Learning. We examine the structure, applications, benefits, and limitations of
each model. Furthermore, we perform an analysis using three publicly available
datasets: IMDB, ARAS, and Fruit-360. We compare the performance of six renowned
deep learning models: CNN, Simple RNN, Long Short-Term Memory (LSTM),
Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional GRU.Comment: 16 pages, 29 figure
Predicting User Interaction on Social Media using Machine Learning
Analysis of Facebook posts provides helpful information for users on social media. Current papers about user engagement on social media explore methods for predicting user engagement. These analyses of Facebook posts have included text and image analysis. Yet, the studies have not incorporate both text and image data. This research explores the usefulness of incorporating image and text data to predict user engagement. The study incorporates five types of machine learning models: text-based Neural Networks (NN), image-based Convolutional Neural Networks (CNN), Word2Vec, decision trees, and a combination of text-based NN and image-based CNN. The models are unique in their use of the data. The research collects 350k Facebook posts. The models learn and test on advertisement posts in order to predict user engagement. User engagements includes share count, comment count, and comment sentiment. The study found that combining image and text data produced the best models. The research further demonstrates that combined models outperform random models
- …