694 research outputs found
TorchRL: A data-driven decision-making library for PyTorch
Striking a balance between integration and modularity is crucial for a
machine learning library to be versatile and user-friendly, especially in
handling decision and control tasks that involve large development teams and
complex, real-world data, and environments. To address this issue, we propose
TorchRL, a generalistic control library for PyTorch that provides
well-integrated, yet standalone components. With a versatile and robust
primitive design, TorchRL facilitates streamlined algorithm development across
the many branches of Reinforcement Learning (RL) and control. We introduce a
new PyTorch primitive, TensorDict, as a flexible data carrier that empowers the
integration of the library's components while preserving their modularity.
Hence replay buffers, datasets, distributed data collectors, environments,
transforms and objectives can be effortlessly used in isolation or combined. We
provide a detailed description of the building blocks, supporting code examples
and an extensive overview of the library across domains and tasks. Finally, we
show comparative benchmarks to demonstrate its computational efficiency.
TorchRL fosters long-term support and is publicly available on GitHub for
greater reproducibility and collaboration within the research community. The
code is opensourced on https://github.com/pytorch/rl
Artificial Intelligence Enabled Project Management: A Systematic Literature Review
In the Industry 5.0 era, companies are leveraging the potential of cutting-edge technologies such as artificial intelligence for more efficient and green human-centric production. In a similar approach, project management would benefit from artificial intelligence in order to achieve project goals by improving project performance, and consequently, reaching higher sustainable success. In this context, this paper examines the role of artificial intelligence in emerging project management through a systematic literature review; the applications of AI techniques in the project management performance domains are presented. The results show that the number of influential publications on artificial intelligence-enabled project management has increased significantly over the last decade. The findings indicate that artificial intelligence, predominantly machine learning, can be considerably useful in the management of construction and IT projects; it is notably encouraging for enhancing the planning, measurement, and uncertainty performance domains by providing promising forecasting and decision-making capabilities
Learning Transformer Programs
Recent research in mechanistic interpretability has attempted to
reverse-engineer Transformer models by carefully inspecting network weights and
activations. However, these approaches require considerable manual effort and
still fall short of providing complete, faithful descriptions of the underlying
algorithms. In this work, we introduce a procedure for training Transformers
that are mechanistically interpretable by design. We build on RASP [Weiss et
al., 2021], a programming language that can be compiled into Transformer
weights. Instead of compiling human-written programs into Transformers, we
design a modified Transformer that can be trained using gradient-based
optimization and then be automatically converted into a discrete,
human-readable program. We refer to these models as Transformer Programs. To
validate our approach, we learn Transformer Programs for a variety of problems,
including an in-context learning task, a suite of algorithmic problems (e.g.
sorting, recognizing Dyck-languages), and NLP tasks including named entity
recognition and text classification. The Transformer Programs can automatically
find reasonable solutions, performing on par with standard Transformers of
comparable size; and, more importantly, they are easy to interpret. To
demonstrate these advantages, we convert Transformers into Python programs and
use off-the-shelf code analysis tools to debug model errors and identify the
``circuits'' used to solve different sub-problems. We hope that Transformer
Programs open a new path toward the goal of intrinsically interpretable machine
learning.Comment: Our code, and example Transformer Programs, are available at
https://github.com/princeton-nlp/TransformerProgram
SPARLING: Learning Latent Representations with Extremely Sparse Activations
Real-world processes often contain intermediate state that can be modeled as
an extremely sparse tensor. We introduce Sparling, a technique that allows you
to learn models with intermediate layers that match this state from only
end-to-end labeled examples (i.e., no supervision on the intermediate state).
Sparling uses a new kind of informational bottleneck that enforces levels of
activation sparsity unachievable using other techniques. We find that extreme
sparsity is necessary to achieve good intermediate state modeling. On our
synthetic DigitCircle domain as well as the LaTeX-OCR and Audio-MNIST-Sequence
domains, we are able to precisely localize the intermediate states up to
feature permutation with > 90% accuracy, even though we only train end-to-end.Comment: 10 pages, 6 figure
Improving Optimization of Convolutional Neural Networks through Parameter Fine-tuning
In recent years, convolutional neural networks have achieved state-of-the-art performance in a number of computer vision problems such as image classification. Prior research has shown that a transfer learning technique known as parameter fine-tuning wherein a network is pre-trained on a different dataset can boost the performance of these networks. However, the topic of identifying the best source dataset and learning strategy for a given target domain is largely unexplored. Thus, this research presents and evaluates various transfer learning methods for fine-grained image classification as well as the effect on ensemble networks. The results clearly demonstrate the effectiveness of parameter fine-tuning over random initialization. We find that training should not be reduced after transferring weights, larger, more similar networks tend to be the best source task, and parameter fine-tuning can often outperform randomly initialized ensembles. The experimental framework and findings will help to train models with improved accuracy
Enhancing Word Representation Learning with Linguistic Knowledge
Representation learning, the process whereby representations are modelled from data, has recently become a central part of Natural Language Processing (NLP). Among the most widely used learned representations are word embeddings trained on large corpora of unannotated text, where the learned embeddings are treated as general representations that can be used across multiple NLP tasks. Despite their empirical successes, word embeddings learned entirely from data can only capture patterns of language usage from the particular linguistic domain of the training data. Linguistic knowledge, which does not vary among linguistic domains, can potentially be used to address this limitation. The vast sources of linguistic knowledge that are readily available nowadays can help train more general word embeddings (i.e. less affected by distance between linguistic domains) by providing them with such information as semantic relations, syntactic structure, word morphology, etc.
In this research, I investigate the different ways in which word embedding models capture and encode words’ semantic and contextual information. To this end, I propose two approaches to integrate linguistic knowledge into the statistical learning of word embeddings. The first approach is based on augmenting the training data for a well-known Skip-gram word embedding model, where synonym information is extracted from a lexical knowledge base and incorporated into the training data in the form of additional training examples. This data augmentation approach seeks to enforce synonym relations in the learned embeddings. The second approach exploits structural information in text by transforming every sentence in the data into its corresponding dependency parse trees and training an autoencoder to recover the original sentence. While learning a mapping from a dependency parse tree to its originating sentence, this novel Structure-to-Sequence (Struct2Seq) model produces word embeddings that contain information about a word’s structural context. Given that the combination of knowledge and statistical methods can often be unpredictable, a central focus of this thesis is on understanding the effects of incorporating linguistic knowledge into word representation learning. Through the use of intrinsic (geometric characteristics) and extrinsic (performance on downstream tasks) evaluation metrics, I aim to measure the specific influence that the injected knowledge can have on different aspects of the informational composition of word embeddings
A guide to machine learning for biologists
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed
- …