121 research outputs found
MsPrompt: Multi-step Prompt Learning for Debiasing Few-shot Event Detection
Event detection (ED) is aimed to identify the key trigger words in
unstructured text and predict the event types accordingly. Traditional ED
models are too data-hungry to accommodate real applications with scarce labeled
data. Besides, typical ED models are facing the context-bypassing and disabled
generalization issues caused by the trigger bias stemming from ED datasets.
Therefore, we focus on the true few-shot paradigm to satisfy the low-resource
scenarios. In particular, we propose a multi-step prompt learning model
(MsPrompt) for debiasing few-shot event detection, that consists of the
following three components: an under-sampling module targeting to construct a
novel training set that accommodates the true few-shot setting, a multi-step
prompt module equipped with a knowledge-enhanced ontology to leverage the event
semantics and latent prior knowledge in the PLMs sufficiently for tackling the
context-bypassing problem, and a prototypical module compensating for the
weakness of classifying events with sparse data and boost the generalization
performance. Experiments on two public datasets ACE-2005 and FewEvent show that
MsPrompt can outperform the state-of-the-art models, especially in the strict
low-resource scenarios reporting 11.43% improvement in terms of weighted
F1-score against the best-performing baseline and achieving an outstanding
debiasing performance
Fully Convolutional Networks for Semantic Segmentation
Convolutional networks are powerful visual models that yield hierarchies of
features. We show that convolutional networks by themselves, trained
end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic
segmentation. Our key insight is to build "fully convolutional" networks that
take input of arbitrary size and produce correspondingly-sized output with
efficient inference and learning. We define and detail the space of fully
convolutional networks, explain their application to spatially dense prediction
tasks, and draw connections to prior models. We adapt contemporary
classification networks (AlexNet, the VGG net, and GoogLeNet) into fully
convolutional networks and transfer their learned representations by
fine-tuning to the segmentation task. We then define a novel architecture that
combines semantic information from a deep, coarse layer with appearance
information from a shallow, fine layer to produce accurate and detailed
segmentations. Our fully convolutional network achieves state-of-the-art
segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012),
NYUDv2, and SIFT Flow, while inference takes one third of a second for a
typical image.Comment: to appear in CVPR (2015
Deterministic Parallel Global Parameter Estimation for a Model of the Budding Yeast Cell Cycle
Two parallel deterministic direct search algorithms are used to find improved
parameters for a system of differential equations designed to simulate the cell cycle of budding
yeast. Comparing the model simulation results to experimental data is difficult because most of
the experimental data is qualitative rather than quantitative. An algorithm to convert simulation
results to mutant phenotypes is presented. Vectors of parameters defining the differential equation
model are rated by a discontinuous objective function. Parallel results on a 2200 processor
supercomputer are presented for a global optimization algorithm, DIRECT, a local optimization
algorithm, MADS, and a hybrid of the two
Network traffic classification : from theory to practice
Since its inception until today, the Internet has been in constant transformation. The analysis and monitoring of data networks try to shed some light on this huge black box of interconnected computers. In particular, the classification of the network traffic has become crucial for understanding the Internet. During the last years, the research community has proposed many solutions to accurately identify and classify the network traffic. However, the continuous evolution of Internet applications and their techniques to avoid detection make their identification a very challenging task, which is far from being completely solved.
This thesis addresses the network traffic classification problem from a more practical point of view, filling the gap between the real-world requirements from the network industry, and the research carried out.
The first block of this thesis aims to facilitate the deployment of existing techniques in production networks. To achieve this goal, we study the viability of using NetFlow as input in our classification technique, a monitoring protocol already implemented in most routers. Since the application of packet sampling has become almost mandatory in large networks, we also study its impact on the classification and propose a method to improve the accuracy in this scenario. Our results show that it is possible to achieve high accuracy with both sampled and unsampled NetFlow data, despite the limited information provided by NetFlow.
Once the classification solution is deployed it is important to maintain its accuracy over time. Current network traffic classification techniques have to be regularly updated to adapt them to traffic changes. The second block of this thesis focuses on this issue with the goal of automatically maintaining the classification solution without human intervention. Using the knowledge of the first block, we propose a classification solution that combines several techniques only using Sampled NetFlow as input for the classification. Then, we show that classification models suffer from temporal and spatial obsolescence and, therefore, we design an autonomic retraining system that is able to automatically update the models and keep the classifier accurate along time.
Going one step further, we introduce next the use of stream-based Machine Learning techniques for network traffic classification. In particular, we propose a classification solution based on Hoeffding Adaptive Trees. Apart from the features of stream-based techniques (i.e., process an instance at a time and inspect it only once, with a predefined amount of memory and a bounded amount of time), our technique is able to automatically adapt to the changes in the traffic by using only NetFlow data as input for the classification.
The third block of this thesis aims to be a first step towards the impartial validation of state-of-the-art classification techniques. The wide range of techniques, datasets, and ground-truth generators make the comparison of different traffic classifiers a very difficult task. To achieve this goal we evaluate the reliability of different Deep Packet Inspection-based techniques (DPI) commonly used in the literature for ground-truth generation. The results we obtain show that some well-known DPI techniques present several limitations that make them not recommendable as a ground-truth generator in their current state. In addition, we publish some of the datasets used in our evaluations to address the lack of publicly available datasets and make the comparison and validation of existing techniques easier
Transfer Learning in Natural Language Processing through Interactive Feedback
Machine learning models cannot easily adapt to new domains and applications. This drawback becomes detrimental for natural language processing (NLP) because language is perpetually changing. Across disciplines and languages, there are noticeable differences in content, grammar, and vocabulary. To overcome these shifts, recent NLP breakthroughs focus on transfer learning. Through clever optimization and engineering, a model can successfully adapt to a new domain or task. However, these modifications are still computationally inefficient or resource-intensive. Compared to machines, humans are more capable at generalizing knowledge across different situations, especially in low-resource ones. Therefore, the research on transfer learning should carefully consider how the user interacts with the model. The goal of this dissertation is to investigate “human-in-the-loop” approaches for transfer learning in NLP.
First, we design annotation frameworks for inductive transfer learning, which is the transfer of models across tasks. We create an interactive topic modeling system for users to find topics useful for classifying documents in multiple languages. The user-constructed topic model bridges improves classification accuracy and bridges cross-lingual gaps in knowledge. Next, we look at popular language models, like BERT, that can be applied to various tasks. While these models are useful, they still require a large amount of labeled data to learn a new task. To reduce labeling, we develop an active learning strategy which samples documents that surprise the language model. Users only need to annotate a small subset of these unexpected documents to adapt the language model for text classification.
Then, we transition to user interaction in transductive transfer learning, which is the transfer of models across domains. We focus our efforts on low-resource languages to develop an interactive system for word embeddings. In this approach, the feedback from bilingual speakers refines the cross-lingual embedding space for classification tasks. Subsequently, we look at domain shift for tasks beyond text classification. Coreference resolution is fundamental for NLP applications, like question-answering and dialogue, but the models are typically trained and evaluated on one dataset. We use active learning to find spans of text in the new domain for users to label. Furthermore, we provide important insights on annotating spans for domain adaptation.
Finally, we summarize the contributions of each chapter. We focus on aspects like the scope of applications and model complexity. We conclude with a discussion of future directions. Researchers may extend the ideas in our thesis to topics like user-centric active learning and proactive learning
Projecting future expansion of invasive species: comparing and improving methodologies for species distribution modeling.
Modeling the distributions of species, especially of invasive species in non-native ranges, involves multiple challenges. Here, we developed some novel approaches to species distribution modeling aimed at reducing the influences of such challenges and improving the realism of projections. We estimated species-environment relationships for Parthenium hysterophorus L. (Asteraceae) with four modeling methods run with multiple scenarios of (i) sources of occurrences and geographically isolated background ranges for absences, (ii) approaches to drawing background (absence) points, and (iii) alternate sets of predictor variables. We further tested various quantitative metrics of model evaluation against biological insight. Model projections were very sensitive to the choice of training dataset. Model accuracy was much improved using a global dataset for model training, rather than restricting data input to the species' native range. AUC score was a poor metric for model evaluation and, if used alone, was not a useful criterion for assessing model performance. Projections away from the sampled space (i.e., into areas of potential future invasion) were very different depending on the modeling methods used, raising questions about the reliability of ensemble projections. Generalized linear models gave very unrealistic projections far away from the training region. Models that efficiently fit the dominant pattern, but exclude highly local patterns in the dataset and capture interactions as they appear in data (e.g., boosted regression trees), improved generalization of the models. Biological knowledge of the species and its distribution was important in refining choices about the best set of projections. A post hoc test conducted on a new Parthenium dataset from Nepal validated excellent predictive performance of our 'best' model. We showed that vast stretches of currently uninvaded geographic areas on multiple continents harbor highly suitable habitats for parthenium. However, discrepancies between model predictions and parthenium invasion in Australia indicate successful management for this globally significant weed
- …