19 research outputs found

    Data Augmentation Techniques and Transfer Learning Approaches Applied to Facial Expressions Recognition Systems

    Get PDF
    The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85\% for the InceptionResNetV2 model

    Fast Vocabulary Transfer for Language Model Compression

    Get PDF
    Real-world business applications require a trade-off between language model performance and size. We propose a new method for model compression that relies on vocabulary transfer. We evaluate the method on various vertical domains and downstream tasks. Our results indicate that vocabulary transfer can be effectively used in combination with other compression techniques, yielding a significant reduction in model size and inference time while marginally compromising on performance

    An energy-based comparative analysis of common approaches to text classification in the Legal domain

    Full text link
    Most Machine Learning research evaluates the best solutions in terms of performance. However, in the race for the best performing model, many important aspects are often overlooked when, on the contrary, they should be carefully considered. In fact, sometimes the gaps in performance between different approaches are neglectable, whereas factors such as production costs, energy consumption, and carbon footprint must take into consideration. Large Language Models (LLMs) are extensively adopted to address NLP problems in academia and industry. In this work, we present a detailed quantitative comparison of LLM and traditional approaches (e.g. SVM) on the LexGLUE benchmark, which takes into account both performance (standard indices) and alternative metrics such as timing, power consumption and cost, in a word: the carbon-footprint. In our analysis, we considered the prototyping phase (model selection by training-validation-test iterations) and in-production phases separately, since they follow different implementation procedures and also require different resources. The results indicate that very often, the simplest algorithms achieve performance very close to that of large LLMs but with very low power consumption and lower resource demands. The results obtained could suggest companies to include additional evaluations in the choice of Machine Learning (ML) solutions.Comment: Accepted at The 4th International Conference on NLP & Text Mining (NLTM 2024), January 27-28 2024, Copenhagen, Denmark - 12 pages, 1 figure, 7 table

    A novel integrated industrial approach with cobots in the age of industry 4.0 through conversational interaction and computer vision

    Get PDF
    From robots that replace workers to robots that serve as helpful colleagues, the field of robotic automation is experiencing a new trend that represents a huge challenge for component manufacturers. The contribution starts from an innovative vision that sees an ever closer collaboration between Cobot, able to do a specific physical job with precision, the AI world, able to analyze information and support the decision-making process, and the man able to have a strategic vision of the future

    A semi-supervised document clustering algorithm based on em

    No full text
    Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the category structure. This task can be difficult also for humans because many different but valid partitions may exist for the same collection. Moreover, the lack of information about categories makes it difficult to apply effective feature selection techniques to reduce the noise in the representation of texts. Despite these intrinsic difficulties, text clustering is an important task for Web search applications in which huge collections or quite long query result lists must be automatically organized. Semi-supervised clustering lies in between automatic categorization and auto-organization. It is assumed that the supervisor is not required to specify a set of classes, but only to provide a set of texts grouped by the criteria to be used to organize the collection. In this paper we present a novel algorithm for clustering text documents which exploits the EM algorithm together with a feature selection technique based on Information Gain. The experimental results show that only very few documents are needed to initialize the clusters and that the algorithm is able to properly extract the regularities hidden in a huge unlabeled collection

    SAGE: Semantic-Aware Global Explanations for Named Entity Recognition

    No full text
    In this paper, we present a post-hoc method to produce highly interpretable global rules to explain NLP classifiers. Rules are extracted with a data mining approach on a semantically enriched input representation, instead of using words/wordpieces solely. Semantic information yields more abstract and general rules that are both more explanatory and less complex, while being also better at reflecting the model behaviour

    Learning similarities for text documents using neural networks

    No full text
    Learning algorithms for neural networks follow either a supervised or a unsupervised scheme. In the first case a teacher provides a target for each pattern in the learning set, while in the second one no target is provided and the learning process aims at fitting the network to the data distribution. In this paper we propose a learning algorithm which, somehow, lies in between these two schemes. The supervisor does not provide a specific target for each example but he specifies a set of relationships among pairs of input patterns. The neural network is trained to map the examples into points of the output space which meet the topological constraints related to the relationships provided by the supervisor. This algorithm is applied to realize an adaptive dimensionality reduction for the vector space representation of text documents. We present a set of experimental results showing that the algorithm is capable of exploiting the semantic relationships implicitly contained in the user’s feedback.

    Enhancing Modern Supervised Word Sense Disambiguation Models by Semantic Lexical Resources

    No full text
    Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. Despite the recent introduction of Word Embeddings and Recurrent Neural Networks to design powerful context-related features, the interest in improving WSD models using Semantic Lexical Resources (SLRs) is mostly restricted to knowledge-based approaches. In this paper, we enhance “modern” supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We propose an effective way to introduce semantic features into the classifiers, and we consider using the SLR structure to augment the training data. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks, and we extend the proposed model into a novel multi-layer architecture for WSD. A detailed experimental comparison in the recent Unified Evaluation Framework (Raganato et al., 2017) shows that the proposed approach leads to supervised models that compare favourably with the state-of-the art
    corecore