107 research outputs found

    Brazilian Court Documents Clustered by Similarity Together Using Natural Language Processing Approaches with Transformers

    Full text link
    Recent advances in Artificial intelligence (AI) have leveraged promising results in solving complex problems in the area of Natural Language Processing (NLP), being an important tool to help in the expeditious resolution of judicial proceedings in the legal area. In this context, this work targets the problem of detecting the degree of similarity between judicial documents that can be achieved in the inference group, by applying six NLP techniques based on transformers, namely BERT, GPT-2 and RoBERTa pre-trained in the Brazilian Portuguese language and the same specialized using 210,000 legal proceedings. Documents were pre-processed and had their content transformed into a vector representation using these NLP techniques. Unsupervised learning was used to cluster the lawsuits, calculating the quality of the model based on the cosine of the distance between the elements of the group to its centroid. We noticed that models based on transformers present better performance when compared to previous research, highlighting the RoBERTa model specialized in the Brazilian Portuguese language, making it possible to advance in the current state of the art in the area of NLP applied to the legal sector

    Horizontal Clustering Side-Channel Attacks on Embedded ECC Implementations (Extended Version)

    Get PDF
    Side-channel attacks are a threat to cryptographic algorithms running on embedded devices. Public-key cryptosystems, including elliptic curve cryptography (ECC), are particularly vulnerable because their private keys are usually long-term. Well known countermeasures like regularity, projective coordinates and scalar randomization, among others, are used to harden implementations against common side-channel attacks like DPA. Horizontal clustering attacks can theoretically overcome these countermeasures by attacking individual side-channel traces. In practice horizontal attacks have been applied to overcome protected ECC implementations on FPGAs. However, it has not been known yet whether such attacks can be applied to protected implementations working on embedded devices, especially in a non-profiled setting. In this paper we mount non-profiled horizontal clustering attacks on two protected implementations of the Montgomery Ladder on Curve25519 available in the µNaCl library targeting electromagnetic (EM) emanations. The first implementation performs the conditional swap (cswap) operation through arithmetic of field elements (cswap-arith), while the second does so by swapping the pointers (cswap-pointer). They run on a 32-bit ARM Cortex-M4F core. Our best attack has success rates of 97.64% and 99.60% for cswap-arith and cswap-pointer, respectively. This means that at most 6 and 2 bits are incorrectly recovered, and therefore, a subsequent brute-force can fix them in reasonable time. Furthermore, our horizontal clustering framework used for the aforementioned attacks can be applied against other protected implementations

    Evolutionary-enhanced quantum supervised learning model

    Full text link
    Quantum supervised learning, utilizing variational circuits, stands out as a promising technology for NISQ devices due to its efficiency in hardware resource utilization during the creation of quantum feature maps and the implementation of hardware-efficient ansatz with trainable parameters. Despite these advantages, the training of quantum models encounters challenges, notably the barren plateau phenomenon, leading to stagnation in learning during optimization iterations. This study proposes an innovative approach: an evolutionary-enhanced ansatz-free supervised learning model. In contrast to parametrized circuits, our model employs circuits with variable topology that evolves through an elitist method, mitigating the barren plateau issue. Additionally, we introduce a novel concept, the superposition of multi-hot encodings, facilitating the treatment of multi-classification problems. Our framework successfully avoids barren plateaus, resulting in enhanced model accuracy. Comparative analysis with variational quantum classifiers from the technology's state-of-the-art reveal a substantial improvement in training efficiency and precision. Furthermore, we conduct tests on a challenging dataset class, traditionally problematic for conventional kernel machines, demonstrating a potential alternative path for achieving quantum advantage in supervised learning for NISQ era

    Rheological Characterization of Commercial Sweetened Condensed Milk

    Get PDF
    The present study have analyzed samples of sweetened condensed milk of five brands sold in the Brazilian market regarding their rheological behavior and viscoelasticity. The products presented pseudoplastic fluid behavior, illustrated by the experimental data of shear stress versus strain rate, with adjustments made to fit the Power Law and Casson models. The effect of temperature on apparent viscosity of the products followed the Arrhenius model, with activation energy values ranging from 33.8 to 40.9 kJ mol-1. The products showed loss modulus (G’’) greater than storage modulus (G ‘), indicating a semi-liquid material behavior

    Clustering by Similarity of Brazilian Legal Documents Using Natural Language Processing Approaches

    Get PDF
    The Brazilian legal system postulates the expeditious resolution of judicial proceedings. However, legal courts are working under budgetary constraints and with reduced staff. As a way to face these restrictions, artificial intelligence (AI) has been tackling many complex problems in natural language processing (NLP). This work aims to detect the degree of similarity between judicial documents that can be achieved in the inference group using unsupervised learning, by applying three NLP techniques, namely term frequency-inverse document frequency (TF-IDF), Word2Vec CBoW, and Word2Vec Skip-gram, the last two being specialized with a Brazilian language corpus. We developed a template for grouping lawsuits, which is calculated based on the cosine distance between the elements of the group to its centroid. The Ordinary Appeal was chosen as a reference file since it triggers legal proceedings to follow to the higher court and because of the existence of a relevant contingent of lawsuits awaiting judgment. After the data-processing steps, documents had their content transformed into a vector representation, using the three NLP techniques. We notice that specialized word-embedding models—like Word2Vec—present better performance, making it possible to advance in the current state of the art in the area of NLP applied to the legal sector

    Automatic BI-RADS Classification of Breast Magnetic Resonance Medical Records Using Transformer-Based Models for Brazilian Portuguese

    Get PDF
    This chapter aims to present a classification model for categorizing textual clinical records of breast magnetic resonance imaging, based on lexical, syntactic and semantic analysis of clinical reports according to the Breast Imaging-Reporting and Data System (BI-RADS) classification, using Deep Learning and Natural Language Processing (NLP). The model was developed from transfer learning based on the pre-trained BERTimbau model, BERT model (Bidirectional Encoder Representations from Transformers) trained in Brazilian Portuguese. The dataset is composed of medical reports in Brazilian Portuguese classified into six categories: Inconclusive; Normal or Negative; Certainly Benign Findings; Probably Benign Findings; Suspicious Findings; High Risk of Cancer; Previously Known Malignant Injury. The following models were implemented and compared: Random Forest, SVM, Naïve Bayes, BERTimbau with and without finetuning. The BERTimbau model presented better results, with better performance after finetuning

    Study of the Wind Speed Forecasting Applying Computational Intelligence

    Get PDF
    The conventional sources of energy such as oil, natural gas, coal, or nuclear are finite and generate environmental pollution. Alternatively, renewable energy source like wind is clean and abundantly available in nature. Wind power has a huge potential of becoming a major source of renewable energy for this modern world. It is a clean, emission-free power generation technology. Wind energy has been experiencing very rapid growth in Brazil and in Uruguay; therefore, it’s a promising industry in these countries. Thus, this rapid expansion can bring several regional benefits and contribute to sustainable development, especially in places with low economic development. Therefore, the scope of this chapter is to estimate short-term wind speed forecasting applying computational intelligence, by recurrent neural networks (RNN), using anemometers data collected by an anemometric tower at a height of 100.0 m in Brazil (tropical region) and 101.8 m in Uruguay (subtropical region), both Latin American countries. The results of this study are compared with wind speed prediction results from the literature. In one of the cases investigated, this study proved to be more appropriate when analyzing evaluation metrics (error and regression) of the prediction results obtained by the proposed model

    Multivariate Real Time Series Data Using Six Unsupervised Machine Learning Algorithms

    Get PDF
    The development of artificial intelligence (AI) algorithms for classification purpose of undesirable events has gained notoriety in the industrial world. Nevertheless, for AI algorithm training is necessary to have labeled data to identify the normal and anomalous operating conditions of the system. However, labeled data is scarce or nonexistent, as it requires a herculean effort to the specialists of labeling them. Thus, this chapter provides a comparison performance of six unsupervised Machine Learning (ML) algorithms to pattern recognition in multivariate time series data. The algorithms can identify patterns to assist in semiautomatic way the data annotating process for, subsequentially, leverage the training of AI supervised models. To verify the performance of the unsupervised ML algorithms to detect interest/anomaly pattern in real time series data, six algorithms were applied in following two identical cases (i) meteorological data from a hurricane season and (ii) monitoring data from dynamic machinery for predictive maintenance purposes. The performance evaluation was investigated with seven threshold indicators: accuracy, precision, recall, specificity, F1-Score, AUC-ROC and AUC-PRC. The results suggest that algorithms with multivariate approach can be successfully applied in the detection of anomalies in multivariate time series data
    corecore