Search CORE

103 research outputs found

Tracking relevant alignment characteristics for machine translation

Author: Lambert Patrik
Ma Yanjun
Ozdowska Sylwia
Way Andy
Publication venue
Publication date: 01/01/2009
Field of study

In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. In this paper we compare alignments tuned directly according to alignment F-score and BLEU score in order to investigate the alignment characteristics that are helpful in translation. We report results for two different SMT systems (a phrase-based and an n-gram-based system) on Chinese to English IWSLT data, and Spanish to English European Parliament data. We give alignment hints to improve BLEU score, depending on the SMT system used and the type of corpus

CiteSeerX

Gaussian class-conditional simplex loss for accurate, adversarially robust deep classifier training

Author: Andrea Migliorati
Arslan Ali
Enrico Magli
Tiziano Bianchi
Publication venue: Springer
Publication date: 01/01/2023
Field of study

In this work, we present the Gaussian Class-Conditional Simplex (GCCS) loss: a novel approach for training deep robust multiclass classifiers that improves over the state-of-the-art in terms of classification accuracy and adversarial robustness, with little extra cost for network training. The proposed method learns a mapping of the input classes onto Gaussian target distributions in a latent space such that a hyperplane can be used as the optimal decision surface. Instead of maximizing the likelihood of target labels for individual samples, our loss function pushes the network to produce feature distributions yielding high inter-class separation and low intra-class separation. The mean values of the learned distributions are centered on the vertices of a simplex such that each class is at the same distance from every other class. We show that the regularization of the latent space based on our approach yields excellent classification accuracy. Moreover, GCCS provides improved robustness against adversarial perturbations, outperforming models trained with conventional adversarial training (AT). In particular, our model learns a decision space that minimizes the presence of short paths toward neighboring decision regions. We provide a comprehensive empirical evaluation that shows how GCCS outperforms state-of-the-art approaches over challenging datasets for targeted and untargeted gradient-based, as well as gradient-free adversarial attacks, both in terms of classification accuracy and adversarial robustness

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Learning morphology with Morfette

Author: Chrupała Grzegorz
Dinu Georgiana
van Genabith Josef
Publication venue
Publication date: 01/01/2008
Field of study

Morfette is a modular, data-driven, probabilistic system which learns to perform joint morphological tagging and lemmatization from morphologically annotated corpora. The system is composed of two learning modules which are trained to predict morphological tags and lemmas using the Maximum Entropy classifier. The third module dynamically combines the predictions of the Maximum-Entropy models and outputs a probability distribution over tag-lemma pair sequences. The lemmatization module exploits the idea of recasting lemmatization as a classification task by using class labels which encode mappings from wordforms to lemmas. Experimental evaluation results and error analysis on three morphologically rich languages show that the system achieves high accuracy with no language-specific feature engineering or additional resources

CiteSeerX

Composite Adversarial Attacks

Author: Chen Yuefeng
He Yuan
Mao Xiaofeng
Su Hang
Wang Shuhui
Xue Hui
Publication venue
Publication date: 09/12/2020
Field of study

Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness. In practice, attack algorithms are artificially selected and tuned by human experts to break a ML system. However, manual selection of attackers tends to be sub-optimal, leading to a mistakenly assessment of model security. In this paper, a new procedure called Composite Adversarial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms and their hyper-parameters from a candidate pool of \textbf{32 base attackers}. We design a search space where attack policy is represented as an attacking sequence, i.e., the output of the previous attacker is used as the initialization input for successors. Multi-objective NSGA-II genetic algorithm is adopted for finding the strongest attack policy with minimum complexity. The experimental result shows CAA beats 10 top attackers on 11 diverse defenses with less elapsed time (\textbf{6

\times

faster than AutoAttack}), and achieves the new state-of-the-art on

l_{\infty}

l_{2}

and unrestricted adversarial attacks.Comment: To appear in AAAI 2021, code will be released late

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sampling high-dimensional design spaces for analysis and optimization

Author: Ito Keiichi
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2016
Field of study

Adaptive Neural Network Usage in Computer Go

Author: Kessler Alexi Robert
Shusdock Ian Renee
Publication venue: Digital WPI
Publication date: 25/04/2017
Field of study

For decades, computer scientists have worked to develop an artificial intelligence for the game of Go intelligent enough to beat skilled human players. In 2016, Google accomplished just that with their program, AlphaGo. AlphaGo was a huge leap forward in artificial intelligence, but required quite a lot of computational power to run. The goal of our project was to take some of the techniques that make AlphaGo so powerful, and integrate them with a less resource intensive artificial intelligence. Specifically, we expanded on the work of last year’s MQP of integrating a neural network into an existing Go AI, Pachi. We rigorously tested the resultant program’s performance. We also used SPSA training to determine an adaptive value function so as to make the best use of the neural network