29 research outputs found

    Can Word Segmentation be Considered Harmful for Statistical Machine Translation Tasks between Japanese and Chinese?

    Get PDF

    Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

    Get PDF
    Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

    IberSPEECH 2020: XI Jornadas en TecnologĂ­a del Habla and VII Iberian SLTech

    Get PDF
    IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli

    Towards Reliable and Inclusive Natural Language Generation

    Get PDF
    Natural language generation (NLG) is an important subfield of natural language processing (NLP) that produces natural language output. Despite notable advancements made by large-scale pre-trained language models in NLG, there remain several unresolved challenges. This thesis aims to enhance NLG from two significant aspects: reliability and inclusiveness. For reliability, on the one hand, we introduce novel training objectives that improve the alignment of language generation models with desired model behaviors. To improve the answerability of model-generated questions, we use a question answering model to provide additional rewards to a question generation model, encouraging the production of more answerable questions. In addition, we propose to train language models with a mixture of forward and reverse cross-entropies, demonstrating that the resulting models yield better generated text without complex decoding strategies. On the other hand, we propose novel evaluation methods to assess the performance of NLG models accurately and comprehensively. By combining human and automatic evaluations, we strike a balance between reliability and reproducibility. We delve into the unexplored issue of unfaithfulness in extractive summaries and conclude that extractive summarization does not guarantee faithfulness. For inclusiveness, we extend the coverage of NLG techniques to low-resource or endangered languages. We develop the first machine translation system for supporting translation between Cherokee, an endangered Native American language, and English, and we propose a roadmap for utilizing NLP to support language revitalization efforts. Additionally, we investigate the underrepresentation of low-resource languages during multilingual tokenization, a crucial data preprocessing step in training multilingual NLG models, and we present best practices for training multilingual tokenizers. Overall, this thesis works towards enhancing the trustworthiness of NLG models in practice and facilitating support for a more diverse range of languages worldwide.Doctor of Philosoph

    Japanese income tourism: An exploratory study of Portuguese luxury hotel management strategy (before and after Covid-19)

    Get PDF
    This study examines the service quality of luxury hotels operating in Portugal, identifying the factors behind Japanese customers’ satisfaction or dissatisfaction with hotel attributes. The study was based on qualitative and quantitative research methods, taking a four-step approach: first, the relevant literature was reviewed for a better understanding of Japanese culture, Japanese tourists, and the keys to success of Japanese and hotel management; second, content analysis was conducted of Japanese online reviews to compare satisfaction and dissatisfaction with hotel attributes. Chinese tourists were introduced in the study (the Asian tourists of most concern to the Portuguese hotel industry) to analyse the differences among Asian customers. The online reviews of 1.354 hotel guests (538 Japanese and 816 Chinese) were collected from the on-line travel agency “booking.com” for Lisbon, Portugal; third, five top hospitality managers and experts were interviewed; and fourth, 187 questionnaires had been answered by directors and client managers not only in the hospitality sector, but also in other business sectors to collect opinions on "outside the box" strategy, before and after the COVID19 pandemic. The conclusions of the study reveal some common categories for hotels, both positive and negative, that fall into dimensions: Facilities, Market, People, Processes, Financial (included in the "Balanced Scorecard" dimensions). There were also some relevant differences between the Japanese and Chinese regarding their perception of hotel facilities. What, for example, might be classified positively by the Japanese as "historical", may be classified negatively by the Chinese as "old fashioned". Hospitality Strategy, and Japanese tourists’ Satisfaction or Dissatisfaction with hotels’ attributes were compared and it became clear that a strategic improvement plan should be implemented to meet and/or exceed Japanese tourists’ expectations. Recovering the European markets and developing people should be the priority after covid-19. While few studies have been conducted in Portugal on this subject, the Japanese market is proving to be very profitable in other European markets.Este estudo analisa a qualidade do serviço dos hotéis de luxo a operar em Portugal, identificando a satisfação ou insatisfação dos clientes japoneses com os atributos dos hotéis. O estudo baseia-se em métodos de investigação qualitativos e quantitativos com quatro fases de abordagem. Primeira, revisão da literatura, compreendendo a cultura japonesa, o turista japonês e as chaves de sucesso da gestão japonesa e hoteleira. Segundo, análise de conteúdo dos comentários online de japoneses, onde os fatores de satisfação e insatisfação para os hotéis foram comparados. Os turistas chineses foram introduzidos no estudo (os turistas asiáticos que mais preocupam a hotelaria portuguesa) para analisar as diferenças entre os clientes asiáticos. Foram analisados 1.354 comentários de clientes (538 japoneses, 816 chineses) que ficaram hospedados em hotéis em Lisboa, Portugal, através do canal de reservas online booking.com. Terceiro, cinco diretores de topo e/ou especialistas em hotelaria foram entrevistados. Quarto, 187 questionários foram respondidos por diretores e gestores de clientes não apenas, no sector de hotelaria, mas também para outros sectores de atividade com o propósito de poder coletar opiniões sobre estratégia “fora da caixa” antes e após a pandemia Covid-19. As conclusões do estudo revelam algumas categorias comuns para os hotéis, positivas e negativas, dividas em dimensões: Instalações, Mercado, Pessoas, Processos, Financeiro (incluídas nas dimensões do “Balanced Scorecard”) e algumas diferenças relevantes entre japoneses e chineses, por exemplo, quanto a instalações do hotel, o que poderá ser classificado pelos japoneses como “histórico”, poderá ser classificado pelos chineses como “antiquado”. A estratégia hoteleira e satisfação ou insatisfação dos japoneses com os atributos dos hotéis foi comparada e um plano de melhoramento deverá ser implementado para que as expectativas dos japoneses sejam supridas ou superadas. Recuperar os mercados europeus e desenvolver pessoas deverá ser a prioridade após covid-19. Poucos estudos foram elaborados em Portugal sobre este assunto. O mercado japonês revela-se muito rentável em outros mercados europeus

    The National Museum of China: Building Memory, Shaping History, Presenting Identity

    Get PDF
    In March 2011, the National Museum of China, a union of the former Museum of Chinese History and the Museum of Chinese Revolution, opened to the public at Tiananmen Square, the heart of the Chinese Nation. The transition to a modern museum complex was fast, ambitious and, to a certain extent, drastic: Only 20% of the original building was kept; 80% is new structure. Thus the museum expanded on a gigantic scale from 65,000 m² to almost 192,000 m², currently constituting the largest museum in the world. It presents itself with a new look and new displays. Although the architects of von Gerkan, Marg and Partners (gmp) were officially commissioned in 2004, the approved design and the entire building project experienced fundamental changes right up until the opening of the museum in 2011. This thesis undertakes a comprehensive analysis of the redesigned National Museum of China, its current significance, its role as a cultural institution and as a representative of the Chinese nation. Chapter One (Framing the Subject: Origins and Concepts) introduces general museum concepts and the historic development of the National Museum of China. Chapter Two (Building Memory: The Architectural Form) examines the current architecture and its influence in the creation of memory and memorial culture. Chapter Three (Shaping History: The Presentation of the Collection) investigates the new presentation of the collection as well as continuities and changes in the display of the official master narrative. An appendix includes comprehensive results of various visitor surveys and statements of the museum staff from 2003, 2007 and 2011. This study presents the latest transformation of one of the most important cultural institutions of the People's Republic of China

    Integration(s) and resistance : governments, capital, social organisations and movements, and the arrival of 'foreign immigrants' in Barcelona and Lisbon

    Get PDF
    In a context characterised by the shift from fordism to post-fordism in the Iberian peninsula, this thesis addresses the following question how are capital, governments and social movements organised in the processes of integration and resistance that affect foreign immigration' in Barcelona and Lisbon? Thus, in the first chapter, an analysis of the concept of "integration" is undertaken in order to understand the complexities and elusiveness that hide behind it, giving special attention to immigrants' integration literature. A distinction between systemic integration and social integration is adopted, and thus in the second chapter recent theorisation on capital and the state (i. e. systemic institutions) is approached, while in the third chapter social movements and organisations are taken into account. In chapter four epistemological and methodological elements are noted. The last three chapters are devoted to analyse original fieldwork data (mainly qualitative interviews): chapter 6 analyses immigration governmental policies at European, 'national-state', 'national-regional', and local levels; chapter 7 studies social and capital organisations in Barcelona in relation to 'foreign immigration'; and in chapter 8 social and capital organisations are studied in relation to 'foreign immigration' in Lisbon. Finally, some conclusions are revealed whilst other questions are posed
    corecore