3 research outputs found

    Text processing using neural networks

    Get PDF
    Natural language processing is a key technology in the field of artificial intelligence. It involves the two basic tasks of natural language understanding and natural language generation. The primary core of solving the above tasks is to obtain text semantics. Text semantic analysis enables computers to simulate humans to understand the deep semantics of natural language and identify the true meaning contained in information by building a model. Obtaining the true semantics of text helps to improve the processing effect of various natural language processing downstream tasks, such as machine translation, question answering systems, and chatbots. Natural language text is composed of words, sentences and paragraphs (in that order). Word-level semantic analysis is concerned with the sense of words, the quality of which directly affects the quality of subsequent text semantics at each level. Sentences are the simplest sequence of semantic units, and sentence-level semantics analysis focuses on the semantics expressed by the entire sentence. Paragraph semantic analysis achieves the purpose of understanding paragraph semantics. Currently, while the performance of semantic analysis models based on Deep Neural Network has made significant progress, many shortcomings remain. This thesis proposes the Deep Neural Network-based model for sentence semantic understanding, word sense understanding and text sequence generation from the perspective of different research tasks to address the difficulties in text semantic analysis. The research contents and contributions are summarized as follows: First, the mainstream use of recurrent neural networks cannot directly model the latent structural information of sentences. To better determine the sense of ambiguous words, this thesis proposes a model that uses a two-layer bi-directional long short-term memory neural network and attention mechanism. Second, static word embedding models cannot manage polysemy. Contextual word embedding models can do so, however, their performance is limited in application scenarios with high real-time requirements. Accordingly, this thesis proposes using a word sense induction task to construct word sense embeddings for polysemous words. Third, the current mainstream encoder-decoder model based on the attention mechanism does not explicitly perform a preliminary screening of the information in the source text before summary generation. This results in the input to the decoder containing a large amount of information irrelevant to summary generation as well as exposure bias and out-of-vocabulary words in the generation of sequences. To address this problem, this thesis proposes an abstractive text summarization model based on a hierarchical attention mechanism and multi-objective reinforcement learning. In summary, this thesis conducts in-depth research on semantic analysis, and proposes solutions to problems in word sense disambiguation, word sense embeddings, and abstractive text summarization tasks. The feasibility and validity were verified through extensive experiments on their respective corresponding publicly-available standard datasets, and also provide support for other related research in the field of natural language processing.Natural language processing is a key technology in the field of artificial intelligence. It involves the two basic tasks of natural language understanding and natural language generation. The primary core of solving the above tasks is to obtain text semantics. Text semantic analysis enables computers to simulate humans to understand the deep semantics of natural language and identify the true meaning contained in information by building a model. Obtaining the true semantics of text helps to improve the processing effect of various natural language processing downstream tasks, such as machine translation, question answering systems, and chatbots. Natural language text is composed of words, sentences and paragraphs (in that order). Word-level semantic analysis is concerned with the sense of words, the quality of which directly affects the quality of subsequent text semantics at each level. Sentences are the simplest sequence of semantic units, and sentence-level semantics analysis focuses on the semantics expressed by the entire sentence. Paragraph semantic analysis achieves the purpose of understanding paragraph semantics. Currently, while the performance of semantic analysis models based on Deep Neural Network has made significant progress, many shortcomings remain. This thesis proposes the Deep Neural Network-based model for sentence semantic understanding, word sense understanding and text sequence generation from the perspective of different research tasks to address the difficulties in text semantic analysis. The research contents and contributions are summarized as follows: First, the mainstream use of recurrent neural networks cannot directly model the latent structural information of sentences. To better determine the sense of ambiguous words, this thesis proposes a model that uses a two-layer bi-directional long short-term memory neural network and attention mechanism. Second, static word embedding models cannot manage polysemy. Contextual word embedding models can do so, however, their performance is limited in application scenarios with high real-time requirements. Accordingly, this thesis proposes using a word sense induction task to construct word sense embeddings for polysemous words. Third, the current mainstream encoder-decoder model based on the attention mechanism does not explicitly perform a preliminary screening of the information in the source text before summary generation. This results in the input to the decoder containing a large amount of information irrelevant to summary generation as well as exposure bias and out-of-vocabulary words in the generation of sequences. To address this problem, this thesis proposes an abstractive text summarization model based on a hierarchical attention mechanism and multi-objective reinforcement learning. In summary, this thesis conducts in-depth research on semantic analysis, and proposes solutions to problems in word sense disambiguation, word sense embeddings, and abstractive text summarization tasks. The feasibility and validity were verified through extensive experiments on their respective corresponding publicly-available standard datasets, and also provide support for other related research in the field of natural language processing.460 - Katedra informatikyvyhově

    Optimización de la factorización de matrices no negativas en Bioinformática

    Get PDF
    En los últimos años se ha incrementado el interés de la comunidad científica en la Factorización de matrices no negativas (Non-negative Matrix Factorization, NMF). Este método permite transformar un conjunto de datos de grandes dimensiones en una pequeña colección de elementos que poseen semántica propia en el contexto del análisis. En el caso de Bioinformática, NMF suele emplearse como base de algunos métodos de agrupamiento de datos, que emplean un modelo estadístico para determinar el número de clases más favorable. Este modelo requiere de una gran cantidad de ejecuciones de NMF con distintos parámetros de entrada, lo que representa una enorme carga de trabajo a nivel computacional. La mayoría de las implementaciones de NMF han ido quedando obsoletas ante el constante crecimiento de los datos que la comunidad científica busca analizar, bien sea porque los tiempos de cómputo llegan a alargarse hasta convertirse en inviables, o porque el tamaño de esos datos desborda los recursos del sistema. Por ello, esta tesis doctoral se centra en la optimización y paralelización de la factorización NMF, pero no solo a nivel teórico, sino con el objetivo de proporcionarle a la comunidad científica una nueva herramienta para el análisis de datos de origen biológico. NMF expone un alto grado de paralelismo a nivel de datos, de granularidad variable; mientras que los métodos de agrupamiento mencionados anteriormente presentan un paralelismo a nivel de cómputo, ya que las diversas instancias de NMF que se ejecutan son independientes. Por tanto, desde un punto de vista global, se plantea un modelo de optimización por capas donde se emplean diferentes tecnologías de alto rendimiento..
    corecore