861 research outputs found

    Exploring AI-Generated Text in Student Writing: How Does AI Help?

    Full text link
    English as foreign language_EFL_students' use of text generated from artificial intelligence_AI_natural language generation_NLG_tools may improve their writing quality. However, it remains unclear to what extent AI-generated text in these students' writing might lead to higher-quality writing. We explored 23 Hong Kong secondary school students' attempts to write stories comprising their own words and AI-generated text. Human experts scored the stories for dimensions of content, language and organization. We analyzed the basic organization and structure and syntactic complexity of the stories' AI-generated text and performed multiple linear regression and cluster analyses. The results show the number of human words and the number of AI-generated words contribute significantly to scores. Besides, students can be grouped into competent and less competent writers who use more AI-generated text or less AI-generated text compared to their peers. Comparisons of clusters reveal some benefit of AI-generated text in improving the quality of both high-scoring students' and low-scoring students' writing. The findings can inform pedagogical strategies to use AI-generated text for EFL students' writing and to address digital divides. This study contributes designs of NLG tools and writing activities to implement AI-generated text in schools.Comment: 43 pages, 10 figures, 3 table

    Estruturas hierárquicas orientadas por dados em aprendizado multi-tarefa

    Get PDF
    Orientador: Fernando José Von ZubenDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Em aprendizado multi-tarefa, um conjunto de tarefas é simultaneamente considerado durante o processo de aprendizado de modo a promover ganho de desempenho através da exploração de similaridades entre tarefas. Em um número significativo de abordagens, tais similaridades são codificadas como informação adicional na etapa de regularização. Embora algumas estruturas sejam levadas em consideração em muitas propostas, como a existência de grupos de tarefas ou um relacionamento baseado em grafo, outras propostas mostraram que usar uma estrutura hierárquica corretamente definida poderá guiar a resultados competitivos. Focando em um relacionamento hierárquico, a extensão buscada nesta pesquisa é baseada na ideia de aprender a estrutura diretamente dos dados, possibilitando que a metodologia multi-tarefa possa ser estendida a uma gama mais vasta de aplicações. Assim, a hipótese levantada é que obter um relacionamento representativo dos dados baseado em hierarquia entre tarefas e usar esta informação adicional como um termo de penalização dentro do formalismo de aprendizado regularizado seria benéfico, relaxando a necessidade de um especialista específico de domínio e melhorando o desempenho de predição. Portanto, a novidade em abordagens hierárquicas orientadas por dados propostas nesta dissertação para aprendizado multi-tarefa é que a troca de informação entre tarefas reais associadas é promovida por tarefas hipotéticas auxiliares presentes nos nós mais altos, dado que as tarefas reais não são diretamente conectadas na hierarquia. Uma vez que a ideia principal envolve obter uma estrutura hierárquica, estudos foram feitos com foco em combinar ambas as áreas de clusterização hierárquica e aprendizado multi-tarefa. Três estratégias promissoras para a obtenção automática de estruturas hierárquicas foram adaptadas ao contexto de aprendizado multi-tarefa. Duas delas são abordagens Bayesianas, sendo uma caracterizada por ramificações não binárias. A possibilidade de corte na estrutura também é investigada, sendo uma poderosa ferramenta para detecção de tarefas outliers. Além disso, um conceito geral chamado Hierarchical Multi-Task Learning Framework é proposto, agrupando módulos individualmente, os quais podem ser facilmente estendidos em pesquisas futuras. Experimentos extensivos são apresentados e discutidos, mostrando o potencial da utilização de estruturas hierárquicas obtidas diretamente dos dados para guiar a etapa de regularização. Foram adotados nos experimentos tanto conjuntos de dados sintéticos com relacionamento entre tarefas conhecido como conjuntos de dados reais utilizados na literatura, nos quais foi possível observar que o framework proposto consistentemente supera estratégias bem estabelecidas de aprendizado multi-tarefaAbstract: In multi-task learning, a set of learning tasks is simultaneously considered during the learning process so that it can leverage performance by exploring similarities among the tasks. In a significant number of approaches, such similarities are encoded as additional information within the regularization framework. Although some sort of structure is taken into account by several proposals, such as the existence of task clusters or a graph-based relationship, others have shown that using a properly defined hierarchical structure may lead to competitive results. Focusing on a hierarchical relationship, the extension pursued in this research is based on the idea of learning it directly from data, enabling a methodology like this to be extended to a wider range of applications. Thus, the hypothesis raised is that obtaining a representative hierarchy-based task relationship from data and using this additional information as a penalty term in the regularization framework would be beneficial, relaxing the necessity of a domain-specific specialist and improving overall generalization predictive performance. Therefore, the novelty of the data-driven hierarchical approaches proposed in this dissertation for multi-task learning is that information exchange among associated real tasks is promoted by auxiliary hypothetical tasks at the upper nodes, given that the real tasks are not directly connected in the hierarchy. Once the main idea involves obtaining a hierarchical structure, several studies were performed focusing on combining both hierarchical clustering and multi-task learning areas. Three promising strategies for automatically obtaining hierarchical structures were adapted to the context of multi-task learning. Two of them are Bayesian-based approaches and one of those two is characterized by non-binary branching. The possibility of cutting edges is also investigated, being a powerful tool to detect outlier tasks. Moreover, a general concept called Hierarchical Multi-Task Learning Framework is proposed, individually grouping modules, which can be easily extended in future research. Extensive experiments are presented and discussed, showing the potential of employing a hierarchical structure obtained directly from task data within the regularization framework. Both synthetic datasets with known underlying relations among tasks and real-world benchmark datasets from the literature are adopted in the experiments, providing evidence that the proposed framework consistently outperforms well-established multi-task learning strategiesMestradoEngenharia de ComputaçãoMestre em Engenharia ElétricaCAPE

    Probabilistic forecasting and interpretability in power load applications

    Get PDF
    Power load forecasting is a fundamental tool in the modern electric power generation and distribution industry. The ability to accurately predict future behaviours of the grid, both in the short and long term, is vital in order to adequately meet demand and scaling requirements. Over the past few decades Machine Learning (ML) has taken center stage in this context, with an emphasis on short-term forecasting using both traditional ML as well as Deep-Learning (DL) models. In this dissertation, we approach forecasting not only from the angle of improving predictive accuracy, but also with the goal of gaining interpretability of the behavior of the electric load through models that can offer deeper insight and extract useful information. Specifically for this reason, we focus on the use of probabilistic models, which can shed light on valuable information about the underlying structure of the data through the interpretation of their parameters. Furthermore, the use of probabilistic models intrinsically provides us with a way of measuring the confidence in our predictions through the predictive variance. Throughout the dissertation we shall focus on two specific ideas within the greater field of power load forecasting, which will comprise our main contributions. The first contribution addresses the notion of power load profiling, in which ML is used to identify profiles that represent distinct behaviours in the power load data. These profiles have two fundamental uses: first, they can be valuable interpretability tools, as they offer simple yet powerful descriptions of the underlying patterns hidden in the time series data; second, they can improve forecasting accuracy by allowing us to train specialized predictive models tailored to each individual profile. However, in most of the literature profiling and prediction are typically performed sequentially, with an initial clustering algorithm identifying profiles in the input data and a subsequent prediction stage where independent regressors are trained on each profile. In this dissertation we propose a novel probabilistic approach that couples both the profiling and predictive stages by jointly fitting a clustering model and multiple linear regressors. In training, both the clustering of the input data and the fitting of the regressors to the output data influence each other through a joint likelihood function, resulting in a set of clusters that is much better suited to the prediction task and is therefore much more relevant and informative. The model is tested on two real world power load databases, provided by the regional transmission organizations ISO New England and PJM Interconect LLC, in a 24-hour ahead prediction scenario. We achieve better performance than other state of the art approaches while arriving at more consistent and informative profiles of the power load data. Our second contribution applies the idea of multi-task prediction to the context of 24- hour ahead forecasting. In a multi-task prediction problem there are multiple outputs that are assumed to be correlated in some way. Identifying and exploiting these relationships can result in much better performance as well as a better understanding of a multi-task problem. Even though the load forecasting literature is scarce on this subject, it seems obvious to assume that there exist important correlations between the outputs in a 24-hour prediction scenario. To tackle this, we develop a multi-task Gaussian process model that addresses the relationships between the outputs by assuming the existence of, and subsequently estimating, both an inter-task covariance matrix and a multitask noise covariance matrix that capture these important interactions. Our model improves on other multi-task Gaussian process approaches in that it greatly reduces the number of parameters to be inferred while maintaining the interpretability provided by the estimation and visualization of the multi-task covariance matrices. We first test our model on a wide selection of general synthetic and real world multi-task problems with excellent results. We then apply it to a 24-hour ahead power load forecasting scenario using the ISO New England database, outperforming other standard multi-task Gaussian processes and providing very useful visual information through the estimation of the covariance matrices.La predicción de carga es una herramenta fundamental en la industria moderna de la generación y distribución de energía eléctrica. La capacidad de estimar con precisión el comportamiento futuro de la red, tanto a corto como a largo plazo, es vital para poder cumplir con los requisitos de demanda y escalado en las diferentes infraestructuras. A lo largo de las últimas décadas, el Aprendizaje Automático o Machine Learning (ML) ha tomado un papel protagonista en este contexto, con un marcado énfasis en la predicción a corto plazo utilizando tanto modelos de ML tradicionales como redes Deep-Learning (DL). En esta tesis planteamos la predicción de carga no sólo con el objetivo de mejorar las prestaciones en la estimación, sino también de ganar en la interpretabilidad del comportamiento de la carga eléctrica a través de modelos que puedan extraer información útil. Por este motivo nos centraremos en modelos probabilísticos, que por su naturaleza pueden arrojar luz sobre la estructura oculta de los datos a través de la interpretación de sus parámetros. Además el uso de modelos probabilísticos nos proporciona de forma intrínseca una medida de confianza en la predicción a través de la estimación de la varianza predictiva. A lo largo de la tesis nos centraremos en dos ideas concretas en el contexto de la predicción de carga eléctrica, que conformarán nuestras aportaciónes principales. Nuestra primera contribución plantea la idea del perfilado de la carga eléctrica, donde se utilizan modelos de ML para identificar perfiles que representan comportamientos diferenciables en los datos de carga. Estos perfiles tienen dos usos fundamentales: en primer lugar son herramientas útiles para la interpretabilidad del problema ya que ofrecen descripciones sencillas de los posibles patrones ocultos en los datos; en segundo lugar, los perfiles pueden ser utilizados para mejorar las prestaciones de estimación, ya que permiten entrenar varios modelos predictivos especializados en cada perfil individual. Sin embargo, en la literatura el perfilado y la predicción se presentan como eventos en cascada, donde primero se entrena un algoritmo de clústering para detectar perfiles que luego son utilizados para entrenar los modelos de regresión. En esta tesis proponemos un modelo probabilístico novedoso que acopla las dos fases ajustando simultáneamente un modelo de clústering y los correspondientes modelos de regresión. Durante el entrenamiento ambas partes del modelo se influencian entre sí a través de una función de verosimilitud conjunta, resultando en un conjunto de clusters que está mucho mejor adaptado a la tarea de predicción y es por tanto mucho más relevante e informativo. En los experimentos, el modelo es entrenado con datos reales de carga eléctrica provinientes de dos bases de datos públicas proporcionadas por las organizaciónde de transmisión regional estadounidenses ISO New England y PJM Interconect LLC, en un escenario de predicción a 24 horas. El modelo obtiene mejores prestaciones que otros algoritmos competitivos, proporcionando al mismo tiempo un conjunto de perfiles del comportamiento de la carga más consistente e informativo. Nuestra segunda contribución aplica la idea de predicción multi-tarea al contexto de la estimación a 24 horas. Los problemas multi-tarea presentan múltiples salidas que se asume están de alguna forma correladas entre sí. Identificar y aprovechar estas relaciones puede incurrir en un incremento de las prestaciones así como un mejor entendimiento del problema multi-tarea. A pesar de que la literatura de predicción de carga es escasa en este sentido, parece lógico pensar que deben existir importantes correlaciones entre las salidas de un escenario de predicción a 24 horas. Por este motivo hemos desarrollado un proceso Gaussiano multi-tarea que recoge las relaciones entre salidas asumiendo la existencia de de una covarianza inter-tarea así como un ruido multi-tarea. Nuestro modelo ofrece mejoras con respecto a otras formulaciones de procesos Gaussianos multi-tarea al reducir el número de parámetros a estimar mientras se mantiene la interpretabilidad proporcionada por la estimación y visualizacion de las matrices de covarianza y ruido inter-tarea. Primero, en la fase de experimentos nuestro modelo es puesto a prueba sobre una batería de bases de datos tanto sintéticas como reales, obteniendo muy buenos resultados. A continuación se aplica el modelo a un problema de predicción de carga a 24 horas utilizando la base de datos de ISO New England, batiendo en prestaciones a otros procesos Gaussianos multi-tarea y proporcionando información visual útil mediante la estimación de las matrices de covarianza inter-tarea.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Pablo Martínez Olmos.- Secretario: Pablo Muñoz Moreno.- Vocal: José Palacio

    Interactive Exploration of Multitask Dependency Networks

    Get PDF
    Scientists increasingly depend on machine learning algorithms to discover patterns in complex data. Two examples addressed in this dissertation are identifying how information sharing among regions of the brain develops due to learning; and, learning dependency networks of blood proteins associated with cancer. Dependency networks, or graphical models, are learned from the observed data in order to make comparisons between the sub-populations of the dataset. Rarely is there sufficient data to infer robust individual networks for each sub-population. The multiple networks must be considered simultaneously; exploding the hypothesis space of the learning problem. Exploring this complex solution space requires input from the domain scientist to refine the objective function. This dissertation introduces a framework to incorporate domain knowledge in transfer learning to facilitate the exploration of solutions. The framework is a generalization of existing algorithms for multiple network structure identification. Solutions produced with human input narrow down the variance of solutions to those that answer questions of interest to domain scientists. Patterns, such as identifying differences between networks, are learned with higher confidence using transfer learning than through the standard method of bootstrapping. Transfer learning may be the ideal method for making comparisons among dependency networks, whether looking for similarities or differences. Domain knowledge input and visualization of solutions are combined in an interactive tool that enables domain scientists to explore the space of solutions efficiently

    Machine Learning towards General Medical Image Segmentation

    Get PDF
    The quality of patient care associated with diagnostic radiology is proportionate to a physician\u27s workload. Segmentation is a fundamental limiting precursor to diagnostic and therapeutic procedures. Advances in machine learning aims to increase diagnostic efficiency to replace single applications with generalized algorithms. We approached segmentation as a multitask shape regression problem, simultaneously predicting coordinates on an object\u27s contour while jointly capturing global shape information. Shape regression models inherent point correlations to recover ambiguous boundaries not supported by clear edges and region homogeneity. Its capabilities was investigated using multi-output support vector regression (MSVR) on head and neck (HaN) CT images. Subsequently, we incorporated multiplane and multimodality spinal images and presented the first deep learning multiapplication framework for shape regression, the holistic multitask regression network (HMR-Net). MSVR and HMR-Net\u27s performance were comparable or superior to state-of-the-art algorithms. Multiapplication frameworks bridges any technical knowledge gaps and increases workflow efficiency

    Sparse multitask regression for identifying common mechanism of response to therapeutic targets

    Get PDF
    Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response

    Transfer Learning using Computational Intelligence: A Survey

    Get PDF
    Abstract Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling consisting of different data patterns in the current domain. To improve the performance of existing transfer learning methods and handle the knowledge transfer process in real-world systems, ..
    corecore