861 research outputs found
Exploring AI-Generated Text in Student Writing: How Does AI Help?
English as foreign language_EFL_students' use of text generated from
artificial intelligence_AI_natural language generation_NLG_tools may improve
their writing quality. However, it remains unclear to what extent AI-generated
text in these students' writing might lead to higher-quality writing. We
explored 23 Hong Kong secondary school students' attempts to write stories
comprising their own words and AI-generated text. Human experts scored the
stories for dimensions of content, language and organization. We analyzed the
basic organization and structure and syntactic complexity of the stories'
AI-generated text and performed multiple linear regression and cluster
analyses. The results show the number of human words and the number of
AI-generated words contribute significantly to scores. Besides, students can be
grouped into competent and less competent writers who use more AI-generated
text or less AI-generated text compared to their peers. Comparisons of clusters
reveal some benefit of AI-generated text in improving the quality of both
high-scoring students' and low-scoring students' writing. The findings can
inform pedagogical strategies to use AI-generated text for EFL students'
writing and to address digital divides. This study contributes designs of NLG
tools and writing activities to implement AI-generated text in schools.Comment: 43 pages, 10 figures, 3 table
Estruturas hierárquicas orientadas por dados em aprendizado multi-tarefa
Orientador: Fernando José Von ZubenDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Em aprendizado multi-tarefa, um conjunto de tarefas é simultaneamente considerado durante o processo de aprendizado de modo a promover ganho de desempenho através da exploração de similaridades entre tarefas. Em um número significativo de abordagens, tais similaridades são codificadas como informação adicional na etapa de regularização. Embora algumas estruturas sejam levadas em consideração em muitas propostas, como a existência de grupos de tarefas ou um relacionamento baseado em grafo, outras propostas mostraram que usar uma estrutura hierárquica corretamente definida poderá guiar a resultados competitivos. Focando em um relacionamento hierárquico, a extensão buscada nesta pesquisa é baseada na ideia de aprender a estrutura diretamente dos dados, possibilitando que a metodologia multi-tarefa possa ser estendida a uma gama mais vasta de aplicações. Assim, a hipótese levantada é que obter um relacionamento representativo dos dados baseado em hierarquia entre tarefas e usar esta informação adicional como um termo de penalização dentro do formalismo de aprendizado regularizado seria benéfico, relaxando a necessidade de um especialista específico de domínio e melhorando o desempenho de predição. Portanto, a novidade em abordagens hierárquicas orientadas por dados propostas nesta dissertação para aprendizado multi-tarefa é que a troca de informação entre tarefas reais associadas é promovida por tarefas hipotéticas auxiliares presentes nos nós mais altos, dado que as tarefas reais não são diretamente conectadas na hierarquia. Uma vez que a ideia principal envolve obter uma estrutura hierárquica, estudos foram feitos com foco em combinar ambas as áreas de clusterização hierárquica e aprendizado multi-tarefa. Três estratégias promissoras para a obtenção automática de estruturas hierárquicas foram adaptadas ao contexto de aprendizado multi-tarefa. Duas delas são abordagens Bayesianas, sendo uma caracterizada por ramificações não binárias. A possibilidade de corte na estrutura também é investigada, sendo uma poderosa ferramenta para detecção de tarefas outliers. Além disso, um conceito geral chamado Hierarchical Multi-Task Learning Framework é proposto, agrupando módulos individualmente, os quais podem ser facilmente estendidos em pesquisas futuras. Experimentos extensivos são apresentados e discutidos, mostrando o potencial da utilização de estruturas hierárquicas obtidas diretamente dos dados para guiar a etapa de regularização. Foram adotados nos experimentos tanto conjuntos de dados sintéticos com relacionamento entre tarefas conhecido como conjuntos de dados reais utilizados na literatura, nos quais foi possível observar que o framework proposto consistentemente supera estratégias bem estabelecidas de aprendizado multi-tarefaAbstract: In multi-task learning, a set of learning tasks is simultaneously considered during the learning process so that it can leverage performance by exploring similarities among the tasks. In a significant number of approaches, such similarities are encoded as additional information within the regularization framework. Although some sort of structure is taken into account by several proposals, such as the existence of task clusters or a graph-based relationship, others have shown that using a properly defined hierarchical structure may lead to competitive results. Focusing on a hierarchical relationship, the extension pursued in this research is based on the idea of learning it directly from data, enabling a methodology like this to be extended to a wider range of applications. Thus, the hypothesis raised is that obtaining a representative hierarchy-based task relationship from data and using this additional information as a penalty term in the regularization framework would be beneficial, relaxing the necessity of a domain-specific specialist and improving overall generalization predictive performance. Therefore, the novelty of the data-driven hierarchical approaches proposed in this dissertation for multi-task learning is that information exchange among associated real tasks is promoted by auxiliary hypothetical tasks at the upper nodes, given that the real tasks are not directly connected in the hierarchy. Once the main idea involves obtaining a hierarchical structure, several studies were performed focusing on combining both hierarchical clustering and multi-task learning areas. Three promising strategies for automatically obtaining hierarchical structures were adapted to the context of multi-task learning. Two of them are Bayesian-based approaches and one of those two is characterized by non-binary branching. The possibility of cutting edges is also investigated, being a powerful tool to detect outlier tasks. Moreover, a general concept called Hierarchical Multi-Task Learning Framework is proposed, individually grouping modules, which can be easily extended in future research. Extensive experiments are presented and discussed, showing the potential of employing a hierarchical structure obtained directly from task data within the regularization framework. Both synthetic datasets with known underlying relations among tasks and real-world benchmark datasets from the literature are adopted in the experiments, providing evidence that the proposed framework consistently outperforms well-established multi-task learning strategiesMestradoEngenharia de ComputaçãoMestre em Engenharia ElétricaCAPE
Probabilistic forecasting and interpretability in power load applications
Power load forecasting is a fundamental tool in the modern electric power generation
and distribution industry. The ability to accurately predict future behaviours of the grid,
both in the short and long term, is vital in order to adequately meet demand and scaling
requirements. Over the past few decades Machine Learning (ML) has taken center stage
in this context, with an emphasis on short-term forecasting using both traditional ML
as well as Deep-Learning (DL) models. In this dissertation, we approach forecasting not
only from the angle of improving predictive accuracy, but also with the goal of gaining
interpretability of the behavior of the electric load through models that can offer deeper
insight and extract useful information. Specifically for this reason, we focus on the use of
probabilistic models, which can shed light on valuable information about the underlying
structure of the data through the interpretation of their parameters. Furthermore, the use
of probabilistic models intrinsically provides us with a way of measuring the confidence in
our predictions through the predictive variance. Throughout the dissertation we shall focus
on two specific ideas within the greater field of power load forecasting, which will comprise
our main contributions.
The first contribution addresses the notion of power load profiling, in which ML is used
to identify profiles that represent distinct behaviours in the power load data. These profiles
have two fundamental uses: first, they can be valuable interpretability tools, as they offer
simple yet powerful descriptions of the underlying patterns hidden in the time series data;
second, they can improve forecasting accuracy by allowing us to train specialized predictive
models tailored to each individual profile. However, in most of the literature profiling
and prediction are typically performed sequentially, with an initial clustering algorithm
identifying profiles in the input data and a subsequent prediction stage where independent
regressors are trained on each profile. In this dissertation we propose a novel probabilistic
approach that couples both the profiling and predictive stages by jointly fitting a clustering
model and multiple linear regressors. In training, both the clustering of the input data
and the fitting of the regressors to the output data influence each other through a joint
likelihood function, resulting in a set of clusters that is much better suited to the prediction
task and is therefore much more relevant and informative. The model is tested on two real
world power load databases, provided by the regional transmission organizations ISO New
England and PJM Interconect LLC, in a 24-hour ahead prediction scenario. We achieve
better performance than other state of the art approaches while arriving at more consistent and informative profiles of the power load data.
Our second contribution applies the idea of multi-task prediction to the context of 24-
hour ahead forecasting. In a multi-task prediction problem there are multiple outputs that
are assumed to be correlated in some way. Identifying and exploiting these relationships can
result in much better performance as well as a better understanding of a multi-task problem.
Even though the load forecasting literature is scarce on this subject, it seems obvious to
assume that there exist important correlations between the outputs in a 24-hour prediction
scenario. To tackle this, we develop a multi-task Gaussian process model that addresses
the relationships between the outputs by assuming the existence of, and subsequently
estimating, both an inter-task covariance matrix and a multitask noise covariance matrix
that capture these important interactions. Our model improves on other multi-task Gaussian
process approaches in that it greatly reduces the number of parameters to be inferred
while maintaining the interpretability provided by the estimation and visualization of the
multi-task covariance matrices. We first test our model on a wide selection of general
synthetic and real world multi-task problems with excellent results. We then apply it to
a 24-hour ahead power load forecasting scenario using the ISO New England database,
outperforming other standard multi-task Gaussian processes and providing very useful
visual information through the estimation of the covariance matrices.La predicción de carga es una herramenta fundamental en la industria moderna de la
generación y distribución de energía eléctrica. La capacidad de estimar con precisión el
comportamiento futuro de la red, tanto a corto como a largo plazo, es vital para poder
cumplir con los requisitos de demanda y escalado en las diferentes infraestructuras. A lo largo
de las últimas décadas, el Aprendizaje Automático o Machine Learning (ML) ha tomado un
papel protagonista en este contexto, con un marcado énfasis en la predicción a corto plazo
utilizando tanto modelos de ML tradicionales como redes Deep-Learning (DL). En esta
tesis planteamos la predicción de carga no sólo con el objetivo de mejorar las prestaciones
en la estimación, sino también de ganar en la interpretabilidad del comportamiento de la
carga eléctrica a través de modelos que puedan extraer información útil. Por este motivo
nos centraremos en modelos probabilísticos, que por su naturaleza pueden arrojar luz sobre
la estructura oculta de los datos a través de la interpretación de sus parámetros. Además el
uso de modelos probabilísticos nos proporciona de forma intrínseca una medida de confianza
en la predicción a través de la estimación de la varianza predictiva. A lo largo de la tesis
nos centraremos en dos ideas concretas en el contexto de la predicción de carga eléctrica,
que conformarán nuestras aportaciónes principales.
Nuestra primera contribución plantea la idea del perfilado de la carga eléctrica, donde
se utilizan modelos de ML para identificar perfiles que representan comportamientos
diferenciables en los datos de carga. Estos perfiles tienen dos usos fundamentales: en
primer lugar son herramientas útiles para la interpretabilidad del problema ya que ofrecen
descripciones sencillas de los posibles patrones ocultos en los datos; en segundo lugar,
los perfiles pueden ser utilizados para mejorar las prestaciones de estimación, ya que permiten entrenar varios modelos predictivos especializados en cada perfil individual. Sin
embargo, en la literatura el perfilado y la predicción se presentan como eventos en cascada,
donde primero se entrena un algoritmo de clústering para detectar perfiles que luego son
utilizados para entrenar los modelos de regresión. En esta tesis proponemos un modelo
probabilístico novedoso que acopla las dos fases ajustando simultáneamente un modelo
de clústering y los correspondientes modelos de regresión. Durante el entrenamiento
ambas partes del modelo se influencian entre sí a través de una función de verosimilitud
conjunta, resultando en un conjunto de clusters que está mucho mejor adaptado a la tarea
de predicción y es por tanto mucho más relevante e informativo. En los experimentos, el
modelo es entrenado con datos reales de carga eléctrica provinientes de dos bases de datos
públicas proporcionadas por las organizaciónde de transmisión regional estadounidenses
ISO New England y PJM Interconect LLC, en un escenario de predicción a 24 horas. El
modelo obtiene mejores prestaciones que otros algoritmos competitivos, proporcionando al
mismo tiempo un conjunto de perfiles del comportamiento de la carga más consistente e
informativo.
Nuestra segunda contribución aplica la idea de predicción multi-tarea al contexto de
la estimación a 24 horas. Los problemas multi-tarea presentan múltiples salidas que se
asume están de alguna forma correladas entre sí. Identificar y aprovechar estas relaciones
puede incurrir en un incremento de las prestaciones así como un mejor entendimiento del
problema multi-tarea. A pesar de que la literatura de predicción de carga es escasa en este
sentido, parece lógico pensar que deben existir importantes correlaciones entre las salidas
de un escenario de predicción a 24 horas. Por este motivo hemos desarrollado un proceso
Gaussiano multi-tarea que recoge las relaciones entre salidas asumiendo la existencia de de
una covarianza inter-tarea así como un ruido multi-tarea. Nuestro modelo ofrece mejoras
con respecto a otras formulaciones de procesos Gaussianos multi-tarea al reducir el número
de parámetros a estimar mientras se mantiene la interpretabilidad proporcionada por la
estimación y visualizacion de las matrices de covarianza y ruido inter-tarea. Primero, en la
fase de experimentos nuestro modelo es puesto a prueba sobre una batería de bases de datos
tanto sintéticas como reales, obteniendo muy buenos resultados. A continuación se aplica
el modelo a un problema de predicción de carga a 24 horas utilizando la base de datos
de ISO New England, batiendo en prestaciones a otros procesos Gaussianos multi-tarea y
proporcionando información visual útil mediante la estimación de las matrices de covarianza
inter-tarea.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Pablo Martínez Olmos.- Secretario: Pablo Muñoz Moreno.- Vocal: José Palacio
Interactive Exploration of Multitask Dependency Networks
Scientists increasingly depend on machine learning algorithms to discover patterns in complex data. Two examples addressed in this dissertation are identifying how information sharing among regions of the brain develops due to learning; and, learning dependency networks of blood proteins associated with cancer. Dependency networks, or graphical models, are learned from the observed data in order to make comparisons between the sub-populations of the dataset. Rarely is there sufficient data to infer robust individual networks for each sub-population. The multiple networks must be considered simultaneously; exploding the hypothesis space of the learning problem. Exploring this complex solution space requires input from the domain scientist to refine the objective function. This dissertation introduces a framework to incorporate domain knowledge in transfer learning to facilitate the exploration of solutions. The framework is a generalization of existing algorithms for multiple network structure identification. Solutions produced with human input narrow down the variance of solutions to those that answer questions of interest to domain scientists. Patterns, such as identifying differences between networks, are learned with higher confidence using transfer learning than through the standard method of bootstrapping. Transfer learning may be the ideal method for making comparisons among dependency networks, whether looking for similarities or differences. Domain knowledge input and visualization of solutions are combined in an interactive tool that enables domain scientists to explore the space of solutions efficiently
Machine Learning towards General Medical Image Segmentation
The quality of patient care associated with diagnostic radiology is proportionate to a physician\u27s workload. Segmentation is a fundamental limiting precursor to diagnostic and therapeutic procedures. Advances in machine learning aims to increase diagnostic efficiency to replace single applications with generalized algorithms. We approached segmentation as a multitask shape regression problem, simultaneously predicting coordinates on an object\u27s contour while jointly capturing global shape information. Shape regression models inherent point correlations to recover ambiguous boundaries not supported by clear edges and region homogeneity. Its capabilities was investigated using multi-output support vector regression (MSVR) on head and neck (HaN) CT images. Subsequently, we incorporated multiplane and multimodality spinal images and presented the first deep learning multiapplication framework for shape regression, the holistic multitask regression network (HMR-Net). MSVR and HMR-Net\u27s performance were comparable or superior to state-of-the-art algorithms. Multiapplication frameworks bridges any technical knowledge gaps and increases workflow efficiency
Sparse multitask regression for identifying common mechanism of response to therapeutic targets
Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response
Transfer Learning using Computational Intelligence: A Survey
Abstract Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling consisting of different data patterns in the current domain. To improve the performance of existing transfer learning methods and handle the knowledge transfer process in real-world systems, ..
- …