455 research outputs found

    Data Challenges and Data Analytics Solutions for Power Systems

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Development of a tool based on deep learning able to classify biomedical literature

    Get PDF
    Dissertação de mestrado em BioinformaticsIn the last decades, the scientific community has produced huge amounts of publications about the most varied biomedical topics, making the search for relevant information a really difficult task for every researcher. Some approaches have been followed to develop tools that can facilitate this process. For instance, PubMed implemented in 2017 a Machine Learning model to sort documents by their relevance. Nevertheless, even the authors consider that their system would benefit from the implementation of a Deep Learning model, which for now needs more studies. In this context, a package called BioTMPy1 was developed in this work, to perform document classification of biomedical literature using the Python programming language. The package is divided into different modules to provide to the user functions to read documents in different formats, perform preprocessing and data analysis and to train, optimize and evaluate Machine and Deep learning models. Our package also provides intuitive pipelines that can be easily adapted for the user needs, illustrating how to implement complex deep learning models. The developed package was applied to a dataset from a challenge of the BioCreative forum, from 2019, about protein-protein interactions altered by mutations, an important topic for the advances related to precision medicine. Using this dataset, it was possible to observe a slightly better performance of BioWordVec pre-trained embeddings over GloVe, ”pubmed pmc” and ”pubmed ncbi” embeddings. Also, with the evaluation of the developed models on the test set, we managed to overcome the challenge’s best submission, by using a model with BioBERT and a bidirectional LSTM on top, resulting in a difference of 7.25% for average precision, 3.22% for precision, 2.99% for recall and 3.15% for the f1-score. Also, a web server was developed to provide access to the best Deep Learning model trained in this work. The overall pipeline here developed can be applied to other case studies in different topics, provided there is a set of documents annotated as relevant and non-relevant, allowing to train the models.Nas últimas décadas, a comunidade científica tem produzido uma enorme quantidade de publicações sobre os mais variados tópicos biomédicos, tornando a procura de informação relevante num processo complicado para qualquer investigador. Alguma abordagem tem sido seguidas para desenvolver ferramentas que possam facilitar este processo. Por exemplo, o PubMed implementou em 2017 um modelo de aprendizagem máquina para ordenar documentos pela sua relevância. Contudo, os autores consideram que o seu sistema pode beneficiar com a implementação de um modelo de Deep Learning, o que para já necessita de mais estudos. Neste projeto, foi desenvolvida um package chamado BioTMPy para classificar documentos da literatura biomédica através da linguagem de programação Python. Este package é dividido em diferentes módulos para fornecer ao utilizador funções para ler documentos de formatos diferentes, realizar pré-processamento e análise de dados, e para treinar, otimizar e avaliar modelos de aprendizagem máquina. A plataforma também fornece pipelines intuitivas que podem ser facilmente adaptadas de acordo com as necessidades do utilizador, demonstrando como implementar modelos complexos de Deep Learning. O package desenvolvido foi aplicado a um conjunto de dados de um desafio do fórum BioCreative, de 2019, acerca de interações proteína-proteína alteradas por mutações, um tópico importante para a área da medicina de precisão. Usando este conjunto de dados, consegue-se observar um melhor desempenho dos BioWordVec embeddings pré-treinados em relação a embeddings como GloVe, ”pubmed pmc” e ”pubmed ncbi”. Com os modelos desenvolvidos, foi possível ultrapassar a melhor submissão do challenge, usando um modelo com BioBERT e uma LSTM bidirecional acima, obtendo-se diferenças de 7.25% na precisão média, 3.22% na precisão, 2.99% no recall e 3.15% para o f1 -score. Foi ainda desenvolvido um servidor web de forma a fornecer acesso ao nosso melhor modelo. A plataforma desenvolvida neste trabalho poderá ser aplicável a outros casos de estudo em diferentes tópicos, desde que exista um conjunto de documentos anotado como relevante ou não relevante, que permita treinar os modelos

    A Comprehensive Method For Coordinating Distributed Energy Resources In A Power Distribution System

    Get PDF
    Utilities, faced with increasingly limited resources, strive to maintain high levels of reliability in energy delivery by adopting improved methodologies in planning, operation, construction and maintenance. On the other hand, driven by steady research and development and increase in sales volume, the cost of deploying PV systems has been in constant decline since their first introduction to the market. The increased level of penetration of distributed energy resources in power distribution infrastructure presents various benefits such as loss reduction, resilience against cascading failures and access to more diversified resources. However, serious challenges and risks must be addressed to ensure continuity and reliability of service. By integrating necessary communication and control infrastructure into the distribution system, to develop a practically coordinated system of distributed resources, controllable load/generation centers will be developed which provide substantial flexibility for the operation of the distribution system. On the other hand, such a complex distributed system is prone to instability and black outs due to lack of a major infinite supply and other unpredicted variations in load and generation, which must be addressed. To devise a comprehensive method for coordination between Distributed Energy Resources in order to achieve a collective goal, is the key point to provide a fully functional and reliable power distribution system incorporating distributed energy resources. A road map to develop such comprehensive coordination system is explained and supporting scenarios and their associated simulation results are then elaborated. The proposed road map describes necessary steps to build a comprehensive solution for coordination between multiple agents in a microgrid or distribution feeder.\u2

    Transformer-Based Multi-Task Learning for Crisis Actionability Extraction

    Get PDF
    Social media has become a valuable information source for crisis informatics. While various methods were proposed to extract relevant information during a crisis, their adoption by field practitioners remains low. In recent fieldwork, actionable information was identified as the primary information need for crisis responders and a key component in bridging the significant gap in existing crisis management tools. In this paper, we proposed a Crisis Actionability Extraction System for filtering, classification, phrase extraction, severity estimation, localization, and aggregation of actionable information altogether. We examined the effectiveness of transformer-based LSTM-CRF architecture in Twitter-related sequence tagging tasks and simultaneously extracted actionable information such as situational details and crisis impact via Multi-Task Learning. We demonstrated the system’s practical value in a case study of a real-world crisis and showed its effectiveness in aiding crisis responders with making well-informed decisions, mitigating risks, and navigating the complexities of the crisis

    EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

    Full text link
    Dense embedding-based retrieval is now the industry standard for semantic search and ranking problems, like obtaining relevant web documents for a given query. Such techniques use a two-stage process: (a) contrastive learning to train a dual encoder to embed both the query and documents and (b) approximate nearest neighbor search (ANNS) for finding similar documents for a given query. These two stages are disjoint; the learned embeddings might be ill-suited for the ANNS method and vice-versa, leading to suboptimal performance. In this work, we propose End-to-end Hierarchical Indexing -- EHI -- that jointly learns both the embeddings and the ANNS structure to optimize retrieval performance. EHI uses a standard dual encoder model for embedding queries and documents while learning an inverted file index (IVF) style tree structure for efficient ANNS. To ensure stable and efficient learning of discrete tree-based ANNS structure, EHI introduces the notion of dense path embedding that captures the position of a query/document in the tree. We demonstrate the effectiveness of EHI on several benchmarks, including de-facto industry standard MS MARCO (Dev set and TREC DL19) datasets. For example, with the same compute budget, EHI outperforms state-of-the-art (SOTA) in by 0.6% (MRR@10) on MS MARCO dev set and by 4.2% (nDCG@10) on TREC DL19 benchmarks

    Predictive maintenance of electrical grid assets: internship at EDP Distribuição - Energia S.A

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThis report will describe the activities developed during an internship at EDP Distribuição, focusing on a Predictive Maintenance analytics project directed at high voltage electrical grid assets including Overhead Lines, Power Transformers and Circuit Breakers. The project’s main goal is to support EDP’s asset management processes by improving maintenance and investing planning. The project’s main deliverables are the Probability of Failure metric that forecast asset failures 15 days ahead of time, estimated through supervised machine learning models; the Health Index metric that indicates asset’s current state and condition, implemented though the Ofgem methodology; and two asset management dashboards. The project was implemented by an external service provider, a consultant company, and during the internship it was possible to integrate the team, and participate in the development activities

    Evaluation of integration of pumped storage units in an isolated network

    Get PDF
    Tese de mestrado. Engenharia Eletrotécnica e de Computadores (Área de especialização em Sistemas de Energia). 2006. Faculdade de Engenharia. Universidade do Port

    Demand Curve Modeling for the Utility of the Future

    Get PDF
    Electricity systems are undergoing significant changes. Demands are shifting in magnitude and temporal distribution due to developing policies and technologies such as electric vehicles, heat pumps, embedded generation and energy storage, while an increasingly renewable supply is intermittent and less flexible. As such, there is currently great uncertainty in the industry and future business pathways may vary significantly from the current paradigm. This research focused on developing a set of models which can be used by utility companies to leverage their smart meter data and gain insights into possible future impacts and opportunities. The thesis presents a series of novel models, developed and implemented with data provided from a utility in Southern Ontario. First, a regression model was developed to leverage the full value of utility smart meter data by disaggregating residential and commercial sector demands into base, heating and cooling end uses. The use of a variable temperature changepoint only marginally improved prediction accuracy, but significantly shifted disaggregation results, particularly at hourly resolution. This model was also applied for weather normalization, assessment of technology change and projection under different climate scenarios. A second model used this and additional data from literature to project long term utility level average and peak seasonal load curves. A dynamic interface with parameterized controls allowed real-time visualization of technology and policy impacts on the demand curve. A set of eight literature-based scenarios were also projected to demonstrate the extreme range of impacts predicted by different literature. These led to the conclusion that unmanaged technology penetration can lead to significant challenges such as increased peaks, large ramp rates and lower utilization. An analysis was then performed at finer geographic resolution, investigating impacts on representative distribution system transformers. First, the current variation in local technology penetration was examined, showing a significantly skewed distribution with many transformers having up to ten times the average rates. Clustering was then used to identify a set of eight diverse, representative transformer load profiles. Future scenarios were modeled, demonstrating that the impacts of technology and optimal mitigation techniques vary significantly between regions of the distribution system. Finally, the dynamic utility load curve model was also updated to project demands for the representative transformer groups identified. This allows users to simultaneously assess local impacts and mitigation strategies, as well as aggregate effects on the overall system demands. Together these works combine to provide a valuable toolset and significant insight into potential system impacts
    corecore