197 research outputs found
PhagePro: prophage finding tool
Dissertação de mestrado em BioinformáticaBacteriophages are viruses that infect bacteria and use them to reproduce. Their
reproductive cycle can be lytic or lysogenic. The lytic cycle leads to the bacteria death,
given that the bacteriophage hijacks hosts machinery to produce phage parts necessary
to assemble a new complete bacteriophage, until cell wall lyse occurs. On the other
hand, the lysogenic reproductive cycle comprises the bacteriophage genetic material in
the bacterial genome, becoming a prophage. Sometimes, due to external stimuli, these
prophages can be induced to perform a lytic cycle. Moreover, the lysogenic cycle can
lead to significant modifications in bacteria, for example, antibiotic resistance.
To that end, PhagePro was created. This tool finds and characterises prophages
inserted in the bacterial genome. Using 42 features, three datasets were created and
five machine learning algorithms were tested.
All models were evaluated in two phases, during testing and with real bacterial cases.
During testing, all three datasets reached the 98 % F1 score mark in their best result. In
the second phase, the results of the models were used to predict real bacterial cases
and the results compared to the results of two tools, Prophage Hunter and PHASTER.
The best model found 110 zones out of 154 and the model with the best result in dataset
3 had 94 in common.
As a final test, Agrobacterium fabrum strC68 was extensively analysed. The results
show that PhagePro was capable of detecting more regions with proteins associated
with phages than the other two tools.
In the ligth of the results obtained, PhagePro has shown great potential in the discovery
and characterisation of bacterial alterations caused by prophages.Bacteriófagos são vírus que infetam bactérias usando-as para garantir a manutenção do seu genoma. Este processo pode ser realizado por ciclo lítico ou lipogénico. O ciclo lítico consiste em usar a célula para seu proveito, criar bacteriófagos e lisar a célula. Por outro lado, no ciclo lipogénico o bacteriófago insere o seu código genético no genoma da bactéria, o que pode levar à transferência de genes de interesse, tornando-se importante uma monitorização dos profagos. Assim foi desenvolvido o PhagePro, uma ferramenta capaz de encontrar e caracterizar bacteriófagos em genomas bactérias. Foram criadas features para distinguir profagos de bactérias, criando três datasets e usando algoritmos de aprendizagem de máquina. Os modelos foram avaliados durante duas fases, a fase de teste e a fase de casos reais. Na primeira fase de testes, o melhor modelo do dataset 1 teve 98% de F1 score, dataset 2 teve 98% e do dataset 3 também teve 98%. Todos os modelos, para teste em casos reais, foram comparados com previsões de duas ferramentas Prophage Hunter e PHASTER. O modelo com os melhores resultados obteve 110 de 154 zonas em comum com as duas ferramentas e o modelo do dataset 3 teve 94 zonas. Por fim, foi feita a análise dos resultados da bactéria Agrobacterium fabrum strC68. Os resultados obtidos mostram resultados diferentes, mas válidos, as ferramentas comparadas, visto que o PhagePro consegue detectar zonas com proteínas associadas a fagos que as outras tools não conseguem. Em virtude dos resultados obtidos, PhagePro mostrou que é capaz de encontrar e caracterizar profagos em bactérias.Este estudo contou com o apoio da Fundação para a Ciência e Tecnologia (FCT)
portuguesa no âmbito do financiamento estratégico da unidade UIDB/04469/2020. A obra também foi parcialmente financiada pelo Projeto PTDC/SAU-PUB/29182/2017 [POCI-01-0145-FEDER-029182]
Probabilistic movement primitives for coordination of multiple human–robot collaborative tasks
This paper proposes an interaction learning method for collaborative and assistive robots based on movement primitives. The method allows for both action recognition and human–robot movement coordination. It uses imitation learning to construct a mixture model of human–robot interaction primitives. This probabilistic model allows the assistive trajectory of the robot to be inferred from human observations. The method is scalable in relation to the number of tasks and can learn nonlinear correlations between the trajectories that describe the human–robot interaction. We evaluated the method experimentally with a lightweight robot arm in a variety of assistive scenarios, including the coordinated handover of a bottle to a human, and the collaborative assembly of a toolbox. Potential applications of the method are personal caregiver robots, control of intelligent prosthetic devices, and robot coworkers in factories
Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions
This work was supported in part by the Ministry of Higher Education under the Fundamental Research Grant Scheme under Grant FRGS/1/2018/ICT04/UTM/01/1; and in part by the Faculty of Informatics and Management, University of Hradec Kralove, through SPEV project under Grant 2102/2022.Phishing has become an increasing concern and captured the attention of end-users as well
as security experts. Existing phishing detection techniques still suffer from the de ciency in performance
accuracy and inability to detect unknown attacks despite decades of development and improvement.
Motivated to solve these problems, many researchers in the cybersecurity domain have shifted their attention
to phishing detection that capitalizes on machine learning techniques. Deep learning has emerged as a branch
of machine learning that becomes a promising solution for phishing detection in recent years. As a result,
this study proposes a taxonomy of deep learning algorithm for phishing detection by examining 81 selected
papers using a systematic literature review approach. The paper rst introduces the concept of phishing and
deep learning in the context of cybersecurity. Then, taxonomies of phishing detection and deep learning
algorithm are provided to classify the existing literature into various categories. Next, taking the proposed
taxonomy as a baseline, this study comprehensively reviews the state-of-the-art deep learning techniques
and analyzes their advantages as well as disadvantages. Subsequently, the paper discusses various issues
that deep learning faces in phishing detection and proposes future research directions to overcome these
challenges. Finally, an empirical analysis is conducted to evaluate the performance of various deep learning
techniques in a practical context, and to highlight the related issues that motivate researchers in their future
works. The results obtained from the empirical experiment showed that the common issues among most of
the state-of-the-art deep learning algorithms are manual parameter-tuning, long training time, and de cient
detection accuracy.Ministry of Higher Education under the Fundamental Research Grant Scheme FRGS/1/2018/ICT04/UTM/01/1Faculty of Informatics and Management, University of Hradec Kralove, through SPEV project 2102/202
Implementation of a Private Cloud
The exponential growth of hardware requirements coupled with online services development costs have brought the need to create dynamic and resilient systems with networks able to handle high-density traffic.
One of the emerging paradigms to achieve this is called Cloud Computing it proposes
the creation of an elastic and modular computing architecture that allows dynamic
allocation of hardware and network resources in order to meet the needs of applications.
The creation of a Private Cloud based on the OpenStack platform implements this
idea. This solution decentralizes the institution resources making it possible to aggregate resources that are physically spread across several areas of the globe and allows an optimization of computing and network resources.
With this in mind, in this thesis a private cloud system was implemented that is capable of elastically leasing and releasing computing resources, allows the creation of public and private networks that connect computation instances and the launch of virtual
machines that instantiate servers and services, and also isolate projects within the same system.
The system expansion should start with the addition of extra nodes and the modernization of the existing ones, this expansion will also lead to the emergence of network problems which can be surpassed with the integration of Software Defined Network controllers
Artificial intelligence in innovation research: A systematic review, conceptual framework, and future research directions
Artificial Intelligence (AI) is increasingly adopted by organizations to innovate, and this is ever more reflected in scholarly work. To illustrate, assess and map research at the intersection of AI and innovation, we performed a Systematic Literature Review (SLR) of published work indexed in the Clarivate Web of Science (WOS) and Elsevier Scopus databases (the final sample includes 1448 articles). A bibliometric analysis was deployed to map the focal field in terms of dominant topics and their evolution over time. By deploying keyword co-occurrences, and bibliographic coupling techniques, we generate insights on the literature at the intersection of AI and innovation research. We leverage the SLR findings to provide an updated synopsis of extant scientific work on the focal research area and to develop an interpretive framework which sheds light on the drivers and outcomes of AI adoption for innovation. We identify economic, technological, and social factors of AI adoption in firms willing to innovate. We also uncover firms' economic, competitive and organizational, and innovation factors as key outcomes of AI deployment. We conclude this paper by developing an agenda for future research
Network anomalies detection via event analysis and correlation by a smart system
The multidisciplinary of contemporary societies compel us to look at Information Technology (IT) systems as one of the most significant grants that we can remember. However, its increase implies a mandatory security force for users, a force in the form of effective and robust tools to combat cybercrime to which users, individual or collective, are ex-posed almost daily. Monitoring and detection of this kind of problem must be ensured in real-time, allowing companies to intervene fruitfully, quickly and in unison.
The proposed framework is based on an organic symbiosis between credible, affordable, and effective open-source tools for data analysis, relying on Security Information and Event Management (SIEM), Big Data and Machine Learning (ML) techniques commonly applied for the development of real-time monitoring systems. Dissecting this framework, it is composed of a system based on SIEM methodology that provides monitoring of data in real-time and simultaneously saves the information, to assist forensic investigation teams. Secondly, the application of the Big Data concept is effective in manipulating and
organising the flow of data. Lastly, the use of ML techniques that help create mechanisms to detect possible attacks or anomalies on the network. This framework is intended to provide a real-time analysis application in the institution ISCTE – Instituto Universitário de Lisboa (Iscte), offering a more complete, efficient, and secure monitoring of the data from the different devices comprising the network.A multidisciplinaridade das sociedades contemporâneas obriga-nos a perspetivar os sistemas informáticos como uma das maiores dádivas de que há memória. Todavia o seu incremento implica uma mandatária força de segurança para utilizadores, força essa em forma de ferramentas eficazes e robustas no combate ao cibercrime a que os utilizadores, individuais ou coletivos, são sujeitos quase diariamente. A monitorização e deteção deste tipo de problemas tem de ser assegurada em tempo real, permitindo assim, às empresas intervenções frutuosas, rápidas e em uníssono.
A framework proposta é alicerçada numa simbiose orgânica entre ferramentas open source credíveis, acessíveis pecuniariamente e eficazes na monitorização de dados, recorrendo a um sistema baseado em técnicas de Security Information and Event Management (SIEM), Big Data e Machine Learning (ML) comumente aplicadas para a criação de sistemas de monitorização em tempo real. Dissecando esta framework, é composta pela metodologia SIEM que possibilita a monitorização de dados em tempo real e em simultâneo guardar a informação, com o objetivo de auxiliar as equipas de investigação forense. Em segundo lugar, a aplicação do conceito Big Data eficaz na manipulação e organização do fluxo dos dados. Por último, o uso de técnicas de ML que ajudam a criação de mecanismos de deteção de possíveis ataques ou anomalias na rede. Esta framework tem como objetivo uma aplicação de análise em tempo real na instituição ISCTE – Instituto Universitário de Lisboa (Iscte), apresentando uma monitorização mais completa, eficiente e segura dos dados dos diversos dispositivos presentes na mesma
A realistic evaluation of indoor positioning systems based on Wi-Fi fingerprinting: The 2015 EvAAL–ETRI competition
Pre-print versionThis paper presents results from comparing different Wi-Fi fingerprinting algorithms on the same private dataset. The algorithms where realized by independent teams in the frame of the off-site track of the EvAAL-ETRI Indoor Localization Competition which was part of the Sixth International Conference on Indoor Positioning and Indoor Navigation (IPIN 2015). Competitors designed and validated their algorithms against the publicly available UJIIndoorLoc database which contains a huge reference- and validation data set. All competing systems were evaluated using the mean error in positioning, with penalties, using a private test dataset. The authors believe that this is the first work in which Wi-Fi fingerprinting algorithm results delivered by several independent and competing teams are fairly compared under the same evaluation conditions. The analysis also comprises a combined approach: Results indicate that the competing systems where complementary, since an ensemble that combines three competing methods reported the overall best results.We would like to thank Francesco Potortì, Paolo Barsocchi, Michele Girolami and Kyle O’Keefe for their valuable help in organizing and spread the EVAALETRI
competition and the off-site track. We would also like to thank the TPC
members Machaj Juraj, Christos Laoudias, Antoni Pérez-Navarro and Robert
Piché for their valuable comments, suggestions and reviews.
Parts of this work were funded in the frame of the Spanish Ministry of Economy
and Competitiveness through the “Metodologiías avanzadas para el diseño,
desarrollo, evaluación e integración de algoritmos de localización en interiores”
project (Proyectos I+D Excelencia, código TIN2015-70202-P) and the “Red de
Posicionamiento y Navegación en Interiores” network (Redes de Excelencia,
código TEC2015-71426- REDT). Parts of this work were funded in the frame of the German federal Ministry of Education and Research programme "FHprofUnt2013" under contract 03FH035PB3 (Project SPIRIT).info:eu-repo/semantics/acceptedVersio
THE ANALYTICS QUOTIENT: RETOOLING CIVIL AFFAIRS FOR THE FUTURE OPERATING ENVIRONMENT
Historically, military intelligence analysts and U.S. forces, frozen in their preferred strategy of attrition warfare, have undervalued civil information in conflicts against irregular threats. As operating environments grow more complex, uncertain, and population-centric, the roles of Civil Affairs Forces and civil information will become increasingly relevant. Unfortunately, the current analytical methods prescribed in Civil Affairs doctrine are inadequate for evaluating complex environments. They fail to provide supported commanders with the information required to make informed decisions. The purpose of this research is to determine how Civil Affairs Forces must retool their analytical capabilities to meet the demands of future operating environments. The answer lies in developing an organic Civil Affairs analytic capability suitable for employing data-driven approaches to gain actionable insights into uncertain operational environments, and subsequently, integrating those insights into sophisticated operational targeting frameworks and strategies designed to disrupt irregular threats. This research uses case studies of organizations, across a range of industries, that leveraged innovative data-driven approaches into disruptive competitive advantages. These organizations highlight the broad utility of the prescribed approaches and potential pathways for Civil Affairs Forces to pursue in creating an analytic capability that supports effective civil knowledge integration.http://archive.org/details/theanalyticsquot1094564891Major, United States ArmyApproved for public release; distribution is unlimited
GPT Models in Construction Industry: Opportunities, Limitations, and a Use Case Validation
Large Language Models(LLMs) trained on large data sets came into prominence
in 2018 after Google introduced BERT. Subsequently, different LLMs such as GPT
models from OpenAI have been released. These models perform well on diverse
tasks and have been gaining widespread applications in fields such as business
and education. However, little is known about the opportunities and challenges
of using LLMs in the construction industry. Thus, this study aims to assess GPT
models in the construction industry. A critical review, expert discussion and
case study validation are employed to achieve the study objectives. The
findings revealed opportunities for GPT models throughout the project
lifecycle. The challenges of leveraging GPT models are highlighted and a use
case prototype is developed for materials selection and optimization. The
findings of the study would be of benefit to researchers, practitioners and
stakeholders, as it presents research vistas for LLMs in the construction
industry.Comment: 58 pages, 20 figure
Adaptive Management of Multimodel Data and Heterogeneous Workloads
Data management systems are facing a growing demand for a tighter integration of heterogeneous data from different applications and sources for both operational and analytical purposes in real-time. However, the vast diversification of the data management landscape has led to a situation where there is a trade-off between high operational performance and a tight integration of data. The difference between the growth of data volume and the growth of computational power demands a new approach for managing multimodel data and handling heterogeneous workloads.
With PolyDBMS we present a novel class of database management systems, bridging the gap between multimodel database and polystore systems. This new kind of database system combines the operational capabilities of traditional database systems with the flexibility of polystore systems. This includes support for data modifications, transactions, and schema changes at runtime. With native support for multiple data models and query languages, a PolyDBMS presents a holistic solution for the management of heterogeneous data. This does not only enable a tight integration of data across different applications, it also allows a more efficient usage of resources. By leveraging and combining highly optimized database systems as storage and execution engines, this novel class of database system takes advantage of decades of database systems research and development.
In this thesis, we present the conceptual foundations and models for building a PolyDBMS. This includes a holistic model for maintaining and querying multiple data models in one logical schema that enables cross-model queries. With the PolyAlgebra, we present a solution for representing queries based on one or multiple data models while preserving their semantics. Furthermore, we introduce a concept for the adaptive planning and decomposition of queries across heterogeneous database systems with different capabilities and features.
The conceptual contributions presented in this thesis materialize in Polypheny-DB, the first implementation of a PolyDBMS. Supporting the relational, document, and labeled property graph data model, Polypheny-DB is a suitable solution for structured, semi-structured, and unstructured data. This is complemented by an extensive type system that includes support for binary large objects. With support for multiple query languages, industry standard query interfaces, and a rich set of domain-specific data stores and data sources, Polypheny-DB offers a flexibility unmatched by existing data management solutions
- …