971 research outputs found

    30th Anniversary of Applied Intelligence: A combination of bibliometrics and thematic analysis using SciMAT

    Get PDF
    Applied Intelligence is one of the most important international scientific journals in the field of artificial intelligence. From 1991, Applied Intelligence has been oriented to support research advances in new and innovative intelligent systems, methodologies, and their applications in solving real-life complex problems. In this way, Applied Intelligence hosts more than 2,400 publications and achieves around 31,800 citations. Moreover, Applied Intelligence is recognized by the industrial, academic, and scientific communities as a source of the latest innovative and advanced solutions in intelligent manufacturing, privacy-preserving systems, risk analysis, knowledge-based management, modern techniques to improve healthcare systems, methods to assist government, and solving industrial problems that are too complex to be solved through conventional approaches. Bearing in mind that Applied Intelligence celebrates its 30th anniversary in 2021, it is appropriate to analyze its bibliometric performance, conceptual structure, and thematic evolution. To do that, this paper conducts a bibliometric performance and conceptual structure analysis of Applied Intelligence from 1991 to 2020 using SciMAT. Firstly, the performance of the journal is analyzed according to the data retrieved from Scopus, putting the focus on the productivity of the authors, citations, countries, organizations, funding agencies, and most relevant publications. Finally, the conceptual structure of the journal is analyzed with the bibliometric software tool SciMAT, identifying the main thematic areas that have been the object of research and their composition, relationship, and evolution during the period analyzed

    Artificial intelligence in the cyber domain: Offense and defense

    Get PDF
    Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

    Performance Evaluation of Network Anomaly Detection Systems

    Get PDF
    Nowadays, there is a huge and growing concern about security in information and communication technology (ICT) among the scientific community because any attack or anomaly in the network can greatly affect many domains such as national security, private data storage, social welfare, economic issues, and so on. Therefore, the anomaly detection domain is a broad research area, and many different techniques and approaches for this purpose have emerged through the years. Attacks, problems, and internal failures when not detected early may badly harm an entire Network system. Thus, this thesis presents an autonomous profile-based anomaly detection system based on the statistical method Principal Component Analysis (PCADS-AD). This approach creates a network profile called Digital Signature of Network Segment using Flow Analysis (DSNSF) that denotes the predicted normal behavior of a network traffic activity through historical data analysis. That digital signature is used as a threshold for volume anomaly detection to detect disparities in the normal traffic trend. The proposed system uses seven traffic flow attributes: Bits, Packets and Number of Flows to detect problems, and Source and Destination IP addresses and Ports, to provides the network administrator necessary information to solve them. Via evaluation techniques, addition of a different anomaly detection approach, and comparisons to other methods performed in this thesis using real network traffic data, results showed good traffic prediction by the DSNSF and encouraging false alarm generation and detection accuracy on the detection schema. The observed results seek to contribute to the advance of the state of the art in methods and strategies for anomaly detection that aim to surpass some challenges that emerge from the constant growth in complexity, speed and size of today’s large scale networks, also providing high-value results for a better detection in real time.Atualmente, existe uma enorme e crescente preocupação com segurança em tecnologia da informação e comunicação (TIC) entre a comunidade científica. Isto porque qualquer ataque ou anomalia na rede pode afetar a qualidade, interoperabilidade, disponibilidade, e integridade em muitos domínios, como segurança nacional, armazenamento de dados privados, bem-estar social, questões econômicas, e assim por diante. Portanto, a deteção de anomalias é uma ampla área de pesquisa, e muitas técnicas e abordagens diferentes para esse propósito surgiram ao longo dos anos. Ataques, problemas e falhas internas quando não detetados precocemente podem prejudicar gravemente todo um sistema de rede. Assim, esta Tese apresenta um sistema autônomo de deteção de anomalias baseado em perfil utilizando o método estatístico Análise de Componentes Principais (PCADS-AD). Essa abordagem cria um perfil de rede chamado Assinatura Digital do Segmento de Rede usando Análise de Fluxos (DSNSF) que denota o comportamento normal previsto de uma atividade de tráfego de rede por meio da análise de dados históricos. Essa assinatura digital é utilizada como um limiar para deteção de anomalia de volume e identificar disparidades na tendência de tráfego normal. O sistema proposto utiliza sete atributos de fluxo de tráfego: bits, pacotes e número de fluxos para detetar problemas, além de endereços IP e portas de origem e destino para fornecer ao administrador de rede as informações necessárias para resolvê-los. Por meio da utilização de métricas de avaliação, do acrescimento de uma abordagem de deteção distinta da proposta principal e comparações com outros métodos realizados nesta tese usando dados reais de tráfego de rede, os resultados mostraram boas previsões de tráfego pelo DSNSF e resultados encorajadores quanto a geração de alarmes falsos e precisão de deteção. Com os resultados observados nesta tese, este trabalho de doutoramento busca contribuir para o avanço do estado da arte em métodos e estratégias de deteção de anomalias, visando superar alguns desafios que emergem do constante crescimento em complexidade, velocidade e tamanho das redes de grande porte da atualidade, proporcionando também alta performance. Ainda, a baixa complexidade e agilidade do sistema proposto contribuem para que possa ser aplicado a deteção em tempo real

    Swarm intelligence for clustering dynamic data sets for web usage mining and personalization.

    Get PDF
    Swarm Intelligence (SI) techniques were inspired by bee swarms, ant colonies, and most recently, bird flocks. Flock-based Swarm Intelligence (FSI) has several unique features, namely decentralized control, collaborative learning, high exploration ability, and inspiration from dynamic social behavior. Thus FSI offers a natural choice for modeling dynamic social data and solving problems in such domains. One particular case of dynamic social data is online/web usage data which is rich in information about user activities, interests and choices. This natural analogy between SI and social behavior is the main motivation for the topic of investigation in this dissertation, with a focus on Flock based systems which have not been well investigated for this purpose. More specifically, we investigate the use of flock-based SI to solve two related and challenging problems by developing algorithms that form critical building blocks of intelligent personalized websites, namely, (i) providing a better understanding of the online users and their activities or interests, for example using clustering techniques that can discover the groups that are hidden within the data; and (ii) reducing information overload by providing guidance to the users on websites and services, typically by using web personalization techniques, such as recommender systems. Recommender systems aim to recommend items that will be potentially liked by a user. To support a better understanding of the online user activities, we developed clustering algorithms that address two challenges of mining online usage data: the need for scalability to large data and the need to adapt cluster sing to dynamic data sets. To address the scalability challenge, we developed new clustering algorithms using a hybridization of traditional Flock-based clustering with faster K-Means based partitional clustering algorithms. We tested our algorithms on synthetic data, real VCI Machine Learning repository benchmark data, and a data set consisting of real Web user sessions. Having linear complexity with respect to the number of data records, the resulting algorithms are considerably faster than traditional Flock-based clustering (which has quadratic complexity). Moreover, our experiments demonstrate that scalability was gained without sacrificing quality. To address the challenge of adapting to dynamic data, we developed a dynamic clustering algorithm that can handle the following dynamic properties of online usage data: (1) New data records can be added at any time (example: a new user is added on the site); (2) Existing data records can be removed at any time. For example, an existing user of the site, who no longer subscribes to a service, or who is terminated because of violating policies; (3) New parts of existing records can arrive at any time or old parts of the existing data record can change. The user\u27s record can change as a result of additional activity such as purchasing new products, returning a product, rating new products, or modifying the existing rating of a product. We tested our dynamic clustering algorithm on synthetic dynamic data, and on a data set consisting of real online user ratings for movies. Our algorithm was shown to handle the dynamic nature of data without sacrificing quality compared to a traditional Flock-based clustering algorithm that is re-run from scratch with each change in the data. To support reducing online information overload, we developed a Flock-based recommender system to predict the interests of users, in particular focusing on collaborative filtering or social recommender systems. Our Flock-based recommender algorithm (FlockRecom) iteratively adjusts the position and speed of dynamic flocks of agents, such that each agent represents a user, on a visualization panel. Then it generates the top-n recommendations for a user based on the ratings of the users that are represented by its neighboring agents. Our recommendation system was tested on a real data set consisting of online user ratings for a set of jokes, and compared to traditional user-based Collaborative Filtering (CF). Our results demonstrated that our recommender system starts performing at the same level of quality as traditional CF, and then, with more iterations for exploration, surpasses CF\u27s recommendation quality, in terms of precision and recall. Another unique advantage of our recommendation system compared to traditional CF is its ability to generate more variety or diversity in the set of recommended items. Our contributions advance the state of the art in Flock-based 81 for clustering and making predictions in dynamic Web usage data, and therefore have an impact on improving the quality of online services

    A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions

    Full text link
    In recent decades, social network anonymization has become a crucial research field due to its pivotal role in preserving users' privacy. However, the high diversity of approaches introduced in relevant studies poses a challenge to gaining a profound understanding of the field. In response to this, the current study presents an exhaustive and well-structured bibliometric analysis of the social network anonymization field. To begin our research, related studies from the period of 2007-2022 were collected from the Scopus Database then pre-processed. Following this, the VOSviewer was used to visualize the network of authors' keywords. Subsequently, extensive statistical and network analyses were performed to identify the most prominent keywords and trending topics. Additionally, the application of co-word analysis through SciMAT and the Alluvial diagram allowed us to explore the themes of social network anonymization and scrutinize their evolution over time. These analyses culminated in an innovative taxonomy of the existing approaches and anticipation of potential trends in this domain. To the best of our knowledge, this is the first bibliometric analysis in the social network anonymization field, which offers a deeper understanding of the current state and an insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure

    Performance Evaluation of Smart Decision Support Systems on Healthcare

    Get PDF
    Medical activity requires responsibility not only from clinical knowledge and skill but also on the management of an enormous amount of information related to patient care. It is through proper treatment of information that experts can consistently build a healthy wellness policy. The primary objective for the development of decision support systems (DSSs) is to provide information to specialists when and where they are needed. These systems provide information, models, and data manipulation tools to help experts make better decisions in a variety of situations. Most of the challenges that smart DSSs face come from the great difficulty of dealing with large volumes of information, which is continuously generated by the most diverse types of devices and equipment, requiring high computational resources. This situation makes this type of system susceptible to not recovering information quickly for the decision making. As a result of this adversity, the information quality and the provision of an infrastructure capable of promoting the integration and articulation among different health information systems (HIS) become promising research topics in the field of electronic health (e-health) and that, for this same reason, are addressed in this research. The work described in this thesis is motivated by the need to propose novel approaches to deal with problems inherent to the acquisition, cleaning, integration, and aggregation of data obtained from different sources in e-health environments, as well as their analysis. To ensure the success of data integration and analysis in e-health environments, it is essential that machine-learning (ML) algorithms ensure system reliability. However, in this type of environment, it is not possible to guarantee a reliable scenario. This scenario makes intelligent SAD susceptible to predictive failures, which severely compromise overall system performance. On the other hand, systems can have their performance compromised due to the overload of information they can support. To solve some of these problems, this thesis presents several proposals and studies on the impact of ML algorithms in the monitoring and management of hypertensive disorders related to pregnancy of risk. The primary goals of the proposals presented in this thesis are to improve the overall performance of health information systems. In particular, ML-based methods are exploited to improve the prediction accuracy and optimize the use of monitoring device resources. It was demonstrated that the use of this type of strategy and methodology contributes to a significant increase in the performance of smart DSSs, not only concerning precision but also in the computational cost reduction used in the classification process. The observed results seek to contribute to the advance of state of the art in methods and strategies based on AI that aim to surpass some challenges that emerge from the integration and performance of the smart DSSs. With the use of algorithms based on AI, it is possible to quickly and automatically analyze a larger volume of complex data and focus on more accurate results, providing high-value predictions for a better decision making in real time and without human intervention.A atividade médica requer responsabilidade não apenas com base no conhecimento e na habilidade clínica, mas também na gestão de uma enorme quantidade de informações relacionadas ao atendimento ao paciente. É através do tratamento adequado das informações que os especialistas podem consistentemente construir uma política saudável de bem-estar. O principal objetivo para o desenvolvimento de sistemas de apoio à decisão (SAD) é fornecer informações aos especialistas onde e quando são necessárias. Esses sistemas fornecem informações, modelos e ferramentas de manipulação de dados para ajudar os especialistas a tomar melhores decisões em diversas situações. A maioria dos desafios que os SAD inteligentes enfrentam advêm da grande dificuldade de lidar com grandes volumes de dados, que é gerada constantemente pelos mais diversos tipos de dispositivos e equipamentos, exigindo elevados recursos computacionais. Essa situação torna este tipo de sistemas suscetível a não recuperar a informação rapidamente para a tomada de decisão. Como resultado dessa adversidade, a qualidade da informação e a provisão de uma infraestrutura capaz de promover a integração e a articulação entre diferentes sistemas de informação em saúde (SIS) tornam-se promissores tópicos de pesquisa no campo da saúde eletrônica (e-saúde) e que, por essa mesma razão, são abordadas nesta investigação. O trabalho descrito nesta tese é motivado pela necessidade de propor novas abordagens para lidar com os problemas inerentes à aquisição, limpeza, integração e agregação de dados obtidos de diferentes fontes em ambientes de e-saúde, bem como sua análise. Para garantir o sucesso da integração e análise de dados em ambientes e-saúde é importante que os algoritmos baseados em aprendizagem de máquina (AM) garantam a confiabilidade do sistema. No entanto, neste tipo de ambiente, não é possível garantir um cenário totalmente confiável. Esse cenário torna os SAD inteligentes suscetíveis à presença de falhas de predição que comprometem seriamente o desempenho geral do sistema. Por outro lado, os sistemas podem ter seu desempenho comprometido devido à sobrecarga de informações que podem suportar. Para tentar resolver alguns destes problemas, esta tese apresenta várias propostas e estudos sobre o impacto de algoritmos de AM na monitoria e gestão de transtornos hipertensivos relacionados com a gravidez (gestação) de risco. O objetivo das propostas apresentadas nesta tese é melhorar o desempenho global de sistemas de informação em saúde. Em particular, os métodos baseados em AM são explorados para melhorar a precisão da predição e otimizar o uso dos recursos dos dispositivos de monitorização. Ficou demonstrado que o uso deste tipo de estratégia e metodologia contribui para um aumento significativo do desempenho dos SAD inteligentes, não só em termos de precisão, mas também na diminuição do custo computacional utilizado no processo de classificação. Os resultados observados buscam contribuir para o avanço do estado da arte em métodos e estratégias baseadas em inteligência artificial que visam ultrapassar alguns desafios que advêm da integração e desempenho dos SAD inteligentes. Como o uso de algoritmos baseados em inteligência artificial é possível analisar de forma rápida e automática um volume maior de dados complexos e focar em resultados mais precisos, fornecendo previsões de alto valor para uma melhor tomada de decisão em tempo real e sem intervenção humana

    Identification of Military-related Science and Technology

    Get PDF
    A proof-of-principle demonstration for extracting military-related technologies from a country's total technology publications has been performed, and applied to the Indian science and technology literature#. The method is general and can be applied to the extraction of any meta-category (e.g., intelligence-relevanttechnologies, infrastructure-relevant technologies, etc) which is not easily obtained from document clustering or factor analysis. The methodology for identifying relevant literature on military science appears to provide credible results. The volume of literature retrieved will vary depending on how strongly relevant is the desired literature. For the same definitions of 'military relevant', the volume of India's literature in the Ei Compendex database was an order of magnitude less than that of the USA or China.Defence Science Journal, 2010, 60(3), pp.259-270, DOI:http://dx.doi.org/10.14429/dsj.60.35

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Machine learning methods for omics data integration

    Get PDF
    High-throughput technologies produce genome-scale transcriptomic and metabolomic (omics) datasets that allow for the system-level studies of complex biological processes. The limitation lies in the small number of samples versus the larger number of features represented in these datasets. Machine learning methods can help integrate these large-scale omics datasets and identify key features from each dataset. A novel class dependent feature selection method integrates the F statistic, maximum relevance binary particle swarm optimization (MRBPSO), and class dependent multi-category classification (CDMC) system. A set of highly differentially expressed genes are pre-selected using the F statistic as a filter for each dataset. MRBPSO and CDMC function as a wrapper to select desirable feature subsets for each class and classify the samples using those chosen class-dependent feature subsets. The results indicate that the class-dependent approaches can effectively identify unique biomarkers for each cancer type and improve classification accuracy compared to class independent feature selection methods. The integration of transcriptomics and metabolomics data is based on a classification framework. Compared to principal component analysis and non-negative matrix factorization based integration approaches, our proposed method achieves 20-30% higher prediction accuracies on Arabidopsis tissue development data. Metabolite-predictive genes and gene-predictive metabolites are selected from transcriptomic and metabolomic data respectively. The constructed gene-metabolite correlation network can infer the functions of unknown genes and metabolites. Tissue-specific genes and metabolites are identified by the class-dependent feature selection method. Evidence from subcellular locations, gene ontology, and biochemical pathways support the involvement of these entities in different developmental stages and tissues in Arabidopsis
    corecore