572 research outputs found

    A semantic rule based digital fraud detection

    Get PDF
    Digital fraud has immensely affected ordinary consumers and the finance industry. Our dependence on internet banking has made digital fraud a substantial problem. Financial institutions across the globe are trying to improve their digital fraud detection and deterrence capabilities. Fraud detection is a reactive process, and it usually incurs a cost to save the system from an ongoing malicious activity. Fraud deterrence is the capability of a system to withstand any fraudulent attempts. Fraud deterrence is a challenging task and researchers across the globe are proposing new solutions to improve deterrence capabilities. In this work, we focus on the very important problem of fraud deterrence. Our proposed work uses an Intimation Rule Based (IRB) alert generation algorithm. These IRB alerts are classified based on severity levels. Our proposed solution uses a richer domain knowledge base and rule-based reasoning. In this work, we propose an ontology-based financial fraud detection and deterrence model

    Recent Advances in Steganography

    Get PDF
    Steganography is the art and science of communicating which hides the existence of the communication. Steganographic technologies are an important part of the future of Internet security and privacy on open systems such as the Internet. This book's focus is on a relatively new field of study in Steganography and it takes a look at this technology by introducing the readers various concepts of Steganography and Steganalysis. The book has a brief history of steganography and it surveys steganalysis methods considering their modeling techniques. Some new steganography techniques for hiding secret data in images are presented. Furthermore, steganography in speeches is reviewed, and a new approach for hiding data in speeches is introduced

    Approximation and Relaxation Approaches for Parallel and Distributed Machine Learning

    Get PDF
    Large scale machine learning requires tradeoffs. Commonly this tradeoff has led practitioners to choose simpler, less powerful models, e.g. linear models, in order to process more training examples in a limited time. In this work, we introduce parallelism to the training of non-linear models by leveraging a different tradeoff--approximation. We demonstrate various techniques by which non-linear models can be made amenable to larger data sets and significantly more training parallelism by strategically introducing approximation in certain optimization steps. For gradient boosted regression tree ensembles, we replace precise selection of tree splits with a coarse-grained, approximate split selection, yielding both faster sequential training and a significant increase in parallelism, in the distributed setting in particular. For metric learning with nearest neighbor classification, rather than explicitly train a neighborhood structure we leverage the implicit neighborhood structure induced by task-specific random forest classifiers, yielding a highly parallel method for metric learning. For support vector machines, we follow existing work to learn a reduced basis set with extremely high parallelism, particularly on GPUs, via existing linear algebra libraries. We believe these optimization tradeoffs are widely applicable wherever machine learning is put in practice in large scale settings. By carefully introducing approximation, we also introduce significantly higher parallelism and consequently can process more training examples for more iterations than competing exact methods. While seemingly learning the model with less precision, this tradeoff often yields noticeably higher accuracy under a restricted training time budget

    Big data analytics for preventive medicine

    Get PDF
    © 2019, Springer-Verlag London Ltd., part of Springer Nature. Medical data is one of the most rewarding and yet most complicated data to analyze. How can healthcare providers use modern data analytics tools and technologies to analyze and create value from complex data? Data analytics, with its promise to efficiently discover valuable pattern by analyzing large amount of unstructured, heterogeneous, non-standard and incomplete healthcare data. It does not only forecast but also helps in decision making and is increasingly noticed as breakthrough in ongoing advancement with the goal is to improve the quality of patient care and reduces the healthcare cost. The aim of this study is to provide a comprehensive and structured overview of extensive research on the advancement of data analytics methods for disease prevention. This review first introduces disease prevention and its challenges followed by traditional prevention methodologies. We summarize state-of-the-art data analytics algorithms used for classification of disease, clustering (unusually high incidence of a particular disease), anomalies detection (detection of disease) and association as well as their respective advantages, drawbacks and guidelines for selection of specific model followed by discussion on recent development and successful application of disease prevention methods. The article concludes with open research challenges and recommendations

    Detection of Road Conditions Using Image Processing and Machine Learning Techniques for Situation Awareness

    Get PDF
    In this modern era, land transports are increasing dramatically. Moreover, self-driven car or the Advanced Driving Assistance System (ADAS) is now the public demand. For these types of cars, road conditions detection is mandatory. On the other hand, compared to the number of vehicles, to increase the number of roads is not possible. Software is the only alternative solution. Road Conditions Detection system will help to solve the issues. For solving this problem, Image processing, and machine learning have been applied to develop a project namely, Detection of Road Conditions Using Image Processing and Machine Learning Techniques for Situation Awareness. Many issues could be considered for road conditions but the main focus will be on the detection of potholes, Maintenance sings and lane. Image processing and machine learning have been combined for our system for detecting in real-time. Machine learning has been applied to maintains signs detection. Image processing has been applied for detecting lanes and potholes. The detection system will provide a lane mark with colored lines, the pothole will be a marker with a red rectangular box and for a road Maintenance sign, the system will also provide information of aintenance sign as maintenance sing is detected. By observing all these scenarios, the driver will realize the road condition. On the other hand situation awareness is the ability to perceive information from it’s surrounding, takes decisions based on perceived information and it makes decision based on prediction

    Machine learning applications for the topology prediction of transmembrane beta-barrel proteins

    Get PDF
    The research topic for this PhD thesis focuses on the topology prediction of beta-barrel transmembrane proteins. Transmembrane proteins adopt various conformations that are about the functions that they provide. The two most predominant classes are alpha-helix bundles and beta-barrel transmembrane proteins. Alpha-helix proteins are present in larger numbers than beta-barrel transmembrane proteins in structure databases. Therefore, there is a need to find computational tools that can predict and detect the structure of beta-barrel transmembrane proteins. Transmembrane proteins are used for active transport across the membrane or signal transduction. Knowing the importance of their roles, it becomes essential to understand the structures of the proteins. Transmembrane proteins are also a significant focus for new drug discovery. Transmembrane beta-barrel proteins play critical roles in the translocation machinery, pore formation, membrane anchoring, and ion exchange. In bioinformatics, many years of research have been spent on the topology prediction of transmembrane alpha-helices. The efforts to TMB (transmembrane beta-barrel) proteins topology prediction have been overshadowed, and the prediction accuracy could be improved with further research. Various methodologies have been developed in the past to predict TMB proteins topology. Methods developed in the literature that are available include turn identification, hydrophobicity profiles, rule-based prediction, HMM (Hidden Markov model), ANN (Artificial Neural Networks), radial basis function networks, or combinations of methods. The use of cascading classifier has never been fully explored. This research presents and evaluates approaches such as ANN (Artificial Neural Networks), KNN (K-Nearest Neighbors, SVM (Support Vector Machines), and a novel approach to TMB topology prediction with the use of a cascading classifier. Computer simulations have been implemented in MATLAB, and the results have been evaluated. Data were collected from various datasets and pre-processed for each machine learning technique. A deep neural network was built with an input layer, hidden layers, and an output. Optimisation of the cascading classifier was mainly obtained by optimising each machine learning algorithm used and by starting using the parameters that gave the best results for each machine learning algorithm. The cascading classifier results show that the proposed methodology predicts transmembrane beta-barrel proteins topologies with high accuracy for randomly selected proteins. Using the cascading classifier approach, the best overall accuracy is 76.3%, with a precision of 0.831 and recall or probability of detection of 0.799 for TMB topology prediction. The accuracy of 76.3% is achieved using a two-layers cascading classifier. By constructing and using various machine-learning frameworks, systems were developed to analyse the TMB topologies with significant robustness. We have presented several experimental findings that may be useful for future research. Using the cascading classifier, we used a novel approach for the topology prediction of TMB proteins

    Text Classification of installation Support Contract Topic Models for Category Management

    Get PDF
    Air Force Installation Contracting Agency manages nearly 18 percent of total Air Force spend, equating to approximately 57 billion dollars. To improve strategic sourcing, the organization is beginning to categorize installation-support spend and assign accountable portfolio managers to respective spend categories. A critical task in this new strategic environment includes the appropriate categorization of Air Force contracts into newly created, manageable spend categories. It has been recognized that current composite categories have the opportunity to be further distinguished into sub-categories leveraging text analytics on the contract descriptions. Furthermore, upon establishing newly constructed categories, future contracts must be classified into these newly constructed categories in order to be strategically managed. This research proposes a methodological framework for using Latent Dirichlet Allocation to sculpt categories from the natural distribution of contract topics, and assesses the appropriateness of supervised learning classification algorithms such as Support Vector Machines, Random Forests, and Weighted K-Nearest Neighbors models to classify future unseen contracts. The results suggest a significant improvement in modeled spend categories over the existing categories, facilitating more accurate classification of unseen contracts into their respective sub-categories

    Detecting feature influences to quality attributes in large and partially measured spaces using smart sampling and dynamic learning

    Get PDF
    Emergent application domains (e.g., Edge Computing/Cloud/B5G systems) are complex to be built manually. They are characterised by high variability and are modelled by large Variability Models (VMs), leading to large configuration spaces. Due to the high number of variants present in such systems, it is challenging to find the best-ranked product regarding particular Quality Attributes (QAs) in a short time. Moreover, measuring QAs sometimes is not trivial, requiring a lot of time and resources, as is the case of the energy footprint of software systems — the focus of this paper. Hence, we need a mechanism to analyse how features and their interactions influence energy footprint, but without measuring all configurations. While practical, sampling and predictive techniques base their accuracy on uniform spaces or some initial domain knowledge, which are not always possible to achieve. Indeed, analysing the energy footprint of products in large configuration spaces raises specific requirements that we explore in this work. This paper presents SAVRUS (Smart Analyser of Variability Requirements in Unknown Spaces), an approach for sampling and dynamic statistical learning without relying on initial domain knowledge of large and partially QA-measured spaces. SAVRUS reports the degree to which features and pairwise interactions influence a particular QA, like energy efficiency. We validate and evaluate SAVRUS with a selection of likewise systems, which define large searching spaces containing scattered measurements.Funding for open access charge: Universidad de Málaga / CBUA. This work is supported by the European Union’s H2020 re search and innovation programme under grant agreement DAEMON H2020-101017109, by the projects IRIS PID2021-12281 2OB-I00 (co-financed by FEDER funds), Rhea P18-FR-1081 (MCI/AEI/ FEDER, UE), and LEIA UMA18-FEDERIA-157, and the PRE2019-087496 grant from the Ministerio de Ciencia e Innovación, Spain

    Detecting Social Spamming on Facebook Platform

    Get PDF
    TĂ€napĂ€eval toimub vĂ€ga suur osa kommunikatsioonist elektroonilistes suhtlusvĂ”rgustikes. Ühest kĂŒljest lihtsustab see omavahelist suhtlemist ja uudiste levimist, teisest kĂŒljest loob see ideaalse pinnase sotsiaalse rĂ€mpsposti levikuks. Rohkem kui kahe miljardi kasutajaga Facebooki platvorm on hetkel rĂ€mpsposti levitajate ĂŒks pĂ”hilisi sihtmĂ€rke. Platvormi kasutajad puutuvad igapĂ€evaselt kokku ohtude ja ebameeldivustega nagu pahavara levitavad lingid, vulgaarsused, vihakĂ”ned, kĂ€ttemaksuks levitatav porno ja muu. Kuigi uurijad on esitanud erinevaid tehnikaid sotsiaalmeedias rĂ€mpspostituste vĂ€hendamiseks, on neid rakendatud eelkĂ”ige Twitteri platvormil ja vaid vĂ€hesed on seda teinud Facebookis. Pidevalt arenevate rĂ€mpspostitusmeetoditega vĂ”itlemiseks tuleb vĂ€lja töötada jĂ€rjest uusi rĂ€mpsposti avastamise viise. KĂ€esolev magistritöö keskendub Facebook platvormile, kuhu on lĂ”putöö raames paigutatud kĂŒmme „meepurki” (ingl honeypot), mille abil mÀÀratakse kindlaks vĂ€ljakutsed rĂ€mpsposti tuvastamisel, et pakkuda tĂ”husamaid lahendusi. Kasutades kĂ”iki sisendeid, kaasa arvatud varem mujal sotsiaalmeedias testitud meetodid ja informatsioon „meepurkidest”, luuakse andmekaeve ja masinĂ”ppe meetoditele tuginedes klassifikaator, mis suudab eristada rĂ€mpspostitaja profiili tavakasutaja profiilist. Nimetatu saavutamiseks vaadeldakse esmalt peamisi vĂ€ljakutseid ja piiranguid rĂ€mpsposti tuvastamisel ning esitletakse varasemalt tehtud uuringuid koos tulemustega. SeejĂ€rel kirjeldatakse rakenduslikku protsessi, alustades „meepurgi” ehitusest, andmete kogumisest ja ettevalmistamisest kuni klassifikaatori ehitamiseni. LĂ”puks esitatakse „meepurkidelt” saadud vaatlusandmed koos klassifikaatori tulemustega ning vĂ”rreldakse neid uurimistöödega teiste sotsiaalmeedia platvormide kohta. Selle lĂ”putöö peamine panus on klassifikaator, mis suudab eristada Facebooki kasutaja profiilid spĂ€mmerite omast. Selle lĂ”putöö originaalsus seisneb eesmĂ€rgis avastada erinevat sotsiaalset spĂ€mmi, mitte ainult pahavara levitajaid vaid ka neid, kes levitavad roppust, massiliselt sĂ”numeid, heakskiitmata sisu jne.OSNs (Online Social Networks) are dominating the human interaction nowadays, easing the communication and spreading of news on one hand and providing a global fertile soil to grow all different kinds of social spamming, on the other. Facebook platform, with its 2 billions current active users, is currently on the top of the spammers' targets. Its users are facing different kind of social threats everyday, including malicious links, profanity, hate speech, revenge porn and others. Although many researchers have presented their different techniques to defeat spam on social media, specially on Twitter platform, very few have targeted Facebook's.To fight the continuously evolving spam techniques, we have to constantly develop and enhance the spam detection methods. This research digs deeper in the Facebook platform, through 10 implemented honeypots, to state the challenges that slow the spam detection process, and ways to overcome it. Using all the given inputs, including the previous techniques tested on other social medias along with observations driven from the honeypots, the final product is a classifier that distinguish the spammer profiles from legitimate ones through data mining and machine learning techniques. To achieve this, the research first overviews the main challenges and limitations that obstruct the spam detection process, and presents the related researches with their results. It then, outlines the implementation steps, from the honeypot construction step, passing through the data collection and preparation and ending by building the classifier itself. Finally, it presents the observations driven from the honeypot and the results from the classifier and validates it against the results from previous researches on different social platforms. The main contribution of this thesis is the end classifier which will be able to distinguish between the legitimate Facebook profiles and the spammer ones. The originality of the research lies in its aim to detect all kind of social spammers, not only the spreading-malware spammers, but also spamming in its general context, e.g. the ones spreading profanity, bulk messages and unapproved contents
    • 

    corecore