23 research outputs found

    Intrusion Detection in Databases

    Get PDF

    Natural Language Processing as a Weapon

    Get PDF
    Natural Language Processing (NLP) is a science aimed at computationally interpreting written language. This field is maturing at an extraordinary pace. It is creating significant value and advancing a number of key research fronts. However, it also enables highly sophisticated phishing attacks. Given a large enough text sample, an NLP algorithm can identify and replicate defining characteristics of an individual’s communication patterns. This facilitates programmatic impersonation of trusted individuals. A natural language processor could interpret incoming text messages or email and improvise responses which approximate the language of a known contact. The recipient could be tricked into sharing sensitive information. Just how vulnerable are we? This paper reviews the state of the art of natural language processing and social engineering. It also describes a test which empirically assesses our ability to discern legitimate communications from algorithmically-produced forgeries

    Using Computer Behavior Profiles to Differentiate between Users in a Digital Investigation

    Get PDF
    Most digital crimes involve finding evidence on the computer and then linking it to a suspect using login information, such as a username and a password. However, login information is often shared or compromised. In such a situation, there needs to be a way to identify the user without relying exclusively on login credentials. This paper introduces the concept that users may show behavioral traits which might provide more information about the user on the computer. This hypothesis was tested by conducting an experiment in which subjects were required to perform common tasks on a computer, over multiple sessions. The choices they made to complete each task was recorded. These were converted to a \u27behavior profile,\u27 corresponding to each login session. Cluster Analysis of all the profiles assigned identifiers to each profile such that 98% of profiles were attributed correctly. Also, similarity scores were generated for each session-pair to test whether the similarity analysis attributed profiles to the same user or to two different users. Using similarity scores, the user sessions were correctly attributed 93.2% of the time. Sessions were incorrectly attributed to the same user 3.1% of the time and incorrectly attributed to different users 3.7% of the time. At a confidence level of 95%, the average correct attributions for the population was calculated to be between 92.98% and 93.42%. This shows that users show uniqueness and consistency in the choices they make as they complete everyday tasks on a system, and this can be useful to differentiate between them. Keywords: computer behavior users, interaction, investigation, forensics, graphical inter-face, windows, digital Keywords: computer behavior users, interaction, investigation, forensics, graphical inter- face, windows, digita

    A Knowledge-based Clinical Toxicology Consultant for Diagnosing Multiple Exposures

    Get PDF
    Objective: This paper presents continued research toward the development of a knowledge-based system for the diagnosis of human toxic exposures. In particular, this research focuses on the challenging task of diagnosing exposures to multiple toxins. Although only 10% of toxic exposures in the United States involve multiple toxins, multiple exposures account for more than half of all toxin-related fatalities. Using simple medical mathematics, we seek to produce a practical decision support system capable of supplying useful information to aid in the diagnosis of complex cases involving multiple unknown substances. Methods: The system is automatically trained using data mining techniques to extract prior probabilities and likelihood ratios from a database managed by the Florida Poison Information Center (FPIC). When supplied with observed clinical effects, the system produces a ranked list of the most plausible toxic exposures. During testing, the system diagnosed toxins at three levels: identifying the substance, identifying the toxin’s major and minor categories, and identifying the toxin’s major category alone. To enable comparison between these three levels, accuracy was calculated as the percentage of exposures correctly identified in top 10% of trained diagnoses. Results: System evaluation utilized a dataset of 8,901 multiple exposure cases and 37,617 single exposure cases. Initial system testing using only multiple exposure cases yielded poor results, with diagnosis accuracies ranging from 18.5-50.1%. Further investigation revealed that the system’s inability to diagnose multiple disorders resulted from insufficient data and that the clinical effects observed in multiple exposures are dominated by a single substance. Including single exposures when training, the system achieved accuracies as high as 83.5% when 2 diagnosing the primary contributors in multiple exposure cases by substance, 86.9% when diagnosing by major and minor categories, and 79.9% when diagnosing by major category alone. Conclusions: Although the system failed to completely diagnose exposures to multiple toxins, the ability to identify the primary contributor in such cases may prove valuable in aiding medical personnel as they seek to diagnose and treat patients. As time passes and more cases are added to the FPIC database, we believe system accuracy will continue to improve, producing a viable decision support system for clinical toxicology

    Masquerade Attack Detection Using a Search-Behavior Modeling Approach

    Get PDF
    Masquerade attacks are unfortunately a familiar security problem that is a consequence of identity theft. Detecting masqueraders is very hard. Prior work has focused on user command modeling to identify abnormal behavior indicative of impersonation. This paper extends prior work by presenting one-class Hellinger distance-based and one-class SVM modeling techniques that use a set of novel features to reveal user intent. The specific objective is to model user search profiles and detect deviations indicating a masquerade attack. We hypothesize that each individual user knows their own file system well enough to search in a limited, targeted and unique fashion in order to find information germane to their current task. Masqueraders, on the other hand, will likely not know the file system and layout of another user's desktop, and would likely search more extensively and broadly in a manner that is different than the victim user being impersonated. We extend prior research that uses UNIX command sequences issued by users as the audit source by relying upon an abstraction of commands. We devise taxonomies of UNIX commands and Windows applications that are used to abstract sequences of user commands and actions. We also gathered our own normal and masquerader data sets captured in a Windows environment for evaluation. The datasets are publicly available for other researchers who wish to study masquerade attack rather than author identification as in much of the prior reported work. The experimental results show that modeling search behavior reliably detects all masqueraders with a very low false positive rate of 0.1%, far better than prior published results. The limited set of features used for search behavior modeling also results in huge performance gains over the same modeling techniques that use larger sets of features

    A Knowledge-based Clinical Toxicology Consultant for Diagnosing Single Exposures

    Get PDF
    Objective: Every year, toxic exposures kill twelve hundred Americans. To aid in the timely diagnosis and treatment of such exposures, this research investigates the feasibility of a knowledge-based system capable of generating differential diagnoses for human exposures involving unknown toxins. Methods: Data mining techniques automatically extract prior probabilities and likelihood ratios from a database managed by the Florida Poison Information Center. Using observed clinical effects, the trained system produces a ranked list of plausible toxic exposures. The resulting system was evaluated using 30,152 single exposure cases. In addition, the effects of two filters for refining diagnosis based on a minimum number of exposure cases and a minimum number of clinical effects were also explored. Results: The system achieved accuracies (calculated as the percentage of exposures correctly identified in top 10% of trained diagnoses) as high as 79.8% when diagnosing by substance and 78.9% when diagnosing by the major and minor categories of toxins. Conclusions: The results of this research are modest, yet promising. At this time, no similar systems are currently in use in the United States and it is hoped that these studies will yield an effective medical decision support system for clinical toxicology

    Computer Intrusion Detection Through Statistical Analysis and Prediction Modeling

    Get PDF
    Information security is very important in today’s society. Computer intrusion is one type of security infraction that poses a threat to all of us. Almost every person in modern parts of the world depend upon automated information. Information systems deliver paychecks on time, manage taxes, transfer funds, deliver important information that enables decisions, and maintain situational awareness in many different ways. Interrupting, corrupting, or destroying this information is a real threat. Computer attackers, often posing as intruders masquerading as authentic users, are the nucleus of this threat. Preventive computer security measures often do not provide enough; digital firms need methods to detect attackers who have breached firewalls or other barriers. This thesis explores techniques to detect computer intruders based upon UNIX command usage of authentic users compared against command usage of attackers. The hypothesis is that computing behavior of authentic users differs from the computing behavior of attackers. In order to explore this hypothesis, seven different variables that measure computing commands are created and utilized to perform predictive modeling to determine the presence or absence of a attacker. This is a classification problem that involves two known groups: intruders and non intruders. Techniques explored include a proven algorithm published by Matthius Schonlau in [17] and several predictive model variations utilizing the aforementioned seven variables; predictive models include linear discrimination analysis, clustering, kernel partial least squares learning machines
    corecore