3 research outputs found
High Performance Attack Estimation in Large-Scale Network Flows
Network based attacks are the major threat to security on the Internet. The volume of traffic and the high variability of the attacks place threat detection squarely in the domain of big data. Conventional approaches are mostly based on signatures. While these are relatively inexpensive computationally, they are inflexible and insensitive to small variations in the attack vector. Therefore we explored the use of machine learning techniques on real flow data. We found that benign traffic could be identified with high accuracy
Illicit Activity Detection in Large-Scale Dark and Opaque Web Social Networks
Many online chat applications live in a grey area between the legitimate web and the dark net. The Telegram network in particular can aid criminal activities. Telegram hosts “chats” which consist of varied conversations and advertisements. These chats take place among automated “bots” and human users. Classifying legitimate activity from illegitimate activity can aid law enforcement in finding criminals. Social network analysis of Telegram chats presents a difficult problem. Users can change their username or create new accounts. Users involved in criminal activity often do this to obscure their identity. This makes establishing the unique identity behind a given username challenging. Thus we explored classifying users from their language usage in their chat messages.The volume and velocity of Telegram chat data place it well within the domain of big data. Machine learning and natural language processing (NLP) tools are necessary to classify this chat data. We developed NLP tools for classifying users and the chat group to which their messages belong. We found that legitimate and illegitimate chat groups could be classified with high accuracy. We also were able to classify bots, humans, and advertisements within conversations
Localization of a novel melanoma susceptibility locus to 1p22
Over the past 20 years, the incidence of cutaneous malignant melanoma (CMM) has increased dramatically worldwide. A positive family history of the disease is among the most established risk factors for CMM; it is estimated that 10% of CMM cases result from an inherited predisposition. Although mutations in two genes, CDKN2A and CDK4, have been shown to confer an increased risk of CMM, they account for only 20%-25% of families with multiple cases of CMM. Therefore, to localize additional loci involved in melanoma susceptibility, we have performed a genomewide scan for linkage in 49 Australian pedigrees containing at least three CMM cases, in which CDKN2A and CDK4 involvement has been excluded. The highest two-point parametric LOD score (1.82; recombination fraction [theta] 0.2) was obtained at D1S2726, which maps to the short arm of chromosome 1 (1p22). A parametric LOD score of 4.65 (theta = 0) and a nonparametric LOD score of 4.19 were found at D1S2779 in nine families selected for early age at onset. Additional typing yielded seven adjacent markers with LOD scores 13 in this subset, with the highest parametric LOD score, 4.95 (theta = 0) ( nonparametric LOD score 5.37), at D1S2776. Analysis of 33 additional multiplex families with CMM from several continents provided further evidence for linkage to the 1p22 region, again strongest in families with the earliest mean age at diagnosis. A nonparametric ordered sequential analysis was used, based on the average age at diagnosis in each family. The highest LOD score, 6.43, was obtained at D1S2779 and occurred when the 15 families with the earliest ages at onset were included. These data provide significant evidence of a novel susceptibility gene for CMM located within chromosome band 1p22