2,388 research outputs found

    Dynamic Analysis of Executables to Detect and Characterize Malware

    Full text link
    It is needed to ensure the integrity of systems that process sensitive information and control many aspects of everyday life. We examine the use of machine learning algorithms to detect malware using the system calls generated by executables-alleviating attempts at obfuscation as the behavior is monitored rather than the bytes of an executable. We examine several machine learning techniques for detecting malware including random forests, deep learning techniques, and liquid state machines. The experiments examine the effects of concept drift on each algorithm to understand how well the algorithms generalize to novel malware samples by testing them on data that was collected after the training data. The results suggest that each of the examined machine learning algorithms is a viable solution to detect malware-achieving between 90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the performance evaluation on an operational network may not match the performance achieved in training. Namely, the CAA may be about the same, but the values for precision and recall over the malware can change significantly. We structure experiments to highlight these caveats and offer insights into expected performance in operational environments. In addition, we use the induced models to gain a better understanding about what differentiates the malware samples from the goodware, which can further be used as a forensics tool to understand what the malware (or goodware) was doing to provide directions for investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure

    Tracking Cyber Adversaries with Adaptive Indicators of Compromise

    Full text link
    A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary will no longer be detected once all of the IOCs become invalid. Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular expressions (regexes), up-to-date with a dynamic adversary. Our framework solves the TTK problem in an automated, cyclic fashion to bracket a previously discovered adversary. This tracking is accomplished through a data-driven approach of self-adapting a given model based on its own detection capabilities. In our initial experiments, we found that the true positive rate (TPR) of the adaptive solution degrades much less significantly over time than the naive solution, suggesting that self-updating the model allows the continued detection of positives (i.e., adversaries). The cost for this performance is in the false positive rate (FPR), which increases over time for the adaptive solution, but remains constant for the naive solution. However, the difference in overall detection performance, as measured by the area under the curve (AUC), between the two methods is negligible. This result suggests that self-updating the model over time should be done in practice to continue to detect known, evolving adversaries.Comment: This was presented at the 4th Annual Conf. on Computational Science & Computational Intelligence (CSCI'17) held Dec 14-16, 2017 in Las Vegas, Nevada, US

    Neurogenesis Deep Learning

    Full text link
    Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms.Comment: 8 pages, 8 figures, Accepted to 2017 International Joint Conference on Neural Networks (IJCNN 2017

    The Wide Brown Dwarf Binary Oph 1622-2405 and Discovery of A Wide, Low Mass Binary in Ophiuchus (Oph 1623-2402): A New Class of Young Evaporating Wide Binaries?

    Full text link
    We imaged five objects near the star forming clouds of Ophiuchus with the Keck Laser Guide Star AO system. We resolved Allers et al. (2006)'s #11 (Oph 16222-2405) and #16 (Oph 16233-2402) into binary systems. The #11 object is resolved into a 243 AU binary, the widest known for a very low mass (VLM) binary. The binary nature of #11 was discovered first by Allers (2005) and independently here during which we obtained the first spatially resolved R~2000 near-infrared (J & K) spectra, mid-IR photometry, and orbital motion estimates. We estimate for 11A and 11B gravities (log(g)>3.75), ages (5+/-2 Myr), luminosities (log(L/Lsun)=-2.77+/-0.10 and -2.96+/-0.10), and temperatures (Teff=2375+/-175 and 2175+/-175 K). We find self-consistent DUSTY evolutionary model (Chabrier et al. 2000) masses of 17+4-5 MJup and 14+6-5 MJup, for 11A and 11B respectively. Our masses are higher than those previously reported (13-15 MJup and 7-8 MJup) by Jayawardhana & Ivanov (2006b). Hence, we find the system is unlikely a ``planetary mass binary'', (in agreement with Luhman et al. 2007) but it has the second lowest mass and lowest binding energy of any known binary. Oph #11 and Oph #16 belong to a newly recognized population of wide (>100 AU), young (<10 Myr), roughly equal mass, VLM stellar and brown dwarf binaries. We deduce that ~6+/-3% of young (<10 Myr) VLM objects are in such wide systems. However, only 0.3+/-0.1% of old field VLM objects are found in such wide systems. Thus, young, wide, VLM binary populations may be evaporating, due to stellar encounters in their natal clusters, leading to a field population depleted in wide VLM systems.Comment: Accepted version V2. Now 13 pages longer (45 total) due to a new discussion of the stability of the wide brown dwarf binary population, new summary Figure 17 now included, Astrophysical Journal 2007 in pres

    Associations Between Cardiorespiratory Fitness and C-Reactive Protein in Men

    Get PDF
    Objective - This study examined the association between cardiorespiratory fitness and C-reactive protein (CRP), with adjustment for weight and within weight categories. Methods and Results - We calculated median and adjusted geometric mean CRP levels, percentages of individuals with an elevated CRP (≥2.00 mg/L), and odds ratios of elevated CRP across 5 levels of cardiorespiratory fitness for 722 men. CRP values were adjusted for age, body mass index, vitamin use, statin medication use, aspirin use, the presence of inflammatory disease, cardiovascular disease, and diabetes, and smoking habit. We found an inverse association of CRP across fitness levels (P for trend\u3c0.001), with the highest adjusted CRP value in the lowest fitness quintile (1.64 [1.27 to 2.11] mg/L) and the lowest adjusted CRP value in the highest fitness quintile (0.70 [0.60 to 0.80] mg/L). Similar results were found for the prevalence of elevated CRP across fitness quintiles. We used logistic regression to model the adjusted odds for elevated CRP and found that compared with the referent first quintile, the second (odds ratio [OR] 0.43, 95% CI 0.22 to 0.85), third (OR 0.33, 95% CI 0.17 to 0.65), fourth (OR 0.23, 95% CI 0.12 to 0.47), and fifth (OR 0.17, 95% CI 0.08 to 0.37) quintiles of fitness had significantly lower odds of elevated CRP. Similar results were found when examining the CRP-fitness relation within categories of body fatness (normal weight, overweight, and obese) and waist girth (\u3c102 or ≥102 cm). Conclusions - Cardiorespiratory fitness levels were inversely associated with CRP values and the prevalence of elevated CRP values in this sample of men from the Aerobics Center Longitudinal Study

    Diversity and distribution of the scarab beetle tribe Phanaeini in the northern states of the Brazilian Northeast (Coleoptera: Scarabaeidae: Scarabaeinae)

    Get PDF
    The fauna of Phanaeini of the northeast of Brazil was investigated through fieldwork in the States of Ceará, Maranhão and Piauí, and through study of preserved material from other states. Seven species of Phanaeini are newly recorded from these three states. Of these, two species are also new records for the northeast region: Phanaeus melibaeus Blanchard and an unidentified Dendropaemon Perty species. A total of 13 new state records are given for eight of the 15 species of Phanaeini recorded from the northeast to date, including three new state genus records. A key is provided for identification of all species. Detailed distributional information is presented together with habitat and bait preferences and other ecological data for each species. The diversity and distribution of the tribe in the northeast is discussed in the context of regional biotopes and wider geographic ranges. The fauna is shown to be more diverse than previously believed, containing both endemic and widespread elements occurring in species assemblages that differ according to habitat type and elevation, leading to substantial complementarity of diversity amongst the main biogeographic provinces and biotopes of the region. A fauna de Phanaeini do Nordeste brasileiro é sumarizada, como resultado de novas coletas nos estados do Ceará, Maranhão e Piauí, e pelo estudo de material preservado de outros estados. Sete espécies de Phanaeini são reportadas pela primeira vez para esses estados. Destas, duas espécies são também novos registros para a Região Nordeste: Phanaeus melibaeus Blanchard e uma espécie não identificada de Dendropaemon Perty. Um total de treze novos registros de estatais é apresentado para oito das quinze espécies de Phanaeini reportadas do Nordeste até agora, incluindo treis novos registros estatais de gêneros. Uma chave é dada para permitir a identificação de todas as espécies. Apresenta-se informação detalhada de distribuição, hábitat e preferências por isca e outros dados ecológicos para cada espécie. A diversidade e a distribuição da tribo no Nordeste são discutidas no contexto de biótopos regionais e áreas geográficas maiores. A fauna é considerada mais diversa do que previamente esperado, contendo tanto elementos endêmicos como de ampla distribuição, ocorrendo em assembléias de espécies que variam de acordo com tipos de hábitat e altitude, levando a grande complementariedade da diversidade entre as principais províncias biogeográficas e biótopos da região

    The CMS Integration Grid Testbed

    Get PDF
    The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed using VO management scripts from the European Data Grid (EDG). Gridwide monitoring is accomplished using local tools such as Ganglia interfaced into the Globus Metadata Directory Service (MDS) and the agent based Mona Lisa. Domain specific software is packaged and installed using the Distrib ution After Release (DAR) tool of CMS, while middleware under the auspices of the Virtual Data Toolkit (VDT) is distributed using Pacman. During a continuo us two month span in Fall of 2002, over 1 million official CMS GEANT based Monte Carlo events were generated and returned to CERN for analysis while being demonstrated at SC2002. In this paper, we describe the process that led to one of the world's first continuously available, functioning grids.Comment: CHEP 2003 MOCT01
    corecore