2,388 research outputs found
Dynamic Analysis of Executables to Detect and Characterize Malware
It is needed to ensure the integrity of systems that process sensitive
information and control many aspects of everyday life. We examine the use of
machine learning algorithms to detect malware using the system calls generated
by executables-alleviating attempts at obfuscation as the behavior is monitored
rather than the bytes of an executable. We examine several machine learning
techniques for detecting malware including random forests, deep learning
techniques, and liquid state machines. The experiments examine the effects of
concept drift on each algorithm to understand how well the algorithms
generalize to novel malware samples by testing them on data that was collected
after the training data. The results suggest that each of the examined machine
learning algorithms is a viable solution to detect malware-achieving between
90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the
performance evaluation on an operational network may not match the performance
achieved in training. Namely, the CAA may be about the same, but the values for
precision and recall over the malware can change significantly. We structure
experiments to highlight these caveats and offer insights into expected
performance in operational environments. In addition, we use the induced models
to gain a better understanding about what differentiates the malware samples
from the goodware, which can further be used as a forensics tool to understand
what the malware (or goodware) was doing to provide directions for
investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure
Tracking Cyber Adversaries with Adaptive Indicators of Compromise
A forensics investigation after a breach often uncovers network and host
indicators of compromise (IOCs) that can be deployed to sensors to allow early
detection of the adversary in the future. Over time, the adversary will change
tactics, techniques, and procedures (TTPs), which will also change the data
generated. If the IOCs are not kept up-to-date with the adversary's new TTPs,
the adversary will no longer be detected once all of the IOCs become invalid.
Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular
expressions (regexes), up-to-date with a dynamic adversary. Our framework
solves the TTK problem in an automated, cyclic fashion to bracket a previously
discovered adversary. This tracking is accomplished through a data-driven
approach of self-adapting a given model based on its own detection
capabilities.
In our initial experiments, we found that the true positive rate (TPR) of the
adaptive solution degrades much less significantly over time than the naive
solution, suggesting that self-updating the model allows the continued
detection of positives (i.e., adversaries). The cost for this performance is in
the false positive rate (FPR), which increases over time for the adaptive
solution, but remains constant for the naive solution. However, the difference
in overall detection performance, as measured by the area under the curve
(AUC), between the two methods is negligible. This result suggests that
self-updating the model over time should be done in practice to continue to
detect known, evolving adversaries.Comment: This was presented at the 4th Annual Conf. on Computational Science &
Computational Intelligence (CSCI'17) held Dec 14-16, 2017 in Las Vegas,
Nevada, US
Neurogenesis Deep Learning
Neural machine learning methods, such as deep neural networks (DNN), have
achieved remarkable success in a number of complex data processing tasks. These
methods have arguably had their strongest impact on tasks such as image and
audio processing - data processing domains in which humans have long held clear
advantages over conventional algorithms. In contrast to biological neural
systems, which are capable of learning continuously, deep artificial networks
have a limited ability for incorporating new information in an already trained
network. As a result, methods for continuous learning are potentially highly
impactful in enabling the application of deep networks to dynamic data sets.
Here, inspired by the process of adult neurogenesis in the hippocampus, we
explore the potential for adding new neurons to deep layers of artificial
neural networks in order to facilitate their acquisition of novel information
while preserving previously trained data representations. Our results on the
MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes
lower and upper case letters and digits, demonstrate that neurogenesis is well
suited for addressing the stability-plasticity dilemma that has long challenged
adaptive machine learning algorithms.Comment: 8 pages, 8 figures, Accepted to 2017 International Joint Conference
on Neural Networks (IJCNN 2017
The Wide Brown Dwarf Binary Oph 1622-2405 and Discovery of A Wide, Low Mass Binary in Ophiuchus (Oph 1623-2402): A New Class of Young Evaporating Wide Binaries?
We imaged five objects near the star forming clouds of Ophiuchus with the
Keck Laser Guide Star AO system. We resolved Allers et al. (2006)'s #11 (Oph
16222-2405) and #16 (Oph 16233-2402) into binary systems. The #11 object is
resolved into a 243 AU binary, the widest known for a very low mass (VLM)
binary. The binary nature of #11 was discovered first by Allers (2005) and
independently here during which we obtained the first spatially resolved R~2000
near-infrared (J & K) spectra, mid-IR photometry, and orbital motion estimates.
We estimate for 11A and 11B gravities (log(g)>3.75), ages (5+/-2 Myr),
luminosities (log(L/Lsun)=-2.77+/-0.10 and -2.96+/-0.10), and temperatures
(Teff=2375+/-175 and 2175+/-175 K). We find self-consistent DUSTY evolutionary
model (Chabrier et al. 2000) masses of 17+4-5 MJup and 14+6-5 MJup, for 11A and
11B respectively. Our masses are higher than those previously reported (13-15
MJup and 7-8 MJup) by Jayawardhana & Ivanov (2006b). Hence, we find the system
is unlikely a ``planetary mass binary'', (in agreement with Luhman et al. 2007)
but it has the second lowest mass and lowest binding energy of any known
binary. Oph #11 and Oph #16 belong to a newly recognized population of wide
(>100 AU), young (<10 Myr), roughly equal mass, VLM stellar and brown dwarf
binaries. We deduce that ~6+/-3% of young (<10 Myr) VLM objects are in such
wide systems. However, only 0.3+/-0.1% of old field VLM objects are found in
such wide systems. Thus, young, wide, VLM binary populations may be
evaporating, due to stellar encounters in their natal clusters, leading to a
field population depleted in wide VLM systems.Comment: Accepted version V2. Now 13 pages longer (45 total) due to a new
discussion of the stability of the wide brown dwarf binary population, new
summary Figure 17 now included, Astrophysical Journal 2007 in pres
Associations Between Cardiorespiratory Fitness and C-Reactive Protein in Men
Objective - This study examined the association between cardiorespiratory fitness and C-reactive protein (CRP), with adjustment for weight and within weight categories.
Methods and Results - We calculated median and adjusted geometric mean CRP levels, percentages of individuals with an elevated CRP (≥2.00 mg/L), and odds ratios of elevated CRP across 5 levels of cardiorespiratory fitness for 722 men. CRP values were adjusted for age, body mass index, vitamin use, statin medication use, aspirin use, the presence of inflammatory disease, cardiovascular disease, and diabetes, and smoking habit. We found an inverse association of CRP across fitness levels (P for trend\u3c0.001), with the highest adjusted CRP value in the lowest fitness quintile (1.64 [1.27 to 2.11] mg/L) and the lowest adjusted CRP value in the highest fitness quintile (0.70 [0.60 to 0.80] mg/L). Similar results were found for the prevalence of elevated CRP across fitness quintiles. We used logistic regression to model the adjusted odds for elevated CRP and found that compared with the referent first quintile, the second (odds ratio [OR] 0.43, 95% CI 0.22 to 0.85), third (OR 0.33, 95% CI 0.17 to 0.65), fourth (OR 0.23, 95% CI 0.12 to 0.47), and fifth (OR 0.17, 95% CI 0.08 to 0.37) quintiles of fitness had significantly lower odds of elevated CRP. Similar results were found when examining the CRP-fitness relation within categories of body fatness (normal weight, overweight, and obese) and waist girth (\u3c102 or ≥102 cm).
Conclusions - Cardiorespiratory fitness levels were inversely associated with CRP values and the prevalence of elevated CRP values in this sample of men from the Aerobics Center Longitudinal Study
Diversity and distribution of the scarab beetle tribe Phanaeini in the northern states of the Brazilian Northeast (Coleoptera: Scarabaeidae: Scarabaeinae)
The fauna of Phanaeini of the northeast of Brazil was investigated through fieldwork in the States of Ceará, Maranhão and Piauí, and through study of preserved material from other states. Seven species of Phanaeini are newly recorded from these three states. Of these, two species are also new records for the northeast region: Phanaeus melibaeus Blanchard and an unidentified Dendropaemon Perty species. A total of 13 new state records are given for eight of the 15 species of Phanaeini recorded from the northeast to date, including three new state genus records. A key is provided for identification of all species. Detailed distributional information is presented together with habitat and bait preferences and other ecological data for each species. The diversity and distribution of the tribe in the northeast is discussed in the context of regional biotopes and wider geographic ranges. The fauna is shown to be more diverse than previously believed, containing both endemic and widespread elements occurring in species assemblages that differ according to habitat type and elevation, leading to substantial complementarity of diversity amongst the main biogeographic provinces and biotopes of the region. A fauna de Phanaeini do Nordeste brasileiro é sumarizada, como resultado de novas coletas nos estados do Ceará, Maranhão e Piauí, e pelo estudo de material preservado de outros estados. Sete espécies de Phanaeini são reportadas pela primeira vez para esses estados. Destas, duas espécies são também novos registros para a Região Nordeste: Phanaeus melibaeus Blanchard e uma espécie não identificada de Dendropaemon Perty. Um total de treze novos registros de estatais é apresentado para oito das quinze espécies de Phanaeini reportadas do Nordeste até agora, incluindo treis novos registros estatais de gêneros. Uma chave é dada para permitir a identificação de todas as espécies. Apresenta-se informação detalhada de distribuição, hábitat e preferências por isca e outros dados ecológicos para cada espécie. A diversidade e a distribuição da tribo no Nordeste são discutidas no contexto de biótopos regionais e áreas geográficas maiores. A fauna é considerada mais diversa do que previamente esperado, contendo tanto elementos endêmicos como de ampla distribuição, ocorrendo em assembléias de espécies que variam de acordo com tipos de hábitat e altitude, levando a grande complementariedade da diversidade entre as principais províncias biogeográficas e biótopos da região
The CMS Integration Grid Testbed
The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2
hardware at the following sites: the California Institute of Technology, Fermi
National Accelerator Laboratory, the University of California at San Diego, and
the University of Florida at Gainesville. The IGT runs jobs using the Globus
Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is
managed using VO management scripts from the European Data Grid (EDG). Gridwide
monitoring is accomplished using local tools such as Ganglia interfaced into
the Globus Metadata Directory Service (MDS) and the agent based Mona Lisa.
Domain specific software is packaged and installed using the Distrib ution
After Release (DAR) tool of CMS, while middleware under the auspices of the
Virtual Data Toolkit (VDT) is distributed using Pacman. During a continuo us
two month span in Fall of 2002, over 1 million official CMS GEANT based Monte
Carlo events were generated and returned to CERN for analysis while being
demonstrated at SC2002. In this paper, we describe the process that led to one
of the world's first continuously available, functioning grids.Comment: CHEP 2003 MOCT01
- …