5 research outputs found
Herding Vulnerable Cats: A Statistical Approach to Disentangle Joint Responsibility for Web Security in Shared Hosting
Hosting providers play a key role in fighting web compromise, but their
ability to prevent abuse is constrained by the security practices of their own
customers. {\em Shared} hosting, offers a unique perspective since customers
operate under restricted privileges and providers retain more control over
configurations. We present the first empirical analysis of the distribution of
web security features and software patching practices in shared hosting
providers, the influence of providers on these security practices, and their
impact on web compromise rates. We construct provider-level features on the
global market for shared hosting -- containing 1,259 providers -- by gathering
indicators from 442,684 domains. Exploratory factor analysis of 15 indicators
identifies four main latent factors that capture security efforts: content
security, webmaster security, web infrastructure security and web application
security. We confirm, via a fixed-effect regression model, that providers exert
significant influence over the latter two factors, which are both related to
the software stack in their hosting environment. Finally, by means of GLM
regression analysis of these factors on phishing and malware abuse, we show
that the four security and software patching factors explain between 10\% and
19\% of the variance in abuse at providers, after controlling for size. For
web-application security for instance, we found that when a provider moves from
the bottom 10\% to the best-performing 10\%, it would experience 4 times fewer
phishing incidents. We show that providers have influence over patch
levels--even higher in the stack, where CMSes can run as client-side
software--and that this influence is tied to a substantial reduction in abuse
levels
Use of Entropy for Feature Selection with Intrusion Detection System Parameters
The metric of entropy provides a measure about the randomness of data and a measure of information gained by comparing different attributes. Intrusion detection systems can collect very large amounts of data, which are not necessarily manageable by manual means. Collected intrusion detection data often contains redundant, duplicate, and irrelevant entries, which makes analysis computationally intensive likely leading to unreliable results. Reducing the data to what is relevant and pertinent to the analysis requires the use of data mining techniques and statistics. Identifying patterns in the data is part of analysis for intrusion detections in which the patterns are categorized as normal or anomalous. Anomalous data needs to be further characterized to determine if representative attacks to the network are in progress. Often time subtleties in the data may be too muted to identify certain types of attacks. Many statistics including entropy are used in a number of analysis techniques for identifying attacks, but these analyzes can be improved upon. This research expands the use of Approximate entropy and Sample entropy for feature selection and attack analysis to identify specific types of subtle attacks to network systems. Through enhanced analysis techniques using entropy, the granularity of feature selection and attack identification is improved