28 research outputs found
Chat mining for gender prediction
The aim of this paper is to investigate the feasibility of predicting the gender of a text document's author using linguistic evidence. For this purpose, term- and style-based classification techniques are evaluated over a large collection of chat messages. Prediction accuracies up to 84.2% are achieved, illustrating the applicability of these techniques to gender prediction. Moreover, the reverse problem is exploited, and the effect of gender on the writing style is discussed. © Springer-Verlag Berlin Heidelberg 2006
Indexing Information for Data Forensics
We introduce novel techniques for organizing the indexing structures of how data is stored so that alterations from an original version can be detected and the changed values specifically identified. We give forensic constructions for several fundamental data structures, including arrays, linked lists, binary search trees, skip lists, and hash tables. Some of our constructions are based on a new reduced-randomness construction for nonadaptive combinatorial group testing
Enhancing the Accuracy of Network-based Intrusion Detection with Host-based Context
In the recent past, both network- and host-based approaches to intrusion detection have received much attention in the network security community. No approach, taken exclusively, provides a satisfactory solution: network-based systems are prone to evasion, while hostbased solutions suffer from scalability and maintenance problems. In this paper we present an integrated approach, leveraging the best of both worlds: we preserve the advantages of network-based detection, but alleviate its weaknesses by improving the accuracy of the traffic analysis with specific host-based context. Our framework preserves a separation of policy from mechanism, is highly configurable and more flexible than sensor/manager-based architectures, and imposes a low overhead on the involved end hosts. We include a case study of our approach for a notoriously hard problem for purely network-based systems: the correct processing of HTTP requests