Ramkumar Chinchani
University at Buffalo-SUNY
USA
Aarthie Muthukrishnan
University at Buffalo-SUNY
USA
Madhusudhanan Chandrasekaran
University at Buffalo-SUNY
USA
Shambhu Upadhyaya
University at Buffalo-SUNY
USA
One of the biggest obstacles faced by user command based anomaly detection techniques is the paucity of data. Gathering command data is a slow process often spanning months or years. In this paper, we propose an approach for data generation based on customizable templates, where each template represents a particular user profile. These templates can either be user-defined or created from known data sets. We have developed an automated tool called RACOON, which rapidly generates large amounts of user command data from a given template. We demonstrate that our technique can produce realistic data by showing that it passes several statistical similarity tests with real data. Our approach offers significant advantages over passive data collection in terms of being non-intrusive and enabling rapid generation of environment-specific data. Finally, we report the benchmark results of some well-known algorithms against an original data set and a generated data set.
Keywords: Data generation, User Level Intrusion Detection, User Command Data