When analyzing (malicious) software, hybrid static-dynamic program analysis techniques help analysts diving into large datasets of potentially evasive malware. A key requirement of these methods is a manually created catalog of patterns or specifications of interesting behaviors.
Due to the rising number of complex malicious software, automatic techniques are needed to automatically build such specifications, present them to the analyst, and create a catalog of matching rules and relevant implementations (e.g., variants).
We will present JackDaw, an automatic behavior extractor and semantic tagger, that exploits jointly static control-flow analysis and dynamic data-flow analysis on malware samples to find interesting potential behaviors. Then, it maps these building blocks to their implementation(s), taking care of capturing and modeling the distinct characteristics of each variant’s implementation. Finally, it associates semantic information to the behaviors, so as to create compact and descriptive summaries that help the analysts in the first phases of reverse engineering.