A non-mobile book review today about a book that has taught me a lot about things that are quite relevant in mobile, too. After reading Zero Day about which I blogged here and after I had to come to the rescue to get rid of a malicious program on a friend's PC I decided it was time to learn more about the subject. After doing some research I got an ebook copy of 'Practical Malware Analysis – The Hands-on Guide To Dissecting Malicious Software' by Michael Sikorski and Andrew Honig.
After reading the first paragraphs I became instantly hooked and the material is well laid out from the simple to the complex. The authors first discuss static malware analysis which means having a look at a binary file from the outside with various tools without actually running it. All tools such as the virustotal website, strings, MD5 check, PEid, Dependency Walker, Resource Hacker and the IDAPro disassembler are available for free on the web.
Apart from the main goal about learning about how to detect and what to do about malicious programs, I was especially interested in the chapter on disassembly. It's been a long time since I last worked directly with assembly code and while I still knew the basics I looked forward to using the book also as a refresher course for this. Just looking at assembly programing from a theoretical point of view without having somthing practicable in mind for doing with it would not have been much fun. Malicious program analysis was the perfect use for an assembly language, processor internals and operating system details refresher.
The second part of the book then deals with dynamic analysis, i.e. actually running the malicious program to see what it does. My recent ventures into the virtual machine space paid out handily here as it's absolutely necessary to run malicious programs in a virtual machine with proper separation of host and guest operating system and separate Internet connectivity to ensure other hosts on the network can't be touched by the malware in case it decides to go viral. Also, a virtual machine comes in handy as snapshots of intermediate results can be saved and a clean environment can be restored after performing the analysis by simply deleting the snapshots. Again, all tools for dynamic analysis discussed in the book are freely available on the web.
The book also discusses how C and C++ code look like in assembly code. For me that was a highly interesting topic even outside of malware analysis as I always felt that this was kind of the missing link between my knowledge of higher level programming languages and the assembly world. Especially the inheritance part of C++ always had me puzzled of how that might look like in assembly code. All chapters, including this one has a learning section with sample code provided and it was often quite humbling to do the exercises after reading the chapter. It seemed so clear when reading about it but the real understanding came when actually doing the exercises and working with the code.
At some point I also started working on real malicious code, the stream to my email inbox supplies fresh samples that get past the network based malware scanner almost daily. With the tools and methods learned one can quickly see what the malware does, which files it creates, how it ensures that it is started automatically, how it calls home to the command and control server and how it downloads further malicious code. Once the virtual machine was infected it was also a good test bed to see how my arsenal of virus removal tools dealt with the issue and if all malicious files were found. Sometimes it was, sometimes it wasn't and only a try a week later with updated virus signatures removed the infection.
The hard part with real malicious programs is disassembling the code or running it in a debugger. All samples I got via email contained a multi-stage packer which helps the malware to better hide from antivirus software and also makes analysis of the code a lot harder. Some of the malware contained anti-debugging code which detects that it is looked at and then does something entirely different. Also, lots of packed code I was looking at also only used indirect function calls to the Windows API making it difficult to impossible to statically analyze it with a disassembler. All of these things are discussed in the book and in practice it takes a newcomer a lot of time to overcome.
Further topics discussed in the book, again including examples to dissect, are user space root-kits, kernel debugging, kernel root-kits, shellcode and 64 bit malware code. The book also goes into the details of how stack overflows are used to infect machines in the first place and also discussed countermeasures such as address space randomization and stack execution prevention. These make it harder to exploit a vulnerability but the book also discusses how black hats have found their way around these counter measures.
The one thing I was really surprised about, because I've never heard or seen this is how malicious programs run inside other running processes to hide themselves. This is called Process Injection and removes the Trojan horse completely from view. One real malware I examined copied itself into explorer.exe and the other one spawned a svchost.exe instance and lived in there. There are various methods how this can be done, again all described in the book and backed-up with sample code that can be analyzed and run for better understanding.
It's been a long review and I still haven't touched on all the points that I found interesting in the book. With some background into programming, Windows and how computers work in general, the book is easy to read and the example code sections always start with something easy and increase their difficulty towards the end. In other words a fully recommended read from a malicious code analysis point of view. If you want to learn more about how operating systems and computers work, looking at malicious code is just the practical thing for which you want to go through the general theory.
Before I close, some thoughts on technical books in ebook format vs. print: If I intended to read it only at home I would have ordered the print version. However, since I was traveling at the time and wanted to start with this topic right away I went for the Kindle version. While this was definitely beneficial for where and when I wanted to use it in terms of instant availability and not needing to carry a full book, I have to say that there's still a lot of room for improvement for reading a technical book on an ebook reader. Quickly jumping from one place in the book to another, going to the table of contents and back, taking notes and generally have a visual idea where some information might be found is very hard to come by in electronic version. I don't know if there is a perfect middle ground in the future but the ideal book for me doesn't weigh anything, is instantly available, i.e. downloadable, I own the binary file for lifetime, nobody can take it away anymore, it should be possible to jump through the book like in a print version combined with text search to find specific content, that would be it. We are still far away from this.
Most of this is unfamiliar to me and I don’t understand it, but I fully agree with your last para.