New release: 0.8.2

Sun 08 May 2022 malcat team news

Today we are happy to announce the release of version 0.8.2! It has been almost 2 months since the last big release (0.8.0). As the software grows and gains maturity, releases will happen less often but pack more content. So what are the improvements of this new version?

  • New analysis module: fast stack/memory strings detection for x86 and x64
  • ISO+UDF file system support (.iso and .img files)
  • Py2Exe python scripts support
  • LZMA streams detection and unpacking support
  • Several user interface improvements
  • The usual: bug fixing and updating of the Yara and anomalies rules sets

Fast memory strings detection

It is nothing new: malware and shellcode like to obfuscate their strings by constructing them dynamically on the stack or in memory. While Malcat's string detection algorithms are relatively good, such dynamic strings were completely missed.

That was until today. In Malcat 0.8.2, we have added a new analysis pass for all x86 and x64 executables which recovers dynamically constructed strings. These strings are then listed among all other found strings as "DYN" strings:

Dynamic strings in string view
Figure 1: Dynamic strings in string view

So how does it work? Usually, stack string detection is performed using one of these two techniques:

  • Pattern matching (i.e. regexp) of standard stack-string construction patterns: very fast, but not very precise
  • Binary emulation (like the excellent FLOSS from FireEye): very powerful but quite slow

Since Malcat is and should stay a fast analysis tool, we tried to reach a compromise between those two techniques. That's why we combined the better of both worlds:

  • We identify interesting basic blocks using a fast heuristic
  • We emulate those basic blocks using a very simplified x86 emulator supporting only a handful of instructions
  • We scan memory at the end of every emulation pass to recover interesting strings

Since this technique operates on a basic-block basis, it won't allow you to recover encrypted strings or stack strings defined over multiple basic blocks (unlike FLOSS). But on the other hand, it is very fast and you can recover 99% of the stack strings in a few milliseconds. Noice.

Annotated disassembly around stack string construction
Figure 2: Annotated disassembly around stack string construction

In the disassembly view, code responsible for string creation also gets annotated accordingly (see above). We hope you will enjoy this feature.

ISO and UDF file systems support

CD and DVD images are currently gaining popularity among threat actors. While opening .iso and .img files is pretty straightforward using your OS's toolset, having to mount a file system every single time can soon become a chore. That is why we have added support for the ISO 9660 and UDF file systems to Malcat. ISO 9660 is an older file format which is the most commonly used for CDs. UDF (Universal Disk Format) is a more recent format managed by OSTA (Optical Storage Technology Association), and was created to overshadow the shortcomings of the ISO standard. That is the most common file system found in DVDs.

Parsing an ISO image in malcat
Figure 3: Parsing an ISO image in malcat

Since these file formats are pretty broad and complex, we did not implement parsing for the full set of structures used, but focused on the few structures which are necessary to parse and open files:

  • the ISO file format parser is limited to basic ISO 9660 structures and ignores (for now) Joliet and Rock Ridge extensions
  • the UDF file format parser only support volume, partition and files-related structures

Nonetheles, you may be able to extract interesting metadata from these file formats, like access times or tool copyright notices which may be relevant for theat hunting. Moreover, all identified files can be opened in-app inside Malcat, which is pretty convenient.

Py2Exe scripts disassembly

Python malware, while not predominant, are slowly gaining popularity among threat actors due to their relativel low detection rate. The main drawback of using python for malware is the impossibility to execute python scripts on most Windows systems. In order to solve this issue, malware authors usually package their scripts in standalone PE executables using either PyInstaller or py2exe. Malcat has had support for PyInstaller archives for some times now, since it is the most popular installer. But starting from version 0.8.2 we have decided to add support for py2exe programs as well.

A standard py2exe PE file
Figure 4: A standard py2exe PE file

Py2exe programs are PE stubs which embed 3 files:

  • the python DLL, in a resource named PYTHON<VERSION>.DLL
  • the python distribution, packed inside a ZIP archive usually stored in the PE's overlay
  • the application python modules, stored inside a resource named PYTHONSCRIPT

The main file of interest is of course the PYTHONSCRIPT resource. Py2exe uses a custom format to store the python script: a short py2exe header (magic: 0x12345678) followed by a marshalled list of python modules. Since Malcat already has support for marshalled python files (.pyc files are marshalled code objects after all), adding support for py2exe scripts was relatively easy. The only difficulty was infering the python version. In fact, the py2exe header does not containg any python magic which could tell us which version of python is used. Sure there is a resource named PYTHON<VERSION>.DLL inside the PE, but some malware authors like to rename this resource to avoid detection.

Disassembling the main py2exe module
Figure 5: Disassembling the main py2exe module

That's why we have to rely on an heuristic to infer the python version which is used, by examining the content of the marshalled code objects. The format of marshalled code object did change a bit along python versions, and detecting which field are presents or not can give you a clue on which python version is used. But keep in mind when disassembling py2exe scripts, if you encounter weird or unsupported opcodes, you may have to manually adjust the CPU architecture in the statusbar, from python 3.6 to python 3.7 for instance. Beside this small issue, everything works like a charm!

LZMA streams detection

When analyzing malware samples or firmware images, it is often helpful to identify not only embedded files but also compressed streams. Malcat already has Zlib streams detection, and we have now added LZMA streams detection to the list. Any LZMA stream having either a Alone or XZ header will be automatically detected by the analysis and can be unpacked in-app by double-clicking it.

A LZMA stream inside a firmware image
Figure 6: A LZMA stream inside a firmware image

UI improvements

Version 0.8.2 ships with many improvements to its user interface. Nothing ground breaking, just several little changes to improve the workflow.

Icons and toolbar

First we have fixed a bug where icons would have sub-optimal resolution for users enabling HiDPI fractional scaling. You will also notice that the icon set is not monochrome anymore, we have switched to colored icons which look much better.

Improved views toolbar
Figure 7: Improved views toolbar

We have also improved the views toolbar with new icons: each view has it own icon now. Previously, we did regroup views in categories (like hexadecimal and structure views would be the data view), and one icon would sometimes be used for two different views, which was somewhat confusing. Now things are much simpler with one icon = one view. The shortcuts did not change though, you will still be able to cycle through the hexadecimal and the structures view by pressing the F2 key for instance.

Improved calculator

Malcat already embeds an in-app calculator (Ctrl+Space) which is basically an integrated python interpreter used to quickly compute things. We improved the calculator with new bindings givings access to the analysis. You will now have access to the following variables:

  • m: the malcat analysis object (see chapter Scripting in help)
  • v: physical offset of start of current view
  • s: physical offset of start of user selection (or None if no sel)
  • S: physical offset of end (exlusive) of user selection (or None if no sel)
  • read(offset, size): read bytes at physical offset
  • uint8/16/32/64(offset, msb=False): read N-bits unsigned integer at physical offset
  • int8/16/32/64(offset, msb=False): read N-bits signed integer at physical offset

We also added a Calculate address... entry to the address context menu that will pop the calculator dialog populated with the address value. We plan to make further improvements to the calculator in the future, so stay tuned!

Calculator bindings
Figure 8: Calculator bindings

Download urls in-app

When analyzing downloaders or shellcode, the usual workflow often consist of:

  1. unpack / decrypt some data
  2. list urls
  3. download linked files using wget
  4. open the downloaded files inside malcat

Having to switch between malcat and the console every time is a bit annoying. That is why we have added a Download and Analyze context menu entry for url-like strings and in the selection context menu as well. If the selected string/bytes range is a valid url, it will be downloaded using python requests. SSL validation is skipped and a basic user agent is used to avoid running into blacklists. After the download, the downloaded file will be opened in malcat as a sub-file. Note that currently it only works for http and https urls.

Download file from within malcat
Figure 9: Download file from within malcat

That's it for the main UI improvements. Other smaller changes have been made, you can find an exhaustive list below.

Full changelog

Here is the complete changelog of this release:

● Fast dynamic string detection for x86/x64:
    - Automatically detects stack strings / memory strings defined within a basic block
    - Algorithm uses a simplified emulator engine
    - Strings are listed along other strings ("DYN" type)
    - Disassembly view show in comments where the strings are defined
● Added support for Py2Exe programs:
    - Added Yara signature
    - Python scripts are recognized and disassembled
● Added support for CD and DVD file systems (.iso/.img):
    - Support for essential ISO 9660 structures
    - Support for essential UDF structures
    - Open files in-app 
● Added support for LZMA packed streams:
    - Automatic detection of LZMA-Alone (with standard flags) and LZMA-XZ streams
    - In-app unpacking
● User interface:
    - Redesigned the views toolbar: one icon per view now to make things less confusing
    - Colored icons
    - Expanded the in-app calculator (Ctrl+Space) by giving access to additional variables and functions
    - Added "Calculate address..." context-menu action for addresses
    - Added "Download & Analyze" context-menu action for URL-like strings and selection
    - Added context menu for fields in structure's quick view
    - Improved readability of midnight theme
    - Script output window now uses wordwrap
● Misc:
    - Improved LNK file parser (extension block parsing)
    - Added template and anomalies for ACE format
    - Added additional options to the "LZMA compress" transform
    - Prefix tlsh version "T1" in tlsh hashes (aka new tlsh format)
    - Added hard timeout of 20 seconds for online intelligence checkers 
    - Improved analysis parallelism by removing a few unnecessary locks
    - Updated/added Yara rules
    - Updated/added anomalies
● Bug fixing:
    - [GTK3] Fixed: crash when chosing "Copy expanded structure" in context menu
    - [GTK3] Fixed: right click would need to stay pressed in order to see context menu for some of the views
    - [GTK3] Fixed: disabled text ellipsis in data and file tabs (top-left), it does not work reliably
    - Fixed: NSIS file name recovery agorithm for non-solid archives
    - Fixed: NSIS disassembly for 'IntOp' opcode
    - Fixed: rare bug in ZIP parser with odd Unix extra fields 
    - Fixed: VBA decompilation would sometimes stop too soon when facing VBA-purging
    - Fixed: assertion error when previewing Yara rules with duplicate strings