I recently saw an older talk by Sergey Bratus and Greg Conti by the name of Voyage of the Reverser: A Visual Study of Binary Species.

Are there any opensource/free tools that one could use to see visual representations of the binaries that are fed in (similar to what is shown in the talk)?


Samples from the talk:

Windows PE visualization

Documents Format Files

share|improve this question

The tool we used for the talk, binviz, is available here: binviz_0.zip.

Some papers are here:

And, there is also an earlier Black Hat talk, in addition to the one I did with Sergey:

I haven't used it in a while but binviz was written in Visual C# (VS2005 or maybe VS2008). The .zip is the project (source) file so it should load into visual studio and run. There is also a compiled .exe in... /binviz_0.44bw/binviz_0.01/bin/Debug/. You should just be able to double click it and run on a Windows machine. I developed it under XP, but have since used it under Windows 7 and it worked more or less the same. (mouseover event behavior is a little different, but still usable).

Note that binviz is a research prototype and has a bug... it doesn't like small files. Would try something 10M in size and then work down from there. I think it is around 500K where it starts getting cranky.

share|improve this answer
1  
Thanks for your reaction! In your 2008 paper you mention platform independent versions as "future work" (p. 8). Were you able to pursue this any further, or is the download still the canonical "current" version? – usr2564301 Aug 2 '14 at 13:12
1  
I went down the a variant path of trying to include automated mapping of arbitrary binaries. see binary mapping paper and wasn't able to explore the platform independent version further. So yes, this is the "current" version. – Greg Conti Aug 2 '14 at 14:43
    
@GregConti Will this software recognize writing patterns in text like articles or books? – Denis Apr 16 '15 at 4:33

cantor.dust

This project is an interactive binary visualization tool, a radical evolution of the traditional hex editor. By translating binary information to a visual abstraction, reverse engineers and forensic analysts can sift through mountains of arbitrary data in seconds. Even previously unseen instruction sets and data formats can be easily located and understood through their visual fingerprint.

cantor.dust example 1

Sadly, I think the development stopped, at least I haven't heard any news about this project recently.

Binwalk

enter image description here devttys0 implemented a similar visualization in binwalk.

Vix/Biteye

Graphical (SDL-based) hexadecimal dump tool designed for GNU/Linux. It lets you see the patterns formed by its bits.

Github

GUI

binglide

Binglide is a visual reverse engineering tool. It is designed to offer a quick overview of the different data types that are present in a file.This tool does not know about any particular file format, everything is done using the same analysis working on the data. This means it works even if headers are missing or corrupted or if the file format is unknown.

enter image description here

senseye

Senseye is a tool for for monitoring, analyzing and visualizing everything from static files and crash dumps to live data flows and application dynamic memory.Each data window provides you with different views (e.g. 3D point butt) and statistical tools (e.g. histogram) and some controls for hinting how the sensor should sample, pack and transfer data.

Senseye

binvis.io

a browser-based tool for visualising binary data.

With binvis.io, you can:

Visually explore binary data. Cluster bytes to pick out fine structural features with space-filling curves. Use the simple scan layout to navigate and select data intuitively. Flip between a number of useful byte color mappings, including an entropy. Visualiser that lets you pick out compressed or encrypted sections. Export data segments for analysis.

http://corte.si/posts/binvis/announce/index.html

===

BinVis

A Qt based tool for visualizing arbitrary data. BinVis has abilities to:

Visualize large amounts of binary data (several TiB). Visualize in time (normal plot) and space (byte plot) domains Visualize parts of files and narrow in on relevant parts of the file "Step" through the file to see what type of data each part of the file contains "Reverse draw" to highlight strong patterns in the visualized data Use filters, transparency, color schemes and various shaders to highlight the relevant data. Heavy use of threading for performance reasons Fairly static memory usage, using the same amount of memory for visualizing a 1GiB file and 100TiB file http://trippler.no/wpcms/?page_id=20 http://trippler.no/wpcms/wp-content/uploads/2014/06/Screenshot_2015-08-09_19-52-30.png

PortEx

Java library to analyse Portable Executable files with a special focus on malware analysis and PE malformation robustness

Veles

Binary visualization framework. Veles visualization

share|improve this answer

The data visualizing tool I saw used in the talk seems to be almost identical if not identical to the BinVis tool available on Google Code. A screenshot of some of the features:

enter image description here

Note: the above is an old version as I could not install the latest on my PC; see Google code site for more.

share|improve this answer
1  
Yeah, it's his work, and I think all that's available unless cantor.dust shows up as opensource. – broadway Aug 2 '14 at 9:42
    
It seems that the current version is availabel in binvis.googlecode.com/svn/trunk/binviz_0.01/bin/Release/… by just renaming .exe.deploy to .exe – PlasmaHH Aug 2 '14 at 20:21

These are graphic dumps from the source data. I use this extensively, with a hex reader I wrote myself -- it's a great way to quickly locate "data" (see the difference between .text and .data) and larger structures (which often contain repeating or similar data on the same offsets).

The top images show raw data dumped as grayscale information: each byte is treated as a pixel. Presumably, the author chose grayscale for convenience. I prefer a full range of contrasting colors, so long swathes of "same" and "similar" data can be discerned easier.

The blocks of solid color near the bottom (.rsrc) are icons and other graphics. They appear horizontally stretched out because they are displayed as one-pixel-per-byte, and are actually 16-, 24-, or 32-bpp images.

(I don't believe the entire top is .text! Executable code appears as random pixels, but the top 1/3rd in that image is less dense. It's probably the MZ and PE Executable Header; these contain lots of zeros. The more denser part in the bottom 1/3rd is part of .text and most likely the Virtual Call table, which happens to contain the byte 0xFF a lot.)

In the bottom images, each byte is displayed in a monochrome binary format: each byte is shown as 8 consecutive pixels. By convention, these are displayed with the most significant bit first. There is no need to check for 'endianness', as that only comes into play when dealing with single values larger than a single byte.

Writing something like this yourself is not hard; probably most important is to make the graphics routines as fast as possible, in particular the monochrome bitmaps. This is also addressed by Conti et al., Visual Reverse Engineering of Binary and Data Files (http://www.rumint.org/gregconti/publications/2008_VizSEC_FileVisualization_v53_final.pdf):

When testing performance we found that the display could be updated in 0.03 seconds, leaving open the possibility of creating byteview visualizations at greater resolutions while still providing a responsive interface. [...] We were able to achieve this level of performance by avoiding C#’s GetPixel and SetPixel methods and directly accessing image memory [...]


Screenshots of my own tool, very similar to what is shown above:

The .text section of CALC.EXE. The repeating structure on the bottom is the Virtual Call table.

calc.exe in binary view

A file containing RLE-encoded images. The RLE compression can be recognized by the horizontal stripes of "coherent" data (better visible when changing the width interactively).

rle-compressed images

GZ compressed data, in monochrome. This is as close to random static as possible.

gz compressed data in monochrome

A tiny embedded bitmap font found in another executable. The random bits between characters are the width in pixels - see the synchronically scrolling hex dump.

a tiny embedded bitmap font

Some data is better visualized in other ways. This is an old-style VGA palette (6x6x6 RGB).

a palette visualized

share|improve this answer

I wrote https://github.com/REMath/implementations/blob/master/code_examples/plot_hex.py which implements the ngram method presented Conti as well as a clustering component to isolate visual properties.

share|improve this answer

BinView is a prototype for a tool for binary data visualization

GitHub repository

Screen 1

Screen 2

Screen 3

Screen 4

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.