A Simple C# Wrapper for Ghostscript
This post has become somewhat popular (relative to my other posts anyway), so I decided to take the code and release it as an open source library. More information here
PDF thumbnails with Ghostscript
I’ve been looking for a while now for a simple solution for generating thumbnail images from PDF files. I wanted something that would let me programmatically load in a PDF file, choose a page, and generate a thumbnail from that page. As far as I can tell, there are only a few open source options and of those options I haven’t been able to find one that I could get working with C#.
After seeing it recommended a few times, I decided take a look at Ghostscript. Ghostscript is an open source interpreter for Postscript and PDF files. Among other things, Ghostscript allows you generate images from PDF pages. Which is exactly what I needed.
Ghostscript is a tool that can be used from the command line, which is how most of the examples I’ve found online have used it. Unfortunately, this is what a call to Ghostscript looks like:
gs -q -dQUIET -dPARANOIDSAFER -dBATCH -dNOPAUSE \
-dNOPROMPT -dMaxBitmap=500000000 -dFirstPage=1 \
-dAlignToPixels=0 -dGridFitTT=0 -sDEVICE=jpeg \
-dTextAlphaBits=4 -dGraphicsAlphaBits=4 -r100x100 \
-sOutputFile=output.jpg input.pdf
Not pretty. Luckily, I needed to automate the task of creating the thumbnails, so I wouldn’t need to manually generate the parameters to be passed to the command line tool. However, I still felt like there might be a better way to hook into Ghostscript’s functionality. So, I decided to take advantage of the API provided by Ghostscript by writing a simple C# wrapper for the API to use in my current ASP.Net project.
A simple Ghostscript wrapper
The first thing I needed was the Windows version of the Ghostscript DLL, which can be obtained here. Once I included the DLL in my project, I needed to expose the unmanaged API functions to my C# wrapper function.
[DllImport("gsdll32.dll", EntryPoint = "gsapi_new_instance")]
private static extern int CreateAPIInstance(out IntPtr pinstance,
IntPtr caller_handle);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_init_with_args")]
private static extern int InitAPI(IntPtr instance, int argc, IntPtr argv);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_exit")]
private static extern int ExitAPI(IntPtr instance);
[DllImport("gsdll32.dll", EntryPoint = "gsapi_delete_instance")]
private static extern void DeleteAPIInstance(IntPtr instance);
Above, I complained about the long list of parameters that need to be passed to the Ghostscript command line tool. Those same parameters need to be passed to the API, so the next thing I did was create a function that wrapped up the functionality for building the list of parameters. For simplicity, I left in a lot of default parameters, but the function could be expanded later on to allow more specific parameters.
private string[] GetArgs(string inputPath, string outputPath,
int firstPage, int lastPage, int width, int height)
{
return new[]
{
// Keep gs from writing information to standard output
"-q",
"-dQUIET",
"-dPARANOIDSAFER", // Run this command in safe mode
"-dBATCH", // Keep gs from going into interactive mode
"-dNOPAUSE", // Do not prompt and pause for each page
"-dNOPROMPT", // Disable prompts for user interaction
"-dMaxBitmap=500000000", // Set high for better performance
// Set the starting and ending pages
String.Format("-dFirstPage={0}", firstPage),
String.Format("-dLastPage={0}", lastPage),
// Configure the output anti-aliasing, resolution, etc
"-dAlignToPixels=0",
"-dGridFitTT=0",
"-sDEVICE=jpeg",
"-dTextAlphaBits=4",
"-dGraphicsAlphaBits=4",
String.Format("-r{0}x{1}", width, height),
// Set the input and output files
String.Format("-sOutputFile={0}", outputPath),
inputPath
};
}
Once I had a way of creating a list of parameters, I could start using the Ghostscript API functions. I created a function called CallAPI that would accept an array of parameters and use them to call the Ghostcript API.
The function I created for building a list of arguments returned an array of strings, but to use the API I needed to convert each of those parameters into a ANSI null terminated byte array (I added the code I used to do this to the bottom of this post). Then I needed to allocate some space in memory for each of those arguments and get pointers to each one of them.
var argStrHandles = new GCHandle[args.Length];
var argPtrs = new IntPtr[args.Length];
// Create a handle for each of the arguments after
// they've been converted to an ANSI null terminated
// string. Then store the pointers for each of the handles
for (int i = 0; i < args.Length; i++)
{
argStrHandles[i] = GCHandle.Alloc(StringToAnsi(args[i]), GCHandleType.Pinned);
argPtrs[i] = argStrHandles[i].AddrOfPinnedObject();
}
// Get a new handle for the array of argument pointers
var argPtrsHandle = GCHandle.Alloc(argPtrs, GCHandleType.Pinned);
Then, to use the newly converted parameters, I needed to create an instance of the Ghostscript API and pass them into the initialization function.
// Get a pointer to an instance of the GhostScript API
// and run the API with the current arguments
IntPtr gsInstancePtr;
CreateAPIInstance(out gsInstancePtr, IntPtr.Zero);
InitAPI(gsInstancePtr, args.Length, argPtrsHandle.AddrOfPinnedObject());
The call to InitAPI runs Ghostscript and generates any requested files at the output path.
Now the only remaining thing I needed to do was clean up the memory that was allocated for the API. To handle this, I wrote a cleanup function that takes in the items that need to be cleaned up. The API provides some cleanup functions, so I called those in the cleanup function as well.
private void Cleanup(GCHandle[] argStrHandles, GCHandle argPtrsHandle,
IntPtr gsInstancePtr)
{
for (int i = 0; i < argStrHandles.Length; i++)
argStrHandles[i].Free();
argPtrsHandle.Free();
ExitAPI(gsInstancePtr);
DeleteAPIInstance(gsInstancePtr);
}
One last thing I added to the wrapper was a simple function for generating thumbnails from a source PDF file. Technically, I could have just used the CallAPI function to do that, but I wanted to hide the details of working with the API from code outside of the wrapper class.
public void GeneratePageThumbs(string inputPath, string outputPath,
int firstPage, int lastPage, int width, int height)
{
CallAPI(GetArgs(inputPath, outputPath, firstPage, lastPage, width, height));
}
The GeneratePageThumbs doesn't do anything other than calling the CallAPI function. However, in the future, I'd like to provide other functions that use the Ghostscript API as well. If anyone has any ideas for improving the code, drop me line.
Update: Here is the code I used to convert the arguments to null terminated byte arrays. There might be a better way to do this in .Net, this is just the quick solution I'm using.
public static byte[] StringToAnsi(string original)
{
var strBytes = new byte[original.Length + 1];
for (int i = 0; i < original.Length; i++)
strBytes[i] = (byte)original[i];
strBytes[original.Length] = 0;
return strBytes;
}
Update: This code has been open sourced