I’ve got all this PHP. Now what? — Parsing PHP in Go

There’s a lot of PHP out there. The problems plaguing that language are well documented, but many companies and projects are frightfully entrenched. There have recently been projects aimed at improving performance and reliability: things like Hack, php-ng, and the recent effort toward a language specification. Unfortunately, none of these projects will directly result in improving large, existing bodies of PHP. So when I learned Go, and I watched Rob Pike’s talk on writing the text/template lexer, I thought it would be fun to try out that language and idea on reading PHP. Over the past several months, that whim has evolved into a nearly feature-complete PHP parser.

Screen Shot 2014-07-27 at 12.02.32 PM

Until now, the goals and milestones related to this project have been nebulous at best. It began as a crazy experiment, after all. It currently parses most of the code I throw at it, with support for most PHP 5.4 features, but not all. Thus, the most important goal at this point is to move the parser further to solidly support the full set of PHP 5.6 features. As a stretch, it would be great if it was possible to set the parser to check against specific PHP versions. The project currently has 85% test coverage and 67% coverage with full unit tests, but I’d like to increase those numbers. I’d also like to improve stability when parsing incorrect code.

At this point, I think there is an opportune moment to consider the next direction to take the parser beyond just parsing. Here is a list of ideas I have:

  • A phpfmt tool (a la gofmt)
  • Static analysis tools (e.g. type inference, dead code detection)
  • Transpiler

The transpiler is perhaps my favorite idea, particularly in Go. With the go/ast package, transpiling into Go is perhaps the closest in reach of all these goals, despite sounding so lofty.

All that said, this project began as an experiment, and for the time being, it continues as one. I’m happy to open up the project to a wider audience. Please feel free to comment or contribute. If you would like to just play with the parser, I have this little tool for testing. The code is available on Github.

4 thoughts on “I’ve got all this PHP. Now what? — Parsing PHP in Go

  1. Cameron Eagans

    I could see a lot of interesting uses for this, but one that’s especially appealing is as a way to provide intelligent PHP autocompletion in various text editors.

  2. Srinivas

    Dearest Stephen,

    It is your deepest friend on the telephone, Srinivas. I am most pleased to be hearing about this of which that you have been applying your parsing skills do the development of!

    I too vote for TRANSPILER!

    I want not for my source codes to be in PHP. I dearly do want my Word Press to be a Go Press instead! When I am deploying it then I know I am deploying scalable Go code that will use the Actor Pattern to get the highest concurrency out of my simple web server.

    So I must reiterate my statement of intent: I too vote for TRANSPILER!

    Please Stephen be a letter us of know when your PHP to Go TRANSPILER is ready. We await the day when your TRANSPILER moves us from the Age of the PHP to the Age of the Go.

    Your Deepest Friend,

  3. carlos cabral

    if you do move forward with the transpiler idea, does that mean you could transform a PHP codebase into one single Go binary?

    1. stephen Post author

      I’ve considered two directions for this: one that would simply run PHP code after having transpiled it to Go. The second would be to translate it naively, in a way that wouldn’t produce complete Go (perhaps wouldn’t compile), but would be a good starting place for a human to finish the task to produce actually good code.


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>