Deep neural network (DNN) is getting much attention to use in many applications. However, it requires a lot of computational resources, and there are many tremendous processes to setup execution environment based hardware acceleration such as GPGPU. Therefore providing DNN applications to end-users is very hard. WebDNN solves this problem by using web browser as installation-free DNN execution framework. This framework optimizes trained DNN model to compress the model data and accelerate the execution, and executes it with novel JavaScript API such as WebAssembly and WebGPU to achieve zero-overhead execution. Empirical evaluations showed that it achieved more than 200x acceleration.
To achieve speedier execution, optimizing the computation graph of DNN models is very important. Execution of DNN consists of two phases, the training phase and the inference phase. The training phase updates the parameters with a back propagation technique. The inference phase makes predictions (forward-propagation only) for the actual task. If the framework focuses on only the inference phase, it can optimize the computation graph more aggressively.
WebDNN focuses on only the inference phase execution on end user devices and supports aggressive optimization. This optimization pipeline can be applied for models trained with various DNN frameworks. It is not required to edit the training codes.
JavaScript is executed by an interpreter. Therefore, it requires computing overhead and it cannot completely harness the capacity of the CPU. The same problem is encountered in GPU. Modern web browsers support WebGL, which is a JavaScript API to use GPU. However, this API is designed for graphics processing and is not suitable for general purpose computation. In addition, using WebGL for general purpose computing incurs overhead costs.
WebDNN uses next generation JavaScript API, WebGPU for GPU execution, and WebAssembly for CPU execution. These APIs help to bring out the full performance of GPU and CPU.
WebDNN supports 3 execution backend implementations: WebGPU,WebAssembly, and fallback pure javascript implementation. By using this 3 backend implementations,WebDNN works all major browsers.
Internet Explorer | Edge | Safari | Chrome | FireFox |
---|---|---|---|---|
10 - 11WebAssembly/asm.js | - 15WebAssembly/asm.js | Technology PreviewWebGPU | - 58WebAssembly/asm.js | - 53WebAssembly/asm.js |
- 10.1WebAssembly/asm.js | ||||
- 9Fallback |
WebGPU | WebAssembly/asm.js | Fallback |
---|---|---|
Not supported | Not supported | Not supported |
In Safari Technology Preview, WebGPU API is disabled as default. To enable the API, see "Develop" > "Experimental Features" > "WebGPU" in menu bar.
We measured execution time for VGG16 [2] and ResNet50 [3]. Below figure shows the result compared with Keras.js. Computation time per image is shown in vertical axis as logarithmic scale. All tests were run on Mac Book Pro early 2015, Intel Core i5 2.7 GHz CPU, 16 GB Memory, Intel Iris Graphics 6100 GPU. The web browser is Safari Technology Preview 30.
This example runs Neural Style Transfer model [4]. Neural Style Transfer model are given 2 input images, one is content image and another is style image. Then this model generate an image based on the style of the style image and the content in the content image.
We use chainer [5] implementation provided in [6] and pre-trained model provided in [7]. The pre-trained model are transpiled by GraphTranspiler into graph descriptor, and then executed by DescriptorRunner. All computation are done by web browser, not by server.