Tuesday, September 22, 2009

Thoughts on JPC: An x86 PC Emulator in Pure Java

In this chapter Rhys Newman and Christopher Dennis went through the design of JPC, an x86 PC emulator in Java and gave tips on constructing an efficient large-scale Java program. At the end, the authors gave a recipe for constructing a beautiful architecture.

The authors first presented their case for an emulator instead of a VM. They argued that VMs need to always rely on some degree of hardware support. In contrast, an emulator, especially a Java-based emulator, can run on any architecture. As the x86 architecture becomes even more dominating (perhaps except at the space of small embedded environment such as cell phones), this may not be too advantageous. I think the user's choice between one and the other will come down to performance and security. No doubt that the Java VM, being one of the earliest large-scale VM deployment, has been proven to be reliable and secure. The authors' task now is to make it as efficient as possible.

I agree with the authors that emulating different pieces of hardware require different considerations as there is a gradient of complexity and execution time between the hardware components. It is the right tradeoff to focus on code clarify and modular design for simple and less used components and on the ultimate performance for CPU and memory. After all, according to Amdahl's law, when optimizing a component in an attempt to improve the overall system performance, one can only improve at most up to the time or resource that component takes.

To increase the perforamce of JPC, it is necessary for the authors to have a good understanding of JVM. From my own experience, when working on various hardware platforms, we also usually have the same approach. Typically, during the development time one knows about the initial platform family that the code will be deployed on and thus one writes code to take advantage of the characteristics of that hardware to capture performance gain. However, as hardware progresses and the new generations come out replacing the old ones, they may have different characteristics such as the cache line size and the numbers of registers. The special tricks used to enhance the performance no longer makes sense, sometimes even become harmful. Unfortunately there is simply no time to go back and rewrite the code, sometimes it is even hard to do when the code needs to be capable to run on both hardware. This usually goes unnoticed externally just because the new hardware often gives better performance and thus shadow the software's inefficiency; however still this means that the code is no longer taking advantage of the hardware features. Reading this chapter, I was thinking having a VM or an emulator in the middle may be the solution. As a new hardware comes out, now only the VM or the emulator needs to be optimized for the new hardware, the rest of the code running on top of them can take advantage of the new hardware indirectly.

At the end, the authors gave a game plan for implementing a beautiful architecture. It is interesting that they talk about having an end-to-end prototype at an early stage and at each stage focusing the testing on the whole prototype. This runs orthogonal to the usual division of labor that I've seen where the whole system is broken into modules and each module owner is responsible for perfecting his or her module until a final integration at the end. It seems that we will need to change habits and to integrate early and to integrate often.

No comments:

Post a Comment