I am currently employed in Huawei as a Senior Engineer.
I was a PhD student in the College of Engineering and Computer Science in the Australian National University. My supervisor was Steve Blackburn. My research interest is the design and implementation of managed languages, i.e. the kind of languages that usually run on virtual machines and usually use garbage collection. My thesis introduced the abstraction layer of micro virtual machines which aim to allow the development of higher-quality, higher-performance managed languages.
I have some open-source projects for my researches, or for fun.
The Mu micro virtual machine: Mu is a so-called “micro virtual machine”, a minimal managed runtime for language implementations. Like microkernels, a micro VM only includes essential parts of traditional language VMs (such as JVM, .NET, etc.), and off-loads non-essential parts to a separate layer (a micro VM “client”). This design intends to provide a reliable foundation for the implementations of high-level languages. I am the chief architect of this project, and am responsible for its specification and its reference implementation.
JSON HTML Query Language: A simple tool for extracting text contents from HTML documents.
Micro Virtual Machines: A Solid Foundation for Managed Language Implementation," Ph.D. thesis, College of Engineering and Computer Science, The Australian National University, 2017. abstract, "
Today new programming languages proliferate, but many of them suffer from poor performance and inscrutable semantics. We assert that the root of many of the performance and semantic problems of today's languages is that language implementation is extremely difficult. This thesis the fundamental challenges of efficiently developing high-level managed languages.
Modern high-level languages provide abstractions over execution, memory management and concurrency. It requires enormous intellectual capability and engineering effort to properly manage these concerns. Lacking such resources, developers usually choose naive implementation approaches in the early stages of language design, a strategy which too often has long-term consequences, hindering the future development of the language. Existing language development platforms have failed to provide the right level of abstraction, and forced implementers to reinvent low-level mechanisms in order to obtain performance.
My thesis is that the introduction of micro virtual machines will allow the development of higher-quality, high-performance managed languages.
The first contribution of this thesis is the design of Mu, with the specification of Mu as the main outcome. Mu is the first micro virtual machine, a robust, performant, and light-weight abstraction over just three concerns\: execution, concurrency and garbage collection. Such a foundation attacks three of the most fundamental and challenging issues that face existing language designs and implementations, leaving the language implementers free to focus on the higher levels of their language design.
The second contribution is an in-depth analysis of on-stack replacement and its efficient implementation. This low-level mechanism underpins run-time feedback-directed optimisation, which is key to the efficient implementation of dynamic languages.
The third contribution is demonstrating the viability of Mu through RPython, a real-world non-trivial language implementation. We also did some preliminary research of GHC as a Mu client.
With micro virtual machines providing a low-level substrate, language developers now have the option to build their next language on a micro virtual machine. We believe that the quality of programming languages will be improved as a result.
Hop, Skip, & Jump: Practical On-Stack Replacement for a Cross-Platform Language-Neutral VM," in 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE 2018), March 24–25, 2018, Williamsburg, VA, USA., 2018. abstract, "
Stop and Go: Understanding Yieldpoint Behavior," in Proceedings of the Fourteenth ACM SIGPLAN International Symposium on Memory Management (ISMM 2015), Portland, OR, June 14, 2015, 2015. abstract, "
Yieldpoints are critical to the implementation of high performance garbage collected languages, yet the design space is not well understood. Yieldpoints allow a running program to be interrupted at well-defined points in its execution, facilitating exact garbage collection, biased locking, on-stack replacement, profiling, and other important virtual machine behaviors. In this paper we identify and evaluate yieldpoint design choices, including previously undocumented designs and optimizations. One of the designs we identify opens new opportunities for very low overhead profiling. We measure the frequency with which yieldpoints are executed and establish a methodology for evaluating the common case execution time overhead. We also measure the median and worst case time-to-yield. We find that Java benchmarks execute about 100 M yieldpoints per second, of which about 1/20000 are taken. The average execution time overhead for untaken yieldpoints on the VM we use ranges from 2.5% to close to zero on modern hardware, depending on the design, and we find that the designs trade off total overhead with worst case time-to-yield. This analysis gives new insight into a critical but overlooked aspect of garbage collector implementation, and identifies a new optimization and new opportunities for very low overhead profiling.
Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development," in 1st Summit oN Advances in Programming Languages (SNAPL 2015), 2015. abstract, "
Many of today’s programming languages are broken. Poor performance, lack of features and hard-to-reason-about semantics can cost dearly in software maintenance and inefficient execution. The problem is only getting worse with programming languages proliferating and hardware becoming more complicated.
An important reason for this brokenness is that much of language design is implementation-driven. The difficulties in implementation and insufficient understanding of concepts bake bad designs into the language itself. Concurrency, architectural details and garbage collection are three fundamental concerns that contribute much to the complexities of implementing managed languages.
We propose the micro virtual machine, a thin abstraction designed specifically to relieve implementers of managed languages of the most fundamental implementation challenges that currently impede good design. The micro virtual machine targets abstractions over memory (garbage collection), architecture (compiler backend), and concurrency. We motivate the micro virtual machine and give an account of the design and initial experience of a concrete instance, which we call Mu, built over a two year period. Our goal is to remove an important barrier to performant and semantically sound managed language design and implementation.
Email: wks1986 AT gmail DOT com