GraalVM — How We Turned Our Research into a Product
GraalVM — How We Turned Our Research into a Product
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
Thank you, Shannon.
As many of you may know, learning the Clojure programming language in 2016 reintroduced the joy of coding back into my life. It changed so much of my thinking about software architecture, and those themes were the basis of The Unicorn Project.
Clojure is a functional programming language that pioneered using immutable data structures, and it either runs on the JVM or it gets transpiled into JavaScript. Before learning Clojure, I had not spent much time using either Java or the JVM. At most, I spent maybe tens of hours using them, and that was in the early 2000s, nearly 20 years ago. I did not have much appreciation for it at the time, and I mostly associated it with Ops people complaining about out-of-memory errors, usually happening in the middle of the night.
But after learning Clojure, I gained so much appreciation for the JVM. It is one of the most battle-tested and performant runtimes, having benefited from billions of dollars of R&D invested over the last 30 years, running on tens of billions of instances worldwide, and running some of the most compute- and data-intensive business processes on the planet.
Believe it or not, it has become one of the most vibrant and innovative VM runtime ecosystems around. In my mind, the person who deserves so much of that credit is Dr. Thomas Wuerthinger, who created the GraalVM project when he joined Oracle Labs 12 years ago, after having spent time working on the famous V8 JavaScript JIT compiler that runs inside the Chrome browser.
I have been so blown away by his work over the years. Among other things, his team created a runtime that runs not just Java, but also Python, Ruby, R, Smalltalk, JavaScript, and often faster than ever. They also created a faster JIT compiler that is written in Java as opposed to C++, and they can now create native JVM executable binaries that have startup times and memory usages that are competitive with Go and C++.
It was such an honor to learn more about the incredible journey that he and his team have been on for over a decade. I just love their mission: make programmers more productive and their programs run faster. This is an amazing story of not just technical innovation, but doing it within a giant, 30-year-old ecosystem with tons of constraints, but succeeding despite it all, or maybe even because of it. Here is Thomas.
Dr. Thomas Wuerthinger
Thank you, Gene. Yes, I will talk about GraalVM and how we turned our research into a product.
My background is that I did a PhD at the Linz University in Austria, which is a small university, but it was very lucky that this university had a collaboration with Sun Microsystems to do research on Java and the HotSpot virtual machine. I got immediately into this type of research and loved it. I was really trying to do new compiler optimizations, and this is how I really came to love the space of making programs run faster.
I got an internship in California, which was super exciting for me. Later, I did another internship at the Google V8 team to do a JavaScript compiler this time, not Java but JavaScript. In 2011 I had the choice: do I go to Google or go to Oracle? In the end, I decided to go for Oracle because there I could join Oracle Labs, and I felt like they would allow me to pursue more of a research agenda and try to do larger things.
This in the end really turned out to be a great decision for me, because I am really grateful that they hired me back then fresh from university and gave me the ability to try out things. I moved through all of the different stages of the career, from first researcher, later manager, director, and finally VP of an organization. I was always trying to make the initial research that we did back in 2011 successful, both in terms of usage and then also successful for Oracle.
When we do this type of research, we did it in the context of Java, and it is important to note that Java itself comes out of a long line of research. It was developed at Sun Microsystems, where they started programming-language research. Initially the language was called Oak, and only later, before publishing it, it was renamed to Java, and it was then published in 1996. There was a long section also of programming-language research before they came out with the first version of Java.
Java is really dominant in the industry. There is this advertisement that is often running, that three billion devices are running Java. That is actually now too low a number. The current estimate is that more than 60 billion Java virtual machines are running out there, and even that estimate might be low, because Java is really dominant, specifically on the server side for running server-side workloads. We are very happy to be a part of the Java community.
For us, things started not as a product or anything. We had a very long time where we first did some basic research. We created a vision paper that they published in 2013 with the title called "One VM to Rule Them All," and then later on we did a result paper a couple years later. But this was still only academic results. We had some prototypes running and found that it was working, but there was not any production application yet.
In 2018, finally, we did an open-source launch. That launch was super important for us because this was the first time GraalVM got a lot of attention in the industry. It was really great to see that some early adopters would immediately start using the technology that we published back then, and it gave us such valuable feedback: how we could further develop the project, which features to focus on, which features resonate in the community, and which features do not resonate.
From this open-source launch, we got a lot of feedback to the project, and then finally in 2019 we did the first commercial launch of the Oracle product associated with the project. One line here for us, when we describe a research project in general, from a manager that I always remember is that he reminded me and the team that it is not a sprint, it is a marathon. This is very important because often when you do these types of projects you feel, oh, I just have to do quickly this, and then I will have my final breakthrough. In reality, the important part is to really keep going at a pace. You have to think of it as a marathon because you need to continue developing at a steady pace, and continue increasing your reach as you go.
I am very proud that with the GraalVM project we could incrementally, over time, increase our user base, increase the way it is used, and find different scenarios that people would use GraalVM. This is for us really a very long-term effort, and you can see here from the number of years from the first ideas that we are now more than 10 years into the project.
One of the core ideas in the initial version of the project was that Java is such a great language, but the JIT compilers and the VM for Java are all written in C/C++, and can we do better? We thought, of course, we want to write the JIT compiler also in Java, because there are several real benefits from writing in Java.
One is a security benefit. This is related to the fact that it is a memory-safe language. There is a recent NSA advisory that people should make the strategic shift from programming languages with no memory protection to programming languages that are managed, like Java for example. Another aspect is robustness, meaning if our compiler has an error because there is a null pointer exception or an array out-of-bounds exception, then we can just gracefully ignore that and bail out of the compiler and not take down the whole JVM.
Java of course has better developer productivity, thanks to the rich ecosystem of Java IDEs around there. Java is also an interesting choice for a Java JIT compiler because of the dogfooding aspect. The reason is that the developers who are writing the JIT compiler, in this case, are writing Java every day. This means that they are thinking like Java programmers. They are optimizing their own programs, which gives additional motivation, and they just really understand a lot of constructs that are familiar to a Java developer but might not be as familiar to a C++ developer.
We really think this is a great choice, to write a Java JIT compiler in Java. In order to achieve that, we had to put an interface between the C/C++ HotSpot VM and our compiler. This interface is called JVMCI. When you download the GraalVM binaries, you get a Java HotSpot JVM, and in that HotSpot JVM the JIT compiler that is selected for running your application is then the Graal compiler. That Graal compiler is written in Java.
There was a recent announcement, and we are very proud and excited that, as part of this announcement, we are announcing that the GraalVM JIT technologies, which include the Graal JIT compiler, will be contributed to the OpenJDK project.
We developed the JIT compiler for many, many years. When you develop a technology for a very long time, you find its advantages but also some disadvantages. Some of the disadvantages that we learned over the years of JIT compilers are that, while they may be great for maximum throughput, they are not so great for the best memory footprint, and they are specifically not great for startup. Startup, when you run in a JIT mode, is very slow because the application needs to first warm up. The application needs to first run in an interpreter and then only gradually get JIT compiled.
This was a big problem for us also in the early days of the project, because our JIT compiler itself is written in Java. That JIT compiler needed to warm up, and it took a very, very long time until the GraalVM JIT on top of HotSpot would finally be able to then run the application in a fast way, because the GraalVM JIT itself needed to compile and warm up before it could even compile the application's code.
This is why we came to the conclusion that the only way we can make this work is to write an ahead-of-time compiler for Java that is able to ahead-of-time compile our compiler and then link it with the HotSpot JVM. This is where Native Image was born. We created a mechanism where, instead of running a Java or JVM-based application in JIT mode, you can use Native Image to create a native binary from the application and then just invoke that native binary directly.
The architecture of this is that Native Image is looking at your application's source code, looking at the JDK libraries, and making an analysis of all the code that is reachable from the main method of your program, so everything that can be potentially executed when your program runs. It does a so-called points-to analysis to determine this subset, and after this analysis it is then creating a final binary that only contains the code that it found as reachable.
It can also contain an initial image heap. In this way, this technology is a little similar to the Smalltalk technology, where you would also be able to have a Smalltalk image snapshot. We sometimes call Native Image also snapshotting technology because it not only snapshots the current code of the application, it also takes a snapshot of the heap. This allows you to pre-initialize data structures. We use these mechanisms a lot also in our JIT compiler when we snapshot that JIT compiler into a Native Image in order to really start up fast when the JIT compiler should be ready to compile the Java user application.
Apart from creating technology for Java, we always had some ambition in the project also to create technology for other languages. This came a little bit out of my background because I was doing this internship at Google V8 and doing a JavaScript JIT compiler there. I always thought it was a little bit of duplicated or wasted effort that every language has its own JIT compiler, every language has its own interpreter, its own garbage collector. In reality, a very large part of the technology stack can actually be reused between languages.
Every language is of course different, and Java and JavaScript are very different languages. But on the lower part of the compiler pipeline, there are a lot of similarities, and those similarities can be used to share the technology, to share the compiler, to share the garbage collector, and so on. Back then, we really wanted to do this, but there was not actually some explicit budget given for it. So we had to find a way to create multiple front ends for the compiler with the lowest effort possible. In this constrained environment, we were thinking very hard: how can we do this without taking a lot of engineering time? Because it was more like our Friday 20% project than an actual agenda for the research project.
That is how we came up with the Truffle system, where we have a system where you only have to write an interpreter for a language, which is much easier than writing a compiler, and then our GraalVM compiler is automatically transforming that interpreter into a compiler using partial evaluation. The way it works is that you write an AST interpreter for your language. The interpreter is a specializing interpreter, meaning it has to be an interpreter that is specializing on types or taking profile input from the application to specialize, for example, an operation to be an integer operation, a string operation, or a double operation.
After it specializes, and it sees that it should now just-in-time compile this AST interpreter, this is when we use partial evaluation to automatically derive the machine code from the AST interpreter. This system is what allows the GraalVM virtual machine to run many different languages while sharing the largest part of its compilation infrastructure between languages.
This makes GraalVM not just a compiler for Java or JVM-based languages like Scala, Clojure, Ruby, or Kotlin. It also enables GraalVM to run languages that are typically not associated with the JVM, like JavaScript, Ruby, R, Python, C, and C++. The important part is all of these languages use this mechanism for this automatic transformation of an interpreter to a compiler, and they share the underlying infrastructure of the JIT compiler, of the garbage collector, and so on. This makes those languages also interoperable.
In those languages, for example, we have a very great collaboration for Ruby with the Shopify people, who are contributing to our Ruby engine and who try to currently deploy this type of Ruby engine as part of the Shopify platform. We also have a lot of work going into Python, where we are developing GraalPy, and we are working with the Python community to improve the native interface of Python to enable better performance Python implementations.
While when you hear about GraalVM currently, you hear primarily about the Java-related aspects of GraalVM, there are some more forward-looking projects going on here as well where we are bringing these additional languages into the GraalVM universe. The idea behind GraalVM in the initial research paper title still stands, which is that GraalVM should be the one VM to rule them all, meaning it is one virtual machine to rule all languages, or at least all languages that are in the managed category or very similar to a managed language. We however also have even an interpreter for C and C++, allowing this way to compile and run C/C++ applications in a managed way.
We are running these applications not just on top of the Java platform, on top of OpenJDK HotSpot. We also have a way to run these applications as part of the Node.js platform. We have an embedding in the Oracle Database to run JavaScript with GraalVM. We are working on an embedding in MySQL, and then you can do stand-alone execution as well for all of these languages. This is where we have a certain multiplication effect, because you can now choose the languages you want to enable in the platform of your choice.
When we were doing this polyglot research, from an academic point of view we always tried to think about it as, yeah, we want to combine languages, we want to have one language talk to the other. In the end, the main use case that was then given as feedback from the community and ecosystem was that we do not necessarily need to have the languages talk to each other, but we really like this polyglot embedding aspect of embedding one language into the platform and embedding one language into another language.
You have two very important internal users at Oracle. One is Oracle Database. The other one is Oracle NetSuite, who hosts the JavaScript engine of GraalVM. We also have external users of such embeddings. There is a recent article from Adyen published on their blog about how they use this type of embedding and this type of polyglot facilities of GraalVM and actually run C/C++ applications safely in the context of Java. Please go to their blog article if you want to hear more about their use case.
One thing which is still the most finished or most used aspect of GraalVM, the most established aspect of GraalVM, is GraalVM Native Image. For GraalVM Native Image, I really think that this is the best way to run Java applications in the cloud, and there are four reasons for that.
One is the startup is instant. For anything where startup matters, GraalVM Native Image is a no-brainer. It is about 50 times or more faster in startup compared to when you would run your Java application in a JIT mode. One area that is growing more and more in that cloud space where startup actually matters is serverless. If you have a serverless application, you are using Lambda or functions, then GraalVM can really lower your startup time.
It also has a better memory footprint, and that memory footprint is up to two to five times smaller. It is actually running the application with lower memory, and in the cloud memory typically is what you pay for. If you can run your application in an instance with two times lower memory, it is often two times cheaper. This is not the case for all cloud applications; this can make sense.
Why is the memory smaller? The primary reason is that the memory for the classes and the metadata that would typically be necessary for running a Java application is not necessary when ahead-of-time compiling the program. All of the method space is not available because we have already created the machine code. We do not need to keep this type of memory around, and this memory can be quite large. It includes the names of methods, the code profiling feedback, and so on.
From a security perspective, it is also interesting to ahead-of-time compile applications because in this scenario, you can check the dependency tree of your application at build time, because there cannot be additional loaded-up code loaded afterwards. In general, this also means a reduced attack surface, because only the code paths that are proven to be reachable by your application are included in the image, and there cannot be later new code loaded at runtime. This reduces the overall attack surface for your application.
Finally, the packaging can be more compact because you package the application and also package the runtime together with the application. You can run this type of binary then in a from-scratch Docker container and have a very small distribution size of your application. There is a link below for detailed demos that prove these points.
If you do not believe us about these benefits, you can also read what the competition says about us. Specifically, there is a big user story published on the AWS blog about Disney+ using GraalVM Native Image in the serverless space to reduce the startup time for serverless functions. The numbers that they published there show that the cold startup of these serverless functions is suddenly 10 times faster when using GraalVM Native Image. Having such a fast cold start really helps with scaling up and down quickly in the cloud and not having to keep warmed-up instances around. It just in general helps with being flexible.
It also shows that GraalVM Native Image performs still very well when you give it a much smaller memory instance. This again translates to real cost savings, because the memory is typically what you pay for in the cloud. It is also what can lead to lower carbon footprint because, in the end, you do not pay the price in terms of compute power usage whenever you start the application. There is a lot of energy wasted when you start up in JIT mode all the time and you use these additional seconds of CPU time and power to start up the application.
With this, at the end of my presentation, I would like to thank you for the attention. The help I am looking for is, first of all, anybody who has not heard about GraalVM and who is interested in the tech, please check it out and let us know about the feedback you have about the technology. Specifically, we are always interested in getting this reality check as to, well, this kind of feature here is something I would actually want to use in practice; this other feature is something that for me does not make a difference.
The GraalVM team is often quite research-minded, for better or worse, meaning we are developing more with the research mentality. This type of reality check from people who deploy the technology in practice is absolutely invaluable for my team. It is super important. If you want to experiment with some of the more forward-looking features of GraalVM, we really encourage that, and we are happy to speak with you. We are happy to guide you along these experiments. This has been, over the last years for the project, already very important and will continue to be important moving forward as we continue to innovate in the space.
The other aspect is when you have a very heavy Java workload. This is an area where our GraalVM JIT compiler, the Enterprise GraalVM JIT compiler, can make a difference for throughput, typically about 20% faster for this type of throughput-heavy workloads. If you have a Java workload where you say, plus 20% throughput makes a difference for me, then it is also quite interesting to hear from you. We can together evaluate whether this works out in practice, because GraalVM of course you can download for free. There is a Community Edition that is under a free open-source license. There is also an Enterprise Edition that requires a license for production use, but even that Enterprise Edition is free for performance evaluation. You can test it out whether it makes a difference for you, and you can do that free of charge. Just go to our website and download the technology.
Thank you very much for your attention, and we are happy to hear from you. Thank you.