With my growing interest in Go Programming Language I decided to do a performance comparison between Go and Java virtual threads performance.
At core I am a Java programmer and am interested in continuing to write programs in Java. The problem I faced was running hundred and thousands of threads and the issues I was having with scalability and resource usage.
To be more exact I have a program that looks at tens of thousands of websites every few minutes to ensure that they are up and running. Every time I have (many) threads fetching data over the web, there were many timeouts. This was frustrating so I kept on thinking about better solutions.
I started looking at Golang Goroutines. They were promising and seemed a lot more scalable. I started learning Go and converted my program code from Java. Well, I was glad I did, most of my timeout issues were gone. Instead of launching 100-1000 URL fetches at a given time, I was able to just fetch all at the same time and let the Go runtime system do its thing. My code got much simpler.
Java Fixed Threads vs Java Virtual Threads Performance
With Java virtual threads finally in the picture, I decided to compare those against standard Java threads and virtual threads. I also decided to compare Java virtual thread performance against Go language Goroutine performance.
Performance Testing of Java System Threads
private static void testSystemThreads() throws InterruptedException {
ExecutorService executor = Executors.newCachedThreadPool();
Instant start = Instant.now();
for (int i = 0; i < NUM_THREADS; i++) {
executor.execute(() -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
}
executor.shutdown();
executor.awaitTermination(10, TimeUnit.SECONDS);
Instant end = Instant.now();
Duration duration = Duration.between(start, end);
System.out.println("Duration: " + duration.toString());
}
I am using the above code to test thread creation and running times of all tests .
I ran the above code in Java, using the CachedThreadPool executor first and then the FixedThreadPool executor next.
Java System Thread Result Summary
Number of Threads | CachedThreadPool | 1k FixedThreadPool | 10k FixedThreadPool |
---|---|---|---|
1,000 | 1.09 / 2.54 | 1.08 / 2.57 | 1.09 / 2.66 |
10,000 | 2.96 / 4.46 | 10.16 / 11.622 | 2.84 / 4.71 |
100,000 | 7.88 / 17.32 | 40.74 / 102 | 11.79 / 13.446 |
500,000 | 27.92 / 75 | 503.66 / 505 | 52.56 / 54.34 |
1,000,000 | 28.62 / 147 | 602.06 / 1,008 | 102.53 / 104 |
Baseline threads: 4050 (this is a variable number as thread count for other running process will change this number)
Java CachedThreadPool
When using the CachedThreadPool , the Rampup and Rampdown time of thread creation and execution put the CPU at 100% usage once the submitted task count was higher than 50,000).
Almost 20k system threads were created by the OS to execute the 100k jobs and over 40k system threads were created to run with the 500k submitted jobs.
Java FixedThreadPool
Using the FixedThreadPool executor, creating 1,000 system threads increased memory usage by approximately 200MB.
CPU usage was nominal, never really going above 2-3%.
Time taken was much higher when compared against the CachedThreadPool once the jobs submitted number went above the total number of available threads.
It is recommended that CachedThreadPool be used only in cases where the jobs are of very short duration. If you are going to do parallel processing of image/video/analytics data you maybe better off using FixedThreads.
I did run the test using 100,000 threads, but that slowed down the system as it took many minutes for the OS to create 100,000 system threads. As such I won’t recommend creating an excessive number of threads.
Java Virtual Thread Performance Result Summary
Next I run the same code using Java Virtual Thread (newVirtualThreadPerTaskExecutor) executor.
Tasks submitted | VirtualThreadPerTaskExecutor (time in seconds) |
---|---|
1,000 | 1.04 |
10,000 | 1.16 |
100,000 | 2.35 |
500,000 | 10.47 |
1,000,000 | 10.73 |
Wow, using system threads is much better than using OS system threads. Well, at least in this case. As the number of submitted jobs went up, the runtime only went up slightly in absolute terms.
Memory usage was very constrained as well, going up by 1.3 GB when 1,000,000 jobs were submitted. Number of system threads created was about 75 in this case.
With 100,000 submitted jobs, there were about 65 or so new threads created.
Go GoRoutine Performance Testing
TLDR: Go virtual threads aka GoRoutines are much faster than what Java can do at this time.
I used the following code for testing GO performance:
func main() {
start := time.Now()
for i := 0; i <= 1000000; i++ {
wg.Add(1)
go testthreads()
}
wg.Wait()
end := time.Now()
duration := end.UnixMilli() - start.UnixMilli()
fmt.Println(duration)
}
func testthreads() {
defer wg.Done()
time.Sleep(1 * time.Second)
}
Tasks submitted | VirtualThreadPerTaskExecutor (time in seconds) |
---|---|
1,000 | 1.007 |
10,000 | 1.050 |
100,000 | 1.538 |
500,000 | 3.32 |
1,000,000 | 4.911 |
Although I have not shown the number of actual system threads created for each run, in multiple runs of the 1 million submitted jobs, I found that the GO runtime created about 10 system threads whereas the JVM created close to 75 operating system threads.
GO vs Java 1,000,000 Submitted Jobs
So I am going to discuss a little about GO and Java differences I noted.
System Baseline for Java
GO 1,000,000 Goroutines started:
Memory usage: 4 Gigabyte
Java 1,000,000 Jobs with 10,000 FixedThreadPool
Memory usage:
Java 1,000,000 Jobs with VirtualThreadPool
System Baseline for Go
GO 1,000,000 Virtual Threads with GoRoutines
Brief Summary
- CPU: Virtual threads in both GO language and Java took up more CPU than fixed threads in Java.
- Memory Usage: Usage for GO virtual threads was at least double of what Java used.
- System threads in Java were significantly more than what GO created to do the same work. 10 (GO) vs 75 (Java)
- Source Code: I don’t really want to judge what code looked better, but I found code for both languages easy to understand. GO may have been simpler to write though. But again no preferences.