Go GoRoutines vs Java Virtual Threads Performance Comparison

With my growing interest in Go Programming Language I decided to do a performance comparison between Go and Java virtual threads performance.

At core I am a Java programmer and am interested in continuing to write programs in Java. The problem I faced was running hundred and thousands of threads and the issues I was having with scalability and resource usage.

To be more exact I have a program that looks at tens of thousands of websites every few minutes to ensure that they are up and running. Every time I have (many) threads fetching data over the web, there were many timeouts. This was frustrating so I kept on thinking about better solutions.

I started looking at Golang Goroutines. They were promising and seemed a lot more scalable. I started learning Go and converted my program code from Java. Well, I was glad I did, most of my timeout issues were gone. Instead of launching 100-1000 URL fetches at a given time, I was able to just fetch all at the same time and let the Go runtime system do its thing. My code got much simpler.

Java Fixed Threads vs Java Virtual Threads Performance

With Java virtual threads finally in the picture, I decided to compare those against standard Java threads and virtual threads. I also decided to compare Java virtual thread performance against Go language Goroutine performance.

Performance Testing of Java System Threads

Java
private static void testSystemThreads() throws InterruptedException {
        ExecutorService executor = Executors.newCachedThreadPool();

        Instant start = Instant.now();
        for (int i = 0; i < NUM_THREADS; i++) {
            executor.execute(() -> {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            });
        }
        executor.shutdown();
        executor.awaitTermination(10, TimeUnit.SECONDS);

        Instant end = Instant.now();
        Duration duration = Duration.between(start, end);
        System.out.println("Duration: " + duration.toString());
}

I am using the above code to test thread creation and running times of all tests .

I ran the above code in Java, using the CachedThreadPool executor first and then the FixedThreadPool executor next.

Java System Thread Result Summary

Number of ThreadsCachedThreadPool 1k FixedThreadPool10k FixedThreadPool
1,0001.09 / 2.541.08 / 2.571.09 / 2.66
10,0002.96 / 4.4610.16 / 11.6222.84 / 4.71
100,0007.88 / 17.3240.74 / 10211.79 / 13.446
500,00027.92 / 75503.66 / 50552.56 / 54.34
1,000,00028.62 / 147602.06 / 1,008102.53 / 104
Note: All times shown are in seconds

Baseline threads: 4050 (this is a variable number as thread count for other running process will change this number)

Java performance testing - baseline threads

Java CachedThreadPool

When using the CachedThreadPool , the Rampup and Rampdown time of thread creation and execution put the CPU at 100% usage once the submitted task count was higher than 50,000).

Almost 20k system threads were created by the OS to execute the 100k jobs and over 40k system threads were created to run with the 500k submitted jobs.

Java CachedThreadPool test run with 1 million submitted jobs.
Java FixedThreadPool

Using the FixedThreadPool executor, creating 1,000 system threads increased memory usage by approximately 200MB.

CPU usage was nominal, never really going above 2-3%.

Time taken was much higher when compared against the CachedThreadPool once the jobs submitted number went above the total number of available threads.

I did run the test using 100,000 threads, but that slowed down the system as it took many minutes for the OS to create 100,000 system threads. As such I won’t recommend creating an excessive number of threads.

Java Virtual Thread Performance Result Summary

Next I run the same code using Java Virtual Thread (newVirtualThreadPerTaskExecutor) executor.

Tasks submittedVirtualThreadPerTaskExecutor (time in seconds)
1,0001.04
10,0001.16
100,0002.35
500,00010.47
1,000,00010.73

Wow, using system threads is much better than using OS system threads. Well, at least in this case. As the number of submitted jobs went up, the runtime only went up slightly in absolute terms.

Memory usage was very constrained as well, going up by 1.3 GB when 1,000,000 jobs were submitted. Number of system threads created was about 75 in this case.

With 100,000 submitted jobs, there were about 65 or so new threads created.

Go GoRoutine Performance Testing

TLDR: Go virtual threads aka GoRoutines are much faster than what Java can do at this time.

I used the following code for testing GO performance:

Go
func main() {

	start := time.Now()
	for i := 0; i <= 1000000; i++ {
		wg.Add(1)
		go testthreads()
	}

	wg.Wait()
	end := time.Now()

	duration := end.UnixMilli() - start.UnixMilli()
	fmt.Println(duration)
}

func testthreads() {
	defer wg.Done()
	time.Sleep(1 * time.Second)
}
Tasks submittedVirtualThreadPerTaskExecutor (time in seconds)
1,0001.007
10,0001.050
100,0001.538
500,0003.32
1,000,0004.911
If you look the numbers above you will find that they runtime is much faster than what Java can do at this time.

Although I have not shown the number of actual system threads created for each run, in multiple runs of the 1 million submitted jobs, I found that the GO runtime created about 10 system threads whereas the JVM created close to 75 operating system threads.

GO vs Java 1,000,000 Submitted Jobs

So I am going to discuss a little about GO and Java differences I noted.

System Baseline for Java

System baseline chart for GO vs Java thread comparision.

GO 1,000,000 Goroutines started:

Memory usage: 4 Gigabyte

Java 1,000,000 Jobs with 10,000 FixedThreadPool

Java 10k Threads 100k Jobs Running

Memory usage:

Java 1,000,000 Jobs with VirtualThreadPool

Java 1mil Virtual Threads

System Baseline for Go

Go System Baseline for Java vs GO thread comparison.

GO 1,000,000 Virtual Threads with GoRoutines

Go 1mil Goroutines Test

Brief Summary

  • CPU: Virtual threads in both GO language and Java took up more CPU than fixed threads in Java.
  • Memory Usage: Usage for GO virtual threads was at least double of what Java used.
  • System threads in Java were significantly more than what GO created to do the same work. 10 (GO) vs 75 (Java)
  • Source Code: I don’t really want to judge what code looked better, but I found code for both languages easy to understand. GO may have been simpler to write though. But again no preferences.