INTRODUCTION TO JVM ARCHITECTURE

Ravindran Kugan
10 min readMay 6, 2021

--

What is java?

Many programmers and programming students who studies Software Engineering knows a bit about the concept of OOP which is Object Oriented Programming. The OOP language that most people use is JAVA. Even though students and other programmers know how to create a program using java they do not know the basics of java and the architecture that java follows. So in this article I will talk a little bit about the concept of java architecture which will include a little bit of info about JDK (Java Development Kit), JRE (Java Runtime Environment), and will mainly focus on the internal structure of JVM (Java Virtual Machine).

Before we talk about the concept of JVM we need to take a look into the core concept of java that is taught in the University. Java has a concept known as WORA, Write Once Run Anywhere. If this statement is true, then a java program that is written in the windows system and then compiled in the windows it self should be able to run on an android device or a mac architecture with the use of the compiled byte code. But it is not possible to directly run the said byte code on other systems because their architecture defers from each other so the byte code will not function properly. So, the WORA function is only taken as a theoretical term for students that have started learning programming.

The architecture of the java can be seen from the below diagram.

JAVA Architecture

As seen from this diagram in order to get JVM, JDK or JRE must be installed on your system. JDK should be installed if you wish to develop JAVA programs, JRE will be installed automatically while installing JDK. If you are not a programmer and only wish to run JAVA application then installing JRE alone is enough.

What is a VM?

VM has two parts Virtual and Machine. Virtual means not physically existing and machine is something that helps to make our work easier. So Virtual Machine are machines that help us to do tasks but do not physically exist. VM can be separated into two different parts.

1. SVM:- System Virtual Machine. SVM uses one or more hardware component to create an environment to work with one or multiple users. Environments are completely independent. Ex:-Hypervisor, Xen

2. AVM:- Application Virtual Machine Thought it uses parts of the hardware isn’t dependent on it fully. Software creates a platform for other software to run. It converts inputs into different outputs. EX:-JVM, CLR, PVM

How JVM is deployed?

After we write a java program first thing we do is compile the code. After compiling the code we will get the class file which is also known as the byte code. Lets call the class file HelloWorld.class. To run the class file we use the following command.

java HelloWorld

When the command “java” is typed JRE will deploy the “c/c++” (depending on the OS) code to create the java virtual machine (JVM) to execute the java program. At any given moment there can be more than one JVM instance. The number of JVM instance existing is depended on the number of JAVA programs running. If there are 3 JAVA programs running there are 3 instances of JVM. For JVM to exit there are two conditions.

1. There should be no non-daemon threads (Threads Created by the User) existing.

2. The java application calls the exit method.

Besides the above mentioned conditions, JVM can also exit if it crashes.

In the next sections we dive into the JVM architecture and see its inner workings.

JVM ARCHITECTURE

JVM Architecture (image:geeksforgeeks)

The above diagram shows the conceptualized architecture of JVM. JVM will do 3 steps to execute a byte code. First is LOADING seconds is STORING and finally it’s the EXECUTION. Loading is done by the class loader, storing is done in the memory area and finally execution is done by the execution engine. Now we will look deeper into each of the steps.

CLASS LOADER

Class Loader’s job is to load the class into the JVM. In order to load the class it does 3 steps and those are loading, linking and initialization.

Loading

The first step into executing the byte code is to load the class. Class Loader will do the task of loading the task on to JVM. There are 3 main class loader in JVM.

1. Bootstrap Class Loader

2. Extension Class Loader

3. Application Class Loader

For more information about class loaders you can visit this link https://www.baeldung.com/java-classloaders

There is also one other class Loader which is the user defined class loader.

So no matter the class loaders used they must follow the 3 principles that are below

1 Visibility Principle

The visibility principle states that the classes that is loaded by the parent class should be visible to the child class but not the other way around.

2 Uniqueness Principle

This principle states that a class should be loaded only once which means that a class that is loaded by the parent class loader should not be loaded by the child class loader.

3 Delegation Principle

The delegation rule states that a hierarchy principle must be followed when loading a class so when a class is to be loaded a request is sent to Application class loader. Application class loader will send that request to the Extension class loader and the Extension Class Loader will send that request to Bootstrap class loader and If the class path is able to be found by bootstrap class loader it loads the class, if not it will send it to the extension class loader and if it is not in extension class loader it is send to the application class loader which will load the class if it is located in the system class path. If the class is failed to be loaded from the application class loader a classNotFoundException is thrown.

(After a class is loaded in to the JVM it cannot be unloaded)

When a class is loaded into JVM it checks for the following things in a class

1 Fully qualified class name

2 Instance and static fields in the class

3 Immediate parent class information

4 Whether it is a class, interface or an enum.

When a class is loaded JVM will also create a class object (The type of the object is class) and store that object in the heap. A class object is created only once for a specific class and is not created again even if it is called again in the program.

Linking

In linking there are 3 steps and those are

1 Verification

2 Preparation

3 Resolution

These 3 steps should be followed in the exact order.

1 Verification

In JVM the byte code verifier will check the following to see whether a valid class file is loaded.

· Valid Complier

· Correct Format

· Correct Structure

For validating the complier the verifier will check if the complier that was used if a licensed. And for the format and structure it will check if any changes are done to the compiled byte code. If any of the above mentioned is wrong the runtime exception verifier exception is thrown.

2 Preparation

In the preparation phase JVM assigns default values to the static and instance fields in the class file. Different data types have different default values. For example integers have a default value of 0 and Boolean has the default value of false.

3 Resolution

In this step JVM will change the places with the business link into direct links. Which means is that while writing a java program we will create objects with real life names such as employee , student and etc. Machine does not understand such words. So in the resolution step JVM will assign the memory location for those objects by replacing their symbolic links with the direct links.

Initialization

Initialization is the final step of the loading phase. In this step the actual values for the static and instance variables are assigned and the static block is executed. The initialization process must be executed before the active use of the class.

The following six are the active use of a class.

1. Use of new keyword.

2. Invoking a static method.

3. Assigning values to static fields.

4. Initial Class (Class with the main method)

5. Reflection API. (getInstance() method)

6. Initializing a child class.

And in java there are 4 ways to initialize a class.

1. New Keyword: Goes through the normal initialization process.

2. Reflection API : getInstance() method and goes through the normal initialization process.

3. Clone Method : Gets the information from the source object.

4. IO.ObjectInputStream : Gets data from non transient variables passed in the parameter.

MEMORY AREA

In this section we will see the how the information about the classes are stored in the JVM memory area. As seen from the below image memory are has 5 components.

Memory Area

1. Method Area

2. Heap Area

3. Stack

4. PC Register

5. Native Method Area

1 METHOD AREA

In the method area all the information about a class is stored these information are instance variables and static variables, methods in the class and constructor details. JVM will create only one method area for a program execution.

(Class loader reference is also stored in the method area. Old java has perm gen space for storing static pools but new java uses meta space to store these details.)

2 HEAP AREA

In the heap area all the object references are stored. The heap will have reference type data. All objects data which includes string and arrays are stored in the heap area. Heap area is also created once per JVM.

3 STACK

In the stack area local variable data are stored in frames per method.

As per the above diagram, each frame will contain data about method which will include local variable data. So, whenever a method is finished the frame is popped out of the stack. As seen this uses Last In First Out (LIFO). A new stack is created per active threads.

4 PC REGISTER

The PC Register holds the information about the order of execution for methods. It will hold the information about the method that is to be executed next. A new PC Register is created per thread.

Stack Operations

as seen from the above image different stacks are created according to the thread and the order of operation is stored in the pc register. If a method is accessing the native method area at that exact point pc register will have null value for that specific thread that is accessing the native method.

5 NATIVE METHOD AREA

Native method area contains data about native methods. Native methods are methods that are not written in java. It can be “c/c++” depending on the OS that is used.

EXECUTION ENGINE

Now after Loading and Storing the classes we come to the final part of the JVM which is the Execution engine. Execution Engine mainly has 3 components.

1. Interpreter

2. JIT Compiler

3. Garbage Collector

1 INTERPRETER

Interpreter converts the byte code into machine code line by line and then executes them in a sequential manner. The interpreter may be slow as it converts line by line and also interprets the same methods multiple times if those methods are used again and again. To overcome this problem JIT compiler is used parallelly with the JVM interpreter.

2 JIT COMPILER

JIT Compiler means Just In Time Compiler. JIT compiler compiles the byte code into native machine code at the runtime of JVM. The use of JIT is that whenever a method is used multiple times it does not need to interpret line by line instead JVM will directly call the compiled code from the JIT compiler. Even though JIT complier quickens the execution of a java program, at the start of the program a lot of methods are being compiled so it will effect the startup time and performance.

3 GARBAGE COLLECTOR

Garbage Collector is a daemon thread (Low priority thread ) that runs in the background. GC will check the heap area for objects that are not in use and will remove them from the heap to free up space. An object is eligible for garbage collection if that object is not being referenced by the class. Which means if the object value is null, it is eligible for garbage collection. Garbage collector also contains a method called finalize(), this method has an empty implementation that can be overridden by the programmer for cleaning purposes. Ex: Closing Database Connections.

Other than the above-mentioned components JVM also has JNI and Native Method Libraries.

Java Native Interface and Native Method Libraries

JVM uses Java native interface to communicate with the native method library which usually have methods written in “c/c++” depending on the environment.

So the key points to remember is that JVM does not exist in the JRE. It is deployed by the JRE during the runtime of the java program. JVM is the specification and JRE is the implementation. JVM does 3 steps to execute the java program and those are LOADING, STORING and EXEUCTION and these steps are done by the CLASSLOADER, MEMORY AREA and EXECUTION ENGINE respectively.

This article is mainly influenced by a YouTube playlist that is made by Krishantha Dinesh.

And the following references are also used to write this blog.

https://www.geeksforgeeks.org/execution-engine-in-java/#:~:text=The%20execution%20engine%20is%20the,the%20virtual%20machine's%20execution%20engine.

--

--