Object-Oriented Programming with Java Lecture Notes
29 April 2008 • The Java Virtual Machine
Outline
Why Compile?
Why are there compilers?
Language Execution
Mostly, to be useful a program has to execute in some environment.
This environment provides
the execution engine,
the language abstractions,
services
The higher-level the language, the greater the demands on the environment.
The Execution Environment
A program's execution environment comprises (among other things)
The execution hardware (CPU, storage, interconnects).
The support software (run-time libraries, operating system).
To execute a program, the environment must understand it.
Execution Hardware
Most execution hardware is simple.
They manipulate a few, basic data types (bytes, words, floating point numbes).
The manipulations are few and basic (arithmetic, comparisons).
This simplicity is reflected in the hardware's
instruction-set architecture
(ISA),
the set of instructions hardware can execute.
Executing Java
At some points Java wants to do what the execution environment can do.
Adding two integers, for example.
But at other points (most other points) there's a great mismatch between what Java wants to do and what the execution environment can do.
new Card("as")
, for example.
The Semantic Gap
The difference between what a Java program wants to do and what an execution environment can do is the
semantic gap
.
The execution environment can't understand what the Java program wants.
The Java program's is expressed in a langauge different from what the execution environment understands.
The semantic gap blocks Java program execution.
Minding the Gap
The semantic gap must be filled before Java progams can execute.
There's two approaches to filling the gap:
Lower the Java program to something the execution environment can understand.
Raise the execution environment so it understands Java.
And, practically, a combination of 1 and 2.
Lowering Java
A Java program can be lowered to execution-environment level by
translation
.
The Java language is expressed in the execution environment's ISA.
This is called
native-code translation
.
Other Java features are expressed in OS facilities.
This is why there are compilers: they do translation.
Native-Code Example
The Gnu compiler for Java (GCJ) is based on native-code translation.
The gcj compiler translates Java programs to native code.
GCJ also contains a native-code implementation of the class library.
GCJ leverages existing compiler technology, and existing GNU compiler technology.
Raising Execution
The other bridge over the semantic gap is to have execution environments that understand Java.
Java is complicated enough so that this approach is impossible.
The Java execution model far exceeds (and is different from) current state of the practice.
Other languages (Lisp, Smalltalk) have had direct-execution engines.
Java Machines
Java bytecode executes in an environment with certain features.
The operand stack, for example; the object heap for another.
The set of features needed to execute Java bytecode, including the execution engine itself, is known as the
Java Machine
.
Java Bytecode
Java bytecode is a good representation for Java programs, but it suffers from an obvious problem.
No current CPU architectures are able to execute bytecode instructions.
In fact, the entire Java machine is completely contrary to modern CPU architectures.
Bytecode Interpreters
The Java bytecode architecture is custom fit to represent and execute Java programs.
But it's far away from what available CPU architectures recognize.
Bytecode Interpretation
The Java system bridges the semantic gap by implementing a CPU architecture that executes bytecode.
The interpreter's called the
Java Virtual Machine
(JVM).
There are other ways to raise the CPU, and other ways to bridge the semantic gap.
Sun's chose to use interpretation
Recursion!
The JVM runs programs, but the JVM itself needs to be run.
The host system is responsible for running the JVM, among other things.
JVM Responsibilities
The JVM has three main responsibilities:
Manage class files, including loading and verification.
Executing programs (interpreting class files).
Interacting with the host system while performing 1 and 2.
Host-System Portability
If the host system bleeds through the JRE, programs become less portable.
A Java program using Solaris-specific facilities, for example.
And bleed-through is necessary, because the JRE can't do it all.
Otherwise it would become another operating system.
Networking Example
A large class of anticipated Java programs need IP-based network communication.
An IP stack in the JVM is a possibility.
Uniform network access, but the JVM is now more complicated.
High-performance TCP implementations are non-trivial.
At some point the JVM still has to bang on the host system.
Host-System Abstractions
To insure portability, the Java system has to define abstractions for the host-system facilities a program might use.
For example: the file system or network connectivity.
Success depends strongly on the depth and breadth of the abstractions provided.
JVM Portability
The JVM is a big, complex program that needs to be fast and efficient (high performance).
Writing portable, high-performance programs is hard.
Portability generalizes code; performance specializes code.
The tension between portability and performance is fierce in Java.
The Java Runtime Environment
The Java runtime-environment (JRE) has three main components:
The class library.
The JVM.
The host system.
What is Portability?
Java programs are portable because the JVM is (approximately).
C is portable because it's got compilers everywhere.
Java's portable because it's got JVMs everywhere.
But not all C compilers (and JVMs) are created equal.
JVM Alternatives
As long as a system supports JVM semantics, it can run Java programs.
Two immediate alternatives are
Compile down to native code, with the JVM implemented by (native) run-time libraries.
A hardware implementation of the JVM.
The JVM Spec
The JVM spec balances wide-spread deployment against portability.
“Wide-spread” includes hardware of differing capabilities.
In general, portability loses.
A cell-phone JVM may run on a workstation, but not vice versa.
Similarly for Java programs.
Garbage Collection
Garbage collection algorithms vary between one JVM and another.
It's impossible to know exactly when memory will be reclaimed
Thread Scheduling
Thread scheduling algorithms differ among JVMs..
Thread execution sequences can't be accurately predicted.
This has little bearing on Java programs.
Thread scheduling is usually underspecified anyway.
Host-System Resources
The JVM is ecumenical in its host-system requirements.
The tension between decoupling for portability and embracing for performance.
Portability
Software is
portable
when it's easier to move it to a new system than rewrite it for the new system.
There are two principle Java portability concerns:
A Java program portable among JVMs.
A JVM portable among host systems.
Java Program Portability
In theory: write once, run anywhere.
In practice, it's not that easy.
Host system differences the JVM couldn't hide.
Including buggy host features.
All JVMs are not created equal.
Thread scheduling.
JVM Portability
To some extent, the JVM is portable among similar systems.
The JVM is also, in many aspects, loosely specified to permit efficient and wide-spread implementations.
The Host System
The third major part of the run-time system is the host system.
The execution engine (threads and CPUs).
File systems, peripherals, and networks.
Other execution resources (libraries and other code).
History
50s: Fortran I-O specs.
60s: Basic
70s: Pascal P-Code.
80s: Postscript
90s: JVM
Summary
Interpreters are a software realization of direct-execution hardware.
Virtual machines are an old and useful technology.
References
The Java Virtual Machine
by Tim Lindholm and Frank Yellin, Addison-Wesley, 1999.
Programming for the Java Virtual Machine
by Joshua Engel, Addison-Wesley, 1999.
Dynamic Class Loading in the Java Virtual Machine
by Sheng Liang and Gilad Bracha, OOOPSLA '98.
This page last modified on 7 April 2008.