The Classpath

Of all the barriers to entry for beginning Java developers, the classpath is probably the most baffling.  The problem is compounded by the fact that the documentation isn't the greatest.  Even after reading the official documentation, I've seen countless beginners still struggle with the concept.  The biggest problem, I think, is that the documentation separates what should be one topic in to three.  So here I'm going to tackle three subjects at once in hopes of making a coherent whole:  the classpath, packages, and imports.

Imports

Just about any non "Hello, World" program is going to need to import something.  Imports are handled, conveniently enough, using the import  keyword.  Items are imported using their fully qualified class names, or FQCN .  For example, to import the IOException thrown by virtually any method involving anything remotely related to IO, you would say:  import java.io.IOException.  This makes the class IOException available without having to fully qualify it every time it appears in your source code.  You could of course just use the full name all the time.  That's completely valid and completely annoying.  

Now, sometimes you'll have no choice but to fully qualify a class name.  For example, (just to stick with classes in the JRE) there's a java.util.List  and a java.awt.List .  They do completely different things but are conveniently named the same.  In the case where you find yourself needing to use both classes, you can import one but you'll have to fully qualify the other.  My advice is to import the more frequently used class and fully qualify the other.  The more libraries you use the more likely you are to run in to this problem.

Going back to that import for a moment, let's dissect that line: 

import java.io.IOException;

Apart from the import keyword, there are two important components to note here.  As I've hinted at earlier, IOException is the class name.  java.io is the package name.  This is the part of the FQCN that makes it "fully qualified."  You will need at least this much to import anything.  Like many other languages, Java supports wildcard, or star, imports.  Personally, I consider wildcard imports bad practice because they can pull in too much and pollute the local namespace.  They can also lead to confusing errors in the middle of your code because you didn't import what you thought you did.  I find it best to explicitly import each class that you need so that you're always sure what you're getting.  That's a stylistic preference and many will argue the opposite.  Just know that those people are all wrong.  Regardless, imports are a compile time construct and have no bearing on the structure or runtime performance of the compiled bytecode.

Packages

So what are packages?  Simply put, packages are namespaces in which classes are declared.  They serve as logical or functional grouping to help you organize your code.  They can also serve to isolate your code from other sections based on the access control modifiers you put on your classes and class members.   There's not a whole to say about them but there are a few things to note.

Packages are flat namespaces.  Take these two packages:  java.awt  and java.awt.event .  These two have almost nothing to do with each other.  The names are "nested" simply to provide a logical grouping showing that the classes in java.awt.event  are logically related to those in java.awt.   However, importing "java.awt.*" will not import anything from java.awt.event nor any other subpackages of java.awt.  This is one of many reasons I discourage the use of star imports as mentioned above.  I've lost count of the beginners who make this mistake.

So how do you make a package?  Two things:  a directory and a keyword.  You can declare that your class lives in a particular package by putting something like the following at the top of your source file: 

package org.something.somethingelse;

The recommended package structure is the reverse order of your domain name.  This works well for companies who actually have a domain name but if you're a student or a 14-year old in your bedroom, that likelihood diminishes rather quickly.  Other options I've seen are to include whatever hosting service you're using for your source repository (provide that it's open source).  For example, if you put your code up on github, I've seen people use com.github.projectname as a package structure.  This works fine until you move your project somewhere else and then that package name gets slightly moldy.  Barring that, you're free to use whatever package structure you want.  You just want to make sure that it's globally unique so that you don't end up sharing a namespace with someone else.

Another thing to note before we move on from packages, is that the package structure reflects where on the disk your code lives.  We'll see why in the next section, but for now just know this:  if your source file is Foo.java with a package statement of org.something.somethingelse, it should live in path structure on disk that looks like this:

<some root>/org/something/somethingelse/Foo.java

Those directories and the package statement need to match up.  When you compile your source, the .class file will be created in a directory structure matching your package structures.   

You've created your class with packages and everything.  You've compiled it.  And now you'd like to use that class.  Before you can do that, though, you have to tell the JVM where to find it.  How do you do that?

The Classpath

Finally.  What you came here to learn about in the first place.  There's nothing magical about the classpath.  Think of it as the JVM analog of your operating system's PATH.  It's a value that tells the JVM where to look for your classes.  There are handful of ways to do this.  The most basic version is to use the CLASSPATH environment variable.  Using a global CLASSPATH (one visible to every user and application on your system) is generally a bad idea but if you set the CLASSPATH variable in a script, say, before launching your application it's not a big deal.

The "better" way to use the -cp option to both javac and java.  Before continuing on, let me highlight something here.  You need that classpath for both compiling and running your code.   The values might differ between compiling and running but you'll still need to tell the JVM where your code is at runtime.  I can't tell you how many times I've seen beginners compile with a classpath and try to run without it.  Java classes aren't statically bound.  They still need to be told where things are at runtime.

So now that you know how to use the classpath, how do we define one?  Unlike the PATH environment variable, two things can go on the classpath:  directories and files.  The standard distribution format in java is the jar file.  A jar file is "just" a zip file with some extra metadata.  Using a plain zip file also works.  The classpath works with either.  A jar file just offers you the ability to do some more advanced things like declaring a bundled classpath and certain authentication/validation options of the classes inside.  We're not going to cover them here as they'll only serve to cloud the discussion.  The only requirement of the structure, whether in a jar or a zip file, is the structure and layout of the contents which we'll cover momentarily.

Directories can also be included on the classpath.  The structure of the contents under a directory has the same requirements as a jar or zip.  Each element on the classpath is called a root.  Under each root, the class loader will append the fully qualified class name  to try to find the class you've asked to use.  For this reason, you don't add the directory where Foo.class lives to your classpath.  You add the directory above where your package name starts.  I know that sounds mildly confusing so let's look at an example.

 

/Users
/jlee
/dev
/myproject
/build
/org
/something
/somethingelse
/Foo.class
/Bob.class
/Bar.class

/src
/org
/something
/somethingelse
/Foo.java
/Bob.java
/Bar.java

In this example, we have three source files:  Foo.java, Bob.java, and Bar.java.  Foo and Bob both live in the package org.something.somethingelse  while Bar lives in org.something .  As you can see, when compiled the resulting .class files live in build under the root of the project.  Under build, the directory structure matches that of our package structure.  So then, if Bar is our main class and we wanted to run the project, it would look like this:

java -cp /usr/lib/java/some.jar:build org.something.Bar

Again, there are two pieces here.  In bold, is the FQCN of the class you want to run.  In italics is the classpath.  In this UNIX-friendly example, I'm using :  to separate the items.  If you're on windows, you'll need to use ; .  When the class loader goes to find org.something.Bar, it looks at each element of the classpath it's been given.  The first item is some.jar that represents some external dependency.  The classloader will look in that jar for some resource matching the name org/something/Bar.classIt won't find it there, of course.  That's your class after all.

So it proceeds to the next element on the classpath:  build.  Here, again, it will look for org/something/Bar.class.  And, of course, here it can find it.  Now that it's found your class, it will load it and try to run the main()  method you've surely written on that class.  Since Bar uses the classes Foo and Bob, the classloader once again has to scan the classpath looking for those two classes.  The classloader is a "first one, wins" component.  If your package structure is not unique, it's conceivable that it finds an org.something.somethingelse.Foo that is not your class somewhere inside some.jar. In practice this is pretty rare largely due to the fact that everyone follows some basic rules when naming their packages.

And that's essentially it.  That's the classpath in a nutshell.  It's not a terribly complicated mechanism but can be confusing the first time you see it.  There are some nuances and some more advanced variations to classpaths that aren't covered here but by the time you get there, you should be more than capable of handling those details.

 

Editor's note: This  post is meant to exist as a tutorial and as such may be tweaked from time to time to clarify a point or tighten up the language.  Most blog posts are largely left alone once posted but I intend to update this one as needed.  If you find an error, please leave a comment and I'll try to address it.

By:
Tags: ,
Category:
Comments Off

Autogenerating META-INF/services

If you've ever tried to use SPI, you're familiar with the hassle of maintaining the meta files necessary for the system to work.  Especially early on when a system is under heavy flux, keeping everything in sync can be a frustrating, error-fraught experience.  Many just plow through it.  Others turn to automated solutions and here is where the trouble begins.

If you're smart you look for a library and if you're lucky you find one that fits your needs.  For one reason or another, some choose to write their own library.  I know I've written one.  Kohsuke of hudson fame has written one.  If i were to use one today, I'd probably end up using this one.  I don't see it on the home page, but I seem to remember Reinier Zwitserloot suggesting that lombok had one in the works.  There are probably a dozen more out if one cared to dig deep enough (I don't).  But it shouldn't be this way.

So we have a situation where using a built-in feature to the java runtime is so awkward that we have all these competing, divergent solutions.  It's time for the JDK team to provide a native apt plugin to handle this natively.  Any feature that drives so many to build work arounds for it clearly is missing something.  In my opinion, this should be done by the JDK team and included in Java 8 at the latest.  If it takes filing a JSR to get it into Java 8, great.  I'd even lead it if necessary.  Build an amalgam of the solutions out.  Bundle an existing implement.  Build one from scratch.  Anything.  But please, give us something native so we can stop inventing the wheel over and over again.

To take it even a step further, I'd push for building a type database by default when building a jar.  Fantom has very nice way to tell the system, "Give me everything of type X."  This kind of "database" would be trivial to build during either the compilation or jar bundling phase.  But to be able to ask the JVM, give me references to all the classes implementing the interface FooBar would be an amazingly useful facility to have.  It would eliminate the need for jar/classpath scanning at start up for everything from simple plugin systems to full blown EE stacks.

So what do you think?  Is it too late for something like this in Java 8?  Or Java 7 update mumblemuble?  It'd be an amazing addition.