The Classpath

Of all the barriers to entry for beginning Java developers, the classpath is probably the most baffling.  The problem is compounded by the fact that the documentation isn't the greatest.  Even after reading the official documentation, I've seen countless beginners still struggle with the concept.  The biggest problem, I think, is that the documentation separates what should be one topic in to three.  So here I'm going to tackle three subjects at once in hopes of making a coherent whole:  the classpath, packages, and imports.

Imports

Just about any non "Hello, World" program is going to need to import something.  Imports are handled, conveniently enough, using the import  keyword.  Items are imported using their fully qualified class names, or FQCN .  For example, to import the IOException thrown by virtually any method involving anything remotely related to IO, you would say:  import java.io.IOException.  This makes the class IOException available without having to fully qualify it every time it appears in your source code.  You could of course just use the full name all the time.  That's completely valid and completely annoying.  

Now, sometimes you'll have no choice but to fully qualify a class name.  For example, (just to stick with classes in the JRE) there's a java.util.List  and a java.awt.List .  They do completely different things but are conveniently named the same.  In the case where you find yourself needing to use both classes, you can import one but you'll have to fully qualify the other.  My advice is to import the more frequently used class and fully qualify the other.  The more libraries you use the more likely you are to run in to this problem.

Going back to that import for a moment, let's dissect that line: 

import java.io.IOException;

Apart from the import keyword, there are two important components to note here.  As I've hinted at earlier, IOException is the class name.  java.io is the package name.  This is the part of the FQCN that makes it "fully qualified."  You will need at least this much to import anything.  Like many other languages, Java supports wildcard, or star, imports.  Personally, I consider wildcard imports bad practice because they can pull in too much and pollute the local namespace.  They can also lead to confusing errors in the middle of your code because you didn't import what you thought you did.  I find it best to explicitly import each class that you need so that you're always sure what you're getting.  That's a stylistic preference and many will argue the opposite.  Just know that those people are all wrong.  Regardless, imports are a compile time construct and have no bearing on the structure or runtime performance of the compiled bytecode.

Packages

So what are packages?  Simply put, packages are namespaces in which classes are declared.  They serve as logical or functional grouping to help you organize your code.  They can also serve to isolate your code from other sections based on the access control modifiers you put on your classes and class members.   There's not a whole to say about them but there are a few things to note.

Packages are flat namespaces.  Take these two packages:  java.awt  and java.awt.event .  These two have almost nothing to do with each other.  The names are "nested" simply to provide a logical grouping showing that the classes in java.awt.event  are logically related to those in java.awt.   However, importing "java.awt.*" will not import anything from java.awt.event nor any other subpackages of java.awt.  This is one of many reasons I discourage the use of star imports as mentioned above.  I've lost count of the beginners who make this mistake.

So how do you make a package?  Two things:  a directory and a keyword.  You can declare that your class lives in a particular package by putting something like the following at the top of your source file: 

package org.something.somethingelse;

The recommended package structure is the reverse order of your domain name.  This works well for companies who actually have a domain name but if you're a student or a 14-year old in your bedroom, that likelihood diminishes rather quickly.  Other options I've seen are to include whatever hosting service you're using for your source repository (provide that it's open source).  For example, if you put your code up on github, I've seen people use com.github.projectname as a package structure.  This works fine until you move your project somewhere else and then that package name gets slightly moldy.  Barring that, you're free to use whatever package structure you want.  You just want to make sure that it's globally unique so that you don't end up sharing a namespace with someone else.

Another thing to note before we move on from packages, is that the package structure reflects where on the disk your code lives.  We'll see why in the next section, but for now just know this:  if your source file is Foo.java with a package statement of org.something.somethingelse, it should live in path structure on disk that looks like this:

<some root>/org/something/somethingelse/Foo.java

Those directories and the package statement need to match up.  When you compile your source, the .class file will be created in a directory structure matching your package structures.   

You've created your class with packages and everything.  You've compiled it.  And now you'd like to use that class.  Before you can do that, though, you have to tell the JVM where to find it.  How do you do that?

The Classpath

Finally.  What you came here to learn about in the first place.  There's nothing magical about the classpath.  Think of it as the JVM analog of your operating system's PATH.  It's a value that tells the JVM where to look for your classes.  There are handful of ways to do this.  The most basic version is to use the CLASSPATH environment variable.  Using a global CLASSPATH (one visible to every user and application on your system) is generally a bad idea but if you set the CLASSPATH variable in a script, say, before launching your application it's not a big deal.

The "better" way to use the -cp option to both javac and java.  Before continuing on, let me highlight something here.  You need that classpath for both compiling and running your code.   The values might differ between compiling and running but you'll still need to tell the JVM where your code is at runtime.  I can't tell you how many times I've seen beginners compile with a classpath and try to run without it.  Java classes aren't statically bound.  They still need to be told where things are at runtime.

So now that you know how to use the classpath, how do we define one?  Unlike the PATH environment variable, two things can go on the classpath:  directories and files.  The standard distribution format in java is the jar file.  A jar file is "just" a zip file with some extra metadata.  Using a plain zip file also works.  The classpath works with either.  A jar file just offers you the ability to do some more advanced things like declaring a bundled classpath and certain authentication/validation options of the classes inside.  We're not going to cover them here as they'll only serve to cloud the discussion.  The only requirement of the structure, whether in a jar or a zip file, is the structure and layout of the contents which we'll cover momentarily.

Directories can also be included on the classpath.  The structure of the contents under a directory has the same requirements as a jar or zip.  Each element on the classpath is called a root.  Under each root, the class loader will append the fully qualified class name  to try to find the class you've asked to use.  For this reason, you don't add the directory where Foo.class lives to your classpath.  You add the directory above where your package name starts.  I know that sounds mildly confusing so let's look at an example.

 

/Users
/jlee
/dev
/myproject
/build
/org
/something
/somethingelse
/Foo.class
/Bob.class
/Bar.class

/src
/org
/something
/somethingelse
/Foo.java
/Bob.java
/Bar.java

In this example, we have three source files:  Foo.java, Bob.java, and Bar.java.  Foo and Bob both live in the package org.something.somethingelse  while Bar lives in org.something .  As you can see, when compiled the resulting .class files live in build under the root of the project.  Under build, the directory structure matches that of our package structure.  So then, if Bar is our main class and we wanted to run the project, it would look like this:

java -cp /usr/lib/java/some.jar:build org.something.Bar

Again, there are two pieces here.  In bold, is the FQCN of the class you want to run.  In italics is the classpath.  In this UNIX-friendly example, I'm using :  to separate the items.  If you're on windows, you'll need to use ; .  When the class loader goes to find org.something.Bar, it looks at each element of the classpath it's been given.  The first item is some.jar that represents some external dependency.  The classloader will look in that jar for some resource matching the name org/something/Bar.classIt won't find it there, of course.  That's your class after all.

So it proceeds to the next element on the classpath:  build.  Here, again, it will look for org/something/Bar.class.  And, of course, here it can find it.  Now that it's found your class, it will load it and try to run the main()  method you've surely written on that class.  Since Bar uses the classes Foo and Bob, the classloader once again has to scan the classpath looking for those two classes.  The classloader is a "first one, wins" component.  If your package structure is not unique, it's conceivable that it finds an org.something.somethingelse.Foo that is not your class somewhere inside some.jar. In practice this is pretty rare largely due to the fact that everyone follows some basic rules when naming their packages.

And that's essentially it.  That's the classpath in a nutshell.  It's not a terribly complicated mechanism but can be confusing the first time you see it.  There are some nuances and some more advanced variations to classpaths that aren't covered here but by the time you get there, you should be more than capable of handling those details.

 

Editor's note: This  post is meant to exist as a tutorial and as such may be tweaked from time to time to clarify a point or tighten up the language.  Most blog posts are largely left alone once posted but I intend to update this one as needed.  If you find an error, please leave a comment and I'll try to address it.

Static members in an injected world

Dependency injection lets you write objects in such a way that you really don't have to care about how to construct your dependencies.  The mechanism can vary from framework to framework but it's usually as simple as putting @Inject or @Autowired on your field or perhaps your constructor paramaters and let the framework find those implementations for you.  I don't want to go into all the how tos of that here but you typically just have to write the construction code once for you dependencies in, say, a Guice module and then just inject away.  So now that we can easily get dependencies injected, the question arises, "What can/should we inject?"  My answer is, "everything." 

The beauty of injection is that you can swap out those dependency implementations based on your runtime environment.  As long as there's a common API to code to, the code accepting those injections need never know the difference.  When converting a large codebase from a traditional "new everything" approach to use guice, I started noticing we were using static methods on utility classes quite a lot.  Given how cheap injection is, I started converting those usages to injections and making those methods not static.  For me, this made things much "cleaner."  It was also, apparently, quite the controversial decision.

There is a school of thought that goes something like this:  "If there's no state on an object, don't inject it.  Make it static and just call those methods statically."  This is "wrong" for a number of reasons.  The biggest reason, to my mind, is that it conflates dependency  injection with state  injection.  The idea, it seems, is that if there's no state on the object, you should save yourself the expense of creating an object just to call its methods.  For me, though, the cost of instantiation is pretty low and these dependencies can be marked as singleton and only created once.  Once created, these objects can then be injected a million times with no extra overhead.

But bigger than that, is the second reason this school of thought is wrong:  I can change the actual object that gets injected whenever I like.  For example, say you have a set of utility methods that are environment aware:  developer's laptop, jenkins, staging, production, Customer A, Customer B, etc.  They send emails or initialize the database by deleting everything.  Whatever.  By simply changing the injection configuration, I can change the implementation to suit the runtime environment without ever changing anything but that configuration.  With static methods, you're stuck with whatever implementation exists on whichever implementation is bound the compiler.  Maybe that's fine for now but it locks your application in those choices in ways that aren't necessarily easy to change.

The decision to inject or not is largely stylistic, but I believe there are enough technical reasons to favor injection over statics and certainly there are more benefits to injection than statics.   Static members (methods in particular) are a code smell in normal circumstances that when we find cheap, easy ways to reduce them, I think we should take advantage of the opportunity.

 

The waiting is the hardest part

I've been using a utility for some time now that has been absolutely awesome to work with but I don't think it's gotten enough attention.  For those who haven't yet heard of it, let me introduce you to one my favorite utilities: Awaitility.​  

Awaitility_logo_red_small.png

This little utility does one thing really, really well.​  Using its simple API, you can defer execution of a method to a thread and wait for it to finish or time out.  Let's look at some examples to see how this works.  The most obvious example is probably waiting for network IO to finish.  We don't want to wait forever for something to download.  Using Awaitility, it'd look something like this:

Awaitility.await().atMost(new Duration(10, TimeUnit.MINUTES))
.until(new Callable<Boolean>() {
@Override
public Boolean call() throws Exception {
return file.exists();
}
});

In this example, we'll wait at most 10 minutes for the file to download.  Awaitility will call your Callable every so often until call() returns true.  The default polling time is 100ms but this configurable by calling pollInterval() in your method chain.  You can also choose to wait a certain duration before polling begin via pollDelay().

If you just need to run some method, however, that doesn't really return anything or you don't care about the return value, you can simply make that blocking call in your Callable like so:

Awaitility.await().atMost(new Duration(10, TimeUnit.MINUTES))
.until(new Callable<Boolean>() {
@Override
public Boolean call() throws Exception {
blockingCall();
return true;
}
});

For as much as it lets you do, Awaitility has very simple API.  If scala or groovy is your preferred language, there are bindings for those as well, though I've no experience with those versions.​