As software engineers we write code every day, and it is inconceivable that this code would ever exist in a vacuum, isolated from all other software ever written. Never has the 'standing on the shoulders of giants' metaphor been more apt than it is today in software engineering, with GitHub, Stack Overflow, Maven Central, and all other directories of code, support, and software libraries available at our fingertips.
Software is built from Application Programming Interfaces (APIs) - we make use of the JDK, and numerous dependencies brought in from tools such as Maven or Gradle, on a daily basis. If you were to ask a room full of software engineers if they were API developers, their response is usually that no, they are not. This is incorrect! Anyone who has ever crafted a public class or public method should consider themselves an API developer. The term 'crafting' is used deliberately here. Too often (and for good reason!) software engineering gets wrapped up in the formality of engineering, but API design different: it is an art form that requires creativity and a gut feeling to be developed over many years, rather than an exact science.
API has to be thought of as a contract - when we make API available to other developers, we are promising them certain functionality. It is often hard for engineers to leave their code alone, as they imagine new and better approaches (even when not working). Thinking that we should revise our API, to attempt to make it better, is to only be done after much consideration, as by doing this we risk introducing breaking changes and bugs to the developers down stream from us.
At the same time, we can't just throw our API over the fence to our users as soon as it exists. We have to carefully consider whether we have cleanly separated our API from its implementation - we don't want users to know the inner workings of our API to make use of it! We should never expect our users to know some special trick that works because they have access to our implementation code. Going even further, too many developers I have worked with make the mistake of documenting their APIs by describing exactly how the implementation works, which is never what the end user of an API should care about. In fact, this can make us beholden to maintaining the same implementation forever, as users begin to expect certain characteristics that go well beyond the intention of the API as it was intended.
In the build up to the 1.0.0 release of our API, we should feel a sense of freedom to experiment with our APIs. In some projects, the point at which we reach 1.0.0 is the point in which our API becomes locked down (at least until the 2.0.0 release). However, in other projects, there may be more flexibility here, with allowance to continue with experimentation and refinement of the API beyond the 1.0.0 release. In fact, with a wise deprecation process, it is not too hard to evolve an API in a backwards compatible way over a long period of time, without restricting the ability to introduce better approaches. The goal of this website is to instill in the reader a sense of what considerations must be made to enable the ability of an API to evolve over time.
There are many criteria through which an API can be characterized, six of which are introduced below, and which form the threads that we will cover in more depth throughout this website.
How often as an engineer have you downloaded a library as a Maven dependency, and then wondered which class gets you started with using the API? An API should not be considered successful if a developer cannot intuitively understand how to use it.
Developers should give consideration to the entry points into their API. Complete documentation is useful and helps developers to understand the bigger picture, but ideally, we want to ensure minimal friction for the developer, and therefore we should provide developers with the minimal steps to get started at the very top of our documentation. From here, good APIs try to expose their functionality through this entry point, to hold the developers hand as they make use of the primary use cases enabled by the API. More advanced functionality can then be discovered through external documentation as necessary. This concept is often referred to as progressive disclosure.
Because we expect others to use our APIs, we should put in the effort to document it. This is so obvious it almost hurts to write, but we are API designers do ourselves a disservice almost universally on this subject. API documentation is all too often an afterthought - something that is written when time permits. Unfortunately, with the frenetic pace of modern-day development, this often means that a substandard set of API documentation is what our users encounter when we release our new library.
Fortunately, some tools do exist that will attempt to keep us honest, but the burden is still on us to ratchet the importance of API documentation up the priority order. API documentation should be generated from the very first day of our API design efforts, and regular reviews of our actual, generated documentation should be performed. Not only will this help to uncover areas of our documentation that are lacking, but it will also potentially help to uncover other API design mishaps.
It is my strong opinion that alongside high-quality API documentation there should exist equally high-quality code samples. These code samples should be able to be copy-pasted by our users into their projects and with the most minimal of editing have the code execute. Where configuration is required (e.g. the user must supply an API key), our code sample must have the foresight to recognise developers are not mind readers, and to therefore document clearly and loudly within the code sample itself the steps that the user must go through to make use of this code sample.
Combining these two points is my final requirement of a well-documented API: we must use tooling, wherever practicable, to embed live code snippets into our generated API documentation. This means that our API documentation is a living document - when APIs are refactored during development, our code snippets will be automatically refactored by our IDEs, and these code snippets will therefore always be correct.
A good API should not surprise its users, and one way we can fail at this is by not being consistent. When we speak of consistency, we mean ensuring that we repeat the same concepts in our API, rather than introduce different concepts in an ad-hoc fashion. Examples include:
- All of our methods should have the form
xyz(), but not both forms.
- If there are two methods (one a convenience overload of the other, e.g. one taking
Object...(that is, a var-args of
Object) and the other taking
Collection<? extends Object>), that overload should be available everywhere.
Teams that can establish a team-wide vocabulary or set of guidelines have the upper-hand here. These guidelines help to focus and provide clarity where ambiguity may otherwise reign supreme. The end result is that we apply a veneer of consistency across our entire SDK, and our users can be delighted by the reduced cognitive burden imposed upon them when they must intuit how the next API they encounter might work based on their experiences so far.
4. Fit for purpose
In developing an API we must ensure that we target it at the right level for the intended user. This can be thought of in two ways:
- Do only one thing, and do it right.
- Understand your user and their goals.
A good example is the Collections APIs included with the JDK. The user should be able to store and retrieve items from a collection very easily, without first defining a reallocation threshold, configuring a factory, specifying a growth policy, a hash collision policy, a comparison policy, a load factor, a caching policy, and so on. Developers should never be expected to know the inner working of a collections framework to benefit from its functionality.
Creating new API can happen almost too quickly - a few taps on the keyboard - but we should remember that for every new API, we are potentially committing to a lifetime of support.
The actual cost of our API decisions depends a lot on our project and its community - some projects are happy to make breaking changes constantly, whereas others (such as the JDK itself) are eager to break as little as possible. Most projects fall in the middle ground, adopting a semantic versioning approach that carefully deprecates API before removing it in major releases only.
Some projects even go so far as to use include various markers to indicate experimental, beta, or preview functionality that makes it available in releases for feedback, before a final API is locked down. A common approach is to introduce this new, experimental functionality with the
@Deprecated annotation, and to then remove the annotation when the API is considered ready.
As mentioned earlier, users do unexpected things with our APIs all the time. They make assumptions about our APIs and do things we never intended for them to do. Because of this, we must be very careful about everything our APIs do. For example:
toString()and logging output will be parsed (meaning users expect it to never change).
- Any mention of an implementation detail in an API will be seized upon as a forever certainty.
- Any API that is
protectedscope will be overridden in sub-classes.
Sometimes, in order to help our users, and because of the limitations of the visibility keywords in Java, it is better to give them less so that they can do less harm.
For every API decision we make, we are backing ourselves into at least one corner. As we make API decisions, we must take the time to view them in the wider context of the future releases of the SDK. We must always be certain to consider the ramifications of an API decision - how it will play out in the context of future ideas. Making this requirement more challenging, we must not only consider the known set of future requirements, we must also consider the unknown set of future requirements. Of course, this sounds implausible at first, but there are many places in API design where we must ask ourselves questions such as:
- Is the parameter list for this method fixed in size or is it likely to grow as new features are added?
- Is the file size always going to be able to be represented as an integer or should be it a long?
Answering questions like these eagerly, whilst our API is still in development, gives us the ability to more carefully consider how our API may be forced to evolve over time, and therefore it enables us to consider how we might support this evolution. In many cases, this leads us to change our API upfront, so that it is ready for the known, and the unknown, when the time comes.