api-design:-which-version-of-versioning-is-right-for-you?
this article describes some of Google’s investments to make it work. It is usually much better for API providers to treat internal users and partners as if they were external consumers whose development process is independent.

Choosing the appropriate technique

You can probably see already that format version and entity versioning are fundamentally different techniques that solve different problems with different consequences, even though they both sail under the flag of versioning.

So when should you choose to do format versioning versus entity versioning? Usually the business requirements make the choice obvious.

In the case of the bank, it isn’t feasible to introduce a new entity version of an account in order to enable an API improvement. Accounts are stable and long-lived, and moving from old accounts to new ones is disruptive. A bank is unwilling to inconvenience its banking customers just to make life better for API developers. If the goal is just to improve the API, the bank should pick format versioning, which will limit the sort of changes that they make to superficial improvements.

The bank should consider introducing a new entity version if there’s significant new value that it wants to expose to its banking customers, or if it’s forced to do so for security or regulatory reasons. In the case of blockchain accounts, there may be publicity value as well as practical value. Entity version upgrades are less common than format versioning changes for established services, but they do happen; you may have received messages from your bank telling you about a significant technology upgrade to your accounts and alerting you to actions you need to take, or changes you will see.

Entity versioning puts an additional burden on API clients, because the older clients cannot work with the newer entities, even though they continue to work unchanged with the older ones. This puts pressure on client developers to produce a new client application or upgrade an existing one to work with the new API.

Entity versioning can work well for technology products, where the the users of the API and the core customers are often one and the same and rapid obsolescence is considered normal.

How do you implement the different versions of versioning?

On the web, you often see conflicting advice on whether or not a version number should appear in the URLs of a web API. The primary alternative is to put the version ID in an HTTP header. The better choice depends on whether you’re doing format versioning or entity versioning.

For format versioning, put the version identifier in an HTTP header, not in the URL. Continuing the banking example, it’s conceptually simpler for each account to have a single URL, regardless of which format the API client wants to see it in. If you put a format version identifier in the URL, you are effectively making each format of each entity a separate web resource, with some behind-the-scenes magic that causes changes in one to be reflected in the other.

Not only is this a more complex conceptual model for users, it also creates problems with links. Suppose that in addition to having an API for accounts, the bank also has an API for customer records, and that each account contains a link to the record for the customer that owns it. If the developer asks for the version 2 format of the account, what version should be used in the link to the customer record? Should the server assume that the developer will also want to use the version 2 format of the customer record and provide that link? What if customer records don’t even have a version 2 format?

Some APIs that put version identifiers in URLs (OpenStack, for example, and at least one bank we know) solve the link problem by having a “canonical” URL for each entity that’s used in links, and a set of version-specific URLs for the same entity that are used to access the entity’s format versions. Clients that want to follow a link have to convert a canonical URL in a link into a version-specific URL by following a documented formula. This is more complex for both the provider and the client; it’s simpler to use a header.

The usual objection to putting format version identifiers in a header is that it’s no longer possible to simply type a URL into a browser to test the result of a GET on a specific version. While this is true, it’s not very hard to add headers in the browser using plugins like Postman, and you’ll probably have to set headers anyway for the Authorization and Accept headers. If you’ew using the cURL shell command to test your API, adding headers is even simpler. You’ll also need more than just the browser to create, update or delete requests to your API, so optimizing for GET only helps for one scenario. Your judgement may be different, but I have never found it very onerous to set a header.

There’s no standard request header that’s ideal for the client to say what format version it wants. The standard “Accept” header specifies which media types the client can accept (e.g., json, yaml, xml, html, plain text), and the standard “Accept-Language” header denotes which natural languages the client can accept (e.g., French, English, Spanish). Some API designers (e.g., the authors of the Restify framework) use a non-standard header called “Accept-Version”. If you’re doing format versioning, I recommend this header. The standard “Accept” headers allow the client to give a list of values they accept, and even provide a weighting for each. This level of complexity isn’t necessary for “Accept-Version”; a single value is enough. If you’re meticulous, you should set a corresponding “Content-Version” header in the response. Further, it can be useful for clients if the server also puts the format version in the body of the response; in fact, if the representation of one resource is embedded in another, the body is the only place to put it. [This argument applies to a number of the standard headers too: e.g., Etag, Location, and Content-Location.]

By contrast, if you’re doing entity versioning, the version identifier will appear somewhere in the URL of each entity—usually either in the domain name or the path. Users of the API do not have to be aware of this; for them, it’s just the entity’s URL. The version identifier will appear in the URL because the URL has to contain information for two different purposes: for routing requests to the correct part of the implementation for processing, and for identifying the entity within that implementation. Because requests for entities that belong to two different entity versions are almost always processed by a different part of the implementation or use different storage, the version identifier (or a proxy for it) must be somewhere in the URL for your routing infrastructure or implementation to use.

Coincidentally, banking provides a simple illustration of the principle that identifiers contain information for both routing and identification. If you have a checking account at a U.S. bank (the details are different in other countries, but the idea is similar), you’ll find two numbers at the bottom of each check. The first is called the routing number. It identifies the institution that issued and can process this check. The second number identifies the check itself. Conceptually, entity URLs are like the numbers at the bottom of a check, though their formats may be different.

Do I have to define my versioning strategy up front?

You’ll sometimes hear the advice that you must define a versioning strategy before you release your first version, or evolving your API will be impossible. This is not true.

You can always add a new versioning header later if you find the need to do format versioning and you can always add new URLs for new entities for a different entity version. Any requests that lack the format versioning header should be interpreted as meaning the first format version. Since instances of a new entity version get new URLs, you can easily introduce a version ID in those URLs without affecting the URLs of the entities of the first version. The new URLs may use a new hostname rather than adding path segments to URLs on the original hostname; whether or not you like that option will depend on your overall approach for managing hostnames.

Procrastination can be good

Laziness is not the only reason why you might not add versioning to the initial version of your API. If it turns out that versioning is never needed for your API, or for significant portions of your API, then the API will look better and be easier to use if it doesn’t include versioning in its initial release.

If you introduce an “Accept-Version” header in V1 of your API in anticipation of future “format versions” that never materialize, then you force your clients to set a header unnecessarily on every request.

Likewise, if you start all your URLs with the path prefix ‘/v1’ in anticipation of future “entity version” introductions that never happen, then you make your URLs longer and uglier than they need to be.

More importantly, in both cases you introduce a complex topic to clients that you didn’t need to introduce.

Some more versioning tips

If you use versioning, make it clear what sort of versioning you use. If there’a a field in your HTTP requests and responses that says “version: V1,” what does that mean? Does V1 apply to the persistent entity itself (entity versioning), or does it reflect the format in which the client asked to see the entity (format versioning)? Having a clear understanding of which versioning scheme or schemes you use helps your users understand how to use your API as it evolves.

If you’re using format versioning and entity versioning together, signal them with different mechanisms. Format versions should go in headers—Accept-Version and Content-Version—in the request and response. Format versions can also be included in the bodies of responses and requests, for those requests that have them. Entity versions (which are really part of the entity type) belong in the request and response bodies; they’re part of the representation of the entity.

Do not try to put versioning identifiers of either kind or entity type identifiers into the standard Accept or Content-Type headers; those headers should only include standard media types like text/html or application/json. Avoid using values like application/v2 json or application/customer json; the media-type is not the place to try to encode version or type information. Unfortunately, even some of the web standards do this the wrong way, for example application/json-patch json.

Don’t put words like “beta” or “alpha” in version IDs for either format versioning or entity versioning. When you move from alpha to beta, or beta to general availability, you’re making a statement about your level of support for the API, or its likely stability. You don’t want to be in a position where the API version changes just because your level of support changes; you only want to change the version if there’s a technical or functional reason for changing it. To illustrate this point, imagine I am a customer who develops a number of client applications that are using the V1beta4 version of an interface—a late-beta version. The API provider declares the product to be GA, and introduces the V1 version of the API, which is actually exactly the same as the V1beta4 API, since there were no breaking API changes between V1beta4 and GA. The V1Beta4 version of the API is still available, so my client applications don’t break, but the language of the support agreement is clear—only users of the V1 version get full product support. The change to my client applications to upgrade to V1 is small—I only have to change the version number I’m using, which may even be as simple as recompiling with the latest release of the vendor-provided client libraries—but any change to my applications, no matter how small, needs to go through a full release process with QA testing, which costs me thousands of dollars. This is very annoying.

Hopefully this post helps bring a little more clarity to the topic of API versioning, and helps you with your design and implementation choices. Let us know what you think.

For more on API design, read the eBook, “Web API Design: The Missing Link” or check out more API design posts on the Apigee blog.

“>

One of these formats encodes the list of characters as a JSON object keyed by the characters’ name, and the other encodes it as a JSON array. Neither are right or wrong. The first format is convenient for clients that always access the characters by name, but it requires clients to learn that the name of the character is to be found in the place that a property name is usually found in JSON, rather than as a property value. The second format does not favor one access pattern over another and is more self-describing; if in doubt, I recommend you use this one. This particular representation choice may not seem very important, but as an API designer you’re faced with a large number of options, and you may sometimes wish you had chosen differently.

Sadly, there’s no practical way to write API clients that are insensitive to name changes and changes in data representation like these. A version format allows you to make changes like this without breaking existing API clients.

Browsers are able to survive HTML webpage changes without versioning, but the techniques that make this work for browsers—e.g., the ability to download and execute client code that is specific to the current format of a particular resource, enormous investment in the technology of the browser itself, industry-level standardization of HTML, and the human user’s ability to adapt to changes in the final outcome—are not available or practical for most API clients. An exception is when the API client runs in a web browser and is loaded on demand each time an API resource is accessed. Even then, you have to be willing to manage a tight coordination between the team producing the browser code and the team producing the API—this doesn’t happen often, even for browser UI development within a single company.

A very common situation that usually requires an entity version change, rather than just a format version change, is when you split or merge entity hierarchies. In the bank example, imagine that Accounts belong to Customers, and each Account entity has a reference to the Customer it belongs to. Because some customers have many Accounts, the bank wants Accounts to be grouped into Portfolios. Now Accounts need to reference the Portfolio they belong to, not the Customer, and it’s the Portfolio that references the Customer. Changes like this are hard to accommodate with format versions, because older clients will try to set a property linking an Account to a Customer and newer clients will try to set a property linking an Account to a Portfolio. You can sometimes find ways to make both sets of clients work in cases like this, but more often you are forced to introduce new entity versions, each of which is updated using only one API format.

The sort of structural changes that force a new entity version usually introduce new concepts and new capabilities that are visible to the user, whereas the changes handled by format version changes are more superficial.

In general, the more clients an API has, and the greater the independence of the clients from the API provider, the more careful the API provider has to be about API compatibility and versioning.

Providers of APIs sometimes make different choices if the consumers of the API are internal to the same company, or limited to a small number of partners. In that case they may be tempted to try to avoid versioning by coordinating with consumers of the API to introduce a breaking change. In our experience this approach has limited success; it typically causes disruption and a large coordination effort on both sides. Google uses this approach internally, but at considerable cost—this article describes some of Google’s investments to make it work. It is usually much better for API providers to treat internal users and partners as if they were external consumers whose development process is independent.

Choosing the appropriate technique

You can probably see already that format version and entity versioning are fundamentally different techniques that solve different problems with different consequences, even though they both sail under the flag of versioning.

So when should you choose to do format versioning versus entity versioning? Usually the business requirements make the choice obvious.

In the case of the bank, it isn’t feasible to introduce a new entity version of an account in order to enable an API improvement. Accounts are stable and long-lived, and moving from old accounts to new ones is disruptive. A bank is unwilling to inconvenience its banking customers just to make life better for API developers. If the goal is just to improve the API, the bank should pick format versioning, which will limit the sort of changes that they make to superficial improvements.

The bank should consider introducing a new entity version if there’s significant new value that it wants to expose to its banking customers, or if it’s forced to do so for security or regulatory reasons. In the case of blockchain accounts, there may be publicity value as well as practical value. Entity version upgrades are less common than format versioning changes for established services, but they do happen; you may have received messages from your bank telling you about a significant technology upgrade to your accounts and alerting you to actions you need to take, or changes you will see.

Entity versioning puts an additional burden on API clients, because the older clients cannot work with the newer entities, even though they continue to work unchanged with the older ones. This puts pressure on client developers to produce a new client application or upgrade an existing one to work with the new API.

Entity versioning can work well for technology products, where the the users of the API and the core customers are often one and the same and rapid obsolescence is considered normal.

How do you implement the different versions of versioning?

On the web, you often see conflicting advice on whether or not a version number should appear in the URLs of a web API. The primary alternative is to put the version ID in an HTTP header. The better choice depends on whether you’re doing format versioning or entity versioning.

For format versioning, put the version identifier in an HTTP header, not in the URL. Continuing the banking example, it’s conceptually simpler for each account to have a single URL, regardless of which format the API client wants to see it in. If you put a format version identifier in the URL, you are effectively making each format of each entity a separate web resource, with some behind-the-scenes magic that causes changes in one to be reflected in the other.

Not only is this a more complex conceptual model for users, it also creates problems with links. Suppose that in addition to having an API for accounts, the bank also has an API for customer records, and that each account contains a link to the record for the customer that owns it. If the developer asks for the version 2 format of the account, what version should be used in the link to the customer record? Should the server assume that the developer will also want to use the version 2 format of the customer record and provide that link? What if customer records don’t even have a version 2 format?

Some APIs that put version identifiers in URLs (OpenStack, for example, and at least one bank we know) solve the link problem by having a “canonical” URL for each entity that’s used in links, and a set of version-specific URLs for the same entity that are used to access the entity’s format versions. Clients that want to follow a link have to convert a canonical URL in a link into a version-specific URL by following a documented formula. This is more complex for both the provider and the client; it’s simpler to use a header.

The usual objection to putting format version identifiers in a header is that it’s no longer possible to simply type a URL into a browser to test the result of a GET on a specific version. While this is true, it’s not very hard to add headers in the browser using plugins like Postman, and you’ll probably have to set headers anyway for the Authorization and Accept headers. If you’ew using the cURL shell command to test your API, adding headers is even simpler. You’ll also need more than just the browser to create, update or delete requests to your API, so optimizing for GET only helps for one scenario. Your judgement may be different, but I have never found it very onerous to set a header.

There’s no standard request header that’s ideal for the client to say what format version it wants. The standard “Accept” header specifies which media types the client can accept (e.g., json, yaml, xml, html, plain text), and the standard “Accept-Language” header denotes which natural languages the client can accept (e.g., French, English, Spanish). Some API designers (e.g., the authors of the Restify framework) use a non-standard header called “Accept-Version”. If you’re doing format versioning, I recommend this header. The standard “Accept” headers allow the client to give a list of values they accept, and even provide a weighting for each. This level of complexity isn’t necessary for “Accept-Version”; a single value is enough. If you’re meticulous, you should set a corresponding “Content-Version” header in the response. Further, it can be useful for clients if the server also puts the format version in the body of the response; in fact, if the representation of one resource is embedded in another, the body is the only place to put it. [This argument applies to a number of the standard headers too: e.g., Etag, Location, and Content-Location.]

By contrast, if you’re doing entity versioning, the version identifier will appear somewhere in the URL of each entity—usually either in the domain name or the path. Users of the API do not have to be aware of this; for them, it’s just the entity’s URL. The version identifier will appear in the URL because the URL has to contain information for two different purposes: for routing requests to the correct part of the implementation for processing, and for identifying the entity within that implementation. Because requests for entities that belong to two different entity versions are almost always processed by a different part of the implementation or use different storage, the version identifier (or a proxy for it) must be somewhere in the URL for your routing infrastructure or implementation to use.

Coincidentally, banking provides a simple illustration of the principle that identifiers contain information for both routing and identification. If you have a checking account at a U.S. bank (the details are different in other countries, but the idea is similar), you’ll find two numbers at the bottom of each check. The first is called the routing number. It identifies the institution that issued and can process this check. The second number identifies the check itself. Conceptually, entity URLs are like the numbers at the bottom of a check, though their formats may be different.

Do I have to define my versioning strategy up front?

You’ll sometimes hear the advice that you must define a versioning strategy before you release your first version, or evolving your API will be impossible. This is not true.

You can always add a new versioning header later if you find the need to do format versioning and you can always add new URLs for new entities for a different entity version. Any requests that lack the format versioning header should be interpreted as meaning the first format version. Since instances of a new entity version get new URLs, you can easily introduce a version ID in those URLs without affecting the URLs of the entities of the first version. The new URLs may use a new hostname rather than adding path segments to URLs on the original hostname; whether or not you like that option will depend on your overall approach for managing hostnames.

Procrastination can be good

Laziness is not the only reason why you might not add versioning to the initial version of your API. If it turns out that versioning is never needed for your API, or for significant portions of your API, then the API will look better and be easier to use if it doesn’t include versioning in its initial release.

If you introduce an “Accept-Version” header in V1 of your API in anticipation of future “format versions” that never materialize, then you force your clients to set a header unnecessarily on every request.

Likewise, if you start all your URLs with the path prefix ‘/v1’ in anticipation of future “entity version” introductions that never happen, then you make your URLs longer and uglier than they need to be.

More importantly, in both cases you introduce a complex topic to clients that you didn’t need to introduce.

Some more versioning tips

If you use versioning, make it clear what sort of versioning you use. If there’a a field in your HTTP requests and responses that says “version: V1,” what does that mean? Does V1 apply to the persistent entity itself (entity versioning), or does it reflect the format in which the client asked to see the entity (format versioning)? Having a clear understanding of which versioning scheme or schemes you use helps your users understand how to use your API as it evolves.

If you’re using format versioning and entity versioning together, signal them with different mechanisms. Format versions should go in headers—Accept-Version and Content-Version—in the request and response. Format versions can also be included in the bodies of responses and requests, for those requests that have them. Entity versions (which are really part of the entity type) belong in the request and response bodies; they’re part of the representation of the entity.

Do not try to put versioning identifiers of either kind or entity type identifiers into the standard Accept or Content-Type headers; those headers should only include standard media types like text/html or application/json. Avoid using values like application/v2 json or application/customer json; the media-type is not the place to try to encode version or type information. Unfortunately, even some of the web standards do this the wrong way, for example application/json-patch json.

Don’t put words like “beta” or “alpha” in version IDs for either format versioning or entity versioning. When you move from alpha to beta, or beta to general availability, you’re making a statement about your level of support for the API, or its likely stability. You don’t want to be in a position where the API version changes just because your level of support changes; you only want to change the version if there’s a technical or functional reason for changing it. To illustrate this point, imagine I am a customer who develops a number of client applications that are using the V1beta4 version of an interface—a late-beta version. The API provider declares the product to be GA, and introduces the V1 version of the API, which is actually exactly the same as the V1beta4 API, since there were no breaking API changes between V1beta4 and GA. The V1Beta4 version of the API is still available, so my client applications don’t break, but the language of the support agreement is clear—only users of the V1 version get full product support. The change to my client applications to upgrade to V1 is small—I only have to change the version number I’m using, which may even be as simple as recompiling with the latest release of the vendor-provided client libraries—but any change to my applications, no matter how small, needs to go through a full release process with QA testing, which costs me thousands of dollars. This is very annoying.

Hopefully this post helps bring a little more clarity to the topic of API versioning, and helps you with your design and implementation choices. Let us know what you think.

For more on API design, read the eBook, “Web API Design: The Missing Link” or check out more API design posts on the Apigee blog.

Leave a Reply

Your email address will not be published. Required fields are marked *