Building Robust RESTful APIs
By building, reading, documenting, hating and loving RESTful APIs, I have learned a lot of things along the way from rights and wrongs. Now, these things serve me as a kind of checklist I look out before/during and after building a RESTful API.
Now, I just wrote the main points of the checklist and would like to share them with you, but before starting, I want to make some statements about what a “Robust RESTful API” is. These 3 words which together sounds so damn glorious! 🙌
I’ll assume you are reading this because you know what a RESTful API is! If not, I recommend you to read this and this and then come back, I’ll be waiting for you 🙃 Regarding the other word, “robust”, Cambridge Dictionary says: “(of an object or system) strong and unlikely to break or fail” and that is, indeed, what we are talking about.
The robust RESTful API Checklist
Let’s go through the list and please, don’t doubt in leaving me a comment if you think I’m missing something important! The list is continuously growing up.
First and most important, be RESTful! 🙏 Just do your best to keep things RESTful and your system will gain (for free) a bunch of desirable non-functional properties, such as performance, scalability, simplicity, modifiability, visibility, portability, and reliability.
A system will be RESTful if it is compliant with the “Six constraints”
- Uniform interface
- Client–server architecture
- Layered system
Check out Roy Fielding’s dissertation about it for detailed information.
Restrict unnecessary HTTP methods
Many times, our systems don’t need to publish all the possible operations for all the existing resources on it. In fact, by doing this, you can create a very serious security hole, and nobody wishes to fight with these things.
There exists a bunch of amazing tools that create predefined RESTful endpoints for a given data model (Django REST Framework, Grape, Strapi, etc..) Its use is very straightforward, but it is up to you, to verify carefully the list of available operations for each resource in your system.
For those cases where an operation should be not allowed,
HTTP 405 - Method Not Allowed exists.
Always use HTTPS
HTTPS, as you should know, it’s an extension of HTTP for secure communications over computer networks. To provide a security layer, HTTPS encrypts the communication using TLS (the new SSL).
The main motivation for HTTPS is to guarantee the confidentiality and integrity of the data exchanged between the two parties. If your API is not using HTTPS, it is a piece of cake to intercept/alter/tamper/read the data in transit, and again, you will have serious security issues.
TIP: Let’s Encrypt is an open Certificate Authority that could help you secure your API, for FREE!
Never trust input data. Even if you are the developer on charge of both API and frontend, you should make strong data validation on the API side.
Check length, range, format and types. In case of something goes wrong, use one of the pre-defined HTTP error codes:
HTTP 400 - Bad Request: Malformed/unexpected input data.
HTTP 406 - Unacceptable: Unexpected or missing content type headers.
HTTP 413 - Request Entity Too Large: Requests exceeding the limit size.
HTTP 415 - Unsupported Media Type: Unexpected or missing content type headers.
Configure a restrictive CORs Policy
I can’t explain it better than OWASP.
Certain “cross-domain” requests, notably Ajax requests, are forbidden by default by the same-origin security policy. However, most times we need to performs this kind of requests. It’s a security MUST, to configure properly/restrictively the allowed domains.
Authentication is the mechanism of associating an incoming request with a set of identifying credentials, such as the user the request came from, or the token that it was signed with. Authentication is not always needed, but if it is, there are some recommendations.
The access control decision should be taken locally by REST endpoints. This will help to minimize latency and reduce coupling between services, being more RESTful. In the same way, User authentication should be centralized in an Identity Provider (IdP), which issues access tokens.
One of the most popular tools to accomplish this is JWT. I you haven’t heard about it, check out this spec.
JWT defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. I strongly recommend its use.
Requests that cannot be authorized, should be answered with an:
HTTP 401 – Unauthorized.
Authentication by itself is not usually sufficient to gain access to information or code. For that, the entity requesting access must have authorization. Permissions are used to grant or deny access to different classes of users to different parts of the API.
TIP: Always model roles and groups, your client doesn’t know he/she would need it until you have a whole system in production.
Requests which doesn’t meet authorization restrictions should be answered with an:
HTTP 403 – Forbidden.
Public REST services without access control run the risk of being farmed leading to excessive bills for bandwidth or compute cycles.
A robust API MUST have a restrictive throttle for unauthenticated requests, and a less restrictive throttle for authenticated requests.
HTTP 429 – Too Many Requests is the appropriate message in case of detect a problematic situation.
There is an old contest about this topic between UX and Security. The security approach (the one we’ll take) says that an API should respond with generic error messages and avoid revealing details of the failure unnecessarily.
Do not pass technical details (e.g. call stacks or other internal hints) to the client/user. These errors must be handled according to a well thought out scheme that will provide a meaningful error message to the end-user, diagnostic information to the site maintainers, and no useful information to an attacker.
Good logging is critical to debugging and troubleshooting problems. Not only is it helpful in local development, but in production it’s indispensable.
Logs allows you to:
- Visualize the behavior of the system.
- Identify problems before they occur.
- Diagnose issues post-mortem.
- Obtain metrics.
A robust RESTful (and not RESTful) API MUST have logging, and it should be implemented carefully in order to inform about issues and warning, using the appropriate communication channels.
The response to a requests could be really BIG. Pagination helps to keep things small and handy for both client and server.
The earlier in the project’s life you decide to add this feature, the less expensive it will be.
API versioning allows you to alter behavior between different clients.
There exists multiple approaches and implement one at the beginning of the project could be a big problem saving task in the future, when the project its alive, used by many clients and users.
The approaches are:
- URL versioning
- Accept Header versioning
- Query Parameter versioning
- Domain versioning
Each one has its pros and cons, but you should consider having one. Even when you are not sure if you need it.
Avoid exposing management endpoints via Internet.
If management endpoints must be accessible via the Internet, make sure that users must use a strong authentication mechanism, e.g. multi-factor. Expose management endpoints via different HTTP ports or hosts. Restrict access to these endpoints by firewall rules or use of access control lists.
These endpoints are imminent objects of attack, and sadly, it is quite common for attackers to succeed just by trying the default credentials.
I read once this quote from Jacob Kaplan-Moss:
“Code without tests is broken by design.”
I think that sums up what I have to say about testing. Period.
As you can see, this checklist is not comprehensive at all (there is nothing about internationalization, caché, filtering, content negiotiation, metadata…), but it was raised upon my own experience. I strongly recommend you to have your own verification steps, in order to guarantee secure, well tested and robust APIs, and products in general.
Again, please don’t hesitate to leave me a comment if you think I’m missing something important! The list is continuously growing.
Hope you have learned something new! If not, thanks for reading 👋
See related posts
Let’s talk about the different automation tools we can use in order to analyze our Python code and perform several checks to make sure it meets the standards.