Soft Deletes In HTTP APIs

October 4th, 2016Dan Yoder

That’s okay, delete if it’ll make you feel better — we’ve got you covered! And no time travel necessary. Photo credit: Ervins Strauhmanis .

A fun question came up the other day on our internal Slack channel:

For soft-deletes (where we keep a record in the database, but set a flag that it’s been deleted), we’re trying to decide between doing that with DELETE <resource> or PUT <resource> with a body {status: 'deleted'}. That way, we can reserve DELETE for hard deletes. Any recommendations?

The argument in favor of the PUT approach is well articulated by Phil Sturgeon:

Delete should probably do what it says, actually delete the item.

On the face of it, what could be more reasonable? But we’re going to explain why this heuristic is misguided. Or as I put it in our Slack channel:

He’s wrong.

Intention Over Implementation

While it’s true that DELETE should mean delete, it’s the semantics that are important, not the implementation. Whether we delete the record or simply set a flag saying, as far as the interface is concerned, this item is deleted, that’s of no concern to our clients. And, in fact, a major design goal for HTTP is to keep the client and server loosely coupled, meaning that we don’t want implementation details to leak from one to the other.

So Much Anger

A few years ago, we implemented an MVP of a social media application where DELETE did, in fact, delete the database record. If you deleted your account, it was truly gone. We later discovered that people wanted to be able to reinstate their accounts. I guess people would delete their accounts in anger and then later reconsider. We never learned whether they were angry at us or whether it was just the person they were chatting with, but this was a thing.

Don’t Break The Build Clients

So, of course, we implemented a soft delete. We did this purely on the server side, meaning that an HTTP DELETE no longer deleted the record. This way, we didn’t need to update our clients apps. And this is the ideal—when you change the server and you don’t break the clients. This is what HTTP strives to help you do, and it’s a promising sign when you’re design accidentally works this way. Or even if it works this way on purpose.

HTTP Says It’s Okay

In fact, the specification expressly allows for this (emphasis added):

If the target resource has one or more current representations, they might or might not be destroyed by the origin server…depending entirely on the nature of the resource and its implementation by the origin server…

Programming Is Hard, Lesson 179

But there’s a wrinkle, of course. There’s always a wrinkle. Because this is programming and it’s hard.

For some resources, soft-deleted objects are still GET-able. Through an admin interface. But GET should now return a 404, right?

Yup. Keeping the semantics consistent is important. We don’t want the interface to degenerate into nonsense. Keeping the implementation details of DELETE from leaking to the client is great, but we shouldn’t be able to turn around and GET a resource after a DELETE. That should absolutely return a 404. We’ll come back to this in a moment.

Creative Destruction

One solution is to create a new resource — the archived resource. But how does the client discover the URL for the new resource? And, probably more importantly, is resource creation as a result of a DELETE a desirable side-effect? After all, the semantics for DELETE, quite reasonably, don’t talk much about creating new resources.

However, so long as we don’t repeatedly create new resources—and mind you, we’re talking semantically, not about creating records in a database—it’s not any different than claiming that deleting a record in a database or expiring content in a cache is a side-effect. HTTP does not prohibit them. It only cares that calling DELETE repeatedly does the same thing as calling it once (that’s what idempotence refers to). Thus, resources may spring into existence for a variety of reasons, including because we deleted something, as odd a circumstance as that might seem. But, hey, life is strange.

Witness Protection

We can turn our attention to how clients are to find out about this new life our soft-deleted resource leads. It’s not witness protection, we don’t want them to disappear. In fact, let’s just call it an archived resource. It sounds kinder. (Cue mobster: You know, sometimes people just…they get archived. We wouldn’t want that to happen to you, now would we?) We need to be able to recover these archived resources once those responsible for their archival realize, in the bright light of day, they’ve made a terrible mistake.

The Nice Thing About Query Parameters

The way to identify a resource is, of course, via its URL. So our server must communicate the URL of the archived resource to the client somehow. (Because otherwise, the client needs to make assumptions about how to construct it, and we don’t want that, do we? No…we don’t? Good, that’s what I thought.) One simple way is to use query parameters. If we know the original URL, we can simply add a query parameter to it (say, archived=true, for example) to get the archived version. This is the nice thing about query parameters. We can use the URL for one resource to derive the URL of another, without violating the opacity of the original.

Update Some of you are asking, but isn’t this still basically constructing the URL? Well, yes. Yes, it is. Our justification is that we don’t need to know much, just that we should add this one query parameter. We don’t even know the server name! We could be using that cool new archival-as-a-service startup, aaas.io to handle our soft deletes for us! We’ve thus minimized the number of things the client needs to know about the server implementation, which is what we’re saying when we talk about loose-coupling, which is not to be confused with goose coupling, which is a completely different thing.

No Body, No Crime…

Another possibility would be to simply include this information in the response body of the DELETE method. This is perfectly fine and expressly allowed for in the spec:

If a DELETE method is successfully applied, the origin server SHOULD send a 202 (Accepted) status code if the action will likely succeed but has not yet been enacted, a 204 (No Content) status code if the action has been enacted and no further information is to be supplied, or a 200 (OK) status code if the action has been enacted and the response message includes a representation describing the status.

So we can return a 200 (OK) along with a bit of JSON with the archived URL. Maybe something like this:

{
	"message": "Boom! This profile was deleted but can be restored from archival if you change your mind or whatever.",
	"archive": {
		"url": "https://acme.com/profiles?login=wileecoyote&archived=true"
	}
}

A Thing You Should Probably Not Do

We could also just return a 204 (No Content) with a Location header, just like we’d do with 201 (Created) or a redirect. This is not expressly allowed by the spec, but HTTP is expressly extensible, so, if we’re careful, we can extend it. We could decide to extend it by saying that a Location header returned by a DELETE request indicates the URL for the archived version of the resource, if one exists. However, this is a big commitment (because we’d be making decision to extend the protocol), so I think including a body with the archive URL is fine, at least until you get over your fear of commitment.

Somewhere, in a parallel universe, this is me. Photo credit: [probably a Mets fan][slgckgc].

A Degree in Fun

That’s great, but, again, wrinkles. Why didn’t we pick something simple to do for a living, like driving an ice-cream truck or operating the T-shirt gun at concerts. Sigh. Anyway, in our example, where someone has deleted their profile, maybe they deleted it via their mobile app, but then attempted to access it again the next day on their laptop. Jerks. So now the client won’t have the URL handy because the response went to the mobile client. We’ll get the 404, but no indication about the archived resource. Again, however, there’s nothing wrong with a response body that provides the archived URL, along the same lines as what we did with DELETE.

Raising The Archived

To restore the archived profile, we could just POST to the archived URL. This is ideal since we probably want to make sure only the original owner, or perhaps and support person, can reactivate it (unlike a create operation, where the profile never existed in the first place).

Delete Should Mean Delete, Dammit

Let’s take a step back. This seems a bit complicated. How again is this better than just ensuring that DELETE means delete and updating the archived flag directly with PUT or PATCH?

Since I already went on a bit about loose coupling and not breaking the clients and so on, let’s take a different tack. Suppose Amy tries to view Joe’s profile after Joe deletes it. What do we want to return to the client? A 404 right?

That’s because, from the client’s perspective, the profile is no longer there.

That’s your litmus test. If that GET needs to return a 404, use a DELETE to delete, archive, suspend, deactivate, or whatever it.

Every Application Is Special

If that was not the case, and the semantics of archival were different and visible to the client, we might be okay with PUT or PATCH here. In this case, the client only cares about archival when the profile owner changes their mind about deleting it. Otherwise, it’s as though the profile never existed. Given those semantics, DELETE and 404 (Not Found) make perfect sense.

And, Also, Time Travel

From the server’s perspective meanwhile, the archive flag is an implementation detail. Later, we might decide that we’re tired of constantly filtering for that darned flag. We might move archived records into a different table. Then we realize, we’re storing a bunch of data in our database on the off chance Joe comes back one day. We might literally archive them to S3 or another higher latency (and less costly) medium. Who knows what the future holds? Only time travelers from the future! Which we aren’t, but, if you are, soft deletes are probably not your reason for coming back. At least, I hope they aren’t! Because what’s going on in that scenario? Either way, it won’t matter to our clients — they’ll never know the difference. (Not about the time travel, but about our implementation of soft-deletes. But also, probably not about the time traveling, either.)