1. What happened
  2. Breaking third-party code
  3. Economics approach
  4. Conclusion – operability

What happened

I don’t even know what happened to github today but I do know that webhooks, i.e. all CI integrations, were severely backed up.

If there were no third parties involved, one might be tempted to clear the whole thing and start fresh; intelligently designed consumers should resubmit, right?

Another option is to drain it back to front. The highest-priority builds in your codebase are probably the newest, not the oldest.

Breaking third-party code

Is always a risk here but is anyone relying on the order of github buildhooks?

I could imagine a system that does – maybe I have some kind of build caching that rebuilds from scratch when the order is off, taking way longer than normal and causing a delay (but the system is already delayed). It’s a big world out there – likely there are other cases I can’t imagine.

That said, breaking code with weird assumptions is something you have to be willing to do in order to enable change at a large platform. It’s not like order is guaranteed now.

Economics approach

There’s an economics paper (curse of the FIFO queue discipline) which uses game theory to claim FIFO reduces ‘equilibrium welfare’ compared to LIFO.

I’ve always had a problem with this result because it feels like it creates chaos and uncertain property rights during a minor resource crunch. ‘Chaos is a ladder’ say my friends who pay for HBO, but perhaps maintaining the fiction that it isn’t

On the flip side, efficient queue draining even if some people don’t get first pick may be net good for groups as a whole. For example, creating a fast lane on the highway allows fast and slow drivers to mix and both get what they want. (Though lane efficiency is also emergent and based on reaction speed, so this is a tricky claim). A fast lane might provide for a more orderly evacuation, increasing the total number of people that can exit a building (but it could also do the opposite).

I bring this up because spending my day chasing draining queues has made me see queue draining metaphors in everything. This is a core computer science problem but feels like it can speak to fairness in economics, sociology, organizational design.

Conclusion – operability

Even if it were safe to do this, it’s likely github’s queueing system doesn’t have the ability to change the order in which it drains. Kafka seems to not.

Things get backed up all the time and have different priorities. I wouldn’t mind having more manual controls & options in cases like this.

I just finished reading Lisanne Bainbridge / ironies of automation which says:

High process controllability, good interface ergonomics and a rich pattern of activities were all associated with high feeling of achievement.

Cheers to that.