The Attribution Problem

By: Justin Dragos

This post makes several references to double-entry ledgering. If you aren't familiar with traditional ledgering methods you may want to skim the Wikipedia article before reading.

The attribution problem is an issue that arises from double entry systems relying on authoritative sums for balances, and the cumulative property of addition. The problem can be summarized as:

Money is only deterministically traceable through a single money movement.

You might be tempted to extrapolate from this. If you can trace money through one movement then if you trace any money back, one movement at a time, you'll eventually have a deterministic trail! Unfortunately, this approach falls short. It's an insidious issue as it won't cause problems until the system scales up, but its present even in simple systems. Let's delve into an example.

A liquid analogy

Before we dive into the illustration we're going to set up an analogy. You can think of traditional double entry accounts like a bucket, filled with liquid. You can put any label you want on the bucket, but when you pour new liquid in, it mixes together. This is because any new money that enters an account is summed into a balance. As part of the balance, numbers are indistinguishable from each other.

For this example we'll keep track of any liquid that is added to, or removed from each bucket (transactions) and balances by the volume of liquid in each bucket. This mimics the two primary pieces of data in a double entry system: transactions and balances.

The problem, illustrated

In order to illustrate the attribution problem, we'll use three buckets. We'll call them In, Held, and Out. Our In bucket will be wherever we get the liquid from. Our Out bucket will be customer's asking for a drink. We'll start by making 4 deposits into the Held bucket:

  1. 1 cup of vodka
  2. 1 cup of water
  3. 1 cup of kool-aid
  4. 1 cup of lime juice

Diagram: Buckets after initial balance

Alright, now we've got a balance of 4 cups of liquid in our Held bucket. We're ready to start handing it out to thirsty people!

The first customer

Our first customer of the day is a middle aged man who asks for a cup to drink. We pour a cup from our Held bucket into his cup.

Now to ask the question that is the crux of the attribution problem:

The attribution problem: What did we just serve our guest?

The answer to this in our analogy is pretty complicated. It depends on fluid dynamics: what is the relative density of each liquid? How long have they been mixed together? What the mixture shaken, stirred, or agitated? We could presumably test a sample of the cup we poured and determine the percentage of the mixture, but that's going to take time and tools we probably don't have.

How could you do it in a double entry system though? You can't take a sample of a balance... every number is perfectly interchangeable. It turns out, this is just not possible thanks the mathematical principles that govern sums.

In most accounting systems this kind of attribution is handled by a generic after-the-fact accounting rule that attempts to fix the problem by making an uninformed, but consistent guess. In other words, we can't possibly tell what we served the customer, so after we pour the glass we'll just decide what was in it, and go with that. The most simplistic and most common rule is FIFO (first in, first out) which assumes that the first money in is the first money out.

We'll apply that rule here and claim we served our first customer 1 cup of straight vodka, much to their delight. Problem solved!

The second customer

Our first customer happily served we welcome our second guest: a 5-year-old child. He asks for a drink pleadingly holding out his empty glass. So we pour them a cup.

The attribution problem: What did we just serve our guest?

That pesky question again. One easily solved though. We decided (without verification) that the cup we served our first customer was 1 cup of vodka, so there is no alcohol left in the bucket... right? Right!?

No rationale person would consider pouring from our Held bucket into the child's glass ethical. Using the standard FIFO approach for attribution it supposedly safe. A drunk 5-year old is a scary prospect, but if you can believe it, there is actually something even crazier (though less ethically questionable) about this situation.

The true evil of the attribution problem

Here is the dark underbelly of the attribution problem.

Maybe you are an proponent of FIFO and think our first customer got 1 cup of vodka. If so: you are right.

Maybe you think the second cup must have contained some vokda and we served alcohol to a child. If so: you are also right.

Maybe you think we served the first customer 1 cup of lime juice, and the child 1 cup of kool-aid. If so: you are still right.

All of these claims are demonstrably true in equal measure. So are any of the others combinations of liquids we have in the bucket. This means that it's possible to have a perfectly valid understanding of your finances that don't map in any way to the actions taken or business rules used by your system. That's insane!

Note: This is why forensic accounting exists as a profession. Trying to reverse engineer what actually happened from a view of double entry accounts is very hard, and requires extensive corroborating information. Money launderers love this property of standard accounting systems. A lot of money laundering is trying to find holes where the way you understand money movement doesn't match the actual actions being taken.

Does it matter?

So what if you can't tell how money actually went through your system? I can use after-the-fact rules like FIFO to account for it all. Why should I even care about deterministic attribution?

This is a valid question. FIFO and rules like it have been the standard for accounting for a century or so. Why is it broke? Let's go back to our example above with the liquids, but give them some more real world context.

These are complicated scenarios, but they are all real world problems. Deterministic attribution gives you the tools to solve them. If you are a domestic only business, and work in a low fraud industry you might be able to get away with vague attribution for a while. Eventually your business will need to answer questions just like these, either for compliance reasons or to be able understand the flow of their own money better.

Common solutions

Ok, this is a potentially big problem, but how do you solve it?

Here we'll talk about some common ways to solve this problem and what to expect if you pursue them.

Smaller accounts

(Sometimes called subledgers)

In our example we could solve the attribution problem in a pretty straight forward way: make a new account for each type of liquid. This approach works, but it has some serious drawbacks you'll need to be ready to deal with.

Transaction ledger

This approach uses a secondary data model other than just the balance. It keeps track of the inbound transactions and matches them to outbound transactions in order to patch around the attribution problem.

Diagram: Example transaction ledger

This approach is pretty solid, actually. It solves the attribution problem by deterministically enforcing FIFO (or whatever attribution rule you want to use). The main downside is that the transaction ledger becomes the real source of truth. The balance row its no longer authoritative, and doesn't much matter. If there is a conflict between the balance and the transaction ledger, the transaction ledger will be the source of truth because it has significantly more detail than the balance. If you already built an account system, and you are trying to bolt on a solution to the attribution problem, this is the route we would suggest.

However, this still won't be easy. You need to design the system carefully and be ready to handle the problems listed here.

Product level ledger

This approach tends to be where most business start. The order system, billing system, or some other product model is responsible for "the truth" of the financials, and then they record it in the financial system. The financial system is expected to blindly accept any information the product system gives it, and if there is ever a conflict the product system is used as the source of truth to resolve the it.

This is a pretty straight forward, and business first approach. It's exceedingly popular amongst start ups for exactly that reason, but it has some serious drawbacks.

The solution

It is nearly impossible to avoid the attribution problem in a double entry system. Blockchain ledgers have better traceability, as they can trace money back to from the last time it was merged, but still suffer from the attribution problem for merges. The only surefire way to avoid the problem is by using a ledger that guarantees deterministic attribution.

String Theory is the only modern ledger system with provable attribution for all money in the system. We'll discuss the properties of deterministic attribution and its uses in the next article.

Next >
(More about deterministic attribution)