Monday morning, somewhere in a state audit office. A senior auditor has 40 infrastructure projects on her desk and three days to decide which ones to investigate. Two project files sit on top of the stack.
File A says: risk score: 73.
File B says: Orange. Investigate within 90 days. Triggers: single-bid award, late high-value amendment, missing justification.
Which one moves?
File B moves. File A goes into the second pile, the one the auditor will get to next quarter, or not. That is the whole argument of this article in one image. A score ranks projects. A traffic light tells someone what to do next. After a decade of watching audit offices receive risk-scored lists every quarter and seeing the same high-risk projects on the same lists the next quarter, I think this distinction is the most important interface decision a procurement risk system makes.
- 0 flagsMonitor
- 1 to 5 flagsReview
- 6 to 10 flagsInvestigate
- 11 or more flagsEscalate
What CoST and the Data Use Manual actually are
Before going further, two definitions, because this article assumes neither.
CoST, the Infrastructure Transparency Initiative, is the international body that defines how governments should publish data about the infrastructure projects they fund. Member countries across Africa, Asia, and Latin America have committed to the standard. They publish project data through national portals; CoST defines what gets published, how it is structured, and how it is used. The current member list is on the CoST website.
The Data Use Manual is CoST's practitioner handbook. A separate document, the disclosure standard called OC4IDS, defines what governments must publish. The Data Use Manual picks up from there. It tells the people who read the published data, auditors, procurement regulators, investigative journalists, civil-society monitors, how to turn it into action. Earlier versions in 2018, 2020, and 2023 progressively added red-flag definitions, country case studies, and analysis recipes. Each version reflected what member countries had learned about what works and what does not.
What v4, published in March 2026, changed: it defined a working set of red-flag categories, then grouped projects into four coloured tiers. The manual calls each tier a band: a colour with a defined threshold and a defined action attached to it. Green means zero flags fired. Yellow means one to five. Orange means six to ten. Red means more than ten. The colours are coarser than a 0-to-100 score. That is the point. This is why.
Why colours work where scores fail
A score lets a reader debate the threshold. 73 is below 80. The auditor who chose not to investigate can defend the choice with a line item: the threshold was 80 and project 47 did not cross it. The score has done its job as decoration and the project goes back into the pile.
A red project cannot be argued with in the same way. The system has declared it red. The auditor's name will appear next to it in the next quarterly report. Inaction becomes visible. The interface has moved the cost of doing nothing from "we ranked it and didn't get to it" to "we saw it was red and walked past it". This is the mechanism the v4 design relies on, and it is the part most often missed by people who read the manual as a colour-coding cosmetic.
If the project shows as score 73
- What the auditor can say
- "73 did not cross the 80 threshold."
- What the auditor's name does
- Stays off the record. The score moves on.
- Consequence
- Project returns to the pile. Same list next quarter.
If the project shows as red
- What the auditor can say
- "The system flagged it. We did not investigate."
- What the auditor's name does
- Appears next to a red project in the quarterly report.
- Consequence
- Inaction becomes visible. Documentary risk exists.
There is a deeper move the colour interface makes possible, though I want to be careful here: v4 was only published in March 2026. No country has yet rolled it out long enough to produce evidence. So what follows is a hypothesis I will test publicly, not a finding.
A procurement director who logs in and sees four of their own projects in red has two choices. Clean them up before anyone else looks, or wait for the audit office to surface them. I expect most agencies to choose the first, because a score is dismissible as a model artefact while a red colour with its specific flag list visible underneath is not. Whether that expectation is right is one of the things the test in the last section will tell us.
That direction reversal is the genuinely novel claim in this design. It does not appear in the manual. It is the part the manual makes possible.
Three ways the light can mislead
The colour is useful because it forces a decision. It fails when readers treat the colour as truth instead of a prompt to ask better questions.
1. Green is not clearance
Green only means the available data triggered no flags. If ownership data is missing, ownership risk cannot appear. Put data completeness beside the colour, or the reader may mistake blindness for safety.
2. Patterns beat counts
One flag can have a reasonable explanation. A single-bid award may happen in a narrow market. A late amendment may follow a real scope change. Together, they say something different. Show combinations, not just totals.
3. Test behaviour, not the palette
The traffic light matters only if red changes what people do. The test is documented action within 90 days: an inspection scheduled, a contract suspended, a referral filed, or a contractor required to respond. If red projects do not get faster documented responses across at least three programmes by the end of 2026, the light is decoration.
What this means if you are building one
If your country programme is scoping a procurement risk interface in the next 12 months, three decisions matter more than the rest. Show the colour first, the score second, and the flag list third. Publish data-completeness in the same row as the colour. Define the action attached to each colour before you define the threshold for each colour: if "investigate within 90 days" is not a workflow your audit office can actually run, the traffic light is not the design that fits your institution.
And put the design on a clock. State the test, state the metric, state the window. A risk interface that no one has agreed to evaluate is an interface that will quietly become wallpaper, regardless of how cleanly the colours render.
Found this useful?
I write about open data systems, transparency, and implementation.
Read more articles →