OC4IDS
Open Contracting for Infrastructure Data Standard, standard.open-contracting.org/infrastructure/
Infrastructure transparency portals collect data from multiple government systems: e-procurement platforms, integrated financial management systems (IFMIS), project management databases, and beneficiary registration systems. Connecting these systems to an OC4IDS (Open Contracting for Infrastructure Data Standard, standard.open-contracting.org/infrastructure/) portal is the technical problem that determines whether a transparency initiative succeeds or fails. Based on implementations across Uganda, Kenya, and Nigeria, five integration patterns emerge. Each has different maintenance requirements, data quality outcomes, and failure modes.
Pattern 1: Direct Database Query
The simplest integration: the OC4IDS portal queries the source database directly via a read-only connection, transforming raw records to OC4IDS schema at query time. This pattern works when the source database has a stable, documented schema and a DBA can grant a read-only service account. The Uganda PPDA portal (gpp.ppda.go.ug) uses a variant of this pattern for its IFMIS integration: a scheduled ETL (extract-transform-load) job pulls contract award records from IFMIS nightly and maps them to OCDS fields, which then feed into the broader OC4IDS project record.
| Pattern | Technical Complexity | Maintenance Risk | Best For |
|---|---|---|---|
| 1. Direct database query (ETL) | Low–Medium | Medium, schema changes break it | Stable IFMIS with documented schema |
| 2. API-to-API (source system API) | Medium | Low, API versioning protects consumer | Modern e-procurement systems (e.g. PPDA API) |
| 3. CSV/file drop (scheduled export) | Low | High, manual step in the chain | Legacy systems with no API or DB access |
| 4. Middleware broker | High | Low, decoupled from source system | Multiple heterogeneous source systems |
| 5. Manual data entry (last resort) | None | Very High, human compliance required | Only where no other option exists |
Pattern 2: API-to-API Integration
The source system exposes a REST API; the OC4IDS portal polls it on a schedule, transforms responses to OC4IDS schema, and persists records. This is the most maintainable pattern when the source system has a versioned API. The Open Contracting Partnership's OCDS tools (standard.open-contracting.org/latest/en/guidance/build/) provide reference implementations for consuming procurement APIs and mapping to OCDS format, which then feeds the OC4IDS procurement phase. The Kaduna State Infrastructure Data Portal (ipdata.kdsg.gov.ng) uses API polling from the state's procurement system to populate its OC4IDS project records.
Pattern 3: Scheduled File Drop
| Risk Factor | Observed Failure Mode | Mitigation |
|---|---|---|
| Manual export dependency | Export skipped when officer changes, CoST Uganda 2019 pattern | Automate export via cron job if system permits |
| File format drift | Column names change silently between exports | Schema validation step in the ingestion pipeline |
| No delivery confirmation | Failed drops are invisible until data audit | Monitoring alert on file age in SFTP endpoint |
| Staff turnover | Named data steward leaves; no backup | Embed in civil service job description, not project role |
When a source system cannot expose a database connection or API, a scheduled CSV export, dropped to an SFTP endpoint or shared drive, is the fallback. The OC4IDS portal picks up the file on a schedule, validates the structure, and transforms it. This pattern introduces a manual step: a civil servant must run the export. CoST Uganda's 2019 Synthesis Report (infrastructuretransparency.org/programmes/uganda/) documented the predictable outcome of this dependency: file drops became irregular when the responsible officer changed. File drop integrations require a named, backed-up data steward role, not a project position.
Pattern 4: Middleware Broker
A middleware layer sits between source systems and the OC4IDS portal, normalising data from multiple heterogeneous sources into a common schema. The World Bank GovTech Maturity Index 2022 (worldbank.org/en/programs/govtech) documents this pattern as common in low-income country implementations where procurement, financial, and project management systems run on incompatible platforms. The middleware layer accepts outputs in whatever format the source system can produce, database dump, CSV, XML, proprietary API, and emits standardised OC4IDS-compatible records. This architectural decoupling means source system upgrades or replacements do not break the portal.
Pattern 5: Manual Entry (Last Resort)
| Time Period | Typical Compliance Rate | Trigger |
|---|---|---|
| Launch (0–3 months) | High, motivated team, donor visibility | Initial project energy |
| Mid-project (3–12 months) | Declining, officers prioritise other duties | No enforcement mechanism |
| Audit period | Temporary spike, retroactive entry | External pressure |
| Post-project | Near-zero, no dedicated resource | Project team disbanded |
When no automated integration is technically feasible, a manual entry form is the only option. OC4IDS portals built on this pattern always fail eventually. CoST Uganda's assurance reports consistently documented the same pattern: manual entry compliance dropped within months of launch, recovered during audit periods, and declined again between audits. Manual entry is not a transparency system, it is a compliance theatre that produces unreliable data. If a government agency cannot automate any of the first four patterns, the correct answer is to document why and set a deadline for automation rather than to build a manual portal.
The Counterargument: Automation Creates Brittleness
Critics argue that automated integrations create a different failure mode: brittleness. When the source system upgrades its schema or the API changes version, the automated pipeline silently breaks. Manual entry at least fails visibly, an officer who stops entering data is noticeable. A broken ETL job that inserts stale data undetected is worse than no data at all. The Uganda PPDA example, critics note, works because PPDA has institutional continuity. Most governments do not.
This objection is valid but mislocated. Brittleness is not a property of automation, it is a property of undocumented automation. The World Bank GovTech Maturity Index 2022 (worldbank.org/en/programs/govtech) documents that e-transparency implementations with the longest operational lives shared a common feature: schema change management built into the integration layer, not the source system. A middleware broker (Pattern 4) absorbs source system changes precisely because it decouples the portal from the source. ETL jobs that run against a documented, stable schema (Pattern 1) are no more brittle than the database they query. The real risk is not automation itself, it is the project-team assumption that the integration will maintain itself. OC4IDS implementations built on the Uganda PPDA model institutionalise the integration as infrastructure, not as a project deliverable. That distinction, not the choice between manual and automated entry, determines whether the system survives its first budget cycle.
Choosing the Right Pattern
data stewardship model
is there a dedicated data officer, or will the integration depend on a project team?
The decision depends on three factors: the source system's technical capabilities (database access, API availability), the IT capacity of the implementing team (can they maintain a middleware broker?), and the data stewardship model (is there a dedicated data officer, or will the integration depend on a project team?). A direct database query with a single dedicated DBA is more reliable in practice than a sophisticated middleware broker maintained by a rotating project team. Automation solves the technical barrier. Institutional design, dedicated roles with budget lines, solves the maintenance barrier. Both are required for any pattern to function beyond the initial project cycle.
Playbook
Decision Table
| Option | When to Use | Tradeoff |
|---|---|---|
| Adopt immediately | Low-risk process and clear team ownership | Fast progress, limited validation runway |
| Pilot first | Uncertain data quality or mixed institutional capacity | Slower scale-up, higher confidence |
| Defer pending controls | Missing governance, QA, or monitoring guardrails | Lower short-term output, better long-term durability |
Execution Checklist
Failure Modes
- Skipping the section "Pattern 1: Direct Database Query" during implementation.
- Skipping the section "Pattern 2: API-to-API Integration" during implementation.
- Skipping the section "Pattern 3: Scheduled File Drop" during implementation.
Found this useful?
I write about open data systems, transparency, and implementation.
Read more articles →