BuildCalc API
Methodology

Codes methodology

How BuildCalc API curates US construction code metadata (IRC + IBC + NEC + IECC + IPC) — the legal framework, hand-curation process, dual-edition strategy, and 6 known limitations every caller should understand.

This page documents how /v1/codes/* endpoints produce their data, the legal posture that constrains the scope, and the limitations every caller should surface to end users.

If you arrived here from a response carrying amendment_notes: "Source: ICC Master I-Code Adoption Chart (Jan 2024 snapshot)..." or noticed the codes vertical does not return verbatim section text, this is the canonical reference for why those design choices were made.

TL;DR

  • Metadata only, no text — every section row carries section_number, section_title, chapter, parent_section_number, topic_tags, plus viewer_deep_link to the publisher's free portal (ICC Digital Codes, NFPA Link). The full code body text is NEVER republished — neither verbatim nor paraphrased.
  • State amendments are flagged, not summarized — every adoption row carries has_amendments BOOLEAN plus a short amendment_notes string; the substantive amendment text is NOT reproduced.
  • Two adoption sources — codecheck.com (NEC + IRC, quarterly refresh)
    • ICC Master I-Code Adoption Chart Jan 2024 snapshot (IBC + IPC + IECC, annual snapshot). Each row carries the source in its amendment_notes or source_url so auditors can trace provenance.
  • Five code families seeded — IRC (dual edition 2021 default + 2024 latest), IBC 2024, NEC (dual edition 2023 default + 2026 latest), IECC 2024, IPC 2024. 937 hand-curated sections total.
  • 52 US jurisdictions tracked — 50 states + DC + PR.

Per ADR-0014, the codes vertical operates under a deliberately conservative legal posture — short factual labels and deep links only, no verbatim or paraphrased code text. The framework draws from three doctrines:

1. Veeck v. SBCCI — code-as-law publicly accessible

Veeck v. Southern Building Code Congress International, Inc., 293 F.3d 791 (5th Cir. 2002), and ASTM v. Public.Resource.Org, 597 F. Supp. 3d 213 (D.D.C. 2022), establish that code text incorporated by law is publicly accessible. Once a jurisdiction adopts an ICC or NFPA code as its mandatory minimum, the code text becomes the law itself — which cannot be copyrighted.

However, Veeck does NOT authorize republishing:

  • Copyrighted formatting (tables, indices, cross-reference layouts)
  • Editorial commentary or interpretation
  • Paraphrased summaries that bundle creative selections

The codes vertical respects this boundary: we ship facts (section numbers + titles + chapter structure + adoption status) and link to the publisher's portal for the full text.

2. Feist Publications — facts are not copyrightable

Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991): facts cannot be copyrighted. Section numbers are facts. Section titles ("Stair Treads and Risers") are short factual labels — Feist applies, no copyright concern. Chapter numbers, parent-section relationships, adoption-by-jurisdiction status — all facts.

3. CFAA — public data, no auth bypass

All codes vertical sources are publicly accessible. codecheck.com is a free public website. The ICC Master I-Code Adoption Chart is a publicly downloadable PDF. We don't bypass authentication, paywalls, or rate limits.

Following the T0-T4 framework established in ADR-0015 (products vertical) and inherited by ADR-0016 (benchmarks vertical):

TierDefinitionCodes vertical inclusion
T0Federal public-domain (USG works §105)None — ICC and NFPA are private standards-developing organizations
T1Federal-recognized cert registry, ToS permits researchNone
T2Mfr-published catalog PDFICC Master I-Code Adoption Chart PDF (publicly downloadable factual table) — used for adoption metadata only, never the underlying code text
T3Private paywalled reportExcluded — DO NOT access: ICC Digital Codes paid subscriptions, NFPA-fee tiers, RSMeans, Construction Specifications Institute (CSI) paywalls
T4Publicly-published industry guides with copyrightSame exclusions as T3

ICC and NFPA's free-access viewers (ICC Digital Codes free read-only, NFPA Link free-access portal) are linked from viewer_deep_link — we link, we don't copy.

What we ship per section

For every section in the 5 code families we store ONLY:

FieldRiskIncluded?Why
section_numbernoneyesFact — Feist applies
section_titlelowyesShort factual label — Feist applies
chapternoneyesFact
parent_section_numbernoneyesFact (hierarchy of the document)
topic_tagsnoneyesBuildCalc's own taxonomy (kebab-case, fixed vocabulary)
viewer_deep_linknoneyesLinking is not copying
source_urlnoneyesLinking
Verbatim section texthighNOWould invite ICC/NFPA takedown even if defensible
Paraphrased summariesmedNODeferred to Phase 2 (post-revenue, after legal review)
State amendment textmedNOSame posture as base code text

The viewer_deep_link field points to the publisher's free preview viewer. Users who need the full text follow the link; we don't redistribute it.

Editions and dual-edition strategy

IRC — dual edition (2021 default + 2024 latest)

IRC 2024 substantially renumbered chapter 3 — most notably "Means of Egress" moved from R311 to R318, taking the stairway sub-sections with it (R311.7 → R318.7). Our calculators (stairs.py, doors_windows.py) cite the 2021 numbering, which is also the dominant adoption in 2026 (~40-50% of US jurisdictions).

To keep calc citations aligned with reality AND give early-adopting jurisdictions a path forward:

  • IRC 2021 is the default edition (matches calculator citations and most jurisdictions).
  • IRC 2024 is also seeded under the same irc code id (the edition column distinguishes rows).
  • app/codes/calc_links.py contains entries for both numbering schemes so AI agents querying either edition land on the same calculator endpoints.

Clients pin to 2024 via ?edition=2024; default stays 2021.

NEC — dual edition (2023 default + 2026 latest)

NFPA 70-2026 (published Oct 10, 2025) is the largest structural overhaul since 1975:

  • Article 220 → Article 120 (load calculations moved from Chapter 2 to Chapter 1). Every 220.x section renumbered to 120.x: 220.12 → 120.12, 220.42 → 120.42, 220.82 → 120.82.
  • Article 100 consolidated all definitions (no more x.2 article-local definitions).
  • Article 404 snap-switch rules moved to 406.30 / 406.32.

NEC 2023 stays the default edition (23 of the 49 tracked jurisdictions are on it; NEC 2026 has near-zero state adoption yet given its Oct 2025 publication). calc_links.py carries entries for both numbering schemes so /v1/calc/electrical/panel-load returns the right citation regardless of which edition the client pins to.

IBC / IECC / IPC — single edition (2024)

Chapter structures were stable across the most-recent revisions. v1 ships only IBC 2024, IECC 2024, IPC 2024.

Adoption matrix sources

Two complementary sources feed code_adoptions (the per-jurisdiction adoption status table):

SourceCodes coveredRefreshCron
codecheck.com/code-adoption-by-stateNEC + IRCQuarterlyRender crn-d85h7l7aqgkc73bchgpg quarterly (1st of Jan/Apr/Jul/Oct, 06:00 UTC)
ICC Master I-Code Adoption Chart (Jan 2024 PDF)IBC + IPC + IECCSnapshotSame cron — parses the PDF in the same run

The cron scripts/scrape_code_adoption.py runs both sources in one pass. Failure of either source is now fatal (per 2026-05-18 post-launch punch list #17): losing IBC/IPC/IECC silently is exactly the "no alarm on URL break" gap that QA flagged. The next quarterly run surfaces any parse error via Sentry email + Render's deploy-status webhook within minutes of the cron firing.

codecheck.com — single source documented

codecheck.com is currently the only free, easily-machine-readable source we've identified for per-state NEC + IRC adoption + amendment notes:

  • NFPA's own adoption tracker isn't crawl-friendly.
  • ICC doesn't track NEC.
  • State Secretaries of State publish in inconsistent formats.

We accept this as a single point of failure for v1, with bounded exposure:

  • The code_adoptions table already holds the last successful scrape result. Production reads from the DB, not from codecheck on each request.
  • A codecheck outage means fresh updates pause, not data goes missing.
  • Worst case before manual mitigation: ~3 months of stale NEC/IRC data, since the cron runs quarterly and data only refreshes on successful runs.
  • Stale data is still labelled with has_amendments boolean and amendment_notes, so consumers know to re-verify time-sensitive questions against the jurisdiction's own publication.

If codecheck disappears (domain shutdown, paywall, or use-policy change), documented recovery options in priority order:

  1. Wait — maintainers often restore briefly. The cron alarms via Sentry within minutes of the next failed run.
  2. Find a replacement crawler target — UpCodes, NFPA, IAEI, or NAHB are candidate fallback sources identified for evaluation; an internal playbook documents trigger conditions and migration steps.
  3. Pay for an authoritative feed — NFPA's commercial adoption tracker or RSMeans cross-reference. Triggered by ≥$500 MRR per the pay-as-you-grow tier matrix.

ICC PDF — pinned to Jan 2024 snapshot

The ICC Master I-Code Adoption Chart is a publicly downloadable PDF ICC refreshes annually. We pin to Master-I-Code-Adoption-Chart-1.pdf (the Jan 2024 snapshot — the URL itself is stable; ICC numbers the chart, not the year).

When ICC publishes a newer chart (e.g., Master-I-Code-Adoption-Chart-2.pdf), the OLD URL will either 404 (fatal in our scraper — surfaces immediately) or keep serving stale data forever (silent — not detectable by the script). Annual manual mitigation: skim https://www.iccsafe.org/wp-content/uploads/ for a newer chart number each January and swap _ICC_CHART_URL accordingly.

ICC adoption cycles run 3-6 years per state, so an annual snapshot is mostly accurate but not perfect. Rows from the ICC PDF carry an explicit staleness note in amendment_notes:

"Source: ICC Master I-Code Adoption Chart (Jan 2024 snapshot). Re-verify against jurisdiction publications for time-sensitive use."

Hand-curated section seeding

Section data does NOT come from API or scraping. Each of the 5 codes is hand-curated from 7 seed JSON files in app/codes/seed_data/:

  • irc_2021.json — ~180 sections
  • irc_2024.json — ~210 sections
  • ibc_2024.json — ~180 sections
  • nec_2023.json — ~98 sections
  • nec_2026.json — ~107 sections (added 2026-05-18 post-publication)
  • iecc_2024.json — ~80 sections
  • ipc_2024.json — ~60 sections

Total: 937 hand-curated sections. Curation drew on the publishers' free viewers + cross-references from app/calculators/*.py to ensure every calculator's code_reference resolves to a real section in our table.

Why hand-curated and not scraped: the publishers' free viewers are designed for human browsing, not bulk extraction, and their use policies reflect that. Hand-curating a focused subset (sections that calculators cite, plus high-traffic reference sections) respects the boundary while still giving AI agents a stable factual lookup.

State amendments — flag-only in v1

Every code_adoptions row carries:

  • has_amendments BOOLEAN (default FALSE)
  • amendment_notes TEXT — free-form short string (e.g., "California amends via Title 24 Part 2.5")

v1 ships NO verbatim or paraphrased amendment text. Amendments are flagged so AI agents know to redirect the user to the jurisdiction's own publication site, not summarized. This is the same legal posture as the base code text — paraphrased amendment summaries are deferred to Phase 2 (post-revenue), gated by:

  • Legal review of fair-use posture
  • Per-jurisdiction copyediting (paraphrasing is creative work; can't be done purely by LLM without supervision)
  • Trigger: when customer count exceeds 50 paid AND ≥5 explicit requests reach [email protected]

The endpoint GET /v1/codes/{code_id}/sections/{section_number} returns related_calc_kinds, a list of /v1/calc/* paths whose formulas are grounded in that section. The mapping lives in app/codes/calc_links.py as a static dict — 30+ entries hand-curated from the citations already present in app/calculators/*.py.

This is purely additive: it lets an AI agent pivot from "what does NEC 220.82 require?" to "let me run the panel-load calculator that implements those requirements" in a single conversation. Every calculator response returns a code_reference (e.g., "IRC R311.7.5") which maps back to a section here, and that section's related_calc_kinds points back to the calc endpoints. AI agents can pivot in either direction.

The 6 known limitations

The legal posture is defensible; the framing is the honest part. Every codes endpoint response is bounded by these caveats.

1. No verbatim or paraphrased section text

By design (per ADR-0014). To read the full text of a section, follow viewer_deep_link to the publisher's free portal (ICC Digital Codes, NFPA Link). An AI agent that needs to answer "what does IRC R311.7.5 actually say?" must call out to the linked viewer or admit the limitation.

Expected impact: AI agents grounding answers in our codes vertical will cite section numbers correctly but cannot synthesize the prescriptive requirements from this data alone.

2. State amendments flagged but not summarized

has_amendments: true tells you California (or whoever) modifies the adopted code, and amendment_notes gives you a 1-line breadcrumb. The substantive amendment text is NOT here. Time-sensitive questions (permit applications, compliance reviews) must re-verify against the jurisdiction's own publication.

Expected impact: AI agents must caveat any amendment-related answer with "verify with the state's published amendments" when has_amendments: true is in the row.

3. ICC PDF snapshot is Jan 2024 — 16+ months stale

The ICC Master I-Code Adoption Chart we parse is pinned to the Jan 2024 snapshot. ICC adoption cycles run 3-6 years per state, so most rows are still accurate, but a state that adopted a new edition in 2025 is NOT reflected for IBC/IPC/IECC. Rows from the ICC PDF carry an explicit staleness note in amendment_notes.

Expected impact: state-by-state IBC/IPC/IECC adoption status may lag actual adoption by up to 16+ months. Quarterly cron refresh does NOT improve this because the source PDF itself is annual.

Mitigation: annual manual swap to newer ICC chart number when published (next expected: Jan 2026 ICC release).

4. codecheck.com single-source dependency

NEC + IRC state adoption + amendment notes depend on a single source (codecheck.com). If codecheck disappears, fresh updates pause until manual mitigation. The code_adoptions table holds the last successful scrape result, so production reads aren't affected immediately — but stale data accumulates.

Expected impact: worst case ~3 months of stale NEC/IRC adoption data before manual mitigation. Sentry alarms within minutes of the next failed cron run.

Mitigation: backup candidate sources (UpCodes, NFPA, IAEI, NAHB) identified with documented migration playbook. Activated only if codecheck primary fails.

5. Section coverage is curated, not exhaustive

937 sections is a focused subset, NOT every section in every code. Curation prioritized:

  • Sections that calculators cite (via related_calc_kinds)
  • High-traffic reference sections (egress, GFCI, structural minimums)
  • Frequently-asked sections from the BuildCalc Pro user base

If you query /v1/codes/irc/sections/R602.10.4 and get 404, that's not a bug — it means we haven't curated that section yet. Open an issue or email [email protected] if you need a specific section added.

Expected impact: AI agents may need to gracefully handle 404s on section lookups and either pivot to a related section in the same chapter or surface the gap to the user.

6. Topic-tag taxonomy is BuildCalc-authored

topic_tags values are drawn from a fixed vocabulary that BuildCalc designed. Axis-1: structural, electrical, plumbing, mechanical, energy, fire-protection, egress, accessibility, envelope, interior. Axis-2: residential, commercial, assembly, etc.

This taxonomy is NOT ICC- or NFPA-endorsed. Two sections that ICC groups under different chapters may share a topic_tag here, and vice-versa. The taxonomy is documented in topic_tags.md.

Expected impact: filtering by ?topic=stairs returns sections we tagged with stairs, not the official ICC chapter-3-stairs subset. Use ?chapter= for chapter-aligned filtering.

Refresh cadence

ComponentCadenceSource
Section seed dataManual (re-seed on new edition release)Hand-curated JSON in app/codes/seed_data/
Adoption matrix (NEC + IRC)Quarterly (1st of Jan/Apr/Jul/Oct)codecheck.com via scripts/scrape_code_adoption.py
Adoption matrix (IBC+IPC+IECC)Same quarterly cronICC Master I-Code Adoption Chart PDF — pinned URL

Cron crn-d85h7l7aqgkc73bchgpg (Render). Both sources must succeed for the cron to exit rc=0; either failing surfaces via Sentry + Render deploy-status webhook.

Sources

ADR cross-reference

On this page