Codes methodology
How BuildCalc API curates US construction code metadata (IRC + IBC + NEC + IECC + IPC) — the legal framework, hand-curation process, dual-edition strategy, and 6 known limitations every caller should understand.
This page documents how /v1/codes/* endpoints produce their data, the
legal posture that constrains the scope, and the limitations every caller
should surface to end users.
If you arrived here from a response carrying amendment_notes: "Source: ICC Master I-Code Adoption Chart (Jan 2024 snapshot)..." or noticed the codes
vertical does not return verbatim section text, this is the canonical
reference for why those design choices were made.
TL;DR
- Metadata only, no text — every section row carries
section_number,section_title,chapter,parent_section_number,topic_tags, plusviewer_deep_linkto the publisher's free portal (ICC Digital Codes, NFPA Link). The full code body text is NEVER republished — neither verbatim nor paraphrased. - State amendments are flagged, not summarized — every adoption row
carries
has_amendments BOOLEANplus a shortamendment_notesstring; the substantive amendment text is NOT reproduced. - Two adoption sources — codecheck.com (NEC + IRC, quarterly refresh)
- ICC Master I-Code Adoption Chart Jan 2024 snapshot (IBC + IPC + IECC,
annual snapshot). Each row carries the source in its
amendment_notesorsource_urlso auditors can trace provenance.
- ICC Master I-Code Adoption Chart Jan 2024 snapshot (IBC + IPC + IECC,
annual snapshot). Each row carries the source in its
- Five code families seeded — IRC (dual edition 2021 default + 2024 latest), IBC 2024, NEC (dual edition 2023 default + 2026 latest), IECC 2024, IPC 2024. 937 hand-curated sections total.
- 52 US jurisdictions tracked — 50 states + DC + PR.
Legal framework
Per ADR-0014, the codes vertical operates under a deliberately conservative legal posture — short factual labels and deep links only, no verbatim or paraphrased code text. The framework draws from three doctrines:
1. Veeck v. SBCCI — code-as-law publicly accessible
Veeck v. Southern Building Code Congress International, Inc., 293 F.3d
791 (5th Cir. 2002), and ASTM v. Public.Resource.Org, 597 F. Supp. 3d
213 (D.D.C. 2022), establish that code text incorporated by law is
publicly accessible. Once a jurisdiction adopts an ICC or NFPA code as
its mandatory minimum, the code text becomes the law itself — which
cannot be copyrighted.
However, Veeck does NOT authorize republishing:
- Copyrighted formatting (tables, indices, cross-reference layouts)
- Editorial commentary or interpretation
- Paraphrased summaries that bundle creative selections
The codes vertical respects this boundary: we ship facts (section numbers + titles + chapter structure + adoption status) and link to the publisher's portal for the full text.
2. Feist Publications — facts are not copyrightable
Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340
(1991): facts cannot be copyrighted. Section numbers are facts. Section
titles ("Stair Treads and Risers") are short factual labels — Feist
applies, no copyright concern. Chapter numbers, parent-section
relationships, adoption-by-jurisdiction status — all facts.
3. CFAA — public data, no auth bypass
All codes vertical sources are publicly accessible. codecheck.com is a free public website. The ICC Master I-Code Adoption Chart is a publicly downloadable PDF. We don't bypass authentication, paywalls, or rate limits.
Per-source legal tier classification
Following the T0-T4 framework established in ADR-0015 (products vertical) and inherited by ADR-0016 (benchmarks vertical):
| Tier | Definition | Codes vertical inclusion |
|---|---|---|
| T0 | Federal public-domain (USG works §105) | None — ICC and NFPA are private standards-developing organizations |
| T1 | Federal-recognized cert registry, ToS permits research | None |
| T2 | Mfr-published catalog PDF | ICC Master I-Code Adoption Chart PDF (publicly downloadable factual table) — used for adoption metadata only, never the underlying code text |
| T3 | Private paywalled report | Excluded — DO NOT access: ICC Digital Codes paid subscriptions, NFPA-fee tiers, RSMeans, Construction Specifications Institute (CSI) paywalls |
| T4 | Publicly-published industry guides with copyright | Same exclusions as T3 |
ICC and NFPA's free-access viewers (ICC Digital Codes free read-only,
NFPA Link free-access portal) are linked from viewer_deep_link — we
link, we don't copy.
What we ship per section
For every section in the 5 code families we store ONLY:
| Field | Risk | Included? | Why |
|---|---|---|---|
section_number | none | yes | Fact — Feist applies |
section_title | low | yes | Short factual label — Feist applies |
chapter | none | yes | Fact |
parent_section_number | none | yes | Fact (hierarchy of the document) |
topic_tags | none | yes | BuildCalc's own taxonomy (kebab-case, fixed vocabulary) |
viewer_deep_link | none | yes | Linking is not copying |
source_url | none | yes | Linking |
| Verbatim section text | high | NO | Would invite ICC/NFPA takedown even if defensible |
| Paraphrased summaries | med | NO | Deferred to Phase 2 (post-revenue, after legal review) |
| State amendment text | med | NO | Same posture as base code text |
The viewer_deep_link field points to the publisher's free preview
viewer. Users who need the full text follow the link; we don't
redistribute it.
Editions and dual-edition strategy
IRC — dual edition (2021 default + 2024 latest)
IRC 2024 substantially renumbered chapter 3 — most notably "Means of
Egress" moved from R311 to R318, taking the stairway sub-sections with
it (R311.7 → R318.7). Our calculators (stairs.py, doors_windows.py)
cite the 2021 numbering, which is also the dominant adoption in 2026
(~40-50% of US jurisdictions).
To keep calc citations aligned with reality AND give early-adopting jurisdictions a path forward:
- IRC 2021 is the default edition (matches calculator citations and most jurisdictions).
- IRC 2024 is also seeded under the same
irccode id (theeditioncolumn distinguishes rows). app/codes/calc_links.pycontains entries for both numbering schemes so AI agents querying either edition land on the same calculator endpoints.
Clients pin to 2024 via ?edition=2024; default stays 2021.
NEC — dual edition (2023 default + 2026 latest)
NFPA 70-2026 (published Oct 10, 2025) is the largest structural overhaul since 1975:
- Article 220 → Article 120 (load calculations moved from Chapter 2 to Chapter 1). Every 220.x section renumbered to 120.x: 220.12 → 120.12, 220.42 → 120.42, 220.82 → 120.82.
- Article 100 consolidated all definitions (no more x.2 article-local definitions).
- Article 404 snap-switch rules moved to 406.30 / 406.32.
NEC 2023 stays the default edition (23 of the 49 tracked jurisdictions
are on it; NEC 2026 has near-zero state adoption yet given its Oct 2025
publication). calc_links.py carries entries for both numbering schemes
so /v1/calc/electrical/panel-load returns the right citation regardless
of which edition the client pins to.
IBC / IECC / IPC — single edition (2024)
Chapter structures were stable across the most-recent revisions. v1 ships only IBC 2024, IECC 2024, IPC 2024.
Adoption matrix sources
Two complementary sources feed code_adoptions (the per-jurisdiction
adoption status table):
| Source | Codes covered | Refresh | Cron |
|---|---|---|---|
| codecheck.com/code-adoption-by-state | NEC + IRC | Quarterly | Render crn-d85h7l7aqgkc73bchgpg quarterly (1st of Jan/Apr/Jul/Oct, 06:00 UTC) |
| ICC Master I-Code Adoption Chart (Jan 2024 PDF) | IBC + IPC + IECC | Snapshot | Same cron — parses the PDF in the same run |
The cron scripts/scrape_code_adoption.py runs both sources in one
pass. Failure of either source is now fatal (per 2026-05-18 post-launch
punch list #17): losing IBC/IPC/IECC silently is exactly the "no alarm
on URL break" gap that QA flagged. The next quarterly run surfaces any
parse error via Sentry email + Render's deploy-status webhook within
minutes of the cron firing.
codecheck.com — single source documented
codecheck.com is currently the only free, easily-machine-readable source we've identified for per-state NEC + IRC adoption + amendment notes:
- NFPA's own adoption tracker isn't crawl-friendly.
- ICC doesn't track NEC.
- State Secretaries of State publish in inconsistent formats.
We accept this as a single point of failure for v1, with bounded exposure:
- The
code_adoptionstable already holds the last successful scrape result. Production reads from the DB, not from codecheck on each request. - A codecheck outage means fresh updates pause, not data goes missing.
- Worst case before manual mitigation: ~3 months of stale NEC/IRC data, since the cron runs quarterly and data only refreshes on successful runs.
- Stale data is still labelled with
has_amendmentsboolean andamendment_notes, so consumers know to re-verify time-sensitive questions against the jurisdiction's own publication.
If codecheck disappears (domain shutdown, paywall, or use-policy change), documented recovery options in priority order:
- Wait — maintainers often restore briefly. The cron alarms via Sentry within minutes of the next failed run.
- Find a replacement crawler target — UpCodes, NFPA, IAEI, or NAHB are candidate fallback sources identified for evaluation; an internal playbook documents trigger conditions and migration steps.
- Pay for an authoritative feed — NFPA's commercial adoption tracker or RSMeans cross-reference. Triggered by ≥$500 MRR per the pay-as-you-grow tier matrix.
ICC PDF — pinned to Jan 2024 snapshot
The ICC Master I-Code Adoption Chart is a publicly downloadable PDF
ICC refreshes annually. We pin to Master-I-Code-Adoption-Chart-1.pdf
(the Jan 2024 snapshot — the URL itself is stable; ICC numbers the
chart, not the year).
When ICC publishes a newer chart (e.g., Master-I-Code-Adoption-Chart-2.pdf),
the OLD URL will either 404 (fatal in our scraper — surfaces immediately)
or keep serving stale data forever (silent — not detectable by the
script). Annual manual mitigation: skim
https://www.iccsafe.org/wp-content/uploads/ for a newer chart number
each January and swap _ICC_CHART_URL accordingly.
ICC adoption cycles run 3-6 years per state, so an annual snapshot is
mostly accurate but not perfect. Rows from the ICC PDF carry an explicit
staleness note in amendment_notes:
"Source: ICC Master I-Code Adoption Chart (Jan 2024 snapshot). Re-verify against jurisdiction publications for time-sensitive use."
Hand-curated section seeding
Section data does NOT come from API or scraping. Each of the 5 codes is
hand-curated from 7 seed JSON files in app/codes/seed_data/:
irc_2021.json— ~180 sectionsirc_2024.json— ~210 sectionsibc_2024.json— ~180 sectionsnec_2023.json— ~98 sectionsnec_2026.json— ~107 sections (added 2026-05-18 post-publication)iecc_2024.json— ~80 sectionsipc_2024.json— ~60 sections
Total: 937 hand-curated sections. Curation drew on the publishers' free
viewers + cross-references from app/calculators/*.py to ensure every
calculator's code_reference resolves to a real section in our table.
Why hand-curated and not scraped: the publishers' free viewers are designed for human browsing, not bulk extraction, and their use policies reflect that. Hand-curating a focused subset (sections that calculators cite, plus high-traffic reference sections) respects the boundary while still giving AI agents a stable factual lookup.
State amendments — flag-only in v1
Every code_adoptions row carries:
has_amendments BOOLEAN(defaultFALSE)amendment_notes TEXT— free-form short string (e.g., "California amends via Title 24 Part 2.5")
v1 ships NO verbatim or paraphrased amendment text. Amendments are flagged so AI agents know to redirect the user to the jurisdiction's own publication site, not summarized. This is the same legal posture as the base code text — paraphrased amendment summaries are deferred to Phase 2 (post-revenue), gated by:
- Legal review of fair-use posture
- Per-jurisdiction copyediting (paraphrasing is creative work; can't be done purely by LLM without supervision)
- Trigger: when customer count exceeds 50 paid AND ≥5 explicit requests
reach
[email protected]
Calculator cross-references (related_calc_kinds)
The endpoint GET /v1/codes/{code_id}/sections/{section_number} returns
related_calc_kinds, a list of /v1/calc/* paths whose formulas are
grounded in that section. The mapping lives in app/codes/calc_links.py
as a static dict — 30+ entries hand-curated from the citations already
present in app/calculators/*.py.
This is purely additive: it lets an AI agent pivot from "what does
NEC 220.82 require?" to "let me run the panel-load calculator that
implements those requirements" in a single conversation. Every
calculator response returns a code_reference (e.g., "IRC R311.7.5")
which maps back to a section here, and that section's related_calc_kinds
points back to the calc endpoints. AI agents can pivot in either direction.
The 6 known limitations
The legal posture is defensible; the framing is the honest part. Every codes endpoint response is bounded by these caveats.
1. No verbatim or paraphrased section text
By design (per ADR-0014). To read the full text of a section, follow
viewer_deep_link to the publisher's free portal (ICC Digital Codes,
NFPA Link). An AI agent that needs to answer "what does IRC R311.7.5
actually say?" must call out to the linked viewer or admit the
limitation.
Expected impact: AI agents grounding answers in our codes vertical will cite section numbers correctly but cannot synthesize the prescriptive requirements from this data alone.
2. State amendments flagged but not summarized
has_amendments: true tells you California (or whoever) modifies the
adopted code, and amendment_notes gives you a 1-line breadcrumb. The
substantive amendment text is NOT here. Time-sensitive questions
(permit applications, compliance reviews) must re-verify against the
jurisdiction's own publication.
Expected impact: AI agents must caveat any amendment-related
answer with "verify with the state's published amendments" when
has_amendments: true is in the row.
3. ICC PDF snapshot is Jan 2024 — 16+ months stale
The ICC Master I-Code Adoption Chart we parse is pinned to the Jan 2024
snapshot. ICC adoption cycles run 3-6 years per state, so most rows are
still accurate, but a state that adopted a new edition in 2025 is NOT
reflected for IBC/IPC/IECC. Rows from the ICC PDF carry an explicit
staleness note in amendment_notes.
Expected impact: state-by-state IBC/IPC/IECC adoption status may lag actual adoption by up to 16+ months. Quarterly cron refresh does NOT improve this because the source PDF itself is annual.
Mitigation: annual manual swap to newer ICC chart number when published (next expected: Jan 2026 ICC release).
4. codecheck.com single-source dependency
NEC + IRC state adoption + amendment notes depend on a single source
(codecheck.com). If codecheck disappears, fresh updates pause until
manual mitigation. The code_adoptions table holds the last
successful scrape result, so production reads aren't affected
immediately — but stale data accumulates.
Expected impact: worst case ~3 months of stale NEC/IRC adoption data before manual mitigation. Sentry alarms within minutes of the next failed cron run.
Mitigation: backup candidate sources (UpCodes, NFPA, IAEI, NAHB) identified with documented migration playbook. Activated only if codecheck primary fails.
5. Section coverage is curated, not exhaustive
937 sections is a focused subset, NOT every section in every code. Curation prioritized:
- Sections that calculators cite (via
related_calc_kinds) - High-traffic reference sections (egress, GFCI, structural minimums)
- Frequently-asked sections from the BuildCalc Pro user base
If you query /v1/codes/irc/sections/R602.10.4 and get 404, that's
not a bug — it means we haven't curated that section yet. Open an
issue or email [email protected] if you need a specific
section added.
Expected impact: AI agents may need to gracefully handle 404s on section lookups and either pivot to a related section in the same chapter or surface the gap to the user.
6. Topic-tag taxonomy is BuildCalc-authored
topic_tags values are drawn from a fixed vocabulary that BuildCalc
designed. Axis-1: structural, electrical, plumbing, mechanical,
energy, fire-protection, egress, accessibility, envelope,
interior. Axis-2: residential, commercial, assembly, etc.
This taxonomy is NOT ICC- or NFPA-endorsed. Two sections that ICC
groups under different chapters may share a topic_tag here, and
vice-versa. The taxonomy is documented in
topic_tags.md.
Expected impact: filtering by ?topic=stairs returns sections we
tagged with stairs, not the official ICC chapter-3-stairs subset.
Use ?chapter= for chapter-aligned filtering.
Refresh cadence
| Component | Cadence | Source |
|---|---|---|
| Section seed data | Manual (re-seed on new edition release) | Hand-curated JSON in app/codes/seed_data/ |
| Adoption matrix (NEC + IRC) | Quarterly (1st of Jan/Apr/Jul/Oct) | codecheck.com via scripts/scrape_code_adoption.py |
| Adoption matrix (IBC+IPC+IECC) | Same quarterly cron | ICC Master I-Code Adoption Chart PDF — pinned URL |
Cron crn-d85h7l7aqgkc73bchgpg (Render). Both sources must succeed for
the cron to exit rc=0; either failing surfaces via Sentry + Render
deploy-status webhook.
Sources
- ICC (International Code Council) publishes IRC, IBC, IECC, IPC: https://www.iccsafe.org/ — free read-only viewer at https://codes.iccsafe.org/
- NFPA (National Fire Protection Association) publishes NEC (NFPA 70): https://www.nfpa.org/ — free read-only access via NFPA Link
- ICC Master I-Code Adoption Chart: https://www.iccsafe.org/wp-content/uploads/Master-I-Code-Adoption-Chart-1.pdf (Jan 2024 snapshot)
- codecheck.com adoption tracker: https://codecheck.com/code-adoption-by-state/
ADR cross-reference
- ADR-0014: Codes vertical scope + hybrid legal model — defines the metadata-only scope, dual-edition strategy (IRC + NEC), and Phase 2 deferral of paraphrased summaries.
- ADR-0015: Products vertical legal posture — defines the T0-T4 tier framework inherited here.
Privacy Policy
What data BuildCalc API collects, how we use it, and your rights under CCPA. Written to actually match what we do — no boilerplate.
Costs methodology
How BuildCalc API computes labor wages and permits — the math, the federal data sources, and the 7 known limitations every caller should understand.