BuildCalc API
Methodology

Products methodology

How BuildCalc API's products vertical sources data, normalizes specs, classifies confidence, and handles per-source legal posture across the 10 categories.

The products vertical is a federated factual catalog of construction SKUs across 10 categories. This page documents the per-category source authorities, the normalization choices baked into the JSONB spec keys, the 4-level confidence taxonomy, and the legal posture per source (ADR-0015).

Category coverage matrix

CategorySource authorityStatusSpec key examples
hvacENERGY STAR Central AC + Heat Pumps + Geothermal + Boilers + Furnaceslivespecs.seer2, specs.eer2, specs.hspf2, specs.btu_h_cooling, specs.afue
windowsENERGY STAR Storm Windows (NFRC residential pending)live (subset)specs.u_factor, specs.shgc, specs.vt, specs.air_leakage_cfm_sf
plumbingEPA WaterSense (toilets, faucets, showerheads, urinals, flushometers, irrigation, sprinklers, RO)livespecs.fixture_type, specs.gpf, specs.gpm, specs.ada
insulationENERGY STAR Certified Insulationlivespecs.r_value, specs.material, specs.thickness_in, specs.fire_class
electricalENERGY STAR Ceiling Fans + Ventilating Fanslive (subset)specs.device_type, specs.watts_high, specs.efficiency_cfm_per_w
doorsNFRC CPDcoming soon (Phase 7.5)
roofingper-mfr PDFs (GAF, CertainTeed, OC, IKO, Tamko)coming soon (Phase 7.5)
drywallper-mfr PDFs (USG, CertainTeed, NatGyp, Continental)coming soon (Phase 7.5)
lumberper-mfr engineered-wood PDFscoming soon (Phase 7.5)
hardwareICC-ES ESR + Simpson Strong-Tiedeferred (LLC + USCO DMCA designation; see ADR-0015)

4-level confidence taxonomy

Every product row carries a confidence value indicating how much trust the caller should put in the spec values:

ValueMeaningUsed for
certifiedIssued by a federal/national authority (ENERGY STAR, AHRI, NFRC, WaterSense, ICC-ES)All live verticals — every spec value cites the certification authority
mfr_publishedExtracted deterministically (Tier-A pdfplumber rules) from a mfr official PDF catalogReserved for sub-dels 7/9/10 when mfr-PDF parsers land
ollama_extractedExtracted via the local Qwen 2.5-VL vision fallback (Tier-B) when Tier-A rules couldn't recognize the PDF shapeReserved — values should be cross-checked against a primary source
legacy_or_staleSource was authoritative when fetched but is no longer current (e.g., SEER1 ratings post-2023 SEER2 transition)Marker for catalog reconciliation

Live verticals today all land as certified. The agent SHOULD trust these for spec selection but MUST always cite the source_url to the end user (it links back to the authority of record).

Per-category source detail

HVAC

Three ENERGY STAR Socrata datasets feed the hvac category:

DatasetSubsetURL
83eb-xbyyCentral AC + Air-Source Heat Pumpsdata.energystar.gov/api/views/83eb-xbyy/rows.csv
acvd-5wvzGeothermal Heat Pumpsdata.energystar.gov/api/views/acvd-5wvz/rows.csv
6rww-hpnsBoilersdata.energystar.gov/api/views/6rww-hpns/rows.csv
i97v-e8auFurnacesdata.energystar.gov/api/views/i97v-e8au/rows.csv

The cron streams each CSV to a tempfile and upserts on UNIQUE (mfr, model_number, category='hvac', revision='current'). When the same outdoor unit pairs with multiple certified indoor coils, the rated SEER2 can differ; our UNIQUE constraint collapses these into one canonical product row + multiple product_certifications rows for each AHRI cert pairing. Agents needing per-pairing efficiency look at product_certifications.cert_type='ahri' and de-reference the AHRI Reference Number on the AHRI Directory directly.

Plumbing (WaterSense)

EPA's public JSON API at api.epa.gov/watersense. The endpoints use a mix of lowercase (/products/toilets/) and camelCase (/products/IrrigationControllers/, /products/reverseOsmosisSystems/) slugs — the cron paginates each in turn. WBIC (Weather-Based) + SMS (Soil-Moisture) controllers share the IrrigationControllers endpoint; the productType field per row disambiguates.

specs.fixture_type discriminates the 8 ingested types: toilet, faucet, showerhead, urinal, flushometer_valve, irrigation_controller, spray_sprinkler, reverse_osmosis.

Windows (Storm subset)

Only the Storm Windows subset (qaxz-ikcb) ships in v1. The residential primary-window market (sliders, casements, double-hung) lives at NFRC CPD; that ingest requires the ASP.NET WebForms ViewState flow which is deferred to Phase 7.5.

Spec keys mirror the NFRC five-rating system (U-factor, SHGC, VT, AL, CR) plus the storm-window specifics (frame material, glazing layers, emissivity, solar transmittance).

Electrical (Fans subset)

Two ENERGY STAR Socrata datasets:

  • 2te3-nmxp Ceiling Fans
  • 8dv7-nngq Ventilating Fans

Breaker/panel/switch coverage (Eaton, Square D, Siemens, Leviton) requires mfr PDF parsing and ships in Phase 7.5.

Insulation

ENERGY STAR Certified Insulation (kphf-22jd) — bag/batt R-value ratings primarily. The dataset is thin (~36 SKUs currently); broader insulation product coverage lives in mfr PDFs (OC, Knauf, JM, Rockwool, CertainTeed) which is Phase 7.5 work.

JSONB filter envelope

Per-category spec keys are declared in app/routes/v1/products/_filters.py. The endpoint validates each filter[specs.<field>.<op>]=<value> against the allowlist; unknown keys return HTTP 400 with invalid_filter_key.

Supported ops: gte, lte, gt, lt, eq. Numeric values are cast to double inside the jsonb_path_exists predicate; booleans are serialized as JSON literals.

Source attribution + crawler policy

Every row's source_url resolves to the authority's deep-link page or dataset root. The BuildCalcAPI-Crawler/1.0 user-agent (/crawler page) self-throttles to 1 req/sec per host and honors robots.txt unconditionally. DMCA notices go to [email protected] with a 24h ack + 72h removal SLA.

Known limitations

  1. HVAC same-model AHRI cert collapse. A single outdoor unit model with multiple indoor-coil pairings gets one product row; pairing-specific SEER2 ratings live on the cert rows, not the product spec.
  2. Storm windows only. Residential primary-window catalog (sliders, casements, etc.) is pending NFRC ingest.
  3. Plumbing IrrigationControllers consolidation. WBIC + SMS share fixture_type='irrigation_controller'; agents needing the distinction look at specs.product_type.
  4. Electrical fans only. Breaker/panel/switch SKUs await mfr PDF parsers.
  5. Insulation thin. ENERGY STAR's dataset covers ~36 SKUs of fiberglass + rigid board; not exhaustive for all insulation chemistries.
  6. 1 category remains gated. Hardware (Simpson Strong-Tie + ICC-ES ESR drill-down) is unblocked by USCO DMCA designation (DMCA-1073500 ACTIVE 2026-05-28) but awaits LLC formation for ICC-ES vendor master agreement signing. Other 4 categories (doors, roofing, drywall, lumber) now have real SKUs shipping monthly via NFRC dynamic discovery and per-mfr PDF parsers (post-Wave-8 2026-05-28). Live total: 9 of 10 categories, 40,723 SKUs as of 2026-05-29.

On this page