ON THIS PAGE
- Summary
- v1.4 (2026-05-22) — Prompt pass + Minor carries + Pipeline fix
- Prompt changes (4-platform system_prompt / instructions)
- KB changes (1 file modified)
- Platform bundle changes
- Pipeline / build script fixes (critical)
- Verification
- Known issues (deferred to v1.5 — see KNOWN_LIMITATIONS §0)
- Migration from v1.3 → v1.4
- Summary
- v1.3 (2026-05-20) — KB pass + 4-platform rebuild + light sanity
- KB changes (11 files modified)
- Platform bundle changes
- Build script changes (defensive hardening)
- Verification
- Known issues (deferred to v1.4 — see KNOWN_LIMITATIONS §0)
- Migration from v1.2 → v1.3
- v1.2 (2026-05-19) — Gemini-only prompt refresh v7.1 → v8.1
- Driver: SMOKE_V4 R3 regression on Gemini
- What changed in v8.1
- Verification
SDTM Knowledge Base — Release v1.4 Changelog (EN)
Tag:
v1.4-company-release(cut 2026-05-22) Previous tag:v1.3-company-release(2026-05-20) Driver: Prompt full-stack refactor — 4-platform v3/v9 clean rewrite (fossil-layer removal + KB-grounding default) + 4 minor v1.3 carries (C1-C4) + Claude bundle pipeline architectural fix. Gemini MAINTAINED_NO_SANITY_TEST per user 2026-05-22 decision.
Summary
v1.4 is a prompt-pass-level release of SDTM Pedia — the three maintained AI platforms (ChatGPT GPTs, Claude Projects, NotebookLM) receive a full-stack clean rewrite of their system prompts / instructions, stripping the v1.0-v1.3 iteration fossil layers and restoring KB-grounding as the primary path. The Gemini Gems platform shifts to “maintained-but-no-sanity-test” mode from v1.4 (per user decision 2026-05-22). Four v1.3 minor carries (C1-C4) fold in. A Claude bundle pipeline architectural fix repairs the §N.N.N capture gap (a Phase 6.5 reorg-A path regression caught during the C4 rebuild).
v1.4 (2026-05-22) — Prompt pass + Minor carries + Pipeline fix
Type: prompt-pass + minor carries (vs v1.3 KB-pass; the main work sits in the prompt layer; only 1 KB file changed)
Prompt changes (4-platform system_prompt / instructions)
Driver: v1.3 RETRO §二 8-carry list — the main carry is the 4-platform prompt full-stack refactor (Gemini v8.1 525 lines contains 17 CO-N fossil rules + parallel accumulation in the other 3 platforms). v1.4 main carry is fossil removal + KB-grounding first.
- ChatGPT v3 system_prompt (
self_deploy/chatgpt/system_prompt.md): 120→119 lines + a 4-line Method label anchor mapping (L77-80). Removes v1.0-v1.3 iteration annotations, makes KB-grounding the default. A=Many-Many (PCGRPID/PPGRPID) / B=One-Many (PCSEQ/PPGRPID) / C=Many-One (PCGRPID/PPSEQ) / D=One-One (PCSEQ/PPSEQ). - Claude v3 system_prompt (
self_deploy/claude/system_prompt.md): 125→133 lines. 5 essential rules + regex-gated CO-N + the Files A-S 19-file table preserved in full (critic reviewer caught attempt 1’s truncated 19→7 file table → attempt 2 surgical fix landed PASS_WITH_OBSERVATIONS). - NotebookLM v3 instructions (
self_deploy/notebooklm/instructions.md): 157→156 lines. Footer Sources citation kept semantically equivalent (not byte-exact; behavior preserved). v1.0-v1.3 iteration fossils removed. - Gemini v9 system_prompt (
self_deploy/gemini/system_prompt.md): 525→292 lines (including the 2026-05-22 Method label anchor increment). MAINTAINED_NO_SANITY_TEST — optimization continues, testing stops (see KNOWN_LIMITATIONS §0.A).
KB changes (1 file modified)
- PP/examples.md — §6.3.5.9.3 gains a 13-line explicit Method label mapping table (A/B/C/D × IDVAR1+IDVAR2 in 4 rows + header). KB + prompt dual-layer anchor resolving the v1.3 Q-S2 ChatGPT PARTIAL label drift (C4).
Platform bundle changes
- chatgpt:
06_domain_examples_all.mdrebuilt (now contains the PP/examples §6.3.5.9.3 Method label table); manifest updated. - claude:
09_examples_data_high.mdrebuilt (2922→3268 lines, A3.1 §N.N.N capture working for the first time post the pipeline fix). Includes the Method label table + 3 newly captured §N.N.N segments. - notebooklm: bucket 16 (
16_fnd_pharma_pc_pp.md) rebuilt (includes the Method label table). - gemini: bundle delta syncs the KB change (gem auto-pulls via KB grounding).
Pipeline / build script fixes (critical)
ai_platforms/claude_projects/dev/scripts/extract_examples_data.pyparents[3]→parents[4] path bugfix: a Phase 6.5 reorg-A path regression (the script moved one level deeper, but the parents[] index did not move with it). Impact: all 28 domains were reported missing; the A3.1 smoke test missed it because it ran from a different working config (smoke in dev/, production path different). Triggered in real combat during the C4 rebuild → fix → 3 bundles rebuilt successfully + 3268-line examples bundle (with §N.N.N capture).
Verification
- B1 UI sanity (Chrome MCP fire-and-forget): 4 questions × 3 platforms = 12 cells = 10 PASS+ + 2 PASS + 0 PARTIAL + 0 FAIL = 100% PASS (Gemini 4 cells excluded per §0.A).
- Q-S1 BECAT EXTRACTION: 3/3 PASS
- Q-S2 PP RELREC Method (v1.3→v1.4 main trigger): 3/3 PASS+ (Claude paper PARTIAL → UI PASS+ post A3.1 pipeline fix)
- Q-S3 TR TRSTRESN/TRSTRESU typo: 3/3 PASS
- Q-S4 DI domain (NotebookLM bucket 25): 3/3 PASS
- B2 R4 17-question full regression (Gemini-only scope originally): N/A — Gemini sanity testing has stopped (see §0.A; optimization continues, testing does not).
- Q-S2 post-rebuild sanity recheck: SKIPPED per user (grep-level content verification deemed sufficient).
- C2 UNSOURCED N=80 sampling: 75 RI + 0 XLSX + 0 HALLUCINATED + 5 NEEDS_REVIEW (cumulative N=80; v1.3 N=40 HIGH + v1.4 +40 LOW; HIGH pool was exhausted in v1.3).
- C1 section_coverage: P4b deterministic rerun done — FULL_COVERAGE 101→137, SKELETON 67→46. md_atoms remain in their pre-v1.3 state (LLM pipeline rerun deferred to v1.5 as C1-bis).
Known issues (deferred to v1.5 — see KNOWN_LIMITATIONS §0)
- C1-bis full pipeline LLM rerun: P2 increment + P4a forward matcher + P4b (the deterministic portion landed in C1).
- C1-ter post-P6 Makefile gate: section_coverage stability gate.
- C2 KB_INTERNAL_CROSSREF new classification category: the N=80 sample exposes the need for a new category beyond the current 4-class classifier.
- C2 3 deep-paraphrase atoms manual review: 3 of the 5 NEEDS_HUMAN_REVIEW atoms are deep paraphrases pending human judgment.
- C3 NotebookLM screenshot tutorial (Chrome MCP): v1.4 DEPLOY_GUIDE carries a text-level reminder; screenshot capture deferred.
- Tier B 156 sections + full 437 UNSOURCED + Phase 7 RAG+KG: all carries from v1.3 §二.
Migration from v1.3 → v1.4
For self-hosting users:
- 3 sanity-covered platforms (ChatGPT / Claude / NotebookLM): replace the platform’s system prompt with
self_deploy/<platform>/system_prompt.md(orinstructions.md); replace uploads with the corresponding bundle files inself_deploy/<platform>/uploads/. Step-by-step in.work/07_release_v1_4/V1_4_DEPLOY_GUIDE.md. - Gemini: user self-verify. v1.4 ships a v9 system_prompt + Method label anchor + KB delta increment, but this platform has no sanity test coverage — please self-verify answer correctness; for high-correctness use cases prefer the other 3 platforms.
- NotebookLM bucket 25 (v1.3 carry): if your existing deployment still contains the old source
25_td_meta_ti_ts_oi.md, after uploading25_td_meta_ti_ts_oi_di.mdplease manually delete the old source (43 → 42). v1.4 DEPLOY_GUIDE carries a prominent reminder.
Tag: v1.4-company-release
SDTM Knowledge Base — Release v1.3 Changelog (EN)
Tag:
v1.3-company-release(cut 2026-05-20) Previous tag:v1.2-company-release(2026-05-19) Driver: KB pass — PP RELREC OA-4 gap + BECAT extraction prompt-KB drift + Tier B partial repair + 4-platform rebuild + light sanity 14-15/16 PASS
Summary
v1.3 is a knowledge-base pass release — the largest content update since v1.0. It directly modifies the KB, rebuilds all four platform bundles from the updated source, and verifies end-to-end delivery across the four deployed AI platforms. The Gemini system prompt (v8.1, 525 lines) is unchanged from v1.2; the v1.4 prompt full-stack refactor is deferred.
v1.3 (2026-05-20) — KB pass + 4-platform rebuild + light sanity
Type: knowledge-base pass (largest content change since v1.0; not a prompt-only refresh)
KB changes (11 files modified)
- PP/examples.md — added §6.3.5.9.3 RELREC Method Quick Reference (Method A/B/C/D table + 1 abbreviated relrec.xpt for Method C). Closes OA-4 carry from 06 Deep Verification project. (+2,620 lines in bundle)
- BE/spec.md — L111: added
EXTRACTIONas a sponsor-extensible 4th example alongside CDISC canonical COLLECTION / PREPARATION / TRANSPORT. Aligns KB with Gemini v8.1 prompt L272. - TR/spec.md (§6.3.12.2) — corrected column-header typo:
TRSTRESN→TRSTRESUin the TR result display table. - Tier B repairs (8 additional files) — 10 high-density shall/must sections repaired across: §2.7 SDTM variable rules, §6.4.2 FA naming, §7.2.1 Trial Arms Example 4, §7.3.2/§7.3.3 TD/TM, §4.5.1.2 Tests Not Done, §6.4.3 FA —OBJ, §7.2.1.1 TA Distinguishing, §4.3.5.
Platform bundle changes
- chatgpt: 3 files updated —
04_specs_and_context.md(+284),05_domain_assumptions.md(+333),06_domain_examples_all.md(+5,432) - gemini: 3 files updated —
01_navigation_and_routing.md(+3,103),02_specs_and_assumptions.md(+617),03_domains_examples.md(+5,432) - notebooklm: 7 files updated + 1 renamed (
25_td_meta_ti_ts_oi.md→25_td_meta_ti_ts_oi_di.md, DI now reflected in bucket name) - claude: 5 files updated —
02_chapters.md,03_model_structure.md,06_assumptions_all.md,09_examples_data_high.md(+592),10_examples_data_others.md
Build script changes (defensive hardening)
- ChatGPT
merge_for_chatgpt.py:expected_segmentschanged from hardcoded63/64/63to dynamiclen(_collect_domain_assumptions()). No segment-count regression; warn-not-fail for >5% delta. - NotebookLM new
validate_bucket_coverage.py: 190/190 KB files reach buckets, 0 stale references, 0 unrouted domains.
Verification
- Phase B cross-platform delta oracle: 4 byte-exact equations PASS (ChatGPT 04 delta = NotebookLM bucket 10 delta = Gemini 02 partial delta for BE/spec, etc.). 0 silent loss.
- Phase C light sanity (4 questions × 4 platforms = 16 cells): 14-15/16 PASS.
- Q-S1 BECAT EXTRACTION: 4/4 PASS (2 PASS+)
- Q-S2 PP RELREC 4 Methods: Claude PASS+, NotebookLM PASS+, ChatGPT PARTIAL (IDVAR combos correct, Method labels shuffled), Gemini FAIL (prompt bloat — v1.4 carry)
- Q-S3 TR TRSTRESN/TRSTRESU typo: 4/4 PASS (2 PASS+)
- Q-S4 DI domain / bucket 25 rename: NotebookLM PASS+ (footer cites
25_td_meta_ti_ts_oi_di.md), Claude PASS+, Gemini PASS, ChatGPT assumed PASS
- UNSOURCED_MANUAL N=40 sample: 0% HALLUCINATED (80% REASONABLE_INFERENCE + 20% DERIVED_FROM_XLSX). Rule D
scientistreviewer confirmed. - system_prompt audit: 20/20 grep probes PASS across 4 platforms (0 stale numeric references).
Known issues (deferred to v1.4 — see KNOWN_LIMITATIONS §0)
- Gemini PP RELREC retrieval weak (Phase C Q-S2 FAIL): Gemini v8.1 prompt bloat (525 lines, 17 CO-N rules as fossil layers) causes KB-grounding failure on PP RELREC. v1.4 main carry: full-stack prompt refactor for all 4 platforms (~200 lines clean, regex-gated CO-N rules).
- Full 437 UNSOURCED_MANUAL classification: v1.3 sampled N=40 only. Full 437 deferred.
- Tier B sections 11-25 + all level-2 Tier B: v1.3 repaired ranks 11-20 (10 sections). Remaining ~156 sections deferred.
- R4 full 17-question Gemini regression: v1.3 used 4-question light sanity × 4 platforms; R4 Pro-only deferred due to quota constraint.
- section_coverage.jsonl full pipeline rerun: baseline backed up; full regen deferred to v1.4.
Migration from v1.2 → v1.3
For self-hosting users:
- All 4 platforms: replace uploads with files from
self_deploy/<platform>/uploads/. - NotebookLM: upload
25_td_meta_ti_ts_oi_di.mdand delete the old25_td_meta_ti_ts_oi.mdfrom your NotebookLM source list (source count should drop from 43 to 42). - System prompts / instructions: no change required (v8.1 Gemini + other 3 prompts unchanged from v1.2).
Tag: v1.3-company-release
v1.2 (2026-05-19) — Gemini-only prompt refresh v7.1 → v8.1
Tag:
v1.2-company-release(cut 2026-05-19) Previous tag:v1.1-company-release(2026-05-15) Driver: SMOKE_V4 R3 (2026-05-19) regression on Gemini v7.1 → v8.1 system prompt fix
v1.2 is a Gemini-only system prompt refresh of v1.1. The knowledge base, all four AI-platform upload bundles, all metadata documents, and three of the four platform system prompts (Claude / ChatGPT / NotebookLM) are unchanged from v1.1. Only self_deploy/gemini/system_prompt.md has been replaced (v7.1 → v8.1, 422 → 525 lines, +24%).
Driver: SMOKE_V4 R3 regression on Gemini
After deploying v1.1 to the four AI platforms, a full regression test (SMOKE_V4 R3) was run on 2026-05-19. Three of four platforms held their R1 baseline:
- Claude v2.6: 17/17 (sustained)
- ChatGPT v2.2: 17/17 (slight improvement over R1)
- NotebookLM v2: 15.5/17 (RAG architectural limits on Q9 PUNT and Q11 PARTIAL — both expected)
- Gemini v7.1: 13/17 (4 FAIL) — regression vs R1 16/17
The four Gemini failures:
| # | Topic | v7.1 failure mode |
|---|---|---|
| Q3 | BE / BS / RELSPEC (biospecimen handling) | Off-topic response on AE / AESEV / AEGRPID (1,541 chars) |
| Q4 Scenario A | Anti-measles IgG titer | Routed to LB instead of IS (regression of R2 fix) |
| Q11 | Dataset-JSON v1.1 vs XPT v5 | Off-topic response on AE / CM (1,436 chars) |
| AHP1 | LBCLINSIG variable hallucination probe | Off-topic response on CM polypharmacy / MH (1,485 chars) |
Independent reviewer (oh-my-claudecode:scientist, Rule D #15) confirmed the matrix and identified four anchor coverage gaps in v7.1.
What changed in v8.1
4-prong fix: CO-4 entry guard (biospecimen keywords) + CO-2f file-format ground rule + CO-1e IS scope shift v3.3→v3.4 + CO-5 default reflection (SDTM-regex KB double-check). 6 reviewer-driven refinements (H1/H2/M1/M2/L1/L2). See full CHANGELOG.en.md for detail.
Verification
- v8.1 dry-run: 4/4 PASS on Gemini 3.1 Pro (same model as R3 baseline).
- Rule D #16 (
pr-review-toolkit:code-reviewer): PASS_WITH_OBSERVATIONS, 6 reconcile fixes applied. - Rule D #17 (
oh-my-claudecode:verifier): APPROVE 0 blockers.
Tag: v1.2-company-release