Domain IV: Manage AI Model Development and Evaluation — Study Game

How to Play

Pick a game mode and test yourself. Cover answers and try to recall before peeking.


GAME MODE 1: Rapid Fire Flashcards

The Two Gates (IV.5 + IV.6)

Card 1 — Front: What's the IV.5 gate question?
Answer: Is the prepared data quality sufficient to train on? End of Phase III.
Card 2 — Front: What's the IV.6 gate question?
Answer: Is the model ready to operate in production? End of Phase V.
Card 3 — Front: Distinguish III.8 from IV.5.
Answer: III.8 = end of Phase II ("do we have what we need?"). IV.5 = end of Phase III ("is prepared data sufficient to train?"). Different artifacts, different boundaries.
Card 4 — Front: What does QCBVR stand for?
Answer: Quality dimensions, Coverage of attributes, Bias within tolerance, Volume sufficient, Reproducibility verified (IV.5 criteria).
Card 5 — Front: What does PBRBARTAO stand for?
Answer: Performance, Bias, Robustness, Baseline, Audit, Reproducibility, Trustworthy AI, Operational fit (IV.6's 8 criteria).
Card 6 — Front: Three outcomes at IV.5 and IV.6 gates?
Answer: GO / ITERATE / DESCOPE.
Card 7 — Front: What does IV.6 GO authorize?
Answer: Domain V (deployment) work to begin.

Technique Selection (IV.1)

Card 8 — Front: What's the PM's role in IV.1?
Answer: Oversee — ensure technique is documented, justified against AI pattern + success criteria, aligned with operational constraints. PM doesn't pick the technique.
Card 9 — Front: Three ML categories?
Answer: Supervised, Unsupervised, Reinforcement.
Card 10 — Front: Difference between algorithm and model?
Answer: Algorithm = procedure. Model = trained artifact. You train an algorithm to produce a model.
Card 11 — Front: Three patterns of pretrained AI?
Answer: Pretrained model (adapt for task), Foundation model (very large pretrained), GenAI (generates new content).
Card 12 — Front: What's transfer learning?
Answer: Pretrained + fine-tune on your task data.
Card 13 — Front: What's RAG?
Answer: Retrieval-Augmented Generation — retrieve relevant context + generate from foundation model.

Training (IV.3)

Card 14 — Front: What does DTHR stand for in training triage?
Answer: Data, Technique, Hardware, Results — review when training overruns.
Card 15 — Front: Overfit vs underfit?
Answer: Overfit = memorizes training data, fails on new. Underfit = doesn't learn even on training.
Card 16 — Front: Typical train/validation/test split?
Answer: ~70%/15%/15%.
Card 17 — Front: What does generalization mean?
Answer: Model performs well on data it hasn't seen — the goal of training.

Data Preparation (IV.4)

Card 18 — Front: What does TRIM stand for?
Answer: Transform formats, Reconcile inconsistencies, Impute missing values, Map fields.
Card 19 — Front: What % of project time is typically spent on data prep?
Answer: 70-80%.

QA/QC (IV.2)

Card 20 — Front: What does IV.2 QA/QC cover?
Answer: Configuration management + performance verification + bias measurement + documentation throughout development.
Card 21 — Front: Three transparency dimensions?
Answer: Systemic (how built), Decision (why this prediction), Algorithmic (algorithm-level).
Card 22 — Front: XAI vs Interpretability?
Answer: XAI = post-hoc explain any model. Interpretability = inherently understandable models. High-stakes prefers interpretability.

GAME MODE 2: Scenario Showdown — What Should the PM Do?

Scenario 1: The Training Overrun

Reveal Pause training. Conduct DTHR (Data/Technique/Hardware/Results) root-cause review. Document decision: continue, change approach, or escalate. 2.5x overrun = project event, not technical hiccup.

Scenario 2: The Black-Box Healthcare Decision

Reveal Document technique selection; ensure trade-off between performance and explainability is presented to stakeholders for decision; consider interpretable-by-design alternatives. IV.1 + Domain I.2 cross-pull.

Scenario 3: The Operational Mismatch

Reveal ITERATE — operational fit failure (PBRBARTAO criterion). Loop back to V.1 (infrastructure) or IV.1 (technique change) with stakeholder decision.

Scenario 4: The Bias Discovery During QA

Reveal Treat as IV.2 + I.3 issue: document, escalate per accountability, engage stakeholders for remediation, do not authorize IV.6 GO until bias within tolerance.

Scenario 5: The Parallel Work Request

Reveal Confirm IV.6 must complete before Domain V work begins — sequential, not parallel. Gate authorizes the transition.

Scenario 6: The Plateau Concern

Reveal Coordinate investigation of early plateau (data quality, technique fit, hyperparameter tuning). Engage IV.2 QA/QC. Don't blindly accept "acceptable" without root-cause.

Scenario 7: The Iteration Trap

Reveal Pause and review iteration trajectory: are improvements converging or plateauing? Is the technique a fit? Is data sufficient? Document decision: continue, change technique, descope, or escalate.

Scenario 8: The Quality Gate with Bias

Reveal ITERATE — bias is QCBVR criterion. Outside tolerance = no GO. Loop to IV.4 to remediate or III.1 to redefine.


GAME MODE 3: Pattern Match Challenge

#ScenarioECO Task
1Overseeing model technique selectionIV.1
2Overseeing model QA/QCIV.2
3Managing model training executionIV.3
4Managing data transformationIV.4
5Verifying data quality (gate)IV.5
6Verifying model ready for ops (gate)IV.6
7The PBRBARTAO gateIV.6
8The QCBVR gateIV.5
9DTHR triage when training overrunsIV.3
10TRIM categories of data prepIV.4
Scoring: 9-10 = Expert | 7-8 = Solid | Below 7 = Review

GAME MODE 4: Fill-in-the-Blank Speed Round

  1. IV.5 evaluates ________ Coverage Bias Volume Reproducibility (QCBVR).
  2. IV.6 evaluates Performance Bias ________ Baseline Audit Reproducibility Trustworthy-AI Operational-fit (PBRBARTAO).
  3. III.8 = "do we have what we need?" IV.5 = "is the ________ data sufficient to train?"
  4. AutoML automates the technical decision but doesn't replace ________ documentation (IV.1).
  5. The PM doesn't pick the technique — the ________ does.
  6. Training overrun by 2.5x = project event. Apply DTHR review: Data, Technique, ________, Results.
  7. ~70-80% of project time is typically spent on ________ ________.
  8. IV.6 GO authorizes ________ ________ work to begin.
  9. RAG = ________-Augmented Generation.
  10. Reproducibility means same data + same pipeline = ________ output.

Reveal answers
  1. Quality
  2. Robustness
  3. prepared
  4. governance
  5. data scientist
  6. Hardware
  7. data preparation
  8. Domain V
  9. Retrieval
  10. same


GAME MODE 5: True or False Lightning Round

#StatementCorrect
1III.8 and IV.5 are the same gateFALSE — different gates at adjacent boundaries
2The PM picks the model techniqueFALSE — data scientist picks; PM oversees governance
3IV.6 has 8 criteria (PBRBARTAO)TRUE
4Domain V work can begin in parallel with IV.6 gateFALSE — sequential
5AutoML bypasses IV.1 documentation requirementFALSE — automation ≠ governance
6Operational fit is part of IV.6 PBRBARTAOTRUE
7Production validation substitutes for IV.6 evaluationFALSE — gate is pre-deployment
8Reproducibility means inference reproducibility onlyFALSE — training reproducibility too
9The PM declares IV.6 GO unilaterallyFALSE — multi-stakeholder sign-off
10A failed contingency test still satisfies V.7 if documentedFALSE — V.7 requires tested plans
11Performance vs baseline is part of IV.6TRUE
12"Reflecting real-world differences" excuses biasFALSE — amplification/perpetuation matter
Scoring: 11-12 = Exam ready | 9-10 = Almost | <9 = Review

GAME MODE 6: Mnemonic Speed Recall

MnemonicExpand it
TRIMTransform, Reconcile, Impute, Map (data prep)
QCBVRQuality, Coverage, Bias, Volume, Reproducibility (IV.5)
DTHRData, Technique, Hardware, Results (training triage)
PBRBARTAOPerformance, Bias, Robustness, Baseline, Audit, Reproducibility, Trustworthy AI, Operational fit (IV.6)
3 ML CategoriesSupervised, Unsupervised, Reinforcement
Algorithm vs ModelAlgorithm = procedure. Model = trained artifact.
3 GatesIII.8 (data ↔ needs) · IV.5 (prepared data quality) · IV.6 (model ↔ ops)
Overfit vs UnderfitOverfit = memorize. Underfit = doesn't learn.

Scoring Summary

ModeScoreMax
Flashcards___/2222
Scenarios___/88
Pattern Match___/1010
Fill-in___/1010
True/False___/1212
Mnemonic___/88
TOTAL___/7070
Rating: 60+ = mastered · 45-59 = strong · 30-44 = review · <30 = re-study.