Samsung Health NL test recorder · 2026-07-01 · DFS-traverse the WHOLE menu-tree, collect per-node TC count + custom-tool targets + failure causes before 사내 rollout. Click any node → its device screenshot + finding (right panel).
| ID | Problem | Where | Fix / custom tool |
|---|---|---|---|
| IB-7 MAJOR | Home-tile reachability / re-entry gap (NEW - dominant blocker) SHealth home shows a user-customized, tab-based, scrollable SUBSET of tracker cards; a node whose entry tile is not on the current feed returns non_arrival via NL->select_card. The same NL even reached Stress on one run and failed on another (dashboard non-determinism). | Spo2, HeartRate, Daily-activity, Programs (all non_arrival). Live screenshot capture of these was itself blocked - visual proof. | Stable non-home entry (QuickAdd hub PROVEN stable) / search / add-tile nav; or pin the home tile-set as a test precondition. #1 blocker for 사내 regression replay. |
| IB-1 MAJOR | General edge-set fragmentation (identity) Same screen splits into multiple nodes when dynamic content / selected value changes the clickable edge-set. Part A fixed only the DASHBOARD. | workout-title (#001), mood-state (#002), food meal/time (#005). | A general identity rule (scrollable/dynamic-region anchor or per-activity stable key). |
| IB-5 MAJOR | Closed feature-vocabulary blocks growth finalize rejects any feature outside a fixed 13-item set; real reached screens were refused. | water / period / medication (#009/#011/#012). | Extend the vocab or add a domain->feature mapping for new trackers. |
| IB-6 MEDIUM | Chart / value read - no read/assert tool (refined) Two shapes, both need a read/assert tool the pipeline lacks: (1) pure canvas (value not in a11y at all); (2) value present in a11y but NL 'read' degrades to tap (pipeline can only actuate). | BodyComp/Water = canvas; Stress = value in a11y but tap-degraded. | A read_value / assert_text tool that extracts a11y text as an oracle. |
| IB-2 MEDIUM | set_weight composite picker tool Weight = kg-wheel + decimal-wheel; a single NL 'set weight to 70.5' sets only the integer. | TrackerWeightInput (#004). | A composite setter that decomposes X.Y, like set_duration. |
| IB-4 MEDIUM | Fast clock-face setter The circular sleep clock-face timed out >2min (heavy multi-drag / vision). | SleepEditClockFace (#006). | A fast deterministic clock-face setter. |
| IB-3 MINOR | Toggle / select is non-extending (U-8) Selecting a mood is a state-change that folds to needs_review; needs select+save to extend. | MoodCheckIn (#002). | Multi-step toggle handling. |
| NOTE INFO | Pipeline robustness - empty VLM content BloodGlucose/Breathe value-set probes returned 'vlm_enrich: empty content (finish_reason may indicate token limit)'. | Reliability artifact, not a custom-tool gap. | Retry / raise max_tokens on empty VLM content. |
The Samsung Health home is a user-customized, tab-based, scrollable subset of tracker cards. A recorded node can only be re-driven if its entry tile is in the same home configuration - the same NL reached Stress on one run and failed on another (dashboard non-determinism, FINDING-C). Live screenshot capture of the blocked screens was itself impossible this pass = direct proof.
Reachable (on-feed): Stress Body composition Breathe Blood glucose (QuickAdd)
NON_ARRIVAL (tile not on home): Blood oxygen Heart rate Daily activity Programs
Mitigation: the QuickAdd hub is a proven-stable re-entry; or search / add-tile nav; or pin the home tile-set.
Select a node to view its screen + finding.