CGM Guide Series – Step 2
CGM Data Sufficiency – How Much Evidence Is Enough?
A device can pass all five framework criteria with ten participants in the study. That is not the same as passing with three hundred. This step sets the minimum threshold for how much data is required before a CGM earns its place in the guide.
Part of: CGM for T1D – The GNL Framework → Step 2 of 3
The loophole
The original DSNFUK CGM Comparison Chart scored devices on five criteria. A 5/5 score meant a device had addressed each of the five risk areas in its accuracy data. That was meaningful progress.
But a loophole emerged. A device could score 5/5 with ten people in the study. Another could score 5/5 with three hundred. Both scored identically, but the generalisability of the evidence – whether it represents how the device performs across a real population – was completely different.
The January 2026 framework update introduced a data sufficiency requirement to close that gap.
The sufficiency threshold
To meet data sufficiency, an accuracy study must satisfy at least one of the following:
Option A – Minimum participant count
At least 50 participants in the accuracy study. This is the clearest route: enough people that the accuracy figures can be considered reasonably generalisable to a broader population.
Option B – High data point density
Fewer than 50 participants, but with a very high number of paired CGM-to-reference data points per participant. Some intensive study designs achieve this. If the total matched pairs are sufficient to produce stable accuracy statistics, a study with 40 participants can meet the threshold.
Both options aim at the same thing: accuracy data that is stable enough to generalise. A small study with sparse data points produces accuracy numbers that can shift significantly with a handful of outliers. A study that meets either option produces figures that are less sensitive to individual variation.
Why this matters clinically
CGM accuracy claims from small studies are real claims from real measurements. But a MARD of 9% in a study of twelve people is not the same confidence level as a MARD of 9% in a study of three hundred.
Small studies are prone to sampling effects – if the participants happened to have more stable glucose patterns, the device looks better than it would in a broader population. If they had unusually variable glucose, it looks worse. Only larger studies smooth out these effects.
For a device used by hundreds of thousands of people to make insulin dosing decisions, the evidence base needs to be big enough that the numbers are meaningful for the population, not just the sample.
Current status of each device
| Device | 5/5 framework | Data sufficient | Key study |
|---|---|---|---|
| Dexcom G7 | Yes | Yes | Garg et al. (2022) – n=316, 619 sensors, 77,774 matched pairs |
| Abbott FreeStyle Libre 3 | Yes | Yes | Abbott pivotal (2022) – n=72, 6,845+ matched pairs |
| Roche Accu-Chek SmartGuide | Yes | Yes | ATTD 2024 accuracy study – n=48, 3 sensors per participant |
| Medtronic Simplera Sync | Yes | Yes | CIP330 pivotal trial – n=243, ages 2-80 years |
| CareSens Air | Framework met | Pending full publication | 15-day accuracy study (DTT, 2024) – threshold review pending |
| GlucoMen iCan | Pending | Pending | Sinocare pivotal (2024) – additional real-world data needed |
Table reflects published evidence as of April 2026. Updated when new studies are published.
What comes next
Data sufficiency answers whether there is enough evidence. The next step – accuracy – looks at what that evidence actually shows: what MARD means in practice, why the 40/40 agreement rate is the critical number, and why the 1% of readings that falls well outside any accuracy window deserves its own serious attention.
