I Got My VO2Max Tested in a Lab to See Which of These Nine Fitness Devices Was Most Accurate

I’m not surprised that Garmin did well, but I am surprised by some of the watches at the bottom of the list.

Jun 11, 2025 - 21:50
 0
I Got My VO2Max Tested in a Lab to See Which of These Nine Fitness Devices Was Most Accurate

We may earn a commission from links on this page.

I have, at my disposal, at least nine different devices that can estimate my cardio fitness. They all put it in terms of a number scientists call VO2max. But the only way to find your actual VO2max is to get a test done in a lab, so I knew what I had to do. From my results, I’ll tell you which devices gave me the best and worst readings—and what that means for my (and your) training going forward.

My test included devices from Apple, Coros, Fitbit, Garmin, Oura, Suunto, Ultrahuman, Withings, and Whoop. I wasn’t surprised that Garmin scored well, but I was expecting better from some of the other brands, like Apple. There were also a few serious outliers—you’ll have to read on to see which devices did the worst.

What is VO2max and why does it matter?

VO2max is a measure of cardio fitness, so athletes and their coaches have long been interested in knowing their VO2max numbers. But more recently, VO2max has become a wellness buzzword, for some reasons that make sense and some that are probably a bit overblown. 

I say overblown because VO2max is just one measure of fitness, not the be-all-end-all, even for athletes. And, like pretty much any number you get from fitness tech, it’s on your watch because it’s easy for a device to estimate, not because it’s necessarily the best thing to focus on. (Nevermind that the estimate may not even be accurate.)

Anyway, a big reason for the buzz around VO2max is that it’s been associated with longevity. Fitter people tend to be healthier and live longer, and VO2max puts a simple number on the otherwise nebulous concept of fitness. A 2016 statement from the American Heart Association pointed out that cardiorespiratory fitness may be a better predictor of mortality than traditional risk factors like cholesterol levels. 

VO2max is also handy to track if you’re interested in your fitness for fitness’ sake. If you like to run or play sports, your VO2max tells you something about how good your body is at aerobic exercise, which is directly relevant to your improvement as an athlete. 

So if your VO2max goes up over time, that’s a good sign, whether you’re interested in winning races or just living a healthy life. Smartwatches will often estimate your VO2max based on workout data, so pretty much every wearable these days will give you a VO2max estimate, sometimes labeled as a “cardio fitness” score. 

How do you get your VO2max tested for real? 

Beth on the treadmill with a mask, Dr. Wisniewski in the background
Credit: Dr. Michelle Stehman

The easiest way to get a VO2max estimate is to glance at your watch, assuming it does the calculation for you. The next easiest way is to do a field test like the Cooper test, which asks how far you can run in 12 minutes. But these are all estimates that may or may not get close to the truth. 

To actually test your VO2max, you need to go to a lab. And that’s why I drove out one sunny Tuesday to the Human Performance Lab at St. Francis University, where Dr. Kristofer Wisniewski and Dr. Michelle Stehman put me through a treadmill test. 

I’ll describe how the testing went for me, but if you get your own VO2max test done, things may be a little different. You might end up on a bike rather than a treadmill, for example, or you might do a walking-only test, or you might have your VO2max session combined with other health or fitness tests.

For the 48 hours before the test, I was instructed not to have any alcohol. For the last 12 hours, no intense exercise. For the last three hours, no caffeine or food. That last part panicked me a little bit, until I realized I had plenty of time for a normal breakfast before my midday appointment. I showed up in exercise clothes and I brought a water bottle, although I couldn’t drink from it during the test. In hindsight, I should have also brought a snack to eat afterward while I awaited my results.

At the lab, I confirmed my answers on a health form I had filled out when booking the appointment, and before we got started I took two puffs of my inhaler (I have mild asthma, which can sometimes be triggered by hard exercise). The scientists took my weight and height, and then began hooking me up to the equipment that would monitor me during the test. 

There was a chest strap to measure my heart rate, which they wanted to make sure was “uncomfortably snug” and tucked underneath my sports bra band. Then there was a mask over my mouth and nose, measured to fit and secured in place with straps that went tightly behind my head. You can see this in the photo above, and it too was, by design, uncomfortably tight. 

The tube attached to the mask doesn’t actually pump oxygen into my mouth, as I mistakenly assumed. Instead I’m breathing normal air from the room, and the air that I exhale is getting sampled to see how much oxygen and how much carbon dioxide it contains. The tube was stiff and supported by a stand, so from time to time I’d have to ask the scientists to move it a little to the left or the right so I could stay centered on the treadmill.

Before the treadmill started, there were lots of little things to be aware of. For example, I wouldn’t be able to see the treadmill while I’m running—that turned out to be more disconcerting than all the physically uncomfortable stuff. A sign on the wall in front of me was perfectly centered on the treadmill, so I could use that as my visual anchor. If I got off-center, Dr. Stehman would tell me to move a little to the right or left. If I wanted to steady myself on the handrail, I needed to do that with my hand palm-up, since a palms-down grip could affect my blood pressure readings. Dr. Stehman would, yes, be taking my blood pressure with a cuff and stethoscope at a few points during the test. And every few minutes, Dr. Wisniewski would ask how I’m feeling, and how hard I felt I was working on a scale of 1 to 10. 

We started at a brisk walk, 3.5 miles per hour. Every three minutes, it got harder: a slow jog at 4.5 mph, then a more comfortable jog at 5.5, then up to my usual easy run pace at 5.7 mph. After that, the incline increased instead of the speed. First 5% at 5.7, then 10% at 5.7. About a minute into that last stage, I gave up, grabbed the handrail, and signaled that it was time to stop. The rest was a blur—I recall a walking cooldown and at least one more blood pressure reading. Dr. Wisniewski analyzed my results while I recovered and sipped some water. Not counting the cooldown, I was on the treadmill for just over 16 minutes.

Why is VO2max measured this way? 

I’m going to get just a tiny bit more technical in my explanation, to make sense of why I had to be hooked up to all this stuff on a treadmill. VO2max literally means the volume (V) of oxygen (O2) that your body can use per minute, at maximum (max), during exercise. It’s measured in milliliters of oxygen per minute, per kilogram of your body weight. (Bigger people breathe more air than smaller people, even if they aren’t necessarily fitter, so the equation accounts for that.) 

In common parlance, we often write this as “VO2max” but I will format it scientifically just this once, so you can see: “V̇O2 max.” The dot on the V means it’s volume per unit of time, not total volume. If you hear runners talking about their VDOT scores, that also refers to an estimate of VO2max.

Why do we care about the amount of oxygen you breathe? Because it corresponds to how much work your body is doing. If you remember that respiration equation from high school biology—glucose plus oxygen feeds into a system that gives you energy in the form of ATP—knowing your oxygen consumption tells us how much energy your body is making and using aerobically. 

So if you put an elite athlete on a treadmill and crank up the speed and incline, their body will be able to do an enormous amount of work, consuming plenty of oxygen to match, and a test will register that they have a high VO2max. 

On the other hand, an out-of-shape, sedentary person would not be able to do what the elite athlete does. They’d manage a brisk walk, maybe a little jog, but they wouldn’t be able to work nearly as hard as the athlete, and so they wouldn’t consume nearly as much oxygen. They would be measured as having a lower VO2max.

Your VO2max can change over time. If that sedentary person starts training consistently and they take the treadmill test again in a few months, they will likely find they can walk or run faster, maybe handle more of an incline. The test would show their VO2max has improved. Heck, maybe someday they will be an elite athlete.

On average, younger people tend to have a higher (better) VO2max than older people, and men tend to have a higher VO2max than women. Elite athletes have been recorded with VO2max numbers in the 70s and 80s, but among recreational athletes, many of us will have numbers in the 30s and 40s, maybe 50s. (For context, Garmin has a chart that breaks down what’s considered “good” by age and sex.) 

How smartwatches and fitness trackers measure VO2max

Apple Health, Garmin, and Suunto displaying VO2max estimates
Apple Health on the phone, Garmin watch at left, Suunto watch at right Credit: Beth Skwarecki

Your smartwatch (or tracking ring or band) doesn’t know how much oxygen you’re breathing. Most of these devices use an algorithm that compares how hard you’re working—for example, how fast you’re running—with how fast your heart is beating. 

Garmin devices, for example, use GPS-tracked activities that last at least 10 minutes. Garmin can trim out parts of your activity that aren’t helpful—say, times you stopped to tie your shoe or chat with a neighbor. 

From the GPS data, the device knows your speed. And from your heart rate, it knows how hard your body is working to keep up that speed. This approach is sometimes called a “submaximal” algorithm, since you don’t have to run at top speed to get usable data. Even an easy jog can tell your Garmin or Apple Watch a lot about your fitness. If you can move at a good clip while your heart beats at a chill, easy rhythm, you’re likely a lot fitter than someone whose heart is beating out of their chest to keep up that same pace.

Each device has its own algorithm to turn the data it collects into a VO2max estimate, and that starts with recognizing when an activity is able to give the algorithm enough data. This varies from device to device; Garmin wants a 10 minute minimum activity, while Coros wants 25 minutes. You often need to have a certain minimum heart rate for the algorithm to kick in. Here’s an example from Apple’s developer documentation that describes when and how it calculates VO2max from an activity: 

“The system can generate VO2max samples after an outdoor walk, outdoor run, or hiking workout. During the outdoor activity, the user must cover relatively flat ground (a grade of less than 5% incline or decline) with adequate GPS, heart rate signal quality, and sufficient exertion. The user must maintain a heart rate approximately greater than or equal to 130% of their resting heart rate. The system can estimate VO2max ranges from 14-60 ml/kg/min.” 

These details vary from device to device. Some Garmin watches can use power meter data from a bike in place of GPS. These algorithms generally require the device to know your maximum heart rate, which they are notoriously bad at estimating, but which they can measure directly if the device is programmed to do so. For a deep dive into what one of these algorithms looks like, here is a paper published by Firstbeat Analytics, which built Garmin’s VO2max algorithm. (It’s not clear if the details described here are exactly the same as what Garmin watches currently use.) 

But some devices don’t give you much detail on how they estimate your VO2max, and some seem to say they may offer a number without collecting any exercise data at all. Whoop, for example, says that “To calculate your score, the algorithm factors your continuous physiological data (including resting heart rate and heart rate variability), your exercise patterns, and GPS-tracked performance metrics (when enabled). It also accounts for how VO2 Max naturally changes with age and incorporates physical factors that influence oxygen utilization, like height, weight, and biological sex.” My Whoop app tells me to do more GPS-tracked activities to improve my VO2max estimate, but according to statements from the company, the app may provide a number even if it doesn’t have GPS data to work from. 

Oura is a bit different from the other devices I tested. Instead of calculating a VO2max estimate from your regular workouts, it prompts you to take a six-minute walking test. This type of test is well known in the medical field, and has been used to estimate VO2max, if imperfectly. 

But there’s a depressing thing to remember about all this. When it comes to knowing how accurate fitness watches actually are, we don’t have enough information to make a scientific judgment. I discussed the problem here: Device makers aren’t required to validate their metrics or to publish their methodology. They just put whatever algorithm they want into whatever device they want, and leave the rest of us to investigate it if we feel like it. By the time scientists are able to design a study, carry it out, and report the results, often enough time has passed that the model they tested is obsolete. 

Studies on smartwatch VO2max estimates generally find that they correlate with tested VO2max results—the higher the smartwatch estimate, the higher the tested VO2max for the same person—but that the exact number can be off by quite a bit. For example, this study on the Apple Watch Series 9 and Ultra 2 concluded that “For individuals with good or excellent fitness, Apple Watch demonstrated a propensity to underestimate VO2 max, whereas among those with poor fitness, there was a tendency to overestimate.”

My results, and the winners

Handwritten result sheet giving my maximum oxygen consumption at 41.8 mL/kg/min
Credit: Beth Skwarecki

I got my official lab result shortly after finishing the treadmill test, and then at home I surveyed the various fitness trackers I’d been wearing lately. Some I had been testing for a review like the Garmin Forerunner 570, some I wear because they are my personal devices and I use them out of habit (like the Oura ring) and some I still had around from previous review testing. You’ll also see a few devices I haven’t finished reviewing yet—consider this a sneak peek.

For any devices that didn’t have recent data, I made sure to take them for a run or two so they could recalibrate. Where I had multiple devices of the same brand, they all fed data into the same app or algorithm, so I’m organizing the results by brand rather than device. A full list of the devices I used is at the bottom of this article. 

My lab-tested VO2max turned out to be 42.8 mL/kg/min. That was higher than most of the estimates I got from my wearables, so I seem to be in better shape than many of them believe. That said, a few overestimated me—Garmin by just one point, Whoop by about three, Ultrahuman by a bewildering amount. Here’s the full list, sorted by how close they were:

  • Tested VO2max: 42.8

  • Garmin: 44 (1.2 points high)

  • Fitbit: 41 (1.8 points low)

  • Suunto: 40 (2.8 points low)

  • Whoop: 46 (3.2 points high)

  • Apple Watch: 37.9 (4.9 points low)

  • Coros: 37 (5.8 points low)

  • Oura: 37 (5.8 points low)

  • Withings: 36 (6.8 points low)

  • Ultrahuman: 61 (18.2 points high)

Garmin came out on top, estimating a VO2max of 44, just 1.2 points over the actual value. I was expecting Garmin to be pretty good, since it knows my exact max heart rate and I’ve already seen that its 5K race time estimate was pretty close to my actual time. 

I was not expecting Fitbit to be next in line, but hey, good job, Fitbit. I’ve seen other reviews that pegged Suunto as having a reasonably accurate VO2max estimate, so it was nice to see Suunto performing well here, even if it was still a few points off. 

After that, Whoop stands out with its three-points-high estimate of 46. Whoop won’t reveal exactly how it estimates VO2max, but since it supposedly doesn’t require exercise data at all, I don’t trust it very far. (I did make sure to feed it some GPS data during my testing, which it said improved the accuracy of my estimate.) If it’s a guess, at least it’s a flattering guess. 

Ultrahuman’s estimate is so far off I almost didn’t include it. I only started testing the Ultrahuman ring a few days ago, and only did two workouts with it so far—but the other devices on my list were all able to give a plausible estimate the first time a number showed up. I checked my settings, and found that I can’t edit the max heart rate Ultrahuman calculates for me, which is probably affecting the accuracy of the VO2max estimate. But if the Ultrahuman app is working from poor data as a design choice, I’m hardly being unfair by using the number it gives me. So it’s on the list, and I’ve voiced my reservations.

The rest are all around five or more points too low. If I had trusted my Apple Watch, I would think I’m a lot less fit than I really am. Along with Coros, Oura, and Withings, it gave a number in the 30s. I really can’t be too impressed by these.

Limitations

The biggest caveat on my results is this: I’m only one person. If you did this same experiment with 100 different people, we probably wouldn’t all get identical results. Some devices might be more accurate with young athletes, some with ordinary folks, some with people who have naturally higher or lower heart rates, and so on. Devices change. Software gets updated. Please view my results as a snapshot of one person on one day with this specific collection of devices. 

The VO2max estimates from each device have their own parameters that I don’t necessarily know about. I did my best to have a correct (or close-enough) weight, age, and where possible, max heart rate entered in each app. But since the companies don’t all disclose what variables they use in their calculations, I don’t have a full list of numbers to go in and double-check.

There is also no such thing as a perfect test, even when done as well as possible. If I had gotten my VO2max test done on a different day, or at a different lab, my result may have been slightly different, and the order of the rankings wouldn’t be quite the same. 

How useful is the VO2max score on your device? 

I’m going to be honest here: after all that science, I can condense the practical advice into about four words: “Make number go up.” Whether your VO2max comes from a lab test or a smartwatch estimate, it will tend to get higher as you do more exercise, more consistently. 

If the number is increasing, or if it stays steady at a relatively high number, you’re probably doing something right. If it decreases over time, you could take that as a nudge to do a little more cardio

(If your watch’s estimate isn’t getting higher as you feel you’re getting more in shape, I’d check it by testing your fitness another way, like timing yourself running a certain distance, or even gauging how you feel during a workout you’ve done before, and seeing if that improves over time. But normally we’d expect changes in these VO2max estimates to keep pace with fitness improvements.)

Besides a VO2max estimate, most of these devices also tell you how good your VO2max is relative to other people of your gender and age group. Garmin has me as “excellent” and once, for a moment, I briefly had a score of “superior.” Apple says my cardio fitness is “high,” Oura says my cardio capacity is “peak,” and Suunto says I’m “excellent.” 

Without quibbling too much about where the borders of these ranges might lie, I think these are fair judgments given that the lab said I’m in the 96th percentile of my cohort of middle-aged women. That sounds impressive on paper, but in real life I'm a pretty average runner. That "for your age and gender" asterisk is doing a lot of work.

But let’s take a step back for a minute. VO2max is just a number. My real goals in life involve being healthy and happy, and maybe improving my 5K time as a treat. If I were a true masochist like some people around here, I might add wanting to run marathons faster and faster. 

Your VO2max is connected to all of that, but it’s not literally the same thing. You can have a high VO2max and still have health problems. Athletes often find that their real-life race times are faster or slower than their VO2max test results would suggest. Coaches don’t just say “let’s get your VO2max up.” They’ll have runners work on their lactate threshold, their running economy, their mental toughness, their leg strength, and dozens of other things. 

Health and fitness are multifaceted and can’t be boiled down to a single number. So while you can use VO2max (or its smartwatch estimate) as a shorthand for cardio fitness, it’s certainly not a direct measurement, nor does reaching a certain VO2max number unlock a certain level of health or longevity. 

The specific device models I used

In some cases, multiple devices fed data to the same app or algorithm. For example, even if you have three Garmin watches linked to the same account, you only get one VO2max score that will display in the Garmin Connect app and on any of the watches. The watches will not disagree with each other in their scores.

In the past I have tested other devices of these brands, and never saw a significant difference from one device to another within the same brand. For example, I recall similar cardio fitness scores from the Fitbit app whether I was wearing the Charge 6 or the Pixel Watch 3. So I feel pretty confident reporting these scores per app rather than per device.

With that in mind, the list below includes the devices I used around the time of my VO2max lab test as the primary sources for each brand’s estimate. 

  • Apple Watch: Series 10 (GPS + cellular, 42 mm)

  • Coros: Pace 3 (Used less recently: Pace Pro)

  • Suunto: Suunto Run

  • Withings: Scanwatch 2

  • Whoop: Whoop 4.0 

  • Ultrahuman: Ring AIR

I made sure to get an updated VO2max estimate from each device within about a week of my VO2max lab test (either before or after the test, as convenient). The only exception was Whoop, which requires 14 days of recent sleep data to give you an up-to-date VO2max estimate. My last VO2max estimate from Whoop was three weeks prior to my VO2max lab test.