III. Demand Modeling Overview | Bureau of Health Workforce

Published 2025

In this module:

Constructing the population databases
Modeling demand for health care services
Modeling the workforce demand implications of COVID-19
Staffing to meet demand for health care services
Demand scenarios

The demand component of the Health Workforce Simulation Model (HWSM) first projects demand for health care services, and then estimates the number and mix of health care workers required to meet projected demand for services. Demand for health care workers is reported as full-time equivalents (FTEs) using the same 40 hours/week definition as supply. Therefore, supply and demand are directly comparable.

There are three major elements for modeling demand:

A population database contains demographic, socioeconomic, health status, and health risk information for a representative sample of the current and projected future population in each county. County data sum to the state and national levels.
Prediction equations of the demand for health care services relate individual’s characteristics (in the population database) to annual health service use by care delivery setting and by health profession seen or diagnosis category.
Staffing patterns convert demand for services into demand for providers.

Exhibit III‑1 presents a flow diagram for the demand component of the HWSM. Not all care delivery sites pertain to every health occupation modeled. The drivers of growth in demand for hospital-based occupations are projected growth in inpatient days and emergency visits. Growth in ambulatory visits is the demand driver for growth in demand for health care workers in office and outpatient-based settings. For the “other employment” settings, the workload metric for demand is included in parentheses. For example, growth in the population age 5-17 is the demand driver for growth in demand for care in school-based care or counseling.

Exhibit III-1: Flow Diagram for the Demand Component of the HWSM

This diagram shows how various inputs are synthesized into demand projections. It gives a high-level view of the model. The individual components are explained in more detail in this module. The basic equation is that combining Utilization Patterns and Population Data translates into a Demand for Services. Demand for Services is then translated into Demand for Health Workers using Staffing Ratios.

Sources: MEPS=Medical Expenditure Panel Survey, NIS=National Inpatient Sample, NHAMCS=National Hospital Ambulatory Medical Care Survey, ACS=American Community Survey, BRFSS=Behavioral Risk Factor Surveillance System, CMS MDS = Centers for Medicare and Medicaid Services Long Term Care Minimum Dataset, MCBS=Medicare Beneficiary Survey, population projections come from states, S&P Global, and the U.S. Census Bureau.

Utilization patterns are the relationship between patient characteristics and health care use. These data come from a variety of sources. The population data include demographic, socioeconomic, and health risk factors. These data also come from a variety of sources. Population data are also valuable for quantifying specific population types, like the school-age population, that might demand specific types of care.

By combining characteristics of the population with data on how parts of the population use services, demand for services is derived. Demand for services include items like hospital inpatient days by diagnosis, provider visits by occupation/specialty, and hospice visits by population, among others.

Demand for services is then converted into a demand for health care workers by applying staffing ratios. Staffing ratios take a demand for services, like the number of visits to a dentist annually, and converts that into the number of dentists needed to meet that demand.

Constructing the population databases

General approach

The microsimulation approach models demand for health care services separately for individual people and then aggregates projected service demand to the population level. This approach requires individual level (micro) data on the predictors of health care use for each person in a representative sample of a designated geographic region (national, state, or county-equivalent).

Prior to 2019, the HWSM produced projections at the state and national levels based on constructed state-level population databases. Population files were constructed starting in 2019 for each of the approximately 3,142 counties or county equivalents (for example, parishes, boroughs, independent cities) in the United States, excluding U.S. territories. Modeling at the county level facilitates evaluation of supply and demand by rurality across states and the nation. This allows for better modeling of health workforce supply for medically vulnerable communities and populations. County population files can be combined to produce state files, which in turn combine to produce the national file.

County level population files start with combining data from multiple sources, as specified later, to create preliminary state population files.

These files contain a representative sample of the population in each state by:

demographic
household income level
medical insurance type
residency institution status (such as, resides in the community, in a residential care facility, or in a nursing home)

Then, the population data are re-calibrated to produce a representative sample of the population in each county with the prevalence of health care use demand determinants (demographics, disease, lifestyle choices, and medical insurance) benchmarked to external sources.

The core micro data file on which the HWSM’s baseline population databases are built is the most recent year (2023) of the American Community Survey (ACS). The ACS provides the demographic and socioeconomic characteristics of a representative sample of the population in each state. ACS reports information on medical insurance type, household income, and whether the person lives in a community or institutional setting. A statistical matching process, described later, is used to add health risk factors and information on disease presence. Using random sampling with replacement, each person in the ACS is matched with a similar person in the Behavioral Risk Factor Surveillance System (BRFSS), the Medicare Beneficiary Survey (MCBS)1, or the Centers for Medicare and Medicaid Services (CMS) Long-Term Care Minimum Data Set (MDS).2 This process preserves the number of records from the ACS file as well as each record’s ACS sample weight, and thus produces a preliminary population file for each state with population characteristics representative of that state. Each record has a person’s demographics, health-related lifestyle indicators, health conditions, socioeconomic and insurance characteristics, and residency setting.

Demographics
- Children (age groups 0-2, 3-5, 6-13, 14-17 years)
  Adults (age groups 18-34, 35-44, 45-64, 65-74, 75+ years)
- Sex (male, female)
- Race/ethnicity (non-Hispanic White, non-Hispanic Black, non-Hispanic other, Hispanic)
Health-related lifestyle indicators
- Body weight status (normal, overweight, obese)
- Current smoker status (yes, no)
Health conditions (diagnosis coded as yes, no)
- Arthritis, asthma, cardiovascular disease, diabetes, hypertension
- History of cancer, history of heart attack, history of stroke
Socioeconomic conditions and insurance
- Household annual income (<$10,000, $10,000 to <$20,000, $20,000 to < $35,000, $35,000 to < $50,000, $50,000 to < $75,000, $75,000 to < $100,000, $100,000 to < $150,000, $150,000+)
- In managed care plan (yes, no)
Residency setting
- Non-institutionalized in the community
- Group quarters (which includes residential care facilities and nursing homes)
Geographic location
- State
- 2013 NCHS Urban-Rural Classification Scheme for Counties3

As illustrated in Exhibit III‑2, for the community-based population, each individual in the ACS file is matched with someone in the BRFSS from the same sex, age group (17 age groups used), race, ethnicity, medical insurance type, household income level (eight income categories), and state of residence.4

Individuals residing in a group setting are randomly matched to a person in the MCBS or Nursing Home MDS in the same state, age group, sex, and race and ethnicity strata. The total number of people living in nursing homes and residential care, by state and age group, is constructed to match published numbers from the Centers for Disease Control and Prevention (CDC), showing nearly 1.2 million nursing home residents and 1.01 million people living in residential care nationally.5, 6

1 Multiple years of Medicare Beneficiary Survey (MCBS) data are used to increase sample size. The combined 2021 and 2022 MCBS are most recent data available.
2 The 2021 Long-Term Care Minimum Data Set (MDS) file is the most recent data available.
3 Centers for Disease Control and Prevention. 2013 NCHS Urban-Rural Classification Scheme for Counties (PDF - 2 MB).
4 The first round of BRFSS-ACS matching produced a match in the same strata for 91% of the population. To match the remaining 9%, the eight income levels were collapsed into four (3% matched), then the race/ethnicity dimension was dropped (2% matched), then the same criteria as the first round was applied except State was removed as a stratum (remaining 4% matched), and finally for the fifth round, only demographics was included (remaining 0.05% matched).
5 Kaiser Family Foundation. Total Number of Residents in Certified Nursing Facilities. Published 2024. Accessed May 6, 2025.
6 Melekin A, Sengupta M, Caffrey C. Residential care community resident characteristics: United States, 2022. NCHS Data Brief, no 506. Hyattsville, MD: National Center for Health Statistics. 2024. Accessed May 6, 2025.

After creating the preliminary state population file, the county level population file is then constructed and calibrated. The U.S. Census Bureau reports data on the aggregate number of people in each county in 2023 by five-year age group, sex, and race/ethnicity. Using the NCHS urban-rural classifications, each county is categorized as metropolitan or nonmetropolitan.7 In the constructed preliminary state population file, the population is first divided into metropolitan/ nonmetropolitan location using the metropolitan designation in BRFSS.8 In this file, the number of people in nonmetropolitan areas is understated relative to other published statistics.9

The sample weights are then re-weighted for people identified as metropolitan to match the demographics of the population in each metropolitan county. Sample weights are also re-weighted for people identified as nonmetropolitan to match the demographics in each nonmetropolitan county. This produces a weighted sample for each county that is representative of the demographics in each county. The other variables (for example, household income, insurance coverage, disease prevalence, and prevalence of health risk factors) in this weighted sample are representative of the demographically-adjusted metropolitan and nonmetropolitan populations. The county population files are calibrated to match data from external sources on disease prevalence, lifestyle choices, and medical insurance status.

County-level estimates of disease prevalence are calibrated at the individual level so that the population prevalence numbers exactly match published statistics. Calibration is achieved by first estimating a series of logistic regression equations using BRFSS data.

The dependent variable is whether the person has the modeled condition or risk factor, with separate regressions10 used to model:

arthritis
asthma
hypertension
cardiovascular disease
diabetes
history of cancer
history of heart attack
history of stroke
obesity
current smoker

Independent variables in the regression equations are:

demographic variables used for demand modeling (age group, sex, race/ethnicity)
dichotomous variable indicating whether the person has exercised or participated in physical activity other than their regular job in the past 30 days
body weight status—normal, overweight, obese (except for the obesity regression)
current smoker status (except for the smoker regression)
hypertension—included for modeling cardiovascular disease, history of cancer, history of heart attack, and history of stroke

Applying the prediction equation to each person in the constructed population file creates a probability that the person has the condition or risk factor. This probability is compared to a random number generated from a uniform distribution from 0-1. The population file prevalence for a specific condition or risk factor is then adjusted (if needed) until the population prevalence exactly matches published statistics for that county in the 2024 CDC PLACES database (which is based on 2021/2022 BRFSS data).11

County data “history of heart attack” prevalence is unavailable in CDC PLACES. Heart attack prevalence estimates come from state health departments (Exhibit III‑3). The availability of county-level data varies by state, and for some states is unavailable.12 When published prevalence is unavailable, the HWSM uses the prevalence rate created in the constructed population file.

7 Centers for Disease Control and Prevention. 2013 NCHS Urban-Rural Classification Scheme for Counties (PDF - 2 MB).
8 The BRFSS, administered annually by the CDC, collects data on a sample of over 500,000 individuals. Like the ACS, the BRFSS includes demographics, household income, and medical insurance status on a stratified random sample of households in each state. The BRFSS also collects detailed information on the presence of chronic conditions and other health risk factors (for example, obesity, smoking). The 2022 and 2023 files are combined to provide records for approximately one million individuals. The 2022 file lacks data on hypertension and hypercholesterolemia (variables that are omitted from the even-year BRFSS files). A predictive equation is used on the 2022 BRFSS to estimate probability of having hypertension or hypercholesterolemia as a function of other known characteristics about the person (for example, demographics, family income, obesity status, and smoking status) based on analysis of the 2023 file. To create the health risk factor dataset, health status prevalence percentages for each individual county and county equivalent in the United States (approximately 3,142 counties within 50 states and the District of Columbia) is gathered. The prevalence of 12 health risk factors/conditions in the county-level population databases are representative of the prevalence of 12 risk factor categories from BRFSS. They are coronary heart disease, stroke, current smoking, heart attack, current asthma, obesity, diabetes, high blood pressure, arthritis, cancer, high cholesterol, and current insurance status. Smoking, asthma, obesity, and insurance status reflect the individual’s current status, while the other eight categories reflect lifetime status. Obesity status is calculated based on the individual’s current weight.
9 U.S. Department of Agriculture. USDA ERS - Population & Migration. Published June 12, 2025. Accessed July 23, 2025.
10 The logistic regression equations are unweighted because the independent variables (for example, demographics) are the same variables used in the development of the BRFSS sample weights.
11 Centers for Disease Control and Prevention. PLACES: Local Data for Better Health. Published October 2024. Accessed December 31, 2024.
12 County level prevalence data were not available for Alabama, Arkansas, Idaho, Indiana, Maine, Minnesota, Mississippi, Montana, Nebraska, Nevada, New Mexico, New York, North Dakota, Ohio, Tennessee, Texas, Utah, Vermont, Washington, Wisconsin, and Wyoming.

Exhibit III-3: State Sources for County Prevalence of History of Heart Attack

State	Source
Alaska	Alaska BRFSS
Arizona	Arizona BRFSS
California	California BRFSS
Colorado	Colorado BRFSS
Connecticut	Requested from Connecticut Dept. of Health^a
Delaware	Delaware BRFSS
Florida	Florida BRFSS
Georgia	Requested from Georgia BRFSS^a
Hawaii	Hawaii BRFSS
Illinois	Illinois BRFSS
Iowa	Iowa BRFSS
Kansas	Kansas BRFSS
Kentucky	Kentucky BRFSS
Louisiana	Requested from Louisiana BRFSS^a
Maryland	Maryland BRFSS
Michigan	Michigan BRFSS, supplemented by special request^a
Missouri	Missouri Public Health Information Management System
New Hampshire	Requested from New Hampshire Division of Health Services^a
New Jersey	New Jersey BRFSS
New York	New York BRFSS
North Carolina	North Carolina BRFSS
Oklahoma	Oklahoma BRFSS
Oregon	Oregon BRFSS
Pennsylvania	Pennsylvania BRFSS
Rhode Island	Requested from Rhode Island Dept. of Health^a
South Carolina	Requested from South Carolina BRFSS^a
West Virginia	West Virginia BRFSS
Massachusetts	Massachusetts BRFSS
South Dakota	South Dakota BRFSS (PDF - 2 MB)
Virginia	Requested from Virginia Dept. of Health^a

Note^a: Received data via email from respective office

Developing demand forecasts requires creating population databases for future populations. Sample weights of the starting year population are adjusted to match population demographics (age group, sex, race and ethnicity) in the projections. The implicate assumption is that baseline prevalence rates of health and health behavior characteristics remain the same within each demographic strata (by age, sex, race and ethnicity) into the future.

Population projections

The projections start with year 2023 county-level population estimates, by demographic characteristics, published by the U.S. Census Bureau. State and county-level population projections through 2038 come from government agencies and universities, as well as from S&P Global demographers13 for those states that publish projections with a vintage older than 2021 (Exhibit III-4). While each state data center as well as the S&P Global demographers use different assumptions and approaches for projecting future population, most population projections use a stock-and-flow type model called the ‘Cohort-Component’ method. This method projects fertility, mortality, and migration rates by breaking populations down into at least nativity, age, sex, and race subsets, then subsequently assigning flow rates based on historical data. These population projections incorporate much of the excess deaths through June 2023 as well as the most recent data on birth rates and migration patterns. The updated national projections for 2037 based on the recent data is about 840,000 higher than the projections produced last year.

13 © 2025 by S&P Global Market Intelligence, a division of S&P Global. County and State Population Projections. https://www.spglobal.com/market-intelligence/en/solutions/products/regional-explorer-economics-data-analytics#research-analysis

Exhibit III-4: Sources for County Population Projections

State	Source
Alabama	S&P Projections
Alaska	Alaska Dept. of Labor and Workforce Development
Arizona	Arizona Office of Economic Opportunity
Arkansas	Arkansas Economic Development Institute
California	California Dept. of Finance
Colorado	Colorado State Demography Office
Connecticut	Connecticut State Data Center
Delaware	Delaware Population Consortium
District of Columbia	S&P Projections
Florida	Florida Bureau of Economic and Business Research (PDF - 377 KB)
Georgia	Georgia Governor's Office of Planning and Budget
Hawaii	Hawaii Dept. of Business, Economic Development & Tourism
Idaho	S&P Projections
Illinois	S&P Projections
Indiana	S&P Projections
Iowa	S&P Projections
Kansas	Kansas Center for Economic Development and Business Research
Kentucky	Kentucky State Data Center
Louisiana	S&P Projections
Maine	Maine Dept. of Administrative and Financial Services
Maryland	Maryland State Data Center
Massachusetts	University of Massachusetts Amherst Population Estimates Program
Michigan	Michigan Dept. of Technology, Management & Budget
Minnesota	Minnesota State Demographic Center
Mississippi	S&P Projections
Missouri	S&P Projections
Montana	S&P Projections
Nebraska	S&P Projections
Nevada	S&P Projections
New Hampshire	New Hampshire Dept. of Business and Economic Affairs (PDF - 2 MB)
New Jersey	S&P Projections
New Mexico	S&P Projections
New York	S&P Projections
North Carolina	North Carolina Office of State Budget and Management
North Dakota	S&P Projections
Ohio	Ohio Dept. of Development
Oklahoma	S&P Projections
Oregon	Portland State University Population Research Center
Pennsylvania	S&P Projections
Rhode Island	S&P Projections
South Carolina	S&P Projections
South Dakota	S&P Projections
Tennessee	Tennessee State Data Center
Texas	Texas Demographic Center
Utah	University of Utah Kem C. Gardner Policy Institute
Vermont	S&P Projections
Virginia	University of Virginia Weldon Cooper Center for Public Service
Washington	Washington State Office of Financial Management
West Virginia	West Virginia University John Chambers College of Business and Economics
Wisconsin	S&P Projections
Wyoming	S&P Projections

The U.S. population is projected to grow by 26.5 million people (7.9%), growing from 334.9 million in 2023 to 361.4 million in 2038 (Exhibit III-5). Almost half of this growth is among the population aged 75 years and older, with projected growth of 12.7 million (52.0% growth), suggesting high growth in demand for health care services used by older patients. The population under age 18 years is projected to decline by 767,270 (-1.1% growth), reflecting low birth rates and suggesting slow growth in demand for pediatric care.

Detailed description

Population file validation

A key demand component of the HWSM is the constructed population files containing person-level data for a representative sample of the current and projected future population. Within this population file, the variables most highly correlated with health care services are age, having medical insurance, whether the medical insurance is Medicaid, and presence of chronic diseases. Other variables correlated with use of many health care services are race/ethnicity, rurality of the county in which the person resides, and whether the insured person is in a managed care plan.

Sex is correlated with use of some health care services, as are current smoking status and body weight status. Household income is correlated with use of oral health services. For most health care services, after controlling for whether the person has medical insurance, the correlation between household income and care utilization diminishes. Some patient characteristics appear to be more important than others when ensuring they are accurately reflected in the population file.

Prior to 2019, population files were constructed at the state level. One challenge is that ACS does not have a metropolitan/nonmetropolitan variable, whereas BRFSS contains this information. Consequently, metropolitan could not be a stratum when statistically matching a person in ACS with a similar person in BRFSS (to add the health-related variables in BRFSS that were absent in ACS). This match process understated the size of the population in nonmetropolitan areas. States generally do not provide data to calibrate/validate the population characteristics by metropolitan/nonmetropolitan location. Another challenge is that states generally do not produce population projections by metropolitan/nonmetropolitan designation. However, population projections and characteristics that can be used to calibrate/validate the population file are available at the county level.

Starting in 2019, there was increased policy interest in modeling at the sub-state level. Therefore, the population files used to model demand were constructed to be representative of each county. This allows aggregation by level of rurality based on each county’s urban-rural designation.14 The approach used to construct the population files ensures that demographics of the state’s population are identical whether one constructs state-level files or constructs county-level files and then aggregates to the state level.

However, sampling issues with surveys such as BRFSS and ACS can result in slightly different estimates of prevalence for population characteristics (for example, disease prevalence). This occurs when constructing state population files versus constructing county population files and aggregating to the state level. This is particularly true when projecting into future years because some counties within a state are growing faster than other counties. The characteristics of faster-growing counties will have a larger impact on the state-wide prevalence of select characteristics.

Two approaches were explored to develop the county population files. In addition to the approach described earlier, an alternative approach would use Public Use Microdata Areas (PUMA) as the sampling unit from ACS and build county-level population files up from each PUMA. This approach to develop county level population files conceptually is an improvement on the current approach used. The county sample would be drawn from a geographic area that is narrower than using the state-wide data files.

However, there are drawbacks to using this approach.

Multiple years of data are required (for example, three-year or five-year files) to increase sample size
- Instead of using the most recent available data, the population file would be constructed with slightly older data.
The ACS sample might be small for some demographic groups even after combining multiple years of ACS data.
The contiguous counties that constitute some PUMAs can cross urban-rural designations.

Extensive validation exercises were conducted on the county-level files to determine if use of PUMA designations improved the county-level population files. Both approaches produce almost identical counts of population demographics (age, sex, race/ethnicity) because demographic characteristics are calibrated to Census Bureau county population statistics. Both approaches required that disease prevalence, medical insurance prevalence, and prevalence of health risk factors (obesity and smoking) be calibrated to match estimates from published sources.

Published data on household income by county do not lend themselves to validating whether one of the two approaches performs better. Household income for each person in the sample is reported in ranges that cannot be averaged. In summary, there is no strong evidence that one approach performed better than the other to construct the population files. The current approach takes advantage of more recent data while the PUMA approach might better capture intra-state variation in household income. After controlling for medical insurance, household income appears to have only a small impact on annual use of health care services.

These evaluations revealed that the constructed county population files are representative of the counties’ characteristics described by published statistics.

Modeling demand for health care services

Demand for health care workers derives from the demand for the services that they provide. The enormous number and variety of services provided is captured by 68,000 ICD-10-CM diagnostic codes and 87,000 ICD-10-PCS procedure codes. For modeling, the HWSM projects future demand for health care services using broad categories of ICD-10 codes, as well as information on occupation or specialty of the health care worker who provided a service type when provider information is available.

The Medical Expenditure Panel Survey (MEPS) is the primary source of data on annual use of health care services and patterns of care use by patient characteristics, with MEPS data consisting of both self-reported information obtained during survey and medical record extraction from care providers. MEPS reports the occupation and/or specialty of the highest-trained clinician seen by the patient during an office or outpatient visit. That is, an office visit to a cardiologist, dentist, or a physical therapist will indicate the provider type. MEPS records do not record the provider type for care provided by other health care workers seen during the visit (advanced practice providers, nurses, medical assistants, phlebotomists, etc.) working under the direction of a doctor or other higher-trained provider. For hospitalizations and emergency visits, MEPS does not indicate what type of provider was seen but does have ICD-10-CM diagnosis codes to model the broad category for why the person was admitted. Analysis of MEPS is supplemented by analysis of the National Inpatient Sample (NIS), which has a much larger number of hospitalizations for modeling length of stay, as well as the National Hospital Ambulatory Medical Care Survey (NHAMCS) discussed later.

Exhibit III‑6 summarizes the health care use metrics for modeling demand for health care services. Under a status quo scenario where care delivery patterns remain unchanged, the rate of projected growth in demand for health workers is assumed to be the same as the rate of projected growth in demand for services. For some care delivery or employment settings, population data are used as a proxy for service demand.

14 Centers for Disease Control and Prevention. 2013 NCHS Urban-Rural Classification Scheme for Counties (PDF - 2 MB).

Exhibit III-6: Care Delivery/Employment Settings and Health Care Utilization Measures

Ambulatory care
Setting and Service Type	Health Care Utilization and Population Measures
Physician and other provider offices	Annual total visits by provider type and specialty; Rx scripts
Outpatient departments and clinics^a	Annual total visits by provider type and specialty; Rx scripts
Dental offices	Annual dental non-cleaning visits distributed to general and specialty dentists; annual dental cleaning visits assigned to dental hygienists

Hospital inpatient and emergency care
Setting and Service Type	Health Care Utilization and Population Measures
Hospital inpatient (includes skilled nursing facility [SNF] units of hospitals)	Hospitalizations by primary diagnosis (ICD-9/ICD-10); Rx scripts
Hospital emergency department	Emergency visits by primary diagnosis (ICD-9/ICD-10); Rx scripts

Home health and hospice care
Setting and Service Type	Health Care Utilization and Population Measures
Home health/hospice	Annual total visits by provider type

Post-acute care and long-term care
Setting and Service Type	Health Care Utilization and Population Measures
Nursing home (includes free standing SNF)	Total nursing home residents
Residential care facilities	Total population in residential care facilities

Other settings
Setting and Service Type	Health Care Utilization and Population Measures
Educational institutions	Number of health care workers trained annually
Public/community health	Total population
School health	Population aged 5-18 years
All other settings (for example, insurance companies, life sciences companies)	Total population

Note^a: Examples of outpatient clinics include well-baby clinics/pediatric outpatient departments; obesity clinics; eye, ear, nose, and throat clinics; family planning clinics; cardiology clinics; internal medicine departments; alcohol and drug abuse clinics; physical therapy clinics; and radiation therapy clinics.

To model health care-seeking behavior, we pooled five years of MEPS data (2018-2022) to provide a sufficient sample size for regression analysis.

Regression analyses yielded predicted probabilities and intensity of health care use by care delivery setting and type of services, based on a person’s:

age group
race/ethnicity
smoking status
body weight category (normal, overweight, obese)
presence of chronic conditions (diagnosed with arthritis, asthma, coronary heart disease, diabetes, or hypertension, and history of cancer, heart attack, or stroke)
insurance type
enrollment in a managed care plan
household income level
rurality of residence
MEPS survey year (included to test for systematic changes in utilization over the five years of MEPS data analyzed)

Predicted probabilities are then applied to the relevant population databases to estimate market demand in the given year if the local population had care use patterns similar to a national peer group (such as, a population with similar demographics, risk factors, etc.).

Summing predicted probabilities across individuals provides estimated annual health care use for:

office visits
outpatient visits
emergency department visits
hospitalizations
home health visits
hospice visits

Modeled care delivery and health care worker employment settings are discussed below.

Office/clinic visits

MEPS data are used to quantify the relationship between patient characteristics and number of annual office/clinic visits or hospital outpatient visits with a provider. MEPS contains data on visits to many types of providers, including physicians, psychologists, dentists, optometrists, opticians, physical therapists, occupational therapists, and other types of providers.

Negative binomial regression is used to model annual visits, with this regression type chosen because of the skewed nature of annual visits with large numbers of people having zero visits during the year with a particular provider type.15 Separate regressions are estimated by provider type. Adults and children are modeled separately because the set of explanatory variables available and care use patterns differ for adults and children.

Explanatory variables in the regressions were variables available in both the constructed population file and in MEPS. These variables are the same as listed above.

Because MEPS reports only the highest-trained person seen during an ambulatory visit, separate analysis was conducted using the National Ambulatory Medical Care Survey (NAMCS) to determine the likelihood that a patient would see additional health care workers (for example, PAs, RNs) during a clinical visit. Data from NAMCS were also used to estimate the number of prescriptions that were generated during an ambulatory care visit. This number was used in the demand projections for pharmacy-related professions.

Hospital-related services

Regressions predicting demand for hospital inpatient and emergency services employ the five latest years of MEPS files, along with the latest National Inpatient Sample (NIS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) files.16 Multiple years of MEPS data were used to increase the sample size and provide reliable estimates for hospitalization and emergency department (ED) visits by medical and surgical conditions.

Hospital inpatient services

Utilization patterns of inpatient services by individual characteristics were modeled in three parts:

Annual probability that an individual would experience at least one hospitalization for each of 26 broad diagnosis categories (with categories defined using ICD-10 codes)17
The expected length of stay (LOS) for the hospitalization
Specialty services and prescriptions received during the hospitalization

The probability of hospitalization in general, acute care, long term, or specialty hospitals for each of the 26 diagnosis categories is modeled with logistic regression using MEPS data. Explanatory variables were the same explanatory variables described previously for modeling office and outpatient visits to providers.

LOS for the hospitalization is analyzed with Poisson regression using discharge records in NIS. Separate regressions were modeled for each of the 26 diagnosis categories.

The dependent variable is total days in the hospital, and the explanatory variables were:

patient age group
sex
race
ethnicity
insurance type
presence of diabetes among the diagnosis codes

Because NIS contains over eight million hospital stays, estimates derived from NIS were stable even for hospitalizations for the condition categories with fewer hospitalizations. Expected LOS calculated from NIS is applied to the individuals in the population database and multiplied by hospitalization probability. This estimates each person’s expected number of inpatient days during the year for the modeled medical or surgical condition categories.

NIS also is used to determine the expected number of prescriptions for hospitalized individuals (which is a component to model demand for pharmacists).

Hospital emergency department services

Logistic regression with MEPS data estimates the probability that a person with given characteristics would have at least one emergency visit during the year for each of the 25 emergency department categories of services defined by ICD-10 codes.18

MEPS does not identify the medical specialty of providers seen during an ED visit. Therefore, the NHAMCS is used to identify the number and types of providers seen. If only one physician is encountered this physician is assumed to be an emergency physician. If the records indicate a second physician encounter occurred, the second encounter is assumed to be a specialist consultant with physician specialty aligned with the primary diagnosis code for the visit. For example, if the primary diagnosis code was neurology related then any second physician encounter would be designated as a consult with a neurologist. The NHAMCS record also indicates whether a PA, RN, or select other type of health care worker was seen during the visit. The NHAMCS data also indicated medications prescribed and lab tests/exams performed. This information is used to model demand for pharmacists and various allied health occupations.

Post-acute care services

Demand for post-acute care in hospitals and skilled nursing facilities (SNFs) that are a part of a hospital is modeled as inpatient services. Demand for nursing home care in free-standing nursing homes is linked to the size of the population in nursing homes.

Home health and hospice services

The pooled five-year MEPS files (n~22,000) were used to model home visits. The files contain annual use of home health services, including information on the type of provider seen during the visit (home health aide, physical therapist, etc.). Like the regression for office visits, negative binomial regression is used with annual visits from a specific provider type as the dependent variable. Explanatory variables consist of the same variables used to model demand for office, outpatient, hospital inpatient, and emergency department care.

Utilization of health care worker resources not captured in MEPS

Some health care workers provide services that are not captured in MEPS or not in traditional clinical settings. The HWSM models demand for these workers as a provider-to-population ratio (see Exhibit III‑6). This includes occupations such as nurses, counselors, and physicians who are employed by schools, employed by insurance companies or life sciences companies, work in public health departments, or are involved in teaching or research.

Demand is modeled based on the size of the population who might use such services. For example, the demand for school-based services is derived by the HWSM directly from the projected size of the population of school-aged children. Under the status quo demand scenario, if the size of the population of school-aged children increased by 5%, then demand for school-based health care would increase by 5%.

Modeling the workforce demand implications of COVID-19

Perhaps the largest impact of COVID-19 on future demand for health care workers is the lowered population projections due to COVID-19 impacts on excess deaths and reduced immigration levels. Two additional demand impacts modeled are: (1) care needs associated with future incidence as COVID-19 becomes endemic, and (2) care needs associated with Long COVID.

Despite availability of a vaccine, incidence of COVID-19 continues as the SARS-CoV-2 virus evolves.19 Increasingly, it appears that COVID-19 is shifting from pandemic to endemic like seasonal influenza.20 A McKinsey & Company analysis of the potential long-term cost implications of COVID-19 predicts 110 to 220 million annual COVID-19 cases over the foreseeable future.21 The authors estimate that 10 to 15% of these new COVID-19 cases will require outpatient treatment. The CDC reports that 225,874 COVID-19 hospitalizations occurred between December 1, 2024, and November 30, 2025.22

For modeling, McKinsey & Company’s lower bound estimates of 110 million new COVID-19 cases was used. This bound was scaled to reflect lower hospitalizations. This translates to a revised estimate of about 37.4 million cases annually with 10% accessing ambulatory treatment. This would suggest about 3.7 million ambulatory visits per year. If such visits were mainly to primary care providers, this would increase by about 1.09% the demand for primary care office visits and, by extension, increase by 1.09% the demand for health care workers (for example, primary care physicians, nurse practitioners, physician assistants, nurses, and other office staff) working in primary care ambulatory settings.

If the nation continued to have 225,874 hospitalizations annually as COVID-19 becomes endemic, and if the average length of stay (8.9 days per COVID-related hospitalization) continued, then the estimated 2 million inpatient days raise hospital-based care by about 1.3% above current levels.23 For modeling, it is assumed that this increase raises demand for hospital-based health care workers (for example, hospitalists, critical care doctors, nurses, allied health workers) by a corresponding 1.3%.

Projected future demand for COVID-19 related care is in line with 2010-2016 annual number of influenza-associated outpatient medical visits (4.3 million to 16.7 million) and hospitalizations (139,000 to 708,000).24

The CDC Household Pulse survey reports that 6.9% of COVID-19 cases will result in Long COVID lasting more than three months, which equates to 2.5 million cases annually.25 Long COVID effects may require visits to both primary care providers and to specialists who treat these conditions (for example, neurology, pulmonology, cardiology, infectious diseases, nephrology, and endocrinology). It is assumed that patients with Long COVID will have two doctor visits per case, or 5.1 million visits annually. If these visits were proportionately distributed across primary care physicians and the above-mentioned specialists, this would suggest about a 1.2% increase in annual visits to each of these specialties and, by extension, the health care workers required to provide this additional care. These estimates of increased demand for health care services and providers might be conservative, as Long COVID could cause permanent damage that necessitates lifetime treatment.

Staffing to meet demand for health care services

By applying information on staffing patterns, the HWSM converts demand for visits and other utilization measures (described previously) into demand for FTEs by occupation or specialty.

The base year staffing ratio is calculated by dividing the national volume of service used by the number of health care professionals employed in each setting. In prior years, this approach assumed the base year demand for services in each setting was fully met by the available professionals in that setting—with the exception of primary care and psychiatry where the number of providers required to remove the health professional shortage area (HPSA) designations were used as a proxy for national shortfall. Increasingly, there is growing concern of current shortfalls in many occupations and care settings. Estimates of shortfalls, vacancy rates, or other information indicating shortfalls of health care workers for select physician specialties, nurses, and allied health and oral health occupations are discussed in individual chapters.

For occupations that provide services in a single setting, base year utilization is divided by the base year supply to derive the staffing ratio for that occupation. The staffing ratio is then applied to the projected volume of services to obtain the projected demand for providers in every year after the base year.

For occupations that provide services across multiple settings (for example, nurses and therapists), information from the ACS or from the OEWS on the employment distribution of the care providers in the base year determines the number of individuals working in each setting. In general, ACS data is used for occupations where some health care workers might be self-employed (for example, chiropractors) and OEWS data is used for occupations where health care workers are primarily employees (for example, allied health occupations, nurses).

The modeled staffing ratios for health occupations are summarized in occupation-specific modules of this report.

Different types of health care workers overlap in their ability to provide services, and the status quo demand projections assume that care use and delivery patterns will remain unchanged over the projection horizon. When comparing demand to supply often one should look at categories of providers rather than specific occupations or specialties. For example, family physicians provide care that overlaps that provided by other primary care physicians (pediatricians, general internists, and geriatricians) and some specialist physicians, as well as by providers in other occupations (for example, physician assistants and nurse practitioners). Likewise, some services provided by higher trained providers (for example, RNs, or physicians) could be provided by less trained providers (for example, LPNs, or advanced practice providers). Hence, there is some flexibility within the health care system to shift some care activities between occupations both for cost effectiveness reasons and if there is a shortfall of a particular provider type.

Demand scenarios

Status quo scenario

The status quo demand scenario in the HWSM assumes current national patterns of care use and delivery to the modeled population remain relatively unchanged over time. This scenario models demand considering population demographics, health risk factors, disease prevalence, and economic factors correlated with demand for health care services. It captures population growth and aging over time, as well as geographic variation in demand determinants. When compared against supply projections, this scenario helps inform whether there will be sufficient supply to provide a level of care at least consistent with current levels. The main demand drivers of this scenario are population growth and aging. Changing racial/ethnic diversity also affects demand.

Reduced barriers scenario

A hypothetical reduced barriers Scenario was added to the HWSM in 2019. National and state goals, as described in initiatives such as Healthy People 2030, are to remove barriers that contribute to utilization of services and health outcomes. This will improve access to affordable, high quality care—especially preventive services and chronic disease management.26

This scenario first identifies a population that likely faces few access barriers to care. For modeling, it is assumed that this is non-Hispanic White, with insurance, living in a metropolitan area. For oral health, this scenario also includes people in the top income level modeled in the HWSM—household income of $150,000 or greater.

Using the health care use prediction equations estimated with MEPS data, a simulation is conducted to estimate if individuals not in this group had care utilization rates similar to those in the population likely experiencing fewer access barriers. Examples of people outside the likely group are racial or ethnic minorities, those without insurance, and people living in a nonmetropolitan area.

For women’s health, the metropolitan/nonmetropolitan component of the reduced barriers scenario was omitted because women in nonmetropolitan areas use slightly more services than their peers in metropolitan areas. This scenario only models the additional demand for providers associated with gaining insurance and if minority populations had care use patterns like those of non-Hispanic White women and adolescent girls.

Modeled scenarios in prior reports

Prior reports for general surgeons and allied health and select other occupations modeled an evolving care delivery system scenario that builds on the status quo scenario. This scenario was later replaced by the reduced barriers scenario for two reasons. First, there were data limitations on the potential workforce impact of evolving trends in care delivery. Also, the reduced barriers scenario more clearly and rigorously models key national goals around improving health care access for populations with limited access to services.

The prior evolving care delivery system scenario modeled that the health care system continues to evolve reflecting: innovation and evidence-based medicine; economic considerations including payment reform and aligning patient incentives and health plan incentives; growing use of team-based care with each occupation contributing based on their specialization and evolving scope of practice; and public expectations and policies around population health, care access, and quality. Modeling for physicians suggests that some components of an evolving care delivery system will have contradictory effects on demand.27 Some components will increase demand (for example, improving access to care, and increasing longevity through improved population health). Some components will decrease demand (for example, improved preventive care reducing disease onset). Some components will shift care between providers (for example, from physicians to advanced practice providers). Some components will shift care between settings (for example, shifting care from emergency departments or hospitals to appropriate ambulatory settings).

15 Prior to 2019, the prediction equations modeled annual visits using Poisson regression. In response to inquiries about issues of overdispersion, potential alternative regression models were evaluated, and negative binomial regression replaced Poisson regression. This change had minimal impact on demand projections.
16 The model currently uses the 2022 NIS and 2021 and 2022 NHAMCS files.
17 List of categories modeled in the hospital inpatient setting include - Allergy & Immunology, Cardiology, Colorectal Surgery, Dermatology, Endocrinology, Gastroenterology, General Surgery, Hematology/Oncology, Infectious Disease, Nephrology, Neurological Surgery, Neurology, Ob/Gyn, Ophthalmology, Orthopedic Surgery, Otolaryngology, Physical Medicine & Rehabilitation, Plastic Surgery, Psychiatry, Pulmonology, Neonatal Medicine, Rheumatology, Thoracic Surgery, Urology, Vascular Surgery, Other Specialties.
18 List of categories modeled in the emergency setting include - Allergy & Immunology, Cardiology, Dermatology, Endocrinology, Gastroenterology, Infectious Diseases, General Surgery, Ob/Gyn, Hematology & Oncology, Nephrology, Neurology, Ophthalmology, Orthopedic Surgery, Otolaryngology, Physical Medicine and Rehabilitation, Plastic Surgery, Psychiatry, Pulmonology, Rheumatology, Thoracic Surgery, Urology, Neurological Surgery, Vascular Surgery, Neonatal Medicine, Other Specialties.
19 Markov PV, Ghafari M, Beer M, et al. The evolution of SARS-CoV-2. Nat Rev Microbiol. 2023;21(6):361-379.
20 Klobucista C, Ferragamo M. When Will COVID-19 Become Endemic? Council on Foreign Relations. May 24, 2023. Accessed July 25, 2025.
21 Patel N, Singhal S. What to Expect in US Healthcare in 2023 and Beyond. McKinsey & Company; 2023. Accessed July 25, 2025.
22 Centers for Disease Control and Prevention. COVID-NET Laboratory-confirmed COVID-19 hospitalizations. Accessed May 6, 2025.
23 Kapinos KA, Peters RM, Murphy RE, Hohmann SF, Podichetty A, Greenberg RS. Inpatient Costs of Treating Patients With COVID-19. JAMA Netw Open. 2024;7(1):e2350145. doi:10.1001/jamanetworkopen.2023.50145.
24 Rolfes MA, Foppa IM, Garg S, et al. Annual estimates of the burden of seasonal influenza in the United States: A tool for strengthening influenza surveillance and preparedness. Influenza Other Respir Viruses. 2018;12(1):132-137.
25 National Center for Health Statistics. U.S. Census Bureau, Household Pulse Survey, 2022–2024. Long COVID. Generated interactively: May 6, 2025, from https://www.cdc.gov/nchs/covid19/pulse/long-covid.htm.
26 Department of Health and Human Services. Healthy People 2030. Accessed July 23, 2025.
27 Association of American Medical Colleges. The Complexities of Physician Supply and Demand: Projections From 2021 to 2036. AAMC; 2024. Accessed July 23, 2025.

Date Last Reviewed:

June 2026