III. Demand Modeling Overview

Published 2023

In this module:

Constructing the population databases
Modeling demand for health care services
Modeling the workforce demand implications of COVID-19
Staffing to meet demand for health care services
Demand scenarios

The demand component of the Health Workforce Simulation Model (HWSM) first projects demand for health care services, and then estimates the number and mix of health care workers required to meet projected demand for services. We report demand for health care workers as full time equivalents (FTEs) using the same 40 hours/week definition as supply. Therefore, supply and demand are directly comparable.

There are three major elements for modeling demand:

A population database contains demographic, socioeconomic, health status, and health risk information for a representative sample of the current and projected future population in each county. County data sum to the state and national levels.
Prediction equations of the demand for health care services relate individual’s characteristics (in the population database) to annual health service use by care delivery setting and by health profession seen or diagnosis category.
Staffing patterns convert demand for services into demand for providers.

Exhibit III‑1 presents a flow diagram for the demand component of HWSM. Not all care delivery sites pertain to every health occupation modeled. The drivers of growth in demand for hospital-based occupations are projected growth in inpatient days and emergency visits. Growth in ambulatory visits is the demand driver for growth in demand for health care workers in office and outpatient-based settings. For the “other employment” settings, in parentheses we list the workload metric for demand. For example, growth in the population age 5-17 is the demand driver for growth in demand for care in school-based care or counseling.

Exhibit III-1: Flow Diagram for the Demand Component of HWSM

This diagram shows how various inputs are synthesized into demand projections. It gives a high-level view of the model. The individual components are explained in more detail in this module. The basic equation is that combining Utilization Patterns and Population Data translates into a Demand for Services. Demand for Services is then translated into Demand for Health Workers using Staffing Ratios.

Sources: MEPS=Medical Expenditure Panel Survey, NIS=National Inpatient Sample, NAMCS=National Ambulatory Medical Care Survey, NHAMCS=National Hospital Ambulatory Medical Care Survey, NHATS=National Health and Aging Trends Study, ACS=American Community Survey, BRFSS=Behavioral Risk Factor Surveillance System, CMS MDS = Centers for Medicare and Medicaid Services Long Term Care Minimum Dataset, MCBS=Medicare Beneficiary Survey, population projections come from states and the U.S. Census Bureau.

Utilization patterns are the relationship between patient characteristics and health care use. These data come from a variety of sources. The population data include demographic, socioeconomic, and health risk factors. These data also come from a variety of sources. Population data are also valuable for quantifying specific population types, like the school-age population, that might demand specific types of care.

By combining characteristics of the population with data on how parts of the population use services, we then derive a demand for services. Demand for services include items like hospital inpatient days by diagnosis, provider visits by occupation/specialty, and hospice visits by population, among others.

We then convert this demand for services into a demand for health care workers by applying staffing ratios. Staffing ratios take a demand for services, like the number of visits to a dentist annually, and converts that into the number of dentists needed to meet that demand.

Constructing the population databases

General approach

The microsimulation approach models demand for health care services separately for individual people and then aggregates projected service demand to the population level. This approach requires individual level (micro) data on the predictors of health care use for each person in a representative sample of a designated geographic region (national, state, or county-equivalent).

Prior to 2019, HWSM produced projections at the state and national levels based on constructed state-level population databases. Starting in 2019, we constructed population files for each of the approximately 3,142 counties or county equivalents (e.g., parishes, boroughs, independent cities) in the United States (excluding U.S. territories). Modeling at the county level facilitates evaluation of supply and demand by rurality across states and the nation. This allows for better modeling of health workforce supply for underserved communities and populations. County population files can be combined to produce state files, which in turn combine to produce the national file.

County level population files start with combining data from multiple sources, as specified later, to create preliminary state population files. These files contain a representative sample of the population in each state by:

demographic
household income level
medical insurance type
residency institution status (i.e., resides in the community, in a residential care facility, or in a nursing home)

Then, the population data are re-calibrated to produce a representative sample of the population in each county with the prevalence of health care use demand determinants (demographics, disease, lifestyle choices, and medical insurance) benchmarked to external sources.

The core micro data file on which HWSM’s baseline population databases are built is the most recent year (2021) of the American Community Survey (ACS). The ACS provides the demographic and socioeconomic characteristics of a representative sample of the population in each state. ACS reports information on medical insurance type, household income, and whether the person lives in a community or institutional setting. We use a statistical matching process, described later, to add health risk factors and information on disease presence. Using random sampling with replacement, we match each person in ACS with a similar person in the Behavioral Risk Factor Surveillance System (BRFSS), the Medicare Beneficiary Survey (MCBS)1 , or the Centers for Medicare and Medicaid Services (CMS) Long-Term Care Minimum Data Set (MDS).2 This process preserves the number of records from the ACS file as well as each record’s ACS sample weight, and thus produces a preliminary population file for each state with population characteristics representative of that state. Each record has a person’s demographics, health-related lifestyle indicators, health conditions, socioeconomic and insurance characteristics, and residency setting.

Demographics
- Children (age groups 0-2, 3-5, 6-13, 14-17 years)
  Adults (age groups 18-34, 35-44, 45-64, 65-74, 75+ years)
- Sex (male, female)
- Race/ethnicity (non-Hispanic White, non-Hispanic Black, non-Hispanic other, Hispanic)
Health-related lifestyle indicators
- Body weight status (normal, overweight, obese)
- Current smoker status (yes, no)
Health conditions (diagnosis coded as yes, no)
- Arthritis, asthma, cardiovascular disease, diabetes, hypertension
- History of cancer, history of heart attack, history of stroke
Socioeconomic conditions and insurance
- Household annual income (<$10,000, $10,000 to <$15,000, $15,000 to < $20,000, $20,000 to < $25,000, $25,000 to < $35,000, $35,000 to < $50,000, $50,000 to < $75,000, $75,000+)
- Medical insurance status (private, public, self-pay)
- In managed care plan (yes, no)
Residency setting
- Non-institutionalized in the community
- Group quarters (which includes residential care facilities and nursing homes)
Geographic location
- State
- 2013 NCHS Urban-Rural Classification Scheme for Counties3

As illustrated in Exhibit III‑2, for the community-based population, each individual in the ACS file is matched with someone in the BRFSS from the same sex, age group (17 age groups used), race, ethnicity, medical insurance type, household income level (eight income categories), and state of residence.4

Individuals residing in a group setting are randomly matched to a person in the MCBS or Nursing Home MDS in the same state, age group, sex, and race and ethnicity strata. The total number of people living in nursing homes and residential care, by state and age group, is constructed to match published numbers from the Centers for Disease Control and Prevention (CDC), showing nearly 1.2 million nursing home residents and 918,700 people living in residential care nationally.5 6

After creating the preliminary state population file, we construct and calibrate the county level population files. The U.S. Census Bureau reports data on the aggregate number of people in each county in 2021 by five-year age group, sex, and race/ethnicity. Using the NCHS urban-rural classifications, we categorize each county as metropolitan or nonmetropolitan.3 In the constructed preliminary state population file, we first divide the population into metropolitan/ nonmetropolitan location using the metropolitan designation in BRFSS.7 In this file, the number of people in nonmetropolitan areas is understated relative to published estimates sources. 8

We then re-weight sample weights for people identified as metropolitan to match the demographics of the population in each metropolitan county. We also re-weight sample weights for people identified as nonmetropolitan to match the demographics in each nonmetropolitan county. This produces a weighted sample for each county that is representative of the demographics in each county. The other variables (e.g., household income, insurance coverage, disease prevalence, and prevalence of health risk factors) in this weighted sample are representative of the demographically-adjusted metropolitan and nonmetropolitan populations. We calibrate the county population files to match data from external sources on disease prevalence, lifestyle choices, and medical insurance status.

County-level estimates of disease prevalence are calibrated at the individual level so that the population prevalence numbers exactly match published statistics. Calibration is achieved by first estimating a series of logistic regression equations using BRFSS data. The dependent variable is whether the person has the modeled condition or risk factor, with separate regressions9 used to model:

arthritis
asthma
hypertension
cardiovascular disease
diabetes
history of cancer
history of heart attack
history of stroke
obesity
current smoker

Independent variables in the regression equations are:

demographic variables used for demand modeling (age group, sex, race/ethnicity)
dichotomous variable indicating whether the person has exercised or participated in physical activity other than their regular job in the past 30 days
body weight status—normal, overweight, obese (except for the obesity regression)
current smoker status (except for the smoker regression)
hypertension—included for modeling cardiovascular disease, history of cancer, history of heart attack, and history of stroke

Applying the prediction equation to each person in the constructed population file creates a probability that the person has the condition or risk factor. This probability is compared to a random number generated from a uniform distribution from 0-1. The population file prevalence for a specific condition or risk factor is then adjusted (if needed) until the population prevalence exactly matches published statistics for that county in the 2022 CDC Places database (which is based on 2019/2020 BRFSS data).10

County data “history of heart attack” prevalence is unavailable in CDC Places. Heart attack prevalence estimates come from state health departments (Exhibit III‑3). The availability of county-level data varies by state, and for some states is unavailable.11 When published prevalence is unavailable, HWSM uses the prevalence rate created in the constructed population file.

Exhibit III-3: State Sources for County Prevalence of History of Heart Attack

State	Source
Alaska	Alaska Dept. of Health and Social Services
Arizona	Arizona Behavioral Risk Factor Surveillance System
California	Received from CSU Sacramento, PH Survey Research Program
Colorado	Colorado Dept. of Health, BRFSS County data
Connecticut	Connecticut Dept. of health
Delaware	Delaware Division of Public Health, BRFSS data by county (PDF - 424 KB)
Florida	Florida Dept. of Health website
Georgia	Received from GA Dept. of Public Health
Hawaii	Hawaii Health Data Warehouse, Indicator based information system
Illinois	Illinois BRFSS
Iowa	Iowa community Health Needs Assessment & Health Improvement Planning
Kansas	Kansas BRFSS
Kentucky	Kentucky Area District Profiles, BRFSS data 2019
Maine	BRFSS coordinator from Maine Dept. of Public Health
Maryland	Maryland BRFSS
Michigan	Received from Michigan DHHS
Missouri	Missouri data website
Montana	BRFSS Annual Report 2016 (PDF - 2 MB)
Nevada	Received from Nevada Dept. of Public Health
New Hampshire	Received from NH Dept. of Public Health
New Jersey	NJ state health assessment data
New Mexico	New Mexico Dept. of Health data website
Ohio	Received from OH BRFSS Coordinator
Oklahoma	Oklahoma Dept. of Health
Oregon	Oregon Dept. of Health
Pennsylvania	Pennsylvania Dept. of Health Information/ Data Exchange
Rhode Island	Received from Rhode Island Dept. of Public Health BRFSS
South Carolina	Received from SC Dept. of Health
Tennessee	Received from TN Dept. of Health
Texas	Prepared by Center for Health Statistics, Texas Department of State Health Services
Washington	Received from WA Dept. of Health
West Virginia	Received from WV State

Developing demand forecasts requires creating population databases for future populations. We adjust sample weights of the starting year population to match population demographics (age group, sex, race and ethnicity) in the projections. The implicate assumption is that baseline prevalence rates of health and health behavior characteristics remain the same within each demographic strata (by age, sex, race and ethnicity) into the future.

Population projections

The projections start with year 2021 county-level population estimates, by demographic characteristics, published by the U.S. Census Bureau. State and county-level population projections through 2036 come from government agencies and universities, as well as from S&P Global demographers for those states that do not publish projections. Last year’s health workforce demand projections used population projections that for many states were developed pre-COVID-19. Hence, we adjusted the published projections to account for excess deaths attributed to COVID-19, the opioid crisis, and other factors; as well declining birth rates and changing immigration patterns. While each state, as well as the S&P Global demographers, use different assumptions and approaches for projecting future population, the population projections incorporate much of the excess deaths through June 2021 as well as the most recent data on birth rates and migration patterns. The updated national population projection for 2036 is several million people lower than the population projection based on pre-pandemic data.

The U.S. population is projected to grow by 27.8 million people (8.4%), growing from 331.9 million in 2021 to 359.7 million in 2036 (Exhibit III-4). Almost half of this growth is among the population aged 75 years and older, with projected growth of 12.1 million (54.7% growth), suggesting high growth in demand for health care services used by older patients. The population under age 18 years is projected to grow by only 387,000 (0.5% growth), reflecting low birth rates and suggesting slow growth in demand for pediatric care.

Population file validation

A key demand component of HWSM is the constructed population files containing person-level data for a representative sample of the current and projected future population. Within this population file, the variables most highly correlated with health care services are age, having medical insurance, whether the medical insurance is Medicaid, and presence of chronic diseases. Other variables correlated with use of many health care services are race/ethnicity, rurality of the county in which the person resides, and whether the insured person is in a managed care plan.

Sex is correlated with use of some health care services, as are current smoking status and body weight status. Household income is correlated with use of oral health services. For most health care services, after controlling for whether the person has medical insurance, the correlation between household income and care utilization diminishes. To more precisely model demand for health care services, some patient characteristics appear to be more important than others when ensuring they’re accurately reflected in the population file.

Prior to 2019, population files were constructed at the state level. One challenge is that ACS does not have a metropolitan/nonmetropolitan variable, unlike BRFSS. Consequently, metropolitan could not be a stratum when statistically matching a person in ACS with a similar person in BRFSS (to add the health-related variables in BRFSS that were absent in ACS). This match process understated the size of the population in nonmetropolitan areas. States generally do not provide data to calibrate/validate the population characteristics by metropolitan/nonmetropolitan location. Another challenge is that states generally do not produce population projections by metropolitan/nonmetropolitan designation. However, population projections and characteristics that can be used to calibrate/validate the population file are available at the county level.

Starting in 2019, there was increased policy interest in modeling at the sub-state level. Therefore, the population files used to model demand were constructed to be representative of each county. This allows aggregation by level of rurality based on each county’s urban-rural designation.12 The approach used to construct the population files ensures that demographics of the state’s population is identical whether one constructs state-level files or constructs county-level files and then aggregates to the state level.

However, sampling issues with surveys such as BRFSS and ACS can result in slightly different estimates of prevalence for population characteristics (e.g., disease prevalence). This occurs when constructing state population files versus constructing county population files and aggregating to the state level. This is particularly true when projecting into future years because some counties within a state are growing faster than other counties. The characteristics of faster-growing counties will have a larger impact on the state-wide prevalence of select characteristics.

Two approaches were explored to develop the county population files. In addition to the approach described earlier, an alternative approach would use Public Use Microdata Areas (PUMA) as the sampling unit from ACS and build county-level population files up from each PUMA. This approach to develop county level population files conceptually is an improvement on the current approach used. The county sample would be drawn from a geographic area that is narrower than using the state-wide data files. However, there are drawbacks to using this approach.

Multiple years of data are required (e.g., three-year or five-year files) to increase sample size
- Instead of using the most recent available data, the population file would be constructed with slightly older data.
The ACS sample might be small for some demographic groups even after combining multiple years of ACS data.
The contiguous counties that constitute some PUMAs can cross urban-rural designations.

We conducted extensive validation exercises on the county-level files to determine if use of PUMA designations improved the county-level population files. Both approaches produce almost identical counts of population demographics (age, sex, race/ethnicity) because demographic characteristics are calibrated to Census Bureau county population statistics. Both approaches required that disease prevalence, medical insurance prevalence, and prevalence of health risk factors (obesity and smoking) be calibrated to match estimates from published sources.

Published data on household income by county do not lend itself to validating whether one of the two approaches performs better. Household income for each person in the sample is reported in ranges that cannot be averaged. In summary, there is no strong evidence that one approach performed better than the other to construct the population files. The current approach takes advantage of more recent data while the PUMA approach might better capture intra-state variation in household income. After controlling for medical insurance, household income appears to have only a small impact on annual use of health care services.

These evaluations revealed that the constructed county population files are representative of the counties’ characteristics described by published statistics.

Modeling demand for health care services

Demand for health care workers derives from the demand for the services that they provide. The enormous number and variety of services provided is captured by 68,000 ICD-10-CM diagnostic codes and 87,000 ICD-10-PCS procedure codes. For modeling, HWSM projects future demand for health care services using broad categories of ICD-10 codes (with ICD-9 codes used prior to 2016), as well as information on occupation or specialty of the health care worker who provided a service type when provider information is available.

The Medical Expenditure Panel Survey (MEPS) is the primary source of data on annual use of health care services and patterns of care use by patient characteristics, with MEPS data consisting of both self-reported information obtained during survey and medical record extraction from care providers. MEPS reports the occupation and/or specialty of the highest-trained clinician seen by the patient during an office or outpatient visit. That is, an office visit to a cardiologist, dentist, or a physical therapist will indicate the provider type. MEPS records do not record the provider type of for care provided by other health care workers seen during the visit (advanced practice providers, nurses, medical assistants, phlebotomists, etc.) working under the direction of a doctor or other higher-trained provider. For hospitalizations and emergency visits, MEPS does not indicate what type of provider was seen but does have ICD-10-CM diagnosis codes to model the broad category for why the person was admitted. Analysis of MEPS is supplemented by analysis of the National Inpatient Sample (NIS), which has a much larger number of hospitalizations for modeling length of stay, as well as the National Hospital Ambulatory Medical Care Survey (NHAMCS) discussed later.

Exhibit III‑5 summarizes the health care use metrics for modeling demand for health care services. Under a status quo scenario where care delivery patterns remain unchanged, the rate of projected growth in demand for health workers is assumed to be the same as the rate of projected growth in demand for services. For some care delivery or employment settings, population data are used as a proxy for service demand.

Exhibit III-5: Care Delivery/Employment Settings and Health Care Utilization Measures

Ambulatory care
Setting and Service Type	Health Care Utilization and Population Measures
Physician and other provider offices	Annual total visits by provider type and specialty; Rx scripts
Outpatient departments and clinics ^a	Annual total visits by provider type and specialty; Rx scripts
Dental offices	Annual dental non-cleaning visits distributed to general and specialty dentists; annual dental cleaning visits assigned to dental hygienists

Hospital inpatient and emergency care
Setting and Service Type	Health Care Utilization and Population Measures
Hospital inpatient (includes skilled nursing facility [SNF] units of hospitals)	Hospitalizations by primary diagnosis (ICD-9/ICD-10); Rx scripts
Hospital emergency department	Emergency visits by primary diagnosis (ICD-9/ICD-10); Rx scripts

Home health and hospice care
Setting and Service Type	Health Care Utilization and Population Measures
Home health/hospice	Annual total visits by provider type

Post-acute care and long-term care
Setting and Service Type	Health Care Utilization and Population Measures
Nursing home (includes free standing SNF)	Total nursing home residents
Residential care facilities	Total population in residential care facilities

Other settings
Setting and Service Type	Health Care Utilization and Population Measures
Educational institutions	Number of health care workers trained annually
Public/community health	Total population
School health	Population aged 5-18 years
All other settings (e.g., insurance companies, life sciences companies)	Total population

Note^a: Examples of outpatient clinics include well-baby clinics/pediatric outpatient departments; obesity clinics; eye, ear, nose, and throat clinics; family planning clinics; cardiology clinics; internal medicine departments; alcohol and drug abuse clinics; physical therapy clinics; and radiation therapy clinics.

To model health care-seeking behavior, we pooled five years of MEPS data (2015-2019) to provide a sufficient sample size for regression analysis. Regression analyses yielded predicted probabilities and intensity of health care use by care delivery setting and type of services, based on a person’s:

age group
race/ethnicity
smoking status
body weight category (normal, overweight, obese)
presence of chronic conditions (diagnosed with arthritis, asthma, coronary heart disease, diabetes, or hypertension, and history of cancer, heart attack, or stroke)
insurance type
enrollment in a managed care plan
household income level
rurality of residence
MEPS survey year (included to test for systematic changes in utilization over the 5 years of MEPS data analyzed)

Predicted probabilities are then applied to the relevant population databases to estimate market demand in the given year if the local population had care use patterns similar to a national peer group (i.e., a population with similar demographics, risk factors, etc.). Summing predicted probabilities across individuals provides estimated annual health care use for:

office visits
outpatient visits
emergency department visits
hospitalizations
home health visits
hospice visits

We discuss the modeled care delivery and health care worker employment settings below.

Office/clinic visits

MEPS data are used to quantify the relationship between patient characteristics and number of annual office/clinic visits or hospital outpatient visits with a provider. MEPS contains data on visits to many types of providers, including physicians, psychologists, dentists, optometrists, opticians, physical therapists, occupational therapists, and other types of providers.

Negative binomial regression is used to model annual visits, with this regression type chosen because of the skewed nature of annual visits with large numbers of people having zero visits during the year with a particular provider type.13 Separate regressions are estimated by provider type. Adults and children are modeled separately because the set of explanatory variables available and care use patterns differ for adults and children.

Explanatory variables in the regressions were variables available in both the constructed population file and in MEPS. These variables are the same as listed above.

Because MEPS reports only the highest-trained person seen during an ambulatory visit, separate analysis was conducted using the National Ambulatory Medical Care Survey (NAMCS) to determine the likelihood that a patient would see additional health care workers (e.g., PAs, RNs) during a clinical visit. Data from NAMCS were also used to estimate the number of prescriptions that were generated during an ambulatory care visit. This number was used in the demand projections for pharmacy-related professions.

Hospital-Related Services

Regressions predicting demand for hospital inpatient and emergency services employ the five latest years of MEPS files, along with the latest National Inpatient Sample (NIS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) files.14 Multiple years of MEPS data were used to increase the sample size and provide reliable estimates for hospitalization and emergency department (ED) visits by medical and surgical conditions.

Hospital Inpatient Services

Utilization patterns of inpatient services by individual characteristics were modeled in three parts:

Annual probability that an individual would experience at least one hospitalization for each of 28 broad diagnosis categories (with categories defined using ICD-9 and ICD-10 codes)
The expected length of stay (LOS) for the hospitalization
Specialty services and prescriptions received during the hospitalization

The probability of hospitalization in general, acute care, long term, or specialty hospitals for each of the 28 diagnosis categories is modeled with logistic regression using MEPS data. Explanatory variables were the same explanatory variables described previously for modeling office and outpatient visits to providers.

LOS for the hospitalization is analyzed with Poisson regression using discharge records in NIS. Separate regressions were modeled for each of the 28 diagnosis categories. The dependent variable is total days in the hospital, and the explanatory variables were:

patient age group
sex
race
ethnicity
insurance type
presence of diabetes among the diagnosis codes.

Because NIS contains over 8 million hospital stays, estimates derived from NIS were stable even for hospitalizations for the condition categories with fewer hospitalizations. Expected LOS calculated from NIS is applied to the individuals in the population database and multiplied by hospitalization probability. This estimates each person’s expected number of inpatient days during the year for the modeled medical or surgical condition categories.

NIS also is used to determine the expected number of prescriptions for hospitalized individuals (which is a component to model demand for pharmacists).

Hospital Emergency Department Services

Logistic regression with MEPS data estimates the probability that a person with given characteristics would have at least one emergency visit during the year for each of 20 categories of services defined by ICD-10 (with earlier studies using ICD-9 codes in older NHAMCS files).

MEPS does not identify the medical specialty of providers seen during an ED visit. Therefore, the NHAMCS is used to identify the number and types of providers seen. If only one physician is encountered this physician is assumed to be an emergency physician. If the records indicate a second physician encounter occurred, the second encounter is assumed to be a specialist consultant with physician specialty aligned with the primary diagnosis code for the visit. For example, if the primary diagnosis code was neurology related then any second physician encounter would be designated as a consult with a neurologist. The NHAMCS record also indicates whether a PA, RN, or select other type of health care worker was seen during the visit. The NHAMCS data also indicated medications prescribed and lab tests/exams performed. This information is used to model demand for pharmacists and various allied health occupations.

Post-Acute Care Services

Demand for post-acute care in hospitals and skilled nursing facilities (SNFs) that are a part of a hospital is modeled as inpatient services. Demand for nursing home care in free-standing nursing homes is linked to the size of the population in nursing homes.

Home Health and Hospice Services

The pooled five-year MEPS files (n~22,000) were used to model home visits. The files contain annual use of home health services, including information on the type of provider seen during the visit (home health aide, physical therapist, etc.). Like the regression for office visits, negative binomial regression is used with annual visits from a specific provider type as the dependent variable. Explanatory variables consist of the same variables used to model demand for office, outpatient, hospital inpatient, and emergency department care.

Utilization of Health Care Worker Resources Not Captured in MEPS

Some health care workers provide services that are not captured in MEPS or not in traditional clinical settings. HWSM models demand for these workers as a provider-to-population ratio (see Exhibit III‑5). This includes occupations such as nurses, counselors, and physicians who are employed by schools, employed by insurance companies or life sciences companies, work in public health departments, or are involved in teaching or research.

Demand is modeled based on the size of the population who might use such services. For example, the demand for school-based services is derived by HWSM directly from the projected size of the population of school-aged children. Under the status quo demand scenario, if the size of the population of school-aged children increased by 5%, then demand for school-based health care would increase by 5%.

Modeling the workforce demand implications of COVID-19

Perhaps the largest impact of COVID-19 on future demand for health care workers is the lowered population projections due to COVID-19 impacts on excess deaths and reduced immigration levels. Two additional demand impacts modeled are: (1) care needs associated with future incidence if COVID-19 becomes endemic, and (2) care needs associated with Long COVID.

Despite availability of a vaccine, incidence of COVID-19 continues as the SARS-CoV-2 virus evolves.15 Increasingly, it appears that COVID-19 is shifting from pandemic to endemic like seasonal influenza.16 A McKinsey & Company analysis of the potential long-term cost implications of COVID-19 predicts 110 to 220 million annual COVID-19 cases over the foreseeable future.17 The authors estimate that 10 to 15% of these new COVID-19 cases will require outpatient treatment. The CDC reports that 428,000 COVID-19 hospitalizations occurred between December 1, 2021, and November 30, 2022 (a time frame when COVID-19 vaccination was widely available at little or no cost to the public).18

For modeling, we use McKinsey’s lower bound estimates of 110 million new COVID-19 cases annually and 10% accessing outpatient treatment. This would suggest 11 million outpatient visits per year. If such visits were mainly to primary care providers, this would increase by about 2.2% the demand for primary care office visits and, by extension, increase by 2.2% the demand for health care workers (e.g., primary care physicians, nurse practitioners, physician assistants, nurses, and other office staff) working in primary care outpatient settings.

If the nation continued to have 428,000 hospitalizations annually as COVID-19 becomes endemic, and if the average length of stay (5.5 days per COVID-related hospitalization) continued, then the estimated 2.35 million inpatient days raise hospital-based care by about 1.5% above current levels.19 For modeling, we assume that this increase raises demand for hospital-based health care workers (e.g., hospitalists, critical care doctors, nurses, allied health workers) by a corresponding 1.5%.

Projected future demand for COVID-19 related care is in line with 2010-2016 annual number of influenza-associated outpatient medical visits (4.3 million to 16.7 million) and hospitalizations (139,000 to 708,000).20

The McKinsey study estimates that 3% of COVID-19 cases will result in Long COVID lasting 3-12 months, which equates to 3.3 million cases annually. Long COVID effects may require visits to both primary care providers and to specialists who treat these conditions (e.g., neurology, pulmonology, cardiology, infectious diseases, nephrology, and endocrinology). We assume that patients with Long COVID will have two doctor visits per case, or 6.6 million visits annually. If these visits were proportionately distributed across primary care physicians and the above-mentioned specialists, this would suggest about a 1% increase in annual visits to each of these specialties and, by extension, the health care workers required to provide this additional care. These estimates of increased demand health care services and providers might be conservative, as Long COVID could cause permanent damage that necessitates lifetime treatment.

Staffing to meet demand for health care services

By applying information on staffing patterns, HWSM converts demand for visits and other utilization measures (described previously) into demand for FTEs by occupation or specialty.

The base year staffing ratio is calculated by dividing the national volume of service used by the number of health care professionals employed in each setting. In prior years, this approach assumed the base year demand for services in each setting was fully met by the available professionals in that setting—with the exception of primary care and psychiatry where the number of providers required to remove the health professional shortage area (HPSA) designations were used as a proxy for national shortfall. Increasingly, there is growing concern of current shortfalls in many occupations and care settings. In individual chapters, we discuss estimates of shortfalls, vacancy rates, or other information indicating shortfalls of health care workers for select physician specialties, nurses, and allied health and oral health occupations.

For occupations that provide services in a single setting, base year utilization is divided by the base year supply to derive the staffing ratio for that occupation. The staffing ratio is then applied to the projected volume of services to obtain the projected demand for providers in every year after the base year.

For occupations that provide services across multiple settings (e.g., nurses and therapists), information from the ACS or from the OEWS on the employment distribution of the care providers in the base year determines the number of individuals working in each setting. In general, ACS data is used for occupations where some health care workers might be self-employed (e.g., chiropractors) and OEWS data is used for occupations where health care workers are primarily employees (e.g., allied health occupations, nurses).

The modeled staffing ratios for health occupations are summarized in occupation-specific modules of this report.

Different types of health care workers overlap in their ability to provide services, and the status quo demand projections assume that care use and delivery patterns will remain unchanged over the projection horizon. When comparing demand to supply often one should look at categories of providers rather than specific occupations or specialties. For example, family physicians provide care that overlaps that provided by other primary care physicians (pediatricians, general internists, and geriatricians) and some specialist physicians, as well as by providers in other occupations (e.g., physician assistants and nurse practitioners). Likewise, some services provided by higher trained providers (e.g., RNs, or physicians) could be provided by less trained providers (e.g., LPNs, or advanced practice providers). Hence, there is some flexibility within the health care system to shift some care activities between occupations both for cost effectiveness reasons and if there is a shortfall or a particular provider type.

Demand scenarios

Status quo scenario

The status quo demand scenario in HWSM assumes current national patterns of care use and delivery to the modeled population remain relatively unchanged over time. This scenario models demand considering population demographics, health risk factors, disease prevalence, and economic factors correlated with demand for health care services. It captures population growth and aging over time, as well as geographic variation in demand determinants. When compared against supply projections, this scenario helps inform whether there will be sufficient supply to provide a level of care at least consistent with current levels. The main demand drivers of this scenario are population growth and aging. Changing racial/ethnic diversity also affects demand.

Reduced barriers scenario

A hypothetical reduced barriers was added to HWSM in 2019. National and state goals, as described in initiatives such as Healthy People 2030, are to remove barriers that contribute to inequities in use of services and health outcomes. This will improve access to affordable, high quality care—especially preventive services.21 This is also part of HRSA’s strategic plan: To improve health outcomes and address health disparities through access to quality services, a skilled health workforce, and innovative, high-value programs.22

This scenario first identifies a population that likely faces few access barriers to care. For modeling, we assume this is non-Hispanic White, with insurance, living in a metropolitan area. For oral health, this scenario also includes people in the top income level modeled in HWSM—household income of $75,000 or greater.

Then, using the health care use prediction equations estimated with MEPS data, we simulate if people not in this group had care utilization rates similar to the population likely experiencing fewer access barriers. Examples of people outside the likely group are racial or ethnic minorities, those without insurance, and people living in a nonmetropolitan area.

For women’s health, the metropolitan/nonmetropolitan component of the reduced barriers scenario was omitted because women in nonmetropolitan area use slightly more services than their peers in metropolitan areas. This scenario only models the additional demand for providers associated with gaining insurance and if minority populations had care use patterns like those of non-Hispanic White women and adolescent girls.

Modeled scenarios in prior reports

Prior reports for general surgeons and allied health and select other occupations modeled an evolving care delivery system scenario that builds on the status quo scenario. This scenario was later replaced by the reduced barriers scenario for two reasons. First, there were data limitations on the potential workforce impact of evolving trends in care delivery. Also, the reduced barriers scenario more clearly and rigorously models key national goals around improving equity in health care access for vulnerable populations.

The prior evolving care delivery system scenario modeled that the health care system continues to evolve reflecting: innovation and evidence-based medicine; economic considerations including payment reform and aligning patient incentives and health plan incentives; growing use of team-based care with each occupation contributing based on their specialization and evolving scope of practice; and public expectations and policies around population health, care access, and quality. Modeling for physicians suggests that some components of an evolving care delivery system will have contradictory effects on demand.23 Some components will increase demand (e.g., improving access to care, and increasing longevity through improved population health). Some components will decrease demand (e.g., improved preventive care reducing disease onset). Some components will shift care between providers (e.g., from physicians to advanced practice providers). Some components will shift care between settings (e.g., shifting care from emergency departments or hospitals to appropriate ambulatory settings).

Date Last Reviewed:

October 2023

1 Multiple years of MCBS data are sometimes used to increase sample size. The combined 2019 and 2018 MCBS are most recent data available.
2 The 2019 MDS file is the most recent data available.
3 Centers for Disease Control and Prevention. NCHS Urban-Rural Classification Scheme for Counties.
4 The first round of BRFSS-ACS matching produced a match in the same strata for 92% of the population. To match the remaining 8%, the eight income levels were collapsed into four (2% matched), then the race/ethnicity dimension was dropped (2% matched), then the same criteria as the first round was applied except State was removed as a stratum (remaining 4% matched), and finally for the fifth round, only demographics was included (remaining 0.05% matched).
5Kaiser Family Foundation. Total Number of Residents in Certified Nursing Facilities. Published 2022. Accessed May 31, 2023.
6 Caffrey C, Sengupta M, Melekin A. Residential care community resident characteristics: United States, 2018. National Center for Health Statistics; 2021. Accessed May 31, 2023.
7 The BRFSS, administered annually by the CDC, collects data on a sample of over 500,000 individuals. Like the ACS, the BRFSS includes demographics, household income, and medical insurance status on a stratified random sample of households in each state. The BRFSS also collects detailed information on the presence of chronic conditions and other health risk factors (e.g., obesity, smoking). We combined the 2019 and 2021 files to provide records for approximately one million individuals. We used the 2021 BRFSS to model asthma probability for children, as it is the most recent survey where child age is identified and excluded in the 2019 file.
To create the health risk factor dataset, we gathered health status prevalence percentages for each individual county and county equivalent in the United States (approximately 3,142 counties within 50 states and the District of Columbia). The prevalence of 12 health risk factors/conditions in the county-level population databases are representative of the prevalence of twelve risk factor categories from BRFSS. They are coronary heart disease, stroke, current smoking, heart attack, current asthma, obesity, diabetes, high blood pressure, arthritis, cancer, high cholesterol, and current insurance status. Smoking, asthma, obesity, and insurance status reflect the individual’s current status, while the other 8 categories reflect lifetime status. Obesity status is calculated based on the individual’s current weight.
8U.S. Department of Agriculture. USDA ERS - Population & Migration. Published May 6, 2022. Accessed July 12, 2022.
9 The logistic regression equations are unweighted because the independent variables (e.g., demographics) are the same variables used in the development of the BRFSS sample weights.
10 Centers for Disease Control and Prevention. PLACES: Local Data for Better Health. Published December 2022. Accessed December 31, 2022.
11 County level prevalence data are not published for Alabama, Arkansas, District of Columbia, Idaho, Indiana, Louisiana, Massachusetts, Minnesota, Mississippi, Nebraska, New York, North Carolina, North Dakota, South Dakota, Utah, Vermont, Virginia, Wisconsin, and Wyoming.
12 National Center for Health Statistics. NCHS Urban-Rural Classification Scheme for Counties.
13 Prior to 2019, the prediction equations modeled annual visits using Poisson regression. In response to inquiries about issues of over dispersion, potential alternative regression models were evaluated, and negative binomial regression replaced Poisson regression. This change had minimal impact on demand projections.
14 The model currently uses the 2019 NIS and 2018 and 2019 NHAMCS files.
15 Markov PV, Ghafari M, Beer M, et al. The evolution of SARS-CoV-2. Nat Rev Microbiol. 2023;21(6):361-379.
16 Klobucista C, Ferragamo M. When Will COVID-19 Become Endemic? Council on Foreign Relations. May 24, 2023. Accessed June 29, 2023.
17Patel N, Singhal S. What to Expect in US Healthcare in 2023 and Beyond. McKinsey & Company; 2023. Accessed June 29, 2023.
18 Centers for Disease Control and Prevention. COVID Data Tracker: Hospitalizations. Centers for Disease Control and Prevention. Published June 29, 2023. Accessed June 29, 2023.
19 Zeleke AJ, Moscato S, Miglio R, Chiari L. Length of Stay Analysis of COVID-19 Hospitalizations Using a Count Regression Model and Quantile Regression: A Study in Bologna, Italy. Int J Environ Res Public Health. 2022;19(4):2224.
20 Rolfes MA, Foppa IM, Garg S, et al. Annual estimates of the burden of seasonal influenza in the United States: A tool for strengthening influenza surveillance and preparedness. Influenza Other Respir Viruses. 2018;12(1):132-137.
21 U.S. Department of Health and Human Services. Healthy People 2030.
22Health Resources & Services Administration. Strategic Plan FY 2019-2022. Department of Health and Human Services; 2019. Accessed June 8, 2020.
23 Association of American Medical Colleges. The Complexities of Physician Supply and Demand: Projections From 2019 to 2034. AAMC; 2021. Accessed October 4, 2022.