New Mexico Primary Care Physician Accessibility Models

Preliminary Analysis with R - 2002 Data

Larry Spear, UNM (11/20/2018)

This preliminary analysis using R will eventually compare all the results from the generalized two-step and the one-step models. Various distance decay methods; exponential, power, Gaussian, and DGR power have been used. Several analytical techniques will be employed including; exploratory data analysis (graphics), ANOVA, T-Tests and related diagnostics, Moran’s I test for spatial autocorrelation, and eventually spatial oriented ANOVA. Only the results and a brief discussion are presented here. A more comprehensive version including the data and R code will be prepared later using Jupyter Notebook. Also, an ArcGIS Online Story Map with a more in-depth discussion of results will be developed in the future.

The following Group or Item names have been used to designate the individual methods (Note: the prefix Phys_ may be necessary in some output from R given variable naming conventions):

2SEE - Two Step Hybrid Zonal, Exponential Function

2SEG - Two Step Hybrid Zonal, Gaussian Function

2SEP - Two Step Hybrid Zonal, Std. Power Function

2SED - Two Step Hybrid Zonal, DGR Power Function

1SED - One Step Hybrid Zonal, DGR Power Function

1SEE - One Step Hybrid Zonal, Exponential Function

1SEG - One Step Hybrid Zonal, Gaussian Function

1SEP - One Step Hybrid Zonal, Std. Power Function

Two-Step Models Compared with One-Step (DGR) Model

This preliminary analysis using R will be based on a comparison of the generalized two-step methods using exponential, power, and Gaussian distance decay with the one-step models using the DGR power distance decay methods. Additional comparisons of two-step models with the other one-step models (exponential, power, and Gaussian distance decay methods) will also be presented after the analysis procedures have first been tested and refined here.

Summary Statistics - table shows the resulting means, standard deviations, minimum and maximum values, and quartiles (physicians per 1000 population) for the two-step and one-step (DGR) accessibility models with various distance decay methods:

Phys_1SED        Phys_2SEE        Phys_2SEG        Phys_2SEP

 Min.   :0.0437   Min.   :0.0000   Min.   :0.0000   Min.   :0.0000

 1st Qu.:0.3522   1st Qu.:0.4648   1st Qu.:0.4648   1st Qu.:0.4648

 Median :0.8060   Median :1.1644   Median :1.1678   Median :1.2384

 Mean   :0.6304   Mean   :1.0100   Mean   :1.0100   Mean   :1.0107

 3rd Qu.:0.8327   3rd Qu.:1.4666   3rd Qu.:1.4666   3rd Qu.:1.4666

 Max.   :2.7440   Max.   :5.4660   Max.   :5.5191   Max.   :5.5201

Note: The one-step model (1SED) with the DGR power decay method has both a lower mean (0.6304) and less variance (0.305) than all the two-step models. There are two other important mean values to be considered. The overall statewide mean derived by dividing the state population estimate (1,874,591) for 2002 by the estimated number of primary care physicians in 2002 (1,167) is 0.62235. The county-based service area (COSVAR) mean is 0.437665. The closest mean values to the overall statewide mean is derived by using the one-step model (Note: small difference may be due to round off).

Boxplots, Histograms – and related plots are useful for visualizing the differences between the one-step (DGR power distance decay) and the two-step models:

Note: These plots clearly indicate that there may be a significant difference in the two-step model results compared with the one-step model results. The median values, interquartile ranges, and outliers (maximum values) are very different. It is also apparent from the histograms that neither of the resulting distributions appear to be normally distributed.

ANOVA (one-way) – test and related results are shown below. The Null Hypothesis (H₀) is that the means from the various models are the same. The Alternative Hypothesis (H_a) is that at least one of the models means is not equal to the others.

Df Sum Sq Mean Sq F value Pr(>F)

Group 3 54 17.998 53.75 <2e-16 ***

Residuals 1992 667 0.335

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Tukey multiple comparisons of means

95% family-wise confidence level

Fit: aov(formula = Phys_per_P ~ Group, data = data_test1_df)

$`Group`

diff lwr upr p adj

2SEG-2SEE 0.0000242485 -0.09416672 0.09421522 1.0000000

2SEP-2SEE 0.0006913828 -0.09349959 0.09488236 0.9999976

1SED-2SEE -0.3795959920 -0.47378697 -0.28540502 0.0000000

2SEP-2SEG 0.0006671343 -0.09352384 0.09485811 0.9999978

1SED-2SEG -0.3796202405 -0.47381121 -0.28542927 0.0000000

1SED-2SEP -0.3802873747 -0.47447835 -0.28609640 0.0000000

Levene's Test for Homogeneity of Variance (center = median)

Df F value Pr(>F)

group 3 59.671 < 2.2e-16 ***

1992

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

One-way analysis of means (not assuming equal variances)

data: Phys_per_P and Group

F = 103.65, num df = 3.0, denom df = 1032.2, p-value < 2.2e-16

Pairwise comparisons using t tests with non-pooled SD

data: data_test1_df$Phys_per_P and data_test1_df$Group

2SEE 2SEG 2SEP

2SEG 1 - -

2SEP 1 1 -

1SED <2e-16 <2e-16 <2e-16

P value adjustment method: BH

Shapiro-Wilk normality test

data: aov_residuals

W = 0.85362, p-value < 2.2e-16

Kruskal-Wallis rank sum test

data: Phys_per_P by Group

Kruskal-Wallis chi-squared = 170.93, df = 3, p-value < 2.2e-16

Note: - As the p-value (<2e-16 ***) is so small the ANOVA test indicates that the Null Hypothesis (H₀) can be rejected in favor of the Alternative Hypothesis (H_a). There appears to be a significant difference between at least one of the models means and the others. However, there are three important assumptions or requirements that should be considered when applying ANOVA: 1) The data are independent and obtained randomly from the population; 2) The data are normally distributed; and 3) The data have common variances. All these assumptions have not been met here and these results should be interpreted with caution. These results are not independent or obtained from a random experiment. There is evidence of more than moderate spatial autocorrelation (see Moran’s I test). The previous histograms show that the data are not normally distributed. The summary statistics show a lack of common variances. If necessary, routine measures such as data transformations can be subsequently employed. Regardless, it is important to present these results using standard ANOVA and related diagnostic techniques that are routinely used even in geographically and spatially oriented research studies. Also, research is underway to eventually conduct a spatial ANOVA test to see if there is any noticeable change in the results (see below).

The additional routine diagnostic tests confirm the initial observations and standard ANOVA results. The Tukey multiple comparison of means indicates that the one-step model always has a low p-value (0.0) when compared with any of the two-step methods. The Leven’s test for homogeneity of variance also has a low p-value (2.2e-16 ***) that suggests that the variances are not common across models. The pair-wise t test with no assumption of equal variance also indicates that the one-step model is significantly different from the two-step models, p-values (<2e-16). The Shapiro-Wilk normality test p-value (< 2.2e-16) also indicates a lack of normality. The Kruskal-Wallis rank sum test (non-parametric) which can be used when ANOVA assumptions are not met does not change the outcome, p-value (< 2.2e-16) confirming the Null Hypothesis (H₀) can be rejected in favor of the Alternative Hypothesis (H_a). Additional confirmation of concern for caution in interpreting the ANOVA results is apparent by reviewing the Normal QQ plot of standardized residuals that should be mostly normally distributed. The residuals deviate considerable from a straight line, confirming a lack of desired normality.

Moran’s I – global test for spatial autocorrelation using a queen’s case neighbors list and row standardization results for each method are shown below:

Neighbour list object: Queen’s case

Number of regions: 499

Number of nonzero links: 2960

Percentage nonzero weights: 1.18875

Average number of links: 5.931864

Weights style: W

Weights constants summary:

n nn S0 S1 S2

W 499 249001 499 185.3664 2095.07

moran.range(Results.lw)

[1] -0.7214727 1.0623680

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEE

weights: Results.lw

Moran I statistic standard deviate = 17.861, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4786685352 -0.0020080321 0.0007242812

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEG

weights: Results.lw

Moran I statistic standard deviate = 17.715, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4746291857 -0.0020080321 0.0007239152

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEP

weights: Results.lw

Moran I statistic standard deviate = 17.825, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4776201197 -0.0020080321 0.0007239826

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_1SED

weights: Results.lw

Moran I statistic standard deviate = 21.084, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.5673121641 -0.0020080321 0.0007291042

Note: There is significant spatial autocorrelation for the one-step and all the two-step models (similar Moran’s I statistics, very low p-values, and large standard deviates). These results indicate strong clustering and it is extremely unlikely (less than 1%) that these clustered patterns could be the results of random chance. The one-step model is perhaps even more clustered (a larger Moran’s I statistic) than the two-step models. This lack of independence is a violation of a major standard ANOVA assumption. A not that widely used or well documented alternative test method that can take into consideration non-independence or spatial autocorrelation is spatial ANOVA.

Spatial ANOVA (one-way) – currently being prepared!

ANOVA is also a linear model that uses categorical variables instead of continuous independent or predictor variables as in regression. For non-spatial models in R similar results can be obtained from ANOVA (anova or aov) and a linear model (lm) with the same dataset having categorical independent variables. An ANOVA is constructed from a linear model. ANOVA is just a linear model where results are reported differently from a linear regression model. The results from a non-spatial linear model (lm) are shown below: However, I have not yet been able to get spatial regression models in R (using the spdep library) to work with categorical independent data. More research is currently being conducted and hopefully results will be available soon.

Call:

lm(formula = Phys_per_P ~ Group, data = data_test4_df)

Residuals:

Min 1Q Median 3Q Max

-1.0107 -0.4774 0.1755 0.4501 4.5094

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 0.63043 0.02590 24.34 <2e-16 ***

GroupPhys_2SEE 0.37960 0.03663 10.36 <2e-16 ***

GroupPhys_2SEG 0.37962 0.03663 10.36 <2e-16 ***

GroupPhys_2SEP 0.38029 0.03663 10.38 <2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5786 on 1992 degrees of freedom

Multiple R-squared: 0.07489, Adjusted R-squared: 0.0735

F-statistic: 53.75 on 3 and 1992 DF, p-value: < 2.2e-16

Note: The intercept is the mean (0.63043) for the one-step model (1SED) using the DGR power-based distance decay method. The other intercepts are the difference from this mean (the mean of the 2SEE method is 0.63043 + 0.37960 = 1.01003). The F-statistic (53.75) is the same as derived from the ANOVA.

Two-Step Exponential Model Compared with One-Step Exponential Model

Summary Statistics - table shows the resulting means, standard deviations, minimum and maximum values (physicians per 1000 population) for the two-step and one-step accessibility models with an exponential distance decay methods:

Group count mean sd min max

<ord> <int> <dbl> <dbl> <dbl> <dbl>

1 2SEE 499 1.01 0.642 0 5.47

2 1SEE 499 0.621 0.198 0.163 1.11

Note: The one-step model (1SEE) with the exponential decay has both a lower mean (0.621) and less variance (0.198) than the two-step model with exponential distance decay method. The one-step model mean is closer to the statewide mean (0.62235).

Boxplots, Histograms – and related plots are useful for visualizing the differences between the one-step and the two-step models:

Note: These plots clearly indicate that there may be a significant difference in the two-step model results compared with the one-step model results.

T-Test – results are shown below. The Null Hypothesis (H₀) is that the means from both the models are the same (differences equal 0). The Alternative Hypothesis (H_a) is that the means of these models are not the same (differences not equal 0).

Welch Two Sample t-test

data: G2SFCAEE_df$Phys_per_P and G1SHGMEE_df$Phys_per_P

t = 12.93, df = 591.97, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3298831 0.4480436

sample estimates:

mean of x mean of y

1.0100251 0.6210617

Two Sample t-test

data: G2SFCAEE_df$Phys_per_P and G1SHGMEE_df$Phys_per_P

t = 12.93, df = 996, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3299321 0.4479945

sample estimates:

mean of x mean of y

1.0100251 0.6210617

Note: - The p-values (< 2.2e-16) from both the Welch t-test (allows for unequal variance) and the standard t-test which are the same and are so small clearly indicates that the Null Hypothesis (H₀) can be rejected in favor of the Alternative Hypothesis (H_a).

Moran’s I – global test for spatial autocorrelation using a queen’s case neighbors list and row standardization results for each model are shown below:

Characteristics of weights list object:

Neighbour list object: Queen’s case

Number of regions: 499

Number of nonzero links: 2960

Percentage nonzero weights: 1.18875

Average number of links: 5.931864

Weights style: W

Weights constants summary:

n nn S0 S1 S2

W 499 249001 499 185.3664 2095.07

moran.range(Results.lw)

[1] -0.7214727 1.0623680

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEE

weights: Results.lw

Moran I statistic standard deviate = 17.861, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4786685352 -0.0020080321 0.0007242812

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_1SEE

weights: Results.lw

Moran I statistic standard deviate = 33.136, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.897786920 -0.002008032 0.000737373

Note: There is significant spatial autocorrelation for the one-step and the two-step models (similar Moran’s I statistics, very low p-values, and large standard deviates). These results indicate strong clustering and it is extremely unlikely (less than 1%) that these clustered patterns could be the results of random chance. The one-step model is perhaps even more clustered (a larger Moran’s I statistic) than the two-step model. This lack of independence is a violation of an important T-Test assumption. A more appropriate spatial statistical test method is currently being researched and eventually these results (which may not be very different) will be presented.

Two-Step Power Model Compared with One-Step Power Model

Group count mean sd min max

<ord> <int> <dbl> <dbl> <dbl> <dbl>

1 2SEP 499 1.01 0.646 0 5.52

2 1SEP 499 0.633 0.367 0.0171 4.35

Note: The one-step model (1SEP) has both a lower mean (0.633) and less variance (0.367) than all the two-step model with power distance decay. The one-step model mean is closer to the statewide mean (0.62235).

Boxplots, Histograms – and related plots are useful for visualizing the differences between the one-step and the two-step models:

Note: These plots clearly indicate that there may be a significant difference in the two-step model results compared with the one-step model results.

Welch Two Sample t-test

data: G2SFCAEP_df$Phys_per_P and G1SHGMEP_df$Phys_per_P

t = 11.357, df = 789.32, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3125421 0.4431561

sample estimates:

mean of x mean of y

1.0107164 0.6328673

Two Sample t-test

data: G2SFCAEP_df$Phys_per_P and G1SHGMEP_df$Phys_per_P

t = 11.357, df = 996, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3125629 0.4431353

sample estimates:

mean of x mean of y

1.0107164 0.6328673

Moran’s I – global test for spatial autocorrelation using a queen’s case neighbors list and row standardization results for each model are shown below:

Characteristics of weights list object:

Neighbour list object: Queen’s case

Number of regions: 499

Number of nonzero links: 2960

Percentage nonzero weights: 1.18875

Average number of links: 5.931864

Weights style: W

Weights constants summary:

n nn S0 S1 S2

W 499 249001 499 185.3664 2095.07

moran.range(Results.lw)

[1] -0.7214727 1.0623680

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEP

weights: Results.lw

Moran I statistic standard deviate = 17.825, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4776201197 -0.0020080321 0.0007239826

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_1SEP

weights: Results.lw

Moran I statistic standard deviate = 13.788, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.3636143537 -0.0020080321 0.0007032187

Note: There is significant spatial autocorrelation for the one-step and the two-step models (similar Moran’s I statistics, very low p-values, and large standard deviates). These results indicate strong clustering and it is extremely unlikely (less than 1%) that these clustered patterns could be the results of random chance. It is interesting that the two-step model is perhaps slightly more clustered (a larger Moran’s I statistic) than the one-step model which is not the case for the other models (exponential and Gaussian). This lack of independence is a violation of an important T-Test assumption. A more appropriate spatial statistical test method is currently being researched and eventually these results (which may not be very different) will be presented.

Two-Step Gaussian Model Compared with One-Step Gaussian Model

Group count mean sd min max

<ord> <int> <dbl> <dbl> <dbl> <dbl>

1 2SEG 499 1.01 0.646 0 5.52

2 1SEG 499 0.576 0.203 0.0131 1.08

Note: The one-step model (1SEG) with the Gaussian distance decay has both a lower mean (0.576) and less variance (0.203) than the two-step model with Gaussian distance decay. The one-step model mean is closer to the statewide mean (0.62235).

Boxplots, Histograms – and related plots are useful for visualizing the differences between the one-step the two-step models:

Note: These plots clearly indicate that there may be a significant difference in the two-step model results compared with the one-step model results.

Welch Two Sample t-test

data: G2SFCAEG_df$Phys_per_P and G1SHGMEG_df$Phys_per_P

t = 14.342, df = 595.45, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3749920 0.4939867

sample estimates:

mean of x mean of y

1.0100493 0.5755599

Two Sample t-test

data: G2SFCAEG_df$Phys_per_P and G1SHGMEG_df$Phys_per_P

t = 14.342, df = 996, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.3750407 0.4939380

sample estimates:

mean of x mean of y

1.0100493 0.5755599

Moran’s I – global test for spatial autocorrelation using a queen’s case neighbors list and row standardization results for each model are shown below:

Characteristics of weights list object:

Neighbour list object: Queen’s case

Number of regions: 499

Number of nonzero links: 2960

Percentage nonzero weights: 1.18875

Average number of links: 5.931864

Weights style: W

Weights constants summary:

n nn S0 S1 S2

W 499 249001 499 185.3664 2095.07

moran.range(Results.lw)

[1] -0.7214727 1.0623680

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_2SEG

weights: Results.lw

Moran I statistic standard deviate = 17.715, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.4746291857 -0.0020080321 0.0007239152

Moran I test under randomisation

data: Results_Pop_Phys_spdf$Phys_1SEG

weights: Results.lw

Moran I statistic standard deviate = 26.433, p-value < 2.2e-16

alternative hypothesis: greater

sample estimates:

Moran I statistic Expectation Variance

0.7149166391 -0.0020080321 0.0007356133

Note: There is significant spatial autocorrelation for the one-step and all the two-step models (similar Moran’s I statistics, very low p-values, and large standard deviates). These results indicate strong clustering and it is extremely unlikely (less than 1%) that these clustered patterns could be the results of random chance. The one-step model is perhaps even more clustered (a larger Moran’s I statistic) than the two-step method. This lack of independence is a violation of an important T-Test assumption. A more appropriate spatial statistical test method is currently being researched and eventually these results (which may not be very different) will be presented.

Summary of Results

Preliminary Version for Discussion Only

My initial goal for this research project was to compare the one-step and two-step models given an assumption or hypothesis that there would not be any significant differences as both are essentially gravity models. However, I have been surprised by the results. There is clear evidence of a significant difference between the one-step and two-step models that have similar constructs (hybrid-zonal) and employ the same distance decay methods (power, exponential, or Gaussian).

It is important to note that there is no indication that either model is a better measure of reality than the other. This can perhaps only be determined when actual observational provider data records are reviewed, or statistical patient-based surveys are conducted. As these models are only approximations of reality, a given model can only be evaluated as somewhat more appropriate or suitable than the other given the availability and quality of input data, plus how well it matches the desired usage or purpose.

This project has been an exceptional learning exercise that has enabled me to gain a better understanding of statistical, spatial, and GIS software and techniques. I have focused on presenting the results as objectively as possible. However, I do have some personal observations that are worth presenting for discussion purposes. Hopefully other researchers will review these results and make future suggestions that will aid in my interpretations.

The one-step models are easier to operationalize but conceptually provide lower resolution than the two-step models. They can require using less computational resources and GIS facilities. However, it is necessary to calculate road-based distances and use a GIS to display results on a map. Also, standard statistical packages and programming environments can be used to perform most of the calculations. The results seem to be more similar to the traditional county-based service capacity standards used by the federal government for measuring physician shortage areas. The mean values are similar to the statewide means and have smaller variances. Although they appear to spread out the potential accessibility measures more evenly, they can also over-estimate in actual shortage areas. Regardless, the one-step model may be a good initial first step beyond the traditional county-based methods. They are easier to explain to decision makers such as state legislators as they are a simple extension of the standard service capacity ratio measure. A one-step model may be more suitable at the state level of geography than a similarly constructed two-step model. As such, they may be more appropriate choice for healthcare planners seeking necessary increased funding.

The two-step models require more effort to operationalize but conceptually provide higher resolution than the one-step models. More computational resources and GIS facilities are required. A more involved programming effort within the GIS and related scripting environments is necessary as the calculations are more complicated. The resulting mean values are noticeably larger and variances greater than the traditional county-based service capacity standards used by the federal government for measuring physician shortage areas. They seem to not spread out potential accessibility measures as evenly as the one-step models and may perhaps over-estimate and severely under-estimate in some areas. Regardless, they represent a definite technological advancement that should be applied when appropriate. However, they are more difficult to explain to a non-professional audience. A two-step model may be more suitable at the regional level of geography than a similarly constructed one-step model. Several varieties of two-step models are continuing to be developed by academic researchers and their practical applications have been demonstrated.

There is more research work to be undertaken in evaluating the similarities and differences (comparing and contrasting) between the one-step and two-step models. This study used ZIP Code (post office locations or centroids) to locate physicians because this older data was all that was available. More recent physician data, if available, could be used to more accurately geocode addresses. From a more technical perspective, the effects of both the modifiable areal unit problem (MAUP) and mathematical aspects of the calculations need more evaluation. It is also apparent that healthcare planners could benefit if better scripting applications were developed. These developments could make it easier to review results and allow for better selection of an appropriate model to employ.

I plan on completing a more in-depth review of the one-step and two-step models. These more detailed results will be presented using Esri’s ArcGIS Online Story Map facility. Further development of a Python based scripting tool will also be completed. It will calculate selected models, compares results using both exploratory and statistical data analysis methods, plus hopefully include a spatial version of ANOVA.

Larry Spear

Sr. Research Scientist (Ret.)

Division of Government Research

University of New Mexico

lspear@unm.edu

https://www.unm.edu/~lspear