New Mexico Primary Care
Physician Accessibility Models
Preliminary Analysis with R - 2002
Data
Larry Spear,
UNM (11/20/2018)
This preliminary analysis using R
will eventually compare all the results from the generalized two-step and the
one-step models. Various distance decay methods; exponential, power, Gaussian,
and DGR power have been used. Several analytical techniques will be employed
including; exploratory data analysis (graphics), ANOVA, T-Tests and related
diagnostics, Moran’s I test for spatial autocorrelation, and eventually spatial
oriented ANOVA. Only the results and a brief discussion are presented here. A
more comprehensive version including the data and R code will be prepared later
using Jupyter Notebook. Also, an ArcGIS Online Story
Map with a more in-depth discussion of results will be developed in the future.
The following Group or Item names
have been used to designate the individual methods (Note: the prefix Phys_ may
be necessary in some output from R given variable naming conventions):
2SEE - Two Step Hybrid Zonal,
Exponential Function
2SEG - Two
Step Hybrid Zonal, Gaussian Function
2SEP - Two
Step Hybrid Zonal, Std. Power Function
2SED - Two
Step Hybrid Zonal, DGR Power Function
1SED
- One Step Hybrid Zonal, DGR Power Function
1SEE - One
Step Hybrid Zonal, Exponential Function
1SEG - One
Step Hybrid Zonal, Gaussian Function
1SEP - One
Step Hybrid Zonal, Std. Power Function
Two-Step Models Compared with One-Step
(DGR) Model
This preliminary analysis using R
will be based on a comparison of the generalized two-step methods using
exponential, power, and Gaussian distance decay with the one-step models using
the DGR power distance decay methods. Additional comparisons of two-step models
with the other one-step models (exponential, power, and Gaussian distance decay
methods) will also be presented after the analysis procedures have first been
tested and refined here.
Summary Statistics
- table shows the resulting means, standard deviations, minimum and maximum
values, and quartiles (physicians per 1000 population) for the two-step and
one-step (DGR) accessibility models with various distance decay methods:
Phys_1SED Phys_2SEE Phys_2SEG Phys_2SEP
Min. :0.0437 Min. :0.0000 Min. :0.0000 Min. :0.0000
1st Qu.:0.3522 1st Qu.:0.4648 1st Qu.:0.4648 1st Qu.:0.4648
Median :0.8060 Median :1.1644 Median :1.1678 Median :1.2384
Mean :0.6304 Mean :1.0100 Mean :1.0100 Mean :1.0107
3rd Qu.:0.8327 3rd Qu.:1.4666 3rd Qu.:1.4666 3rd Qu.:1.4666
Max. :2.7440 Max. :5.4660 Max. :5.5191 Max. :5.5201
Note: The one-step model (1SED) with the
DGR power decay method has both a lower mean (0.6304) and less variance (0.305)
than all the two-step models. There are two other important mean values to be
considered. The overall statewide mean derived by dividing the state population
estimate (1,874,591) for 2002 by the estimated number of primary care
physicians in 2002 (1,167) is 0.62235. The county-based service area (COSVAR)
mean is 0.437665. The closest mean values to the overall statewide mean is
derived by using the one-step model (Note: small difference may be due to round
off).
Boxplots, Histograms – and related plots are useful for
visualizing the differences between the one-step (DGR power distance decay) and
the two-step models:
Note: These plots clearly indicate that
there may be a significant difference in the two-step model results compared
with the one-step model results. The median values, interquartile ranges, and
outliers (maximum values) are very different. It is also apparent from the
histograms that neither of the resulting distributions appear to be normally
distributed.
ANOVA (one-way) – test and related results
are shown below. The Null Hypothesis (H0) is that the
means from the various models are the same. The Alternative Hypothesis
(Ha) is that at least one of the models means is not equal to the
others.
Df Sum Sq
Mean Sq F value Pr(>F)
Group 3
54 17.998 53.75 <2e-16
***
Residuals 1992
667 0.335
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Tukey multiple comparisons of
means
95% family-wise confidence level
Fit: aov(formula = Phys_per_P ~ Group, data = data_test1_df)
$`Group`
diff lwr upr p adj
2SEG-2SEE 0.0000242485 -0.09416672 0.09421522 1.0000000
2SEP-2SEE 0.0006913828 -0.09349959 0.09488236 0.9999976
1SED-2SEE -0.3795959920 -0.47378697 -0.28540502 0.0000000
2SEP-2SEG 0.0006671343 -0.09352384 0.09485811 0.9999978
1SED-2SEG -0.3796202405 -0.47381121 -0.28542927 0.0000000
1SED-2SEP -0.3802873747 -0.47447835 -0.28609640 0.0000000
Levene's Test for
Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 59.671 < 2.2e-16 ***
1992
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
One-way analysis of means (not
assuming equal variances)
data: Phys_per_P and
Group
F = 103.65, num df = 3.0, denom df = 1032.2, p-value < 2.2e-16
Pairwise comparisons using t
tests with non-pooled SD
data: data_test1_df$Phys_per_P and
data_test1_df$Group
2SEE
2SEG 2SEP
2SEG 1 -
-
2SEP 1 1
-
1SED <2e-16 <2e-16 <2e-16
P value adjustment method: BH
Shapiro-Wilk normality test
data: aov_residuals
W = 0.85362, p-value <
2.2e-16
Kruskal-Wallis rank sum test
data: Phys_per_P by Group
Kruskal-Wallis chi-squared =
170.93, df = 3, p-value < 2.2e-16
Note: - As the p-value (<2e-16 ***) is so
small the ANOVA test indicates that the Null Hypothesis (H0)
can be rejected in favor of the Alternative Hypothesis (Ha).
There appears to be a significant difference between at least one of the models
means and the others. However, there are three important assumptions or
requirements that should be considered when applying ANOVA: 1) The data are
independent and obtained randomly from the population; 2) The data are normally
distributed; and 3) The data have common variances. All these assumptions have
not been met here and these results should be interpreted with caution. These
results are not independent or obtained from a random experiment. There is
evidence of more than moderate spatial autocorrelation (see Moran’s I test).
The previous histograms show that the data are not normally distributed. The
summary statistics show a lack of common variances. If necessary, routine
measures such as data transformations can be subsequently employed. Regardless,
it is important to present these results using standard ANOVA and related
diagnostic techniques that are routinely used even in geographically and
spatially oriented research studies. Also, research is underway to eventually
conduct a spatial ANOVA test to see if there is any noticeable change in the results
(see below).
The
additional routine diagnostic tests confirm the initial observations and
standard ANOVA results. The Tukey multiple comparison of means indicates that
the one-step model always has a low p-value (0.0) when compared with any of the two-step
methods. The Leven’s test for homogeneity of variance also has a low p-value (2.2e-16 ***) that
suggests that the variances are not common across models. The pair-wise t test
with no assumption of equal variance also indicates that the one-step model is
significantly different from the two-step models, p-values (<2e-16). The
Shapiro-Wilk normality test p-value (< 2.2e-16) also indicates a lack of normality. The
Kruskal-Wallis rank sum test (non-parametric) which can be used when ANOVA
assumptions are not met does not change the outcome, p-value (< 2.2e-16) confirming
the Null Hypothesis (H0) can be rejected in favor of
the Alternative Hypothesis (Ha). Additional
confirmation of concern for caution in interpreting the ANOVA results is
apparent by reviewing the Normal QQ plot of standardized residuals that should
be mostly normally distributed. The residuals deviate considerable from a
straight line, confirming a lack of desired normality.
Moran’s I – global test for spatial
autocorrelation using a queen’s case neighbors list and row standardization
results for each method are shown below:
Neighbour list object:
Queen’s case
Number of regions: 499
Number of nonzero links: 2960
Percentage nonzero weights:
1.18875
Average number of links:
5.931864
Weights style: W
Weights constants summary:
n
nn S0 S1
S2
W 499 249001 499 185.3664
2095.07
moran.range(Results.lw)
[1] -0.7214727 1.0623680
Moran I test under
randomisation
data: Results_Pop_Phys_spdf$Phys_2SEE
weights: Results.lw
Moran I statistic standard
deviate = 17.861, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4786685352
-0.0020080321 0.0007242812
Moran I test under
randomisation
data: Results_Pop_Phys_spdf$Phys_2SEG
weights: Results.lw
Moran I statistic standard
deviate = 17.715, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4746291857
-0.0020080321 0.0007239152
Moran I test under
randomisation
data: Results_Pop_Phys_spdf$Phys_2SEP
weights: Results.lw
Moran I statistic standard
deviate = 17.825, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4776201197 -0.0020080321 0.0007239826
Moran I test under
randomisation
data: Results_Pop_Phys_spdf$Phys_1SED
weights: Results.lw
Moran I statistic standard
deviate = 21.084, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.5673121641
-0.0020080321 0.0007291042
Note: There is significant spatial
autocorrelation for the one-step and all the two-step models (similar Moran’s I
statistics, very low p-values, and large standard deviates). These results
indicate strong clustering and it is extremely unlikely (less than 1%) that
these clustered patterns could be the results of random chance. The one-step model
is perhaps even more clustered (a larger Moran’s I statistic) than the two-step
models. This lack of independence is a violation of a major standard ANOVA
assumption. A not that widely used or well documented alternative test method
that can take into consideration non-independence or spatial autocorrelation is
spatial ANOVA.
Spatial ANOVA (one-way) – currently
being prepared!
ANOVA is
also a linear model that uses categorical variables instead of continuous
independent or predictor variables as in regression. For non-spatial models in
R similar results can be obtained from ANOVA (anova
or aov) and a linear model (lm)
with the same dataset having categorical independent variables. An ANOVA is
constructed from a linear model. ANOVA is just a linear model where results are
reported differently from a linear regression model. The results from a
non-spatial linear model (lm) are shown below:
However, I have not yet been able to get spatial regression models in R (using
the spdep library) to work with categorical
independent data. More research is currently being conducted and hopefully
results will be available soon.
Call:
lm(formula = Phys_per_P ~ Group, data =
data_test4_df)
Residuals:
Min
1Q Median 3Q
Max
-1.0107 -0.4774 0.1755
0.4501 4.5094
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.63043 0.02590 24.34
<2e-16 ***
GroupPhys_2SEE 0.37960
0.03663 10.36 <2e-16 ***
GroupPhys_2SEG 0.37962
0.03663 10.36 <2e-16 ***
GroupPhys_2SEP 0.38029
0.03663 10.38 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Residual standard error: 0.5786
on 1992 degrees of freedom
Multiple R-squared: 0.07489, Adjusted
R-squared: 0.0735
F-statistic: 53.75 on 3 and 1992
DF, p-value: < 2.2e-16
Note: The intercept is the mean (0.63043)
for the one-step model (1SED) using the DGR power-based distance decay method.
The other intercepts are the difference from this mean (the mean of the 2SEE
method is 0.63043 + 0.37960 = 1.01003). The F-statistic (53.75) is the same as derived
from the ANOVA.
Two-Step Exponential Model Compared
with One-Step Exponential Model
Summary Statistics
- table shows the resulting means, standard deviations, minimum and maximum
values (physicians per 1000 population) for the two-step and one-step
accessibility models with an exponential distance decay methods:
Group count mean
sd
min max
<ord> <int>
<dbl> <dbl>
<dbl> <dbl>
1 2SEE 499 1.01
0.642 0 5.47
2 1SEE 499 0.621 0.198 0.163 1.11
Note: The one-step model (1SEE) with the exponential
decay has both a lower mean (0.621) and less variance (0.198) than the two-step
model with exponential distance decay method. The one-step model mean is closer
to the statewide mean (0.62235).
Boxplots, Histograms – and related plots are useful for
visualizing the differences between the one-step and the two-step models:
Note: These plots clearly indicate that
there may be a significant difference in the two-step model results compared
with the one-step model results.
T-Test
– results are shown below. The Null
Hypothesis (H0) is that the means from both the models are
the same (differences equal 0). The Alternative Hypothesis (Ha)
is that the means of these models are not the same (differences not equal 0).
Welch Two Sample t-test
data: G2SFCAEE_df$Phys_per_P and
G1SHGMEE_df$Phys_per_P
t = 12.93, df = 591.97, p-value
< 2.2e-16
alternative hypothesis: true
difference in means is not equal to 0
95 percent confidence interval:
0.3298831 0.4480436
sample estimates:
mean of x mean of y
1.0100251 0.6210617
Two Sample t-test
data: G2SFCAEE_df$Phys_per_P and
G1SHGMEE_df$Phys_per_P
t = 12.93, df = 996, p-value < 2.2e-16
alternative hypothesis: true
difference in means is not equal to 0
95 percent confidence interval:
0.3299321 0.4479945
sample estimates:
mean of x mean of y
1.0100251 0.6210617
Note: - The p-values (< 2.2e-16) from both
the Welch t-test (allows for unequal variance) and the standard t-test which
are the same and are so small clearly indicates that the Null Hypothesis
(H0) can be rejected in favor of the Alternative Hypothesis
(Ha).
Moran’s I – global test for spatial
autocorrelation using a queen’s case neighbors list and row standardization
results for each model are shown below:
Characteristics of weights list
object:
Neighbour list object:
Queen’s case
Number of regions: 499
Number of nonzero links: 2960
Percentage nonzero weights:
1.18875
Average number of links:
5.931864
Weights style: W
Weights constants summary:
n
nn
S0 S1 S2
W 499 249001 499 185.3664
2095.07
moran.range(Results.lw)
[1] -0.7214727 1.0623680
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_2SEE
weights: Results.lw
Moran I statistic standard
deviate = 17.861, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4786685352
-0.0020080321 0.0007242812
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_1SEE
weights: Results.lw
Moran I statistic standard
deviate = 33.136, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.897786920
-0.002008032 0.000737373
Note: There is significant spatial
autocorrelation for the one-step and the two-step models (similar Moran’s I
statistics, very low p-values, and large standard deviates). These results
indicate strong clustering and it is extremely unlikely (less than 1%) that
these clustered patterns could be the results of random chance. The one-step model
is perhaps even more clustered (a larger Moran’s I statistic) than the two-step
model. This lack of independence is a violation of an important T-Test
assumption. A more appropriate spatial statistical test method is currently
being researched and eventually these results (which may not be very different)
will be presented.
Two-Step Power Model Compared with
One-Step Power Model
Summary Statistics
- table shows the resulting means, standard deviations, minimum and maximum
values (physicians per 1000 population) for the two-step and one-step
accessibility models with a power distance decay method:
Group count mean
sd
min max
<ord> <int>
<dbl> <dbl> <dbl> <dbl>
1 2SEP 499 1.01
0.646 0 5.52
2 1SEP 499 0.633 0.367 0.0171 4.35
Note: The one-step model (1SEP) has both a
lower mean (0.633) and less variance (0.367) than all the two-step model with
power distance decay. The one-step model mean is closer to the statewide mean (0.62235).
Boxplots, Histograms – and related plots are useful for
visualizing the differences between the one-step and the two-step models:
Note: These plots clearly indicate that
there may be a significant difference in the two-step model results compared
with the one-step model results.
T-Test – results are shown below. The Null
Hypothesis (H0) is that the means from both the models are
the same (differences equal 0). The Alternative Hypothesis (Ha)
is that the means of these models are not the same (differences not equal 0).
Welch Two Sample t-test
data: G2SFCAEP_df$Phys_per_P and
G1SHGMEP_df$Phys_per_P
t = 11.357, df = 789.32,
p-value < 2.2e-16
alternative hypothesis: true
difference in means is not equal to 0
95 percent confidence interval:
0.3125421 0.4431561
sample estimates:
mean of x mean of y
1.0107164 0.6328673
Two Sample t-test
data: G2SFCAEP_df$Phys_per_P and
G1SHGMEP_df$Phys_per_P
t = 11.357, df = 996, p-value < 2.2e-16
alternative hypothesis: true difference
in means is not equal to 0
95 percent confidence interval:
0.3125629 0.4431353
sample estimates:
mean of x mean of y
1.0107164 0.6328673
Note: - The p-values (< 2.2e-16) from both
the Welch t-test (allows for unequal variance) and the standard t-test which
are the same and are so small clearly indicates that the Null Hypothesis
(H0) can be rejected in favor of the Alternative Hypothesis
(Ha).
Moran’s I – global test for spatial
autocorrelation using a queen’s case neighbors list and row standardization
results for each model are shown below:
Characteristics of weights list
object:
Neighbour list object:
Queen’s case
Number of regions: 499
Number of nonzero links: 2960
Percentage nonzero weights:
1.18875
Average number of links:
5.931864
Weights style: W
Weights constants summary:
n
nn
S0 S1 S2
W 499 249001 499 185.3664
2095.07
moran.range(Results.lw)
[1] -0.7214727 1.0623680
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_2SEP
weights: Results.lw
Moran I statistic standard
deviate = 17.825, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4776201197
-0.0020080321 0.0007239826
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_1SEP
weights: Results.lw
Moran I statistic standard
deviate = 13.788, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.3636143537
-0.0020080321 0.0007032187
Note: There is significant spatial
autocorrelation for the one-step and the two-step models (similar Moran’s I
statistics, very low p-values, and large standard deviates). These results
indicate strong clustering and it is extremely unlikely (less than 1%) that
these clustered patterns could be the results of random chance. It is interesting
that the two-step model is perhaps slightly more clustered (a larger Moran’s I
statistic) than the one-step model which is not the case for the other models
(exponential and Gaussian). This lack of independence is a violation of an
important T-Test assumption. A more appropriate spatial statistical test method
is currently being researched and eventually these results (which may not be
very different) will be presented.
Two-Step Gaussian Model Compared with
One-Step Gaussian Model
Summary Statistics
- table shows the resulting means, standard deviations, minimum and maximum
values (physicians per 1000 population) for the two-step and one-step
accessibility models with a Gaussian distance decay method:
Group count mean
sd
min max
<ord> <int>
<dbl> <dbl> <dbl> <dbl>
1 2SEG 499 1.01
0.646 0 5.52
2 1SEG 499 0.576 0.203 0.0131 1.08
Note: The one-step model (1SEG) with the Gaussian
distance decay has both a lower mean (0.576) and less variance (0.203) than the
two-step model with Gaussian distance decay. The one-step model mean is closer
to the statewide mean (0.62235).
Boxplots, Histograms – and related plots are useful for
visualizing the differences between the one-step the two-step models:
Note: These plots clearly indicate that
there may be a significant difference in the two-step model results compared
with the one-step model results.
T-Test
– results are shown below. The Null Hypothesis (H0)
is that the means from both the models are the same (differences equal 0). The Alternative
Hypothesis (Ha) is that the means of these models are not the
same (differences not equal 0).
Welch Two Sample t-test
data: G2SFCAEG_df$Phys_per_P and
G1SHGMEG_df$Phys_per_P
t = 14.342, df = 595.45,
p-value < 2.2e-16
alternative hypothesis: true
difference in means is not equal to 0
95 percent confidence interval:
0.3749920 0.4939867
sample estimates:
mean of x mean of y
1.0100493 0.5755599
Two Sample t-test
data: G2SFCAEG_df$Phys_per_P and
G1SHGMEG_df$Phys_per_P
t = 14.342, df = 996, p-value < 2.2e-16
alternative hypothesis: true
difference in means is not equal to 0
95 percent confidence interval:
0.3750407 0.4939380
sample estimates:
mean of x mean of y
1.0100493 0.5755599
Note: - The p-values (< 2.2e-16) from both
the Welch t-test (allows for unequal variance) and the standard t-test which
are the same and are so small clearly indicates that the Null Hypothesis
(H0) can be rejected in favor of the Alternative Hypothesis
(Ha).
Moran’s I – global test for spatial
autocorrelation using a queen’s case neighbors list and row standardization
results for each model are shown below:
Characteristics of weights list
object:
Neighbour list object:
Queen’s case
Number of regions: 499
Number of nonzero links: 2960
Percentage nonzero weights:
1.18875
Average number of links:
5.931864
Weights style: W
Weights constants summary:
n
nn
S0 S1 S2
W 499 249001 499 185.3664
2095.07
moran.range(Results.lw)
[1] -0.7214727 1.0623680
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_2SEG
weights: Results.lw
Moran I statistic standard
deviate = 17.715, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.4746291857
-0.0020080321 0.0007239152
Moran I test under randomisation
data: Results_Pop_Phys_spdf$Phys_1SEG
weights: Results.lw
Moran I statistic standard
deviate = 26.433, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.7149166391
-0.0020080321 0.0007356133
Note: There is significant spatial
autocorrelation for the one-step and all the two-step models (similar Moran’s I
statistics, very low p-values, and large standard deviates). These results
indicate strong clustering and it is extremely unlikely (less than 1%) that
these clustered patterns could be the results of random chance. The one-step model
is perhaps even more clustered (a larger Moran’s I statistic) than the two-step
method. This lack of independence is a violation of an important T-Test
assumption. A more appropriate spatial statistical test method is currently being
researched and eventually these results (which may not be very different) will
be presented.
Summary of Results
Preliminary
Version for Discussion Only
My initial goal for this research
project was to compare the one-step and two-step models given an assumption or
hypothesis that there would not be any significant differences as both are
essentially gravity models. However, I have been surprised by the results. There
is clear evidence of a significant difference between the one-step and two-step
models that have similar constructs (hybrid-zonal) and employ the same distance
decay methods (power, exponential, or Gaussian).
It is important to note that there is
no indication that either model is a better measure of reality than the other. This
can perhaps only be determined when actual observational provider data records
are reviewed, or statistical patient-based surveys are conducted. As these models
are only approximations of reality, a given model can only be evaluated as
somewhat more appropriate or suitable than the other given the availability and
quality of input data, plus how well it matches the desired usage or purpose.
This project has been an exceptional
learning exercise that has enabled me to gain a better understanding of
statistical, spatial, and GIS software and techniques. I have focused on
presenting the results as objectively as possible. However, I do have some
personal observations that are worth presenting for discussion purposes.
Hopefully other researchers will review these results and make future
suggestions that will aid in my interpretations.
The one-step models are easier to operationalize
but conceptually provide lower resolution than the two-step models. They can
require using less computational resources and GIS facilities. However, it is
necessary to calculate road-based distances and use a GIS to display results on
a map. Also, standard statistical packages and programming environments can be
used to perform most of the calculations. The results seem to be more similar to the traditional
county-based service capacity standards used by the federal government for
measuring physician shortage areas. The mean values are similar to the
statewide means and have smaller variances. Although they appear to spread out
the potential accessibility measures more evenly, they can also over-estimate
in actual shortage areas. Regardless, the one-step model may be a good initial
first step beyond the traditional county-based methods. They are easier to
explain to decision makers such as state legislators as they are a simple
extension of the standard service capacity ratio measure. A one-step model may
be more suitable at the state level of geography than a similarly constructed
two-step model. As such, they may be more appropriate choice for healthcare
planners seeking necessary increased funding.
The two-step models require more
effort to operationalize but conceptually provide higher resolution than the one-step
models. More computational resources and GIS facilities are required. A more
involved programming effort within the GIS and related scripting environments
is necessary as the calculations are more complicated. The resulting mean
values are noticeably larger and variances greater than the traditional
county-based service capacity standards used by the federal government for
measuring physician shortage areas. They seem to not spread out potential
accessibility measures as evenly as the one-step models and may perhaps
over-estimate and severely under-estimate in some areas. Regardless, they
represent a definite technological advancement that should be applied when appropriate.
However, they are more difficult to explain to a non-professional audience. A
two-step model may be more suitable at the regional level of geography than a
similarly constructed one-step model. Several varieties of two-step models are
continuing to be developed by academic researchers and their practical
applications have been demonstrated.
There is more research work to be
undertaken in evaluating the similarities and differences (comparing and
contrasting) between the one-step and two-step models. This study used ZIP Code (post office
locations or centroids) to locate physicians because this older data was all
that was available. More recent physician data, if available, could be used to
more accurately geocode addresses. From a more technical perspective, the
effects of both the modifiable areal unit problem (MAUP) and mathematical
aspects of the calculations need more evaluation. It is also apparent that healthcare
planners could benefit if better scripting applications were developed. These
developments could make it easier to review results and allow for better selection
of an appropriate model to employ.
I plan on completing a more in-depth
review of the one-step and two-step models. These more detailed results will be
presented using Esri’s ArcGIS Online Story Map
facility. Further development of a Python based scripting tool will also be
completed. It will calculate selected models, compares results using both exploratory
and statistical data analysis methods, plus hopefully include a spatial version
of ANOVA.
Larry Spear
Sr. Research Scientist (Ret.)
Division of Government Research
University of New Mexico