This next section explores if there exists any correlation between
the parameters of calculated acres, estimated cost, discovery acres, and
total incident personnel.
The first step taken in this regression analysis was to create a
scatterplot matrix to see if there was any possible correlation visibly
present between these parameters.
4.1 - Wildfire Cost Versus Calculated Acres
ggplot(westFires, aes(x = Wildfire.CalculatedAcres, y = Wildfire.EstimatedCostToDate)) + geom_point(size = 0.3)

The two quantitative parameters which most closely resemble some sort of
correlation is wildfire suppression cost and calculated acres, but any
correlation is likely very weak.
summary(lm(westFires$Wildfire.CalculatedAcres ~ westFires$Wildfire.EstimatedCostToDate))
##
## Call:
## lm(formula = westFires$Wildfire.CalculatedAcres ~ westFires$Wildfire.EstimatedCostToDate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -472788 -6903 -6002 -2918 484995
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.216e+03 7.876e+02 7.892 4.89e-15
## westFires$Wildfire.EstimatedCostToDate 1.156e-03 3.168e-05 36.491 < 2e-16
##
## (Intercept) ***
## westFires$Wildfire.EstimatedCostToDate ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 34090 on 1981 degrees of freedom
## (154528 observations deleted due to missingness)
## Multiple R-squared: 0.402, Adjusted R-squared: 0.4017
## F-statistic: 1332 on 1 and 1981 DF, p-value: < 2.2e-16
With an R-Squared value of 0.4017, there exists a weak correlation
between fire size and fire cost. Although fire size does affect fire
cost, other factors such as fuel type and location also play large
factors. For example, wildfires in lighter fuels such as grass and brush
spread much more quickly, thus they often burn more acres than heavier,
less readily ignitable fuels such as timber. However, fire suppression
costs in timber is usually higher per acre, as suppression is much more
time and resource intensive than lighter fuels. If we separate the data
by primary fuel model, will the correlation between fire cost and fire
size be more apparent?
library(dplyr)
Grass <- westFires %>% filter(grepl('Short Grass|Tall Grass', westFires$Wildfire.PrimaryFuelModel))
Brush <- westFires %>% filter(grepl('Brush', westFires$Wildfire.PrimaryFuelModel))
Timber <- westFires %>% filter(grepl('Timber', westFires$Wildfire.PrimaryFuelModel))
Chaparral <- westFires %>% filter(grepl('Chaparral', westFires$Wildfire.PrimaryFuelModel))
Note: Chaparral is a highly flammable scrubland plant community
composed of broad-leaved evergreen shrubs, bushes, and small trees
usually less than 2.5 metres (about 8 feet) tall. It is commonly found
in much of California.
4.1.1 - Grass Fire Acreage vs. Cost
plot(Grass$Wildfire.CalculatedAcres, Grass$Wildfire.EstimatedCostToDate)

summary(lm(Grass$Wildfire.CalculatedAcres ~ Grass$Wildfire.EstimatedCostToDate))
##
## Call:
## lm(formula = Grass$Wildfire.CalculatedAcres ~ Grass$Wildfire.EstimatedCostToDate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -196772 -9262 -8031 -2070 286882
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.550e+03 1.456e+03 6.561 1.57e-10 ***
## Grass$Wildfire.EstimatedCostToDate 1.008e-03 1.271e-04 7.936 1.91e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 29130 on 421 degrees of freedom
## (1519 observations deleted due to missingness)
## Multiple R-squared: 0.1301, Adjusted R-squared: 0.1281
## F-statistic: 62.98 on 1 and 421 DF, p-value: 1.908e-14
With an R-Squared value of just 0.1281, there is no correlation
between fire size and fire cost for grass fires.
4.1.2 - Brush Fire Acreage vs. Cost
plot(Brush$Wildfire.CalculatedAcres, Brush$Wildfire.EstimatedCostToDate)

summary(lm(Brush$Wildfire.CalculatedAcres ~ Brush$Wildfire.EstimatedCostToDate))
##
## Call:
## lm(formula = Brush$Wildfire.CalculatedAcres ~ Brush$Wildfire.EstimatedCostToDate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -47401 -7800 -6356 -2660 351602
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.569e+03 2.995e+03 2.193 0.029471 *
## Brush$Wildfire.EstimatedCostToDate 2.230e-03 5.772e-04 3.864 0.000152 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 37500 on 196 degrees of freedom
## (1156 observations deleted due to missingness)
## Multiple R-squared: 0.07077, Adjusted R-squared: 0.06603
## F-statistic: 14.93 on 1 and 196 DF, p-value: 0.0001518
With an R-Squared value of just 0.06603, there is no correlation
between fire size and fire cost for brush fires.
4.1.3 - Timber Fire Acreage vs. Cost
plot(Timber$Wildfire.CalculatedAcres, Timber$Wildfire.EstimatedCostToDate)

summary(lm(Timber$Wildfire.CalculatedAcres ~ Timber$Wildfire.EstimatedCostToDate))
##
## Call:
## lm(formula = Timber$Wildfire.CalculatedAcres ~ Timber$Wildfire.EstimatedCostToDate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -205610 -6337 -4045 -1967 485636
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.978e+03 1.087e+03 3.659 0.000266 ***
## Timber$Wildfire.EstimatedCostToDate 1.525e-03 4.041e-05 37.730 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 33550 on 1018 degrees of freedom
## (1819 observations deleted due to missingness)
## Multiple R-squared: 0.5831, Adjusted R-squared: 0.5826
## F-statistic: 1424 on 1 and 1018 DF, p-value: < 2.2e-16
With an R-Squared value of 0.5826, there is a weak/medium correlation
between fire cost and fire size in timber fires. This value is notably
higher than the previous R-Squared value for all fuel types
(0.4017).
4.1.4 - Chaparral Fire Acreage vs. Cost
plot(Chaparral$Wildfire.CalculatedAcres, Chaparral$Wildfire.EstimatedCostToDate)

summary(lm(Chaparral$Wildfire.CalculatedAcres ~ Chaparral$Wildfire.EstimatedCostToDate))
##
## Call:
## lm(formula = Chaparral$Wildfire.CalculatedAcres ~ Chaparral$Wildfire.EstimatedCostToDate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -70251 -5017 -3231 -1236 260597
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.130e+03 4.370e+03 0.716 0.476
## Chaparral$Wildfire.EstimatedCostToDate 1.066e-03 1.053e-04 10.123 2.4e-15
##
## (Intercept)
## Chaparral$Wildfire.EstimatedCostToDate ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 34530 on 70 degrees of freedom
## (445 observations deleted due to missingness)
## Multiple R-squared: 0.5941, Adjusted R-squared: 0.5883
## F-statistic: 102.5 on 1 and 70 DF, p-value: 2.399e-15
With an R-Squared value of 0.5883, there is a weak/medium correlation
between fire cost and fire size in Chaparral fires. This R-Squared value
is comparable to that of timber fires, as well as notably higher than
the R-Squared value of all fuel types (0.4017).
4.1.5 - Summary
From this analysis, we can conclude that fire size does have a
notable linear correlation to fire cost for heavier fuel types such as
chaparral and timber, but no correlation for lighter fuel types such as
grass and brush.