Urbanization and Climate in Arizona, USA

INFO 523 - Summer 2025 - Final Project

Vera Jackson, Molly Kerwick, Brooke Pacheco

Topic

  • Traffic and car emissions contribute to local air pollution and climate (Niemeier, et al. 2006)
  • Precipitation and temperature changes and an increase in storms are known indicators of climate change (Trenberth, 2007; EPA, 2024; Lindsey & Dahlman, 2025)

Research Questions

  • How does traffic congestion impact climate indicators in Arizona?
  • Can we predict the impact of traffic congestion on climate?

Data Sources and Metrics

  • Urbanization Metric: Traffic count data (USDOT Federal Highway Administration)
  • Climate Metrics: Temperature, precipitation, and storm data (NOAA)
  • Data cleaned and merged for analysis and predictive modeling

Exploratory Data Analysis

Traffic distribution in Arizona

Exploratory Data Analysis

Temperature and precipitation patterns throughout Arizona

Exploratory Data Analysis

Storm patterns in Arizona (2019-2023)

Feature Scaling

Relationships for 5-year-average

Relationships for 2022

Data Reduction

Component 1: ['max_temp_2022', 'highmagstorm_events_2022', 'highmagstorm_5yavg', 'highmagstorm_events_above_5yavg']
Component 2: ['traffic_counts_2022', 'fips_code', 'highmagstorm_events_2022', 'highmagstorm_5yavg', 'highmagstorm_events_above_5yavg']
Component 3: ['traffic_counts_2022', 'traffic_counts_above_5yavg', 'max_temp_above_5yavg', 'avg_temp_above_5yavg', 'rainfall_2022']
Component 4: ['gridspace', 'traffic_counts_2022', 'traffic_counts_above_5yavg', 'rainfall_2022', 'average_storm_mag_2022']
Component 5: ['gridspace', 'avg_temp_2022', 'rainfall_below_5yavg', 'lowmagstorm_5yavg']
Component 6: ['fips_code', 'avg_temp_above_5yavg', 'average_storm_mag_2022']
Component 7: ['max_temp_2022', 'rainfall_2022', 'lowmagstorm_5yavg', 'average_storm_mag_2022']
Component 8: ['max_temp_above_5yavg', 'avg_temp_2022', 'avg_temp_above_5yavg']
Component 9: ['avg_temp_2022', 'rainfall_below_5yavg', 'lowmagstorm_5yavg']

Linear Regression Model Results

Environmental Data (5 Year Averages)
Traffic Data (5 Year Averages)
OLS Regression Model:
    Mean-squared error: 0.544387914704966
    Root mean-squared error: 0.7378264800784572
    R-squared value: -0.21408882314897082

Ridge best alpha selected: 1000 with CV MSE: 1.174985534201412
Ridge Regression Model:
    Mean-squared error: 0.5596105056182925
    Root mean-squared error: 0.7480711902073843
    R-squared value: -0.24803810267560222

Lasso best alpha selected: 1000 with CV MSE: 1.174985534201412
Lasso Regression Model:
    Mean-squared error: 0.5601840743214328
    Root mean-squared error: 0.7484544570790082
    R-squared value: -0.2493172702195181
Environmental Data (2022)
Traffic Data (2022)
OLS Regression Model:
    Mean-squared error: 0.5310351617487781
    Root mean-squared error: 0.7287215941282227
    R-squared value: -0.7615209941768197

Ridge best alpha selected: 100 with CV MSE: 1.1758120989820233
Ridge Regression Model:
    Mean-squared error: 0.41491285307157283
    Root mean-squared error: 0.6441372936506415
    R-squared value: -0.37632637927870594

Lasso best alpha selected: 100 with CV MSE: 1.1758120989820233
Lasso Regression Model:
    Mean-squared error: 0.4315963501853747
    Root mean-squared error: 0.6569599304260304
    R-squared value: -0.43166796970271903

Linear Regression Model Results (continued)

Environmental Data (2022)
Traffic Data (5 Year Averages)
OLS Regression Model:
    Mean-squared error: 0.5553423826213792
    Root mean-squared error: 0.7452129780280127
    R-squared value: -0.23851937478616003

Ridge best alpha selected: 10000 with CV MSE: 1.1760355262786755
Ridge Regression Model:
    Mean-squared error: 0.5590780965942956
    Root mean-squared error: 0.7477152510109016
    R-squared value: -0.2468507290621953

Lasso best alpha selected: 1 with CV MSE: 1.1759802832153639
Lasso Regression Model:
    Mean-squared error: 0.5601840743214328
    Root mean-squared error: 0.7484544570790082
    R-squared value: -0.2493172702195181
Environmental Data (5 Year Averages)
Traffic Data (2022)
OLS Regression Model:
    Mean-squared error: 0.44783955314134766
    Root mean-squared error: 0.669208153821625
    R-squared value: -0.48554904026195533

Ridge best alpha selected: 100 with CV MSE: 1.1569723795267826
Ridge Regression Model:
    Mean-squared error: 0.4109393466705784
    Root mean-squared error: 0.6410455106079275
    R-squared value: -0.3631456796753172

Lasso best alpha selected: 100 with CV MSE: 1.1569723795267826
Lasso Regression Model:
    Mean-squared error: 0.4315963501853747
    Root mean-squared error: 0.6569599304260304
    R-squared value: -0.43166796970271903

Polynomial Regression Results

Environmental Data (5 Year Averages)
Traffic Data (5 Year Averages)
Mean Squared Error: 8.21679388945711
R-squared: -17.32501669092919
 
Environmental Data (2022)
Traffic Data (2022)
Mean Squared Error: 0.7942820147920174
R-squared: -1.6347491562434593
 
Environmental Data (2022)
Traffic Data (5 Year Averages)
Mean Squared Error: 1.2166376469368712
R-squared: -1.7133338729035614
 
Environmental Data (5 Year Averages)
Traffic Data (2022)
Mean Squared Error: 6.005052083350367
R-squared: -18.919632592900072

Conclusion

Model Results

  • Overall low predictive power
  • PCA demonstrates that extreme storm events and temperature have most significant interaction
  • Linear Regression models show that the predictors are not explaning the outcome variable (negative R-squared values)
  • Polynomial Regression also did not capture any meaningful relationships (negative R-squared values)

Improvements

  • Combining datasets (only two at a time)
  • Longer time period (more data points)
  • Integrity low with multiple public data resources

Thank you!