Abstract
Public agencies aiming to enforce environmental regulation have limited resources to achieve their objectives. We demonstrate how machine-learning methods can inform the efficient use of these limited resources while accounting for real-world concerns, such as gaming the system and institutional constraints. Here, we predict the likelihood of a facility failing a water-pollution inspection and propose alternative inspection allocations that would target high-risk facilities. Implementing such a data-driven inspection allocation could detect over seven times the expected number of violations than current practices. When we impose constraints, such as maintaining a minimum probability of inspection for all facilities and accounting for state-level differences in inspection budgets, our reallocation regimes double the number of violations detected through inspections. Leveraging increasing amounts of electronic data can help public agencies to enhance their regulatory effectiveness and remedy environmental harms. Although employing algorithm-based resource allocation rules requires care to avoid manipulation and unintentional error propagation, the principled use of predictive analytics can extend the beneficial reach of limited resources.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The raw data used in this analysis can be downloaded from the EPA’s ECHO website (https://echo.epa.gov/). The processed datasets are also available with code at the Stanford Digital Repository (https://purl.stanford.edu/hr919hp5420).
References
Kleinberg, J., Ludwig, J., Mullainathan, S. & Obermeyer, Z. Prediction policy problems. Am. Econ. Rev. 105, 491–495 (2015).
Athey, S. Beyond prediction: using big data for policy problems. Science 355, 483–485 (2017).
Mullainathan, S. & Spiess, J. Machine learning: an applied econometric approach. J. Econ. Pers. 31, 87–106 (2017).
Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J. & Mullainathan, S. Human decision and machine predictions. Q. J. Econ. 133, 237–293 (2018).
Kang, J. S., Kuznetsova, P., Luca, M. & Choi, Y. Where not to eat? Improving public policy by predicting hygiene inspections using online reviews. In Proc. 2013 Conference on Empirical Methods in Natural Language Processing 1443–1448 (Association for Computational Linguistics, 2013).
Chandler, D., Levitt, S. D. & List, J. A. Predicting and preventing shootings among at-risk youth. Am. Econ. Rev. 101, 288–292 (2011).
O’Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Broadway Books, New York, USA, 2016).
Blumenthal-Barby, J. S. & Krieger, H. Cognitive biases and heuristics in medical decision making. Med. Decis. Making 35, 539–557 (2015).
Mullainathan, S. & Obermeyer, Z. Does machine learning automate moral hazard and error? Am. Econ. Rev. 107, 476–480 (2017).
Lund, L. C. Clean Water Act National Pollutant Discharge Elimination System Compliance Monitoring Strategy (United States Environmental Protection Agency, 2014); https://www.epa.gov/sites/production/files/2013-09/documents/npdescms.pdf
Friesen, L. Targeting enforcement to improve compliance with environmental regulations. J. Environ. Econ. Manage. 46, 72–85 (2003).
Rivers, L., Dempsey, T., Mitchell, J. & Gibbs, C. Environmental regulation and enforcement: structures, processes and the use of data for fraud detection. J. Environ. Assess. Pol. Manage. 17, 1550033 (2015).
Glicksman, R. L., Markell, D. L. & Monteleoni, C. Technological innovation, data analytics, and environmental enforcement. Ecol. Law. Q. 44, 41–88 (2017).
NPDES Compliance Inspection Manual Interim Revised Version, January 2017 (United States Environmental Protection Agency, 2017); https://www.epa.gov/sites/production/files/2017-01/documents/npdesinspect.pdf
National Pollutant Discharge Elimination System (NPDES) Electronic Reporting Rule (United States Environmental Protection Agency, 2015); https://www.gpo.gov/fdsys/pkg/FR-2015-10-22/pdf/2015-24954.pdf
Shimshack, J. P. & Ward, M. B. Enforcement and over-compliance. J. Environ. Econ. Manage. 55, 90–105 (2008).
James, G., Witten, D., Hastie, T., & Tibshirani, R. An Introduction to Statistical Learning (Springer, New York, USA, 2013).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning. Data Mining, Inference, and Prediction 2nd edn (Springer, New York, USA, 2009).
Zliobaite, I. Fairness-aware machine learning: a perspective. Preprint at https://arxiv.org/abs/1708.00754 (2017).
ICIS-NPDES Download Summary and Data Element Dictionary (United States Environmental Protection Agency, 2017); https://echo.epa.gov/tools/data-downloads/icis-npdes-download-summary
R Development Core Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2017).
State Compliance Monitoring Expectations (United States Environmental Protection Agency, 2015); https://echo.epa.gov/trends/comparative-maps-dashboards/state-compliance-monitoring-expectations
Acknowledgements
We thank S. Athey, M. Burke, F. Burlig, K. Mach, A. D’Agostino, C. Anderson, K. Green, S. Hasan, D. Jiménez, H. Kim, A. R. Siders and A. Stock for comments. E.B. receives funding from the National Science Foundation Graduate Research Fellowship Program (DGE-114747), M.H. from the Department of Earth System Science at Stanford University, and N.B. from the Stanford Graduate Fellowship/David and Lucile Packard Foundation.
Author information
Authors and Affiliations
Contributions
All three authors collaboratively designed the study, developed the methodology, assembled the data, wrote the code, performed the analysis, interpreted the results, and wrote the manuscript. E.B. and M.H. conducted the final analysis, with substantial input from N.B.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Note 1, Supplementary Figures 1–6, Supplementary Tables 1–6, Supplementary References 1–4
Rights and permissions
About this article
Cite this article
Hino, M., Benami, E. & Brooks, N. Machine learning for environmental monitoring. Nat Sustain 1, 583–588 (2018). https://doi.org/10.1038/s41893-018-0142-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41893-018-0142-9