TY - JOUR
T1 - AI-Driven Deployable Water Quality Management
T2 - Benchmarking Effluent BOD Prediction at a European Pulp and Paper Mill
AU - Naeijian, Fatemeh
AU - Arefinia, Ali
AU - Zaki, Mohammed Tamim
AU - Orner, Kevin D.
AU - Tafazzoli, Mohammadsoroush
AU - Rowles, Lewis S.
PY - 2026/1/17
Y1 - 2026/1/17
N2 - The pulp and paper industry consumes 10–300 m3 of water per ton of product and generates substantial wastewater volumes. These facilities collect extensive operational data that remain underutilized for treatment optimization. However, systematic long-horizon comparisons of multiple machine learning algorithms for effluent prediction under full-scale operating conditions remain scarce. Using 1,033 daily records from a full-scale European mill (January 2020–October 2022), we evaluated five machine learning models to predict effluent biochemical oxygen demand (BOD) for environmental compliance monitoring. Input variables included flow rate, influent total suspended solids (TSS), influent BOD, temperature, aeration rate, hydraulic retention time, and chemical dosing. We benchmarked support vector machines (SVMs), artificial neural networks (ANNs), genetic programming (GP), decision trees (DTs), and random forests (RFs). SVM achieved the best performance (R2 = 0.85, NSE = 0.82, RMSE = 5.5 mg/L), while ANN delivered competitive accuracy (R2 = 0.82, RMSE = 6.8 mg/L) with the fastest runtime. GP produced interpretable mathematical equations, and DT/RF provided high operational transparency despite lower accuracy. These models can inform operational decisions related to aeration control, chemical dosing, and compliance management. By combining routinely monitored process variables with laboratory-measured influent BOD, our models demonstrated strong retrospective predictive performance across a three-year data set. This work establishes a benchmark for pulp and paper wastewater management and clarifies pathways for integrating real-time sensors to enable predictive process control.
AB - The pulp and paper industry consumes 10–300 m3 of water per ton of product and generates substantial wastewater volumes. These facilities collect extensive operational data that remain underutilized for treatment optimization. However, systematic long-horizon comparisons of multiple machine learning algorithms for effluent prediction under full-scale operating conditions remain scarce. Using 1,033 daily records from a full-scale European mill (January 2020–October 2022), we evaluated five machine learning models to predict effluent biochemical oxygen demand (BOD) for environmental compliance monitoring. Input variables included flow rate, influent total suspended solids (TSS), influent BOD, temperature, aeration rate, hydraulic retention time, and chemical dosing. We benchmarked support vector machines (SVMs), artificial neural networks (ANNs), genetic programming (GP), decision trees (DTs), and random forests (RFs). SVM achieved the best performance (R2 = 0.85, NSE = 0.82, RMSE = 5.5 mg/L), while ANN delivered competitive accuracy (R2 = 0.82, RMSE = 6.8 mg/L) with the fastest runtime. GP produced interpretable mathematical equations, and DT/RF provided high operational transparency despite lower accuracy. These models can inform operational decisions related to aeration control, chemical dosing, and compliance management. By combining routinely monitored process variables with laboratory-measured influent BOD, our models demonstrated strong retrospective predictive performance across a three-year data set. This work establishes a benchmark for pulp and paper wastewater management and clarifies pathways for integrating real-time sensors to enable predictive process control.
UR - https://doi.org/10.1021/acsengineeringau.5c00088
U2 - 10.1021/acsengineeringau.5c00088
DO - 10.1021/acsengineeringau.5c00088
M3 - Article
SN - 2694-2488
JO - ACS Engineering Au
JF - ACS Engineering Au
ER -