نموذج تعلم آلي خفيف الوزن للكشف الفوري عن تلوث المياه على الأجهزة الطرفية

Authors

  • ايمان سالم سعيد المشلوم المعهد العالي للعلوم والتقنية المشاشية – ليبيا Author

Keywords:

Lightweight Machine Learning, Edge Computing, Water Quality, Contamination Detection, Model Quantization, TensorFlow Lite, Environmental IoT

Abstract

Real-time monitoring of drinking water quality remains a critical challenge, particularly in remote and resource-constrained environments where traditional infrastructure fails to provide continuous and efficient surveillance. This paper presents the design and implementation of a lightweight machine learning (ML) model capable of classifying water potability for human consumption with reasonable accuracy, while ensuring deployability on low-power edge devices.

The proposed methodology leverages the "Water Potability" dataset from Kaggle, comprising 3,276 samples and 10 physicochemical features. Data quality challenges were addressed through median imputation for missing values and the SMOTE algorithm for class distribution balancing. A compact sequential neural network architecture was designed, consisting of three dense layers (32→16→8 neurons) enhanced with advanced regularization techniques (L2 regularization, BatchNormalization, Dropout) to mitigate overfitting. To optimize computational efficiency, the model was converted to TensorFlow Lite format with post-training INT8 quantization.

Experimental results on the held-out test set (n=656) demonstrated a classification accuracy of 61.13%, an F1-score of 51.80%, and a ROC-AUC of 65.69%. Most significantly from a deployment perspective, the model size was compressed from 59.47 KB to merely 6.07 KB—a 9.8× compression ratio—making it suitable for deployment on microcontrollers such as ESP32, Arduino Nano 33 BLE, and Raspberry Pi Pico.

This study validates the feasibility of employing lightweight, quantized AI models for field-based environmental monitoring applications, where a deliberate trade-off between classification accuracy and resource efficiency is managed. The findings represent a practical step toward autonomous, low-cost early-warning systems for water contamination detection, while opening avenues for future research in temporal data fusion, federated learning, and hardware-aware quantization.

Downloads

Download data is not yet available.

References

[1] World Health Organization (WHO), Guidelines for Drinking-water Quality, 4th ed. Geneva, Switzerland: WHO Press, 2017.

[2] M. A. Khan, S. Ullah, and J. A. Khan, "IoT-based water quality monitoring systems: A comprehensive review," IEEE Access, vol. 9, pp. 45678–45695, 2021.

[3] S. K. Singh and R. Kumar, "Machine learning applications in water quality prediction: A systematic review," Journal of Environmental Management, vol. 285, p. 112123, 2021.

[4] A. Sharma, P. Gupta, and N. Verma, "Predicting drinking water potability using ensemble learning techniques," Environmental Science and Pollution Research, vol. 29, no. 15, pp. 21456–21468, 2022.

[5] L. Chen, Y. Wang, and H. Li, "Deep learning for water quality classification: Challenges and opportunities," Water Research, vol. 210, p. 117982, 2022.

[6] J. Brown, M. Al-Farsi, and K. Patel, "Handling missing values and class imbalance in environmental sensor data," IEEE Sensors Journal, vol. 22, no. 8, pp. 7890–7901, 2022.

[7] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic minority over-sampling technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002.

[8] P. Warden and D. Situnayake, TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers. Sebastopol, CA: O'Reilly Media, 2019.

[9] TensorFlow Lite Team, "Post-training integer quantization for edge deployment," Google AI Blog, 2021. [Online]. Available: https://www.tensorflow.org/lite/performance/post_training_integer_quant

[10] M. Li, H. Zhang, and Y. Liu, "Practical guide to post-training quantization for deep neural networks," in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis. (WACV), 2022, pp. 1120–1129.

[11] R. K. G. et al., "Edge AI for environmental monitoring: A survey of architectures, models, and deployment strategies," ACM Computing Surveys, vol. 55, no. 6, pp. 1–38, 2023.

[12] M. S. Rahman, T. Ahmed, and F. Hossain, "Lightweight neural networks for real-time water quality classification on microcontrollers," IEEE Internet of Things Journal, vol. 10, no. 4, pp. 3456–3467, 2023.

[13] S. Han, H. Mao, and W. J. Dally, "Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding," in Proc. Int. Conf. Learn. Represent. (ICLR), 2016.

[14] J. Lee, D. Park, and S. Kim, "Adaptive thresholding for real-time environmental anomaly detection on resource-constrained devices," Sensors, vol. 23, no. 11, p. 5123, 2023.

[15] D. Wang, X. Li, and Q. Zhou, "Federated learning for distributed environmental sensing: Privacy-preserving model updates on edge nodes," IEEE Trans. Green Commun. Netw., vol. 7, no. 2, pp. 890–902, 2023.

[16] United Nations Development Programme (UNDP), Sustainable Development Goal 6: Clean Water and Sanitation, New York, USA, 2023.

[17] E. H. et al., "Hardware-aware neural architecture search for ultra-low-power microcontrollers," ACM Trans. Embed. Comput. Syst., vol. 2, no. 3, pp. 1–25, 2023.

[18] Kaggle, "Water Potability Dataset," 2022. [Online]. Available: https://www.kaggle.com/datasets/adityakadiwal/water-potability

Downloads

Published

2026-03-17

How to Cite

نموذج تعلم آلي خفيف الوزن للكشف الفوري عن تلوث المياه على الأجهزة الطرفية. (2026). Al-Farooq Journal of Sciences, 2(1), 497-514. https://www.afjs.histr.edu.ly/index.php/afjs/article/view/69