Analyzing Air Quality Data by Machine Learning
DOI:
https://doi.org/10.63595/vetor.v35i1.18205Keywords:
Air quality, Clustering, ClassificationAbstract
Machine learning (ML) allows for the continuous analysis of large volumes of data, including information on consumption, public health, and industrial processes. One example of such datasets is the parameters produced by air quality monitoring. This study utilized ML tools to assess air quality at CETESB station 66 - Parisi in Cubatão, São Paulo, Brazil. Data from a one-year period, from 1/1/2022 to 1/1/2023, for Inhalable Particulate Matter (PM10), nitrogen oxides (NO, NO2, and NOx), and SO2 were examined. Feature engineering, clustering, and classification were conducted, resulting in valuable analyses that improve pollutant control in the atmosphere. The dendrogram indicated the presence of four clusters, which was confirmed by the K-mean method. The k nearest neighbor algorithm emerged as the classifier with the best performance, with a coefficient of 0.953138. Protecting the environment should be a collective responsibility; even small initiatives can significantly contribute to movements and public policies.
Downloads
References
S. Lohr, “The age of big data,” New York Times, 11, 2012. Disponível em: https://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the-world.html
National Academies of Sciences, Engineering, and M. and others, Data science for undergraduates: opportunities and options. National Academies of Sciences, Engineering and Medicine Tech. rep, 2018. Disponível em: http://nap.edu/25104
National Academies of Sciences, Engineering, and M. and others (2018b) Data science: opportunities to transform chemical sciences and engineering: proceedings of a workshop in brief. National Academies of Sciences, Engineering and Medicine Tech. rep., 2018. Disponível em: https://doi.org/10.17226/25191
K. Schwab, The fourth industrial revolution. Currency, 2017. Disponível em: http://voicebucketvoitto.s3.amazonaws.com/pdf/ingles/%5BENG%5D%20A%20Quarta%20Revolucao%20Industrial.pdf
Techjury. Acesso em 23 de março de 2022. Disponível em: https://techjury.net/blog/how-much-data-is-created-every-day/#gref
V. Dhar, “Data science and prediction,” Communications of the ACM, vol. 56, no. 12, pp. 64–73, 2013. Disponível em: https://doi.org/10.1145/2500499
J. Leek, “The key word in Data Science is not Data, it is Science,” Simply Statistics, vol. 12, 2013. Disponível em: https://www.linkedin.com/pulse/keyword-data-science-kanika-garg-85o3c
Cetesb. Acesso em 02 de maio de 2024. Disponível em: https://cetesb.sp.gov.br/ar/padroes-de-qualidade-do-ar/
P. Norvig e S. Russell, Inteligência Artificial, tradução da 3a ed., Elsevier, 2013. Disponível em: https://www.grupogen.com.br/livro-inteligencia-artificial-uma-abordagem-moderna-stuart-russell-e-peter-norvig-9788595158870
A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow”, O'Reilly Media, Inc., 2022. Disponível em: https://anayamultimedia.es/primer_capitulo/aprende-machine-learning-con-scikit-learn-keras-y-tensorflow-tercera-edicion.pdf
A. Kadiwal. Acesso em 09 de maio de 2024. Disponível em: https://www.kaggle.com/datasets/adityakadiwal/water-potability
J. P. Mueller e L. Massaron, Machine Learning for Dummies, IBM Limited Edition. New Jersey: John Wiley, 2018. Disponível em: https://www.wiley.com/en-mx/Machine+Learning+For+Dummies-p-9781119245513
E. Alpaydin, Introduction to machine learning, 4th edition, MIT press, 2020. Disponível em: https://www.bme.ufl.edu/wp-content/uploads/2018/07/Fall-2015-Syllabus-BME6938-Machine-Learning.pdf
S. Marsland, Machine learning: an algorithmic perspective, 2nd edition, Chapman and Hall/CRC, 2018. Disponível em: http://2.180.2.83:801/opac/temp/11623.pdf
J. Guttag, Introduction to Computation and Programming Using Python: With Application to Understanding Data, 2nd ed. MIT Press, Cambridge, 2016. Disponível em: https://thuvienso.hoasen.edu.vn/bitstream/handle/123456789/8846/Contents.pdf?sequence=3
Pedregosa et al., Scikit-learn: Machine Learning in Python, JMLR 12, pp. 2825-2830, 2011. Disponível em: https://doi.org/10.1002/hbm.25822
