Processing Big Data: Air Quality and Weather Integration for Environmental Insights (2023)

Big Data Processing

The project focuses on integrating extensive air quality and weather datasets to perform in-depth analysis using Apache Hadoop, Apache Spark, and Apache Hive. By combining these datasets, the goal is to explore meaningful relationships between weather patterns and air quality. The implementation leverages Hadoop's distributed storage, Spark's fast data processing capabilities, and Hive's query system to handle large-scale data efficiently, enabling actionable insights and a deeper understanding of environmental dynamics.

Basic Info:

Team: Haoyu Yang, Zeyi Song

CLICK TO CHECK THE PROJECT OUT