An end-to-end data pipeline that extracts cryptocurrency price data using the CoinGecko API, transforms it with pandas, and loads it into PostgreSQL, orchestrated and scheduled using Apache Airflow (Docker).
- Python (pandas, requests, SQLAlchemy)
- PostgreSQL
- Apache Airflow (Docker)
- REST API (CoinGecko)
- Extracts top 5 cryptocurrency prices from CoinGecko API.
- Transforms data (formats timestamps, selects relevant fields).
- Loads data into a PostgreSQL table
crypto_prices
. - Airflow DAG schedules the pipeline to run daily.
- Clone the repository.
- Set up Docker and navigate to the repo folder.
- Start Airflow with:
docker-compose up -d