📊 Python Data Workflows – 🗄️ SQL + Python 🐍
Posted on: May 29, 2026
Description:
Data workflows often involve more than CSV files.
As datasets grow, SQL becomes one of the most important tools for querying and analyzing data efficiently. Combining SQL with Python gives you the flexibility of programming along with the power of structured querying.
Why SQLite?
SQLite is lightweight, fast, and built directly into Python.
It is perfect for:
- local analysis
- learning SQL
- testing workflows
- small-to-medium datasets
Loading Data into SQL
The workflow starts by loading a CSV file into pandas and storing it in SQLite.
df.to_sql("sales", conn, if_exists="replace", index=False)
This converts the DataFrame into a SQL table.
Running SQL Queries
Once the data is stored, SQL queries can be executed directly from Python.
SELECT category, SUM(sales)
FROM sales
GROUP BY category
This makes aggregation and filtering much cleaner for large datasets.
Combining SQL + pandas
The best part is that query results come back as DataFrames.
result = pd.read_sql(query, conn)
So you can:
- query with SQL
- analyze with pandas
- visualize with matplotlib/seaborn
all in one workflow.
Real-World Importance
SQL + Python is widely used in:
- analytics engineering
- ETL pipelines
- reporting systems
- data platforms
- dashboards
Learning this combination is a major step toward real-world data workflows.
Key Takeaways
- SQLite is excellent for lightweight database workflows
- SQL simplifies filtering and aggregation
- pandas integrates smoothly with SQL queries
- SQL + Python is a core real-world data skill
Code Snippet:
import pandas as pd
import sqlite3
# Step 1 — Load Dataset
df = pd.read_csv("sample_data.csv")
print("✅ Dataset Loaded\n")
print(df.head(), "\n")
# Step 2 — Create SQLite Database Connection
conn = sqlite3.connect("sales_data.db")
print("✅ SQLite database connected\n")
# Step 3 — Store Data in SQLite Table
df.to_sql("sales", conn, if_exists="replace", index=False)
print("✅ Dataset stored in SQLite table\n")
# Step 4 — Query Total Sales by Category
query_1 = """
SELECT
category,
SUM(sales) AS total_sales
FROM sales
GROUP BY category
ORDER BY total_sales DESC
"""
category_sales = pd.read_sql(query_1, conn)
print("📊 Total Sales by Category:")
print(category_sales, "\n")
# Step 5 — Query Average Sales by Region
query_2 = """
SELECT
region,
AVG(sales) AS avg_sales
FROM sales
GROUP BY region
ORDER BY avg_sales DESC
"""
region_sales = pd.read_sql(query_2, conn)
print("📈 Average Sales by Region:")
print(region_sales, "\n")
# Step 6 — Query Top 5 Highest Orders
query_3 = """
SELECT
order_id,
category,
region,
sales
FROM sales
ORDER BY sales DESC
LIMIT 5
"""
top_orders = pd.read_sql(query_3, conn)
print("🏆 Top 5 Highest Sales Orders:")
print(top_orders, "\n")
# Step 7 — Filter Electronics Sales Above 1500
query_4 = """
SELECT
order_id,
category,
region,
sales
FROM sales
WHERE category = 'Electronics'
AND sales > 1500
"""
electronics_sales = pd.read_sql(query_4, conn)
print("⚡ Electronics Orders Above 1500:")
print(electronics_sales, "\n")
# Step 8 — Count Orders by Region
query_5 = """
SELECT
region,
COUNT(*) AS total_orders
FROM sales
GROUP BY region
ORDER BY total_orders DESC
"""
region_orders = pd.read_sql(query_5, conn)
print("🗂️ Total Orders by Region:")
print(region_orders, "\n")
# Step 9 — Close Database Connection
conn.close()
print("🔒 Database connection closed\n")
# Key Takeaways
print("📌 Key Takeaways:")
print("🔹 SQLite is useful for lightweight database workflows")
print("🔹 SQL makes filtering and aggregation easier")
print("🔹 pandas integrates smoothly with SQL queries")
print("🔹 SQL + Python is a powerful combination for data analysis\n")
# Final Note
print("🚀 Combining SQL and Python helps build scalable and structured data workflows!")
No comments yet. Be the first to comment!