🧠 AI with Python – 🌐 Build a FastAPI Prediction API

Posted on: December 11, 2025

Description:

Once a model is saved to disk, the next logical step is to deploy it so external systems can use it.

FastAPI provides a fast, modern, and developer-friendly way to turn a machine learning model into a live REST API that can accept input data and return predictions instantly.

In this guide, we build a simple prediction endpoint by loading our saved .joblib model and exposing a /predict API.

Understanding the Problem

Notebooks are great for experimentation, but production systems need:

an interface to send data
a predictable response format
a server hosting the ML model
consistent, fast predictions

FastAPI enables exactly this with minimal code and excellent performance.

1. Load the Model & Initialize FastAPI

FastAPI loads the model once on server startup.

model = joblib.load("iris_model.joblib")
app = FastAPI(title="Iris Prediction API")

2. Health Check Endpoint

A simple route to confirm the API is running.

@app.get("/")
def home():
    return {"message": "Iris ML Model API is running!"}

3. Create the Prediction Endpoint

The /predict route accepts a list of numeric features and returns the model output.

@app.post("/predict")
def predict(features: list):
    data = np.array(features).reshape(1, -1)
    pred = model.predict(data).tolist()
    return {"prediction": pred}

This endpoint is now ready to serve predictions from any front-end, mobile app, or automated script.

4. Run the FastAPI Server

uvicorn main:app --reload

You can now test predictions via Postman, Curl, or browser tools.

Key Takeaways

FastAPI turns ML models into real-time APIs effortlessly.
Loading the model once on startup ensures fast, efficient inference.
JSON requests and responses make the model usable by any system.
This approach is the foundation for deploying ML in production.
FastAPI + joblib is perfect for prototypes, demos, and lightweight systems.

Conclusion

Deploying an ML model as an API is a huge step toward production readiness.

FastAPI provides a clean, high-performance framework for serving predictions at scale. Once your model is accessible through an endpoint, it can power applications, automation, dashboards, and more — turning your ML code into real-world functionality.

Code Snippet:

from pydantic import BaseModel
from fastapi import FastAPI
import joblib
import numpy as np

model = joblib.load("iris_model.joblib")

app = FastAPI(title="Iris Prediction API")

# Pydantic model for JSON input
class InputData(BaseModel):
    features: list


@app.get("/")
def home():
    return {"message": "Iris ML Model API is running!"}


@app.post("/predict")
def predict(features: list):
    data = np.array(features).reshape(1, -1)
    pred = model.predict(data).tolist()
    return {"prediction": pred}

← →	move
↑	rotate
↓	soft drop
Space	hard drop
P	pause / resume

🧠 AI with Python – 🌐 Build a FastAPI Prediction API

Description:

Understanding the Problem

1. Load the Model & Initialize FastAPI

2. Health Check Endpoint

3. Create the Prediction Endpoint

4. Run the FastAPI Server

Key Takeaways

Conclusion

Code Snippet:

Comments

Add Your Comment

🧠 AI with Python – 🌐 Build a FastAPI Prediction API

Description:

Understanding the Problem

1. Load the Model & Initialize FastAPI

2. Health Check Endpoint

3. Create the Prediction Endpoint

4. Run the FastAPI Server

Key Takeaways

Conclusion

Code Snippet:

Comments Show Comments

Add Your Comment

Related Posts

🧠 AI with Python – 🎛️ Gradio ML Demo Interface

🧠 AI with Python – 🖥️ Streamlit ML App for Predictions

🧠 AI with Python – 💾 Save & Load ML Models (joblib)

Comments