🧠 AI with Python – 🌐 Build a FastAPI Prediction API
Posted on: December 11, 2025
Description:
Once a model is saved to disk, the next logical step is to deploy it so external systems can use it.
FastAPI provides a fast, modern, and developer-friendly way to turn a machine learning model into a live REST API that can accept input data and return predictions instantly.
In this guide, we build a simple prediction endpoint by loading our saved .joblib model and exposing a /predict API.
Understanding the Problem
Notebooks are great for experimentation, but production systems need:
- an interface to send data
- a predictable response format
- a server hosting the ML model
- consistent, fast predictions
FastAPI enables exactly this with minimal code and excellent performance.
1. Load the Model & Initialize FastAPI
FastAPI loads the model once on server startup.
model = joblib.load("iris_model.joblib")
app = FastAPI(title="Iris Prediction API")
2. Health Check Endpoint
A simple route to confirm the API is running.
@app.get("/")
def home():
return {"message": "Iris ML Model API is running!"}
3. Create the Prediction Endpoint
The /predict route accepts a list of numeric features and returns the model output.
@app.post("/predict")
def predict(features: list):
data = np.array(features).reshape(1, -1)
pred = model.predict(data).tolist()
return {"prediction": pred}
This endpoint is now ready to serve predictions from any front-end, mobile app, or automated script.
4. Run the FastAPI Server
uvicorn main:app --reload
You can now test predictions via Postman, Curl, or browser tools.
Key Takeaways
- FastAPI turns ML models into real-time APIs effortlessly.
- Loading the model once on startup ensures fast, efficient inference.
- JSON requests and responses make the model usable by any system.
- This approach is the foundation for deploying ML in production.
- FastAPI + joblib is perfect for prototypes, demos, and lightweight systems.
Conclusion
Deploying an ML model as an API is a huge step toward production readiness.
FastAPI provides a clean, high-performance framework for serving predictions at scale. Once your model is accessible through an endpoint, it can power applications, automation, dashboards, and more — turning your ML code into real-world functionality.
Code Snippet:
from pydantic import BaseModel
from fastapi import FastAPI
import joblib
import numpy as np
model = joblib.load("iris_model.joblib")
app = FastAPI(title="Iris Prediction API")
# Pydantic model for JSON input
class InputData(BaseModel):
features: list
@app.get("/")
def home():
return {"message": "Iris ML Model API is running!"}
@app.post("/predict")
def predict(features: list):
data = np.array(features).reshape(1, -1)
pred = model.predict(data).tolist()
return {"prediction": pred}
No comments yet. Be the first to comment!