⚡️ Saturday AI Sparks 🤖 - 🏷️ Zero-Shot Classification (Classify without training)


Description:

Normally, text classification requires a labeled dataset and a training process. But what if you want to classify text into categories without training at all?

That’s where Zero-Shot Classification comes in. Using Hugging Face’s facebook/bart-large-mnli model, you can assign custom labels to text on the fly.


Why Zero-Shot Classification?

  • No training needed → You just provide text + candidate labels.
  • Flexible → Works with any label set you define (e.g., “positive”, “negative”, “neutral” or “acting”, “cinematography”, “soundtrack”).
  • Fast prototyping → Great for tagging, routing, or feedback analysis.

Installing Requirements

pip install transformers torch

Minimal Implementation

Load the pipeline and run classification:

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

text = "I loved the cinematography, but the plot was too slow and predictable."
candidate_labels = ["positive", "negative", "story", "acting", "cinematography", "soundtrack"]

result = classifier(text, candidate_labels)

This returns a dictionary with labels and scores ranked by confidence.


Multi-Label Option

If a text can belong to more than one category, enable multi-label:

result = classifier(text, candidate_labels, multi_label=True)

This gives independent confidence scores for each label.


Batch Classification

You can also pass a list of texts at once. The pipeline returns a list of results, one per input.


Sample Output

Single-label result (top label): cinematography | score: 0.8421

All labels ranked:
cinematography  -> 0.8421
negative        -> 0.6910
story           -> 0.6395
positive        -> 0.3212
acting          -> 0.1453
soundtrack      -> 0.1024

Key Takeaways

  • Zero-shot classification lets you skip training while still categorizing text.
  • Candidate labels are defined on the fly, making it highly flexible.
  • Great for early experiments, tagging pipelines, and quick insights before building supervised models.

Code Snippet:

# Import pipeline utility
from transformers import pipeline

# Create zero-shot classifier (MNLI-based)
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")


# Text we want to classify
text = "I loved the cinematography, but the plot was too slow and predictable."

# Candidate labels can be anything you choose
candidate_labels = ["positive", "negative", "story", "acting", "cinematography", "soundtrack"]


# Single-label classification (default)
result = classifier(text, candidate_labels)
print("Single-label result (top label):", result["labels"][0], "| score:", round(result["scores"][0], 4))

# Optional: view all labels with scores (sorted by confidence)
print("\nAll labels ranked:")
for lbl, sc in zip(result["labels"], result["scores"]):
    print(f"{lbl:15s} -> {sc:.4f}")


# Multi-label classification (multiple labels can be correct)
result_multi = classifier(text, candidate_labels, multi_label=True)

print("\nMulti-label results (labels with scores ≥ 0.5):")
for lbl, sc in zip(result_multi["labels"], result_multi["scores"]):
    if sc >= 0.5:
        print(f"{lbl:15s} -> {sc:.4f}")


texts = [
    "The battery life is amazing, but the camera is mediocre.",
    "Support was unhelpful. I’m really disappointed.",
    "This album’s production quality is superb!",
]

batch = classifier(texts, candidate_labels, multi_label=True)
print("\nBatch classification (first item shown):")
print(batch[0])

Link copied!

Comments

Add Your Comment

Comment Added!