Hybrid quantum-enhanced ensemble classification (grid stability workflow)
이 페이지는 아직 한국어로 번역되지 않았습니다. 영어 원본을 보고 있습니다.
Usage estimate: 20 minutes in QPU time for each job on an Eagle r3 processor. (NOTE: This is an estimate only. Your runtime may vary.)
Background
This tutorial demonstrates a hybrid quantum–classical workflow that enhances a classical ensemble with a quantum optimization step. Using Multiverse Computing’s “Singularity Machine Learning – Classification” (a Qiskit Function), we train a pool of conventional learners (for example, decision trees, k-NN, logistic regression) and then refine that pool with a quantum layer to improve diversity and generalization. The objective is practical: on a real grid-stability prediction task, we compare a strong classical baseline with a quantum-optimized alternative under the same data splits, so you can see where the quantum step helps and what it costs.
Why this matters: selecting a good subset from many weak learners is a combinatorial problem that grows quickly with ensemble size. Classical heuristics like boosting, bagging, and stacking perform well at moderate scales but can struggle to explore large, redundant libraries of models efficiently. The function integrates quantum algorithms - specifically QAOA (and optionally VQE in other configurations) - to search that space more effectively after the classical learners are trained, increasing the chance of finding a compact, diverse subset that generalizes better.
Crucially, data scale is not limited by qubits. The heavy lifting on data — preprocessing, training the learner pool, and evaluation — remains classical and can handle millions of examples. Qubits only determine the ensemble size used in the quantum selection step. This decoupling is what makes the approach viable on today’s hardware: you keep familiar scikit-learn workflows for data and model training while calling the quantum step through a clean action interface in Qiskit Functions.
In practice, while different learner types can be provided to the ensemble (e.g., decision trees, logistic regression, or k-NN), Decision Trees tend to perform best. The optimizer consistently favors stronger ensemble members—when heterogeneous learners are supplied, weaker models such as linear regressors are typically pruned in favor of more expressive ones like Decision Trees.
What you will do here: prepare and balance the grid-stability dataset; establish a classical AdaBoost baseline; run several quantum configurations that vary ensemble width and regularization; execute on IBM® simulators or QPUs via Qiskit Serverless; and compare accuracy, precision, recall, and F1 across all runs. Along the way, you will use the function’s action pattern (create, fit, predict, fit_predict, create_fit_predict) and key controls:
- Regularization types:
onsite(λ) for direct sparsity andalphafor a ratio-based trade-off between interaction and onsite terms - Auto-regularization: set
regularization="auto"with a target selection ratio to adapt sparsity automatically - Optimizer options: simulator versus QPU, repetitions, classical optimizer and its options, transpilation depth, and runtime sampler/estimator settings
Benchmarks in the documentation show that accuracy improves as the number of learners (qubits) increases on challenging problems, with the quantum classifier matching or exceeding a comparable classical ensemble. In this tutorial, you will reproduce the workflow end-to-end and examine when increasing ensemble width or switching to adaptive regularization yields better F1 at reasonable resource usage. The result is a grounded view of how a quantum optimization step can complement, rather than replace, classical ensemble learning in real applications.
Requirements
Before starting this tutorial, ensure you have the following packages installed in your Python environment:
qiskit[visualization]~=2.1.0qiskit-serverless~=0.24.0qiskit-ibm-runtime v0.40.1qiskit-ibm-catalog~=0.8.0scikit-learn==1.5.2pandas>=2.0.0,<3.0.0imbalanced-learn~=0.12.3
Setup
In this section, we initialize the Qiskit Serverless client and load the Singularity Machine Learning – Classification function provided by Multiverse Computing. With Qiskit Serverless, you can run hybrid quantum–classical workflows on IBM managed cloud infrastructure without worrying about resource management. You will need an IBM Quantum Platform API key and your cloud resource name (CRN) to authenticate and access Qiskit Functions.
Download the dataset
To run this tutorial, we use a preprocessed grid stability classification dataset containing labeled power system sensor readings.
The following cell automatically creates the required folder structure and downloads both the training and test files directly into your environment using wget.
If you already have these files locally, this step will safely overwrite them to ensure version consistency.
# Added by doQumentation — required packages for this notebook
!pip install -q imbalanced-learn matplotlib numpy pandas qiskit-ibm-catalog qiskit-ibm-runtime scikit-learn
## Download dataset for Grid Stability Classification
# Create data directory if it doesn't exist
!mkdir -p data_tutorial/grid_stability
# Download the training and test sets from the official Qiskit documentation repo
!wget -q --show-progress -O data_tutorial/grid_stability/train.csv \
https://raw.githubusercontent.com/Qiskit/documentation/main/datasets/tutorials/grid_stability/train.csv
!wget -q --show-progress -O data_tutorial/grid_stability/test.csv \
https://raw.githubusercontent.com/Qiskit/documentation/main/datasets/tutorials/grid_stability/test.csv
# Check the files have been downloaded
!echo "Dataset files downloaded:"
!ls -lh data_tutorial/grid_stability/*.csv
data_tutorial/grid_ 100%[===================>] 612.94K --.-KB/s in 0.01s
data_tutorial/grid_ 100%[===================>] 108.19K --.-KB/s in 0.006s
Dataset files downloaded:
-rw-r--r-- 1 coder coder 109K Nov 8 18:50 data_tutorial/grid_stability/test.csv
-rw-r--r-- 1 coder coder 613K Nov 8 18:50 data_tutorial/grid_stability/train.csv
Import required packages
In this section, we import all Python packages and Qiskit modules used throughout the tutorial.
These include core scientific libraries for data handling and model evaluation - such as NumPy, pandas, and scikit-learn - along with visualization tools and Qiskit components for running the quantum-enhanced model.
We also import the QiskitRuntimeService and QiskitFunctionsCatalog to connect with IBM Quantum® services and access the Singularity Machine Learning function.
from typing import Tuple
import warnings
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from imblearn.over_sampling import RandomOverSampler
from qiskit_ibm_catalog import QiskitFunctionsCatalog
from qiskit_ibm_runtime import QiskitRuntimeService
from sklearn.ensemble import AdaBoostClassifier
from sklearn.metrics import (
accuracy_score,
f1_score,
precision_score,
recall_score,
)
from sklearn.model_selection import train_test_split
warnings.filterwarnings("ignore")
Set constant variables
IBM_TOKEN = ""
IBM_INSTANCE_TEST = ""
IBM_INSTANCE_QUANTUM = ""
FUNCTION_NAME = "multiverse/singularity"
RANDOM_STATE: int = 123
TRAIN_PATH = "data_tutorial/grid_stability/train.csv"
TEST_PATH = "data_tutorial/grid_stability/test.csv"
Connect to IBM Quantum and load the Singularity function
Next, we authenticate with IBM Quantum services and load the Singularity Machine Learning – Classification function from the Qiskit Functions Catalog.
The QiskitRuntimeService establishes a secure connection to IBM Quantum Platform using your API token and instance CRN, allowing access to quantum backends.
The QiskitFunctionsCatalog is then used to retrieve the Singularity function by name ("multiverse/singularity"), enabling us to call it later for hybrid quantum–classical computation.
If the setup is successful, you will see a confirmation message indicating that the function has been loaded correctly.
service = QiskitRuntimeService(
token=IBM_TOKEN,
channel="ibm_quantum_platform",
instance=IBM_INSTANCE_QUANTUM,
)
backend = service.least_busy()
catalog = QiskitFunctionsCatalog(
token=IBM_TOKEN,
instance=IBM_INSTANCE_TEST,
channel="ibm_quantum_platform",
)
singularity = catalog.load(FUNCTION_NAME)
print(
"Successfully connected to IBM Qiskit Serverless and loaded the Singularity function."
)
print("Catalog:", catalog)
print("Singularity function:", singularity)
Successfully connected to IBM Qiskit Serverless and loaded the Singularity function.
Catalog: <QiskitFunctionsCatalog>
Singularity function: QiskitFunction(multiverse/singularity)
Define helper functions
Before running the main experiments, we define a few small utility functions that streamline data loading and model evaluation.
load_data()reads the input CSV files into NumPy arrays, splitting features and labels for compatibility withscikit-learnand quantum workflows.evaluate_predictions()computes key performance metrics - accuracy, precision, recall, and F1-score - and optionally reports runtime if timing information is provided.
These helper functions simplify repeated operations later in the notebook and ensure consistent metric reporting across both classical and quantum classifiers.
def load_data(data_path: str) -> Tuple[np.ndarray, np.ndarray]:
"""Load data from the given path to X and y arrays."""
df: pd.DataFrame = pd.read_csv(data_path)
return df.iloc[:, :-1].values, df.iloc[:, -1].values
def evaluate_predictions(predictions, y_true):
"""Compute and print accuracy, precision, recall, and F1 score."""
accuracy = accuracy_score(y_true, predictions)
precision = precision_score(y_true, predictions)
recall = recall_score(y_true, predictions)
f1 = f1_score(y_true, predictions)
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1:", f1)
return accuracy, precision, recall, f1
Step 1: Map classical inputs to a quantum problem
We begin by preparing the dataset for hybrid quantum–classical experimentation. The goal of this step is to convert the raw grid-stability data into balanced training, validation, and test splits that can be used consistently by both classical and quantum workflows. Maintaining identical splits ensures that later performance comparisons are fair and reproducible.
Data loading and preprocessing
We first load the training and test CSV files, create a validation split, and balance the dataset using random over-sampling. Balancing prevents bias toward the majority class and provides a more stable learning signal for both classical and quantum ensemble models.
# Load and upload the data
X_train, y_train = load_data(TRAIN_PATH)
X_test, y_test = load_data(TEST_PATH)
X_train, X_val, y_train, y_val = train_test_split(
X_train, y_train, test_size=0.2, random_state=RANDOM_STATE
)
# Balance the dataset through over-sampling of the positive class
ros = RandomOverSampler(random_state=RANDOM_STATE)
X_train_bal, y_train_bal = ros.fit_resample(X_train, y_train)
print("Shapes:")
print(" X_train_bal:", X_train_bal.shape)
print(" y_train_bal:", y_train_bal.shape)
print(" X_val:", X_val.shape)
print(" y_val:", y_val.shape)
print(" X_test:", X_test.shape)
print(" y_test:", y_test.shape)
Shapes:
X_train_bal: (5104, 12)
y_train_bal: (5104,)
X_val: (850, 12)
y_val: (850,)
X_test: (750, 12)
y_test: (750,)
Classical baseline: AdaBoost reference
Before running any quantum optimization, we train a strong classical baseline - a standard AdaBoost classifier - on the same balanced data. This provides a reproducible reference point for later comparison, helping to quantify whether quantum optimization improves generalization or efficiency beyond a well-tuned classical ensemble.
# ----- Classical baseline: AdaBoost -----
baseline = AdaBoostClassifier(n_estimators=60, random_state=RANDOM_STATE)
baseline.fit(X_train_bal, y_train_bal)
baseline_pred = baseline.predict(X_test)
print("Classical AdaBoost baseline:")
_ = evaluate_predictions(baseline_pred, y_test)
Classical AdaBoost baseline:
Accuracy: 0.7893333333333333
Precision: 1.0
Recall: 0.7893333333333333
F1: 0.8822652757078987