주 콘텐츠로 건너뛰기

암묵적 용매 모델을 사용한 전자 구조 시뮬레이션 템플릿 배포 및 실행

Cleveland Clinic과의 협력으로 개발된 이 템플릿은 암묵적 용매 내 분자의 바닥 상태 에너지와 용매화 자유 에너지를 계산하는 워크플로로 구성되어 있습니다 [1]. 이 시뮬레이션은 샘플 기반 양자 대각화(SQD) 방법 [2-6]과 용매의 적분 방정식 형식 편극 가능 연속 모델(IEF-PCM) [7]을 기반으로 합니다.

이 가이드는 용질로 메탄올 분자를 사용하는 템플릿을 활용하며, 그 전자 구조는 명시적으로 시뮬레이션되고, 용매로는 물을 사용하여 연속 유전 매질로 근사합니다. 메탄올의 전자 상관 효과를 고려하면서 계산 비용과 정확도의 균형을 유지하기 위해, SQD IEF-PCM으로 시뮬레이션되는 활성 공간에는 σ\sigma, σ\sigma^{*}, 그리고 고독 쌍(lone pair) 오비탈만 포함합니다. 이 오비탈 선택은 C[2s,2p], O[2s,2p], H[1s] 원자 오비탈 구성 요소를 사용하여 원자 원자가 활성 공간(AVAS) 방법으로 수행되며, 그 결과 14개의 전자와 12개의 오비탈로 이루어진 활성 공간(14e,12o)이 됩니다. 참조 오비탈은 cc-pvdz 기저 집합을 사용하여 폐각 Hartree Fock으로 계산됩니다.

워크플로 소개

이 인터랙티브 가이드는 이 함수 템플릿을 Qiskit Serverless에 업로드하고 예제 워크로드를 실행하는 방법을 보여줍니다. 이 템플릿은 네 단계의 Qiskit 패턴으로 구성되어 있습니다:

1. 입력 수집 및 문제 매핑

이 단계는 분자의 기하 구조, 선택된 활성 공간, 용매화 모델, LUCJ 옵션, SQD 옵션을 입력으로 받습니다. 그런 다음 Hartree-Fock(HF) IEF-PCM 데이터를 포함하는 PySCF 체크포인트 파일을 생성합니다. 이 데이터는 워크플로의 SQD 부분에서 사용됩니다. 워크플로의 LUCJ 부분의 경우, 입력 섹션에서는 PySCF FCIDUMP 형식으로 내부에 저장되는 기체상 HF 데이터도 생성합니다.

기체상 HF 시뮬레이션에서 얻은 정보와 활성 공간의 정의가 입력으로 사용됩니다. 중요한 것은, 오류 억제, 측정 횟수(shots), Circuit Transpiler 최적화 수준, Qubit 레이아웃에 관한 사용자 정의 정보도 입력 섹션에서 가져온다는 점입니다.

정의된 활성 공간 내에서 단전자 및 이전자 적분이 생성됩니다. 이 적분은 고전적인 CCSD 계산을 수행하는 데 사용되며, LUCJ Circuit을 매개변수화하는 데 사용할 t2 진폭을 반환합니다.

2. Circuit 최적화

LUCJ Circuit은 대상 하드웨어용 ISA Circuit으로 Transpile됩니다. 그런 다음 Sampler 프리미티브가 실행을 관리하기 위한 기본 오류 완화 옵션 세트와 함께 인스턴스화됩니다.

3. Circuit 실행

LUCJ 계산은 각 측정에 대한 비트 문자열을 반환하며, 이 비트 문자열은 연구 중인 시스템의 전자 배치에 해당합니다. 비트 문자열은 후처리의 입력으로 사용됩니다.

4. SQD를 사용한 후처리

이 마지막 단계는 HF IEF-PCM 정보가 포함된 PySCF 체크포인트 파일, LUCJ에 의해 예측된 전자 배치를 나타내는 비트 문자열, 그리고 입력 섹션에서 선택된 사용자 정의 SQD 옵션을 입력으로 받습니다. 출력으로는 최저 에너지 배치의 SQD IEF-PCM 총 에너지와 해당 용매화 자유 에너지를 생성합니다.

옵션

이 템플릿을 사용하려면 LUCJ Circuit 생성을 위한 옵션과 SQD 실행 매개변수를 지정해야 합니다.

LUCJ 옵션

LUCJ 양자 Circuit이 실행되면 분자 시스템의 확률 분포에서 계산 기저 상태를 나타내는 샘플 집합이 생성됩니다. LUCJ Circuit의 깊이와 표현력의 균형을 맞추기 위해, 반대 스핀의 스핀 오비탈에 해당하는 Qubit는 단일 앙실라 Qubit를 통해 이웃할 때 두-Qubit 게이트가 적용됩니다. heavy-hex 토폴로지를 가진 IBM 하드웨어에서 이 방법을 구현하기 위해, 동일한 스핀의 스핀 오비탈을 나타내는 Qubit는 대상 하드웨어의 heavy-hex 연결성으로 인해 각 라인이 지그재그 형태를 취하는 라인 토폴로지로 연결되며, 반대 스핀의 스핀 오비탈을 나타내는 Qubit는 네 번째 Qubit마다만 연결됩니다.

필수 옵션에 대한 자세한 내용을 보려면 클릭하세요:

사용자는 SQD IEF-PCM 함수의 lucj_options 섹션에서 이 지그재그 패턴을 충족하는 Qubit에 해당하는 initial_layout 배열을 제공해야 합니다. 메탄올의 SQD IEF-PCM (14e,12o)/cc-pvdz 시뮬레이션의 경우, Eagle R3 QPU의 주 대각선에 해당하는 초기 Qubit 레이아웃을 선택했습니다. 여기서 initial_layout 배열의 처음 12개 요소 [0, 14, 18, 19, 20, 33, 39, 40, 41, 53, 60, 61, ...]는 알파 스핀 오비탈에 해당합니다. 마지막 12개 요소 [... 2, 3, 4, 15, 22, 23, 24, 34, 43, 44, 45, 54]는 베타 스핀 오비탈에 해당합니다.

중요하게도, 사용자는 LUCJ Circuit의 측정 횟수에 해당하는 number_of_shots를 결정해야 합니다. S-CORE 절차의 첫 번째 단계는 바닥 상태 점유 수 분포의 초기 근사를 얻기 위해 올바른 입자 섹터의 샘플에 의존하기 때문에, 측정 횟수는 충분히 커야 합니다.

측정 횟수는 시스템과 하드웨어에 따라 크게 달라지지만, 비공유 결합, 단편 기반, 암묵적 용매 SQD 연구에서는 다음 지침을 따르면 화학적 정확도에 도달할 수 있음을 제안합니다:

  • 분자 오비탈이 16개 미만(스핀 오비탈 32개)인 시스템의 경우 20,000 - 200,000 shots
  • 분자 오비탈이 16 - 18개인 시스템의 경우 200,000 shots
  • 분자 오비탈이 18개 초과인 시스템의 경우 200,000 - 2,000,000 shots

필요한 측정 횟수는 연구 중인 시스템의 스핀 오비탈 수와 해당 시스템 내 선택된 활성 공간에 해당하는 힐베르트 공간의 크기에 따라 영향을 받습니다. 일반적으로 힐베르트 공간이 작은 경우 더 적은 수의 측정이 필요합니다. 사용 가능한 다른 LUCJ 옵션으로는 Circuit Transpiler 최적화 수준오류 억제 옵션이 있습니다. 이 옵션들도 필요한 측정 횟수와 결과 정확도에 영향을 미친다는 점에 유의하세요.

SQD 옵션

SQD 시뮬레이션의 중요한 옵션으로는 sqd_iterations, number_of_batches, samples_per_batch가 있습니다. 일반적으로 배치당 샘플 수가 적을 경우, 더 많은 배치(number_of_batches)와 S-CORE의 더 많은 반복(sqd_iterations)으로 보완할 수 있습니다. 배치가 많을수록 구성 부분 공간의 더 많은 변형을 샘플링할 수 있습니다. 최저 에너지 배치가 시스템의 바닥 상태 에너지 해로 사용되므로, 배치가 많을수록 더 좋은 통계를 통해 결과를 개선할 수 있습니다. S-CORE의 추가 반복은 올바른 입자 섹터의 샘플 수가 적은 경우 원래 LUCJ 분포에서 더 많은 구성을 복원하는 데 도움이 됩니다. 이를 통해 배치당 샘플 수를 줄일 수 있습니다.

SQD 옵션 구성에 대한 자세한 내용은 클릭하세요:

대안적인 전략은 배치당 더 많은 샘플을 사용하는 것으로, 이는 S-CORE 절차 중 올바른 입자 공간의 초기 LUCJ 샘플 대부분이 사용되도록 하고, 개별 부분 공간이 충분한 다양한 전자 구성을 포함하도록 합니다. 이를 통해 필요한 S-CORE 단계의 수를 줄일 수 있으며, 배치당 샘플 수가 충분히 크면 SQD 반복이 두 번 또는 세 번만 필요합니다. 그러나 배치당 샘플이 많을수록 각 대각화 단계의 계산 비용이 높아집니다. 따라서 SQD 시뮬레이션에서 정확도와 계산 비용 간의 균형은 sqd_iterations, number_of_batches, samples_per_batch를 최적으로 선택함으로써 달성할 수 있습니다.

SQD IEF-PCM 연구에 따르면, S-CORE의 세 번 반복을 사용하는 경우 다음 지침을 따르면 화학적 정확도에 도달할 수 있습니다:

  • 메탄올 SQD IEF-PCM (14e,12o) 시뮬레이션에서 배치당 600개 샘플
  • 메틸아민 SQD IEF-PCM (14e,13o) 시뮬레이션에서 배치당 1500개 샘플
  • 물 SQD IEF-PCM (8e,23o) 시뮬레이션에서 배치당 6000개 샘플
  • 에탄올 SQD IEF-PCM (20e,18o) 시뮬레이션에서 배치당 16000개 샘플

LUCJ에서 필요한 측정 횟수와 마찬가지로, S-CORE 절차에서 사용되는 배치당 필요한 샘플 수도 시스템과 하드웨어에 따라 크게 달라집니다. 위의 예시는 배치당 필요한 샘플 수의 벤치마크를 위한 초기 시작점을 추정하는 데 사용할 수 있습니다. 배치당 필요한 샘플 수의 체계적인 벤치마크에 관한 튜토리얼은 여기에서 찾을 수 있습니다.

템플릿 SQD IEF-PCM 함수 배포 및 실행

# Added by doQumentation — required packages for this notebook
!pip install -q ffsim numpy pyscf qiskit qiskit-addon-sqd qiskit-ibm-catalog qiskit-ibm-runtime qiskit-serverless solve-solvent

인증

qiskit-ibm-catalog을 사용하여 IBM Quantum Platform 대시보드에서 찾을 수 있는 API 키(토큰)로 QiskitServerless에 인증합니다. 이를 통해 선택한 함수를 업로드하거나 실행하기 위한 serverless 클라이언트를 인스턴스화할 수 있습니다:

from qiskit_ibm_catalog import QiskitServerless

serverless = QiskitServerless(
channel="ibm_quantum_platform",
instance="INSTANCE_CRN",
token="YOUR_API_KEY" # Use the 44-character API_KEY you created and saved from the IBM Quantum Platform Home dashboard
)

선택적으로, save_account()를 사용하여 로컬 환경에 자격 증명을 저장할 수 있습니다(IBM Cloud 계정 설정 가이드 참조). 이 명령은 자격 증명을 QiskitRuntimeService.save_account()와 동일한 파일에 기록한다는 점에 유의하세요:

QiskitServerless.save_account(token="YOUR_API_KEY", channel="ibm_quantum_platform", instance="INSTANCE_CRN")

계정이 저장된 경우 인증을 위한 토큰을 제공할 필요가 없습니다:

from qiskit_ibm_catalog import QiskitServerless

serverless = QiskitServerless()

템플릿 업로드

사용자 정의 Qiskit Function을 업로드하려면 먼저 함수 소스 코드를 정의하는 QiskitFunction 객체를 인스턴스화해야 합니다. 제목은 원격 클러스터에 있는 함수를 식별하는 데 사용됩니다. 주 진입점은 if __name__ == "__main__"을 포함하는 파일입니다. 워크플로에 추가적인 소스 파일이 필요한 경우, 진입점과 함께 업로드될 작업 디렉토리를 정의할 수 있습니다.

from qiskit_ibm_catalog import QiskitFunction

template = QiskitFunction(
title="sqd_pcm_template",
entrypoint="sqd_pcm_entrypoint.py",
working_dir="./source_files/", # all files in this directory will be uploaded
dependencies=[
"ffsim==0.0.54",
"pyscf==2.9.0",
"qiskit_addon_sqd==0.10.0",
],
)
print(template)
QiskitFunction(sqd_pcm_template)

인스턴스가 준비되면 serverless에 업로드합니다:

serverless.upload(template)
QiskitFunction(sqd_pcm_template)

프로그램이 성공적으로 업로드되었는지 확인하려면 serverless.list()를 사용합니다:

serverless.list()
[QiskitFunction(sqd_pcm_template),
QiskitFunction(hamiltonian_simulation_template)]

원격으로 템플릿 로드 및 실행

함수 템플릿이 업로드되었으므로 Qiskit Serverless를 사용하여 원격으로 실행할 수 있습니다. 먼저 이름으로 템플릿을 로드합니다:

template = serverless.load("sqd_pcm_template")
print(template)
QiskitFunction(sqd_pcm_template)

다음으로, SQD-IEF PCM의 도메인 수준 입력으로 템플릿을 실행합니다. 이 예시에서는 메탄올 기반 워크로드를 지정합니다.

molecule = {
"atom": """
O -0.04559 -0.75076 -0.00000;
C -0.04844 0.65398 -0.00000;
H 0.85330 -1.05128 -0.00000;
H -1.08779 0.98076 -0.00000;
H 0.44171 1.06337 0.88811;
H 0.44171 1.06337 -0.88811
""", # Must be specified
"basis": "cc-pvdz", # default is "sto-3g"
"spin": 0, # default is 0
"charge": 0, # default is 0
"verbosity": 0, # default is 0
"number_of_active_orb": 12, # Must be specified
"number_of_active_alpha_elec": 7, # Must be specified
"number_of_active_beta_elec": 7, # Must be specified
"avas_selection": [
"%d O %s" % (k, x) for k in [0] for x in ["2s", "2px", "2py", "2pz"]
]
+ ["%d C %s" % (k, x) for k in [1] for x in ["2s", "2px", "2py", "2pz"]]
+ ["%d H 1s" % k for k in [2, 3, 4, 5]], # default is None
}

solvent_options = {
"method": "IEF-PCM", # other available methods are COSMO, C-PCM, SS(V)PE, see https://manual.q-chem.com/5.4/topic_pcm-em.html
"eps": 78.3553, # value for water
}

lucj_options = {
"initial_layout": [
0,
14,
18,
19,
20,
33,
39,
40,
41,
53,
60,
61,
2,
3,
4,
15,
22,
23,
24,
34,
43,
44,
45,
54,
],
"dynamical_decoupling_choice": True,
"twirling_choice": True,
"number_of_shots": 200000,
"optimization_level": 2,
}

sqd_options = {
"sqd_iterations": 3,
"number_of_batches": 10,
"samples_per_batch": 1000,
"max_davidson_cycles": 200,
}

backend_name = "ibm_sherbrooke"
job = template.run(
backend_name=backend_name,
molecule=molecule,
solvent_options=solvent_options,
lucj_options=lucj_options,
sqd_options=sqd_options,
)
print(job.job_id)
39f8fb70-79b2-43ca-b723-84e6b6135821

작업의 상세 상태를 확인합니다:

import time

t0 = time.time()
status = job.status()
if status == "QUEUED":
print(f"time = {time.time()-t0:.2f}, status = QUEUED")
while True:
status = job.status()
if status == "QUEUED":
continue
print(f"time = {time.time()-t0:.2f}, status = {status}")
if status == "DONE" or status == "ERROR":
break
time = 2.35, status = DONE

작업이 실행되는 동안 logger.info 출력에서 생성된 로그를 가져올 수 있습니다. 이 로그는 SQD IEF-PCM 워크플로의 진행 상황에 대한 유용한 정보를 제공합니다. 예를 들어, 동일 스핀 오비탈 연결이나 하드웨어에서 실행하기 위한 최종 ISA Circuit의 두-Qubit 깊이 등을 확인할 수 있습니다.

print(job.logs())

작업 결과를 요청하면 결과가 사용 가능해질 때까지 프로그램의 나머지 부분이 차단됩니다. 작업이 완료된 후 결과를 검색할 수 있습니다. 여기에는 용매화 자유 에너지뿐만 아니라 최저 에너지 배치, 최저 에너지 값, 총 솔버 소요 시간 등의 유용한 정보가 포함됩니다.

result = job.result()

result
{'total_energy_hist': array([[-115.14768518, -115.1368396 , -114.19181692, -115.13745429,
-115.1445012 , -114.19673326, -115.1547003 , -114.20563866,
-115.13748344, -115.14764974],
[-115.15768392, -115.15850126, -115.15857275, -115.15770916,
-115.15801684, -115.15822125, -115.15833521, -115.15844051,
-115.15735538, -115.15862354],
[-115.15795148, -115.15847925, -115.15856677, -115.15811156,
-115.15815602, -115.15785171, -115.1583672 , -115.1585533 ,
-115.15833528, -115.15808791]]),
'spin_squared_value_hist': array([[5.37327508e-03, 1.32981759e-02, 1.36214922e-02, 8.84413615e-03,
7.26723578e-03, 1.94875195e-02, 3.03153152e-03, 6.07543106e-03,
1.04951849e-02, 5.36529204e-03],
[6.39397528e-04, 1.36814350e-04, 9.09054260e-05, 5.99361358e-04,
3.64261739e-04, 2.54905866e-04, 2.32540370e-04, 1.53181990e-04,
7.23519739e-04, 6.80737671e-05],
[4.53776416e-04, 1.63043449e-04, 1.05317263e-04, 3.82912836e-04,
3.41047803e-04, 5.18620393e-04, 2.06819142e-04, 1.17086537e-04,
2.32357159e-04, 4.26071537e-04]]),
'solvation_free_energy_hist': array([[-0.00725018, -0.00743955, -0.01132905, -0.0073377 , -0.00722221,
-0.01136705, -0.00719279, -0.01072829, -0.00733404, -0.00725961],
[-0.00719252, -0.00718315, -0.00718074, -0.00719325, -0.00717703,
-0.00718391, -0.00718354, -0.00717928, -0.00719887, -0.0071801 ],
[-0.00719351, -0.00718255, -0.00718198, -0.00718429, -0.00718349,
-0.00718329, -0.0071882 , -0.00718363, -0.00718549, -0.00718814]]),
'occupancy_hist': [[array([0.99712298, 0.99278936, 0.99083163, 0.97328469, 0.98959809,
0.98922134, 0.720333 , 0.25683194, 0.01939338, 0.02840332,
0.00946988, 0.0327204 ]),
array([0.99712298, 0.99278936, 0.99083163, 0.97328469, 0.98959809,
0.98922134, 0.720333 , 0.25683194, 0.01939338, 0.02840332,
0.00946988, 0.0327204 ])],
[array([0.9959042 , 0.9922607 , 0.99018862, 0.99265843, 0.98927447,
0.9900833 , 0.99403876, 0.00989025, 0.01120814, 0.01137717,
0.01152871, 0.01158725]),
array([0.9959042 , 0.9922607 , 0.99018862, 0.99265843, 0.98927447,
0.9900833 , 0.99403876, 0.00989025, 0.01120814, 0.01137717,
0.01152871, 0.01158725])],
[array([0.99590079, 0.99222193, 0.99016753, 0.99265045, 0.98927264,
0.99007179, 0.99407207, 0.00986684, 0.01125181, 0.01141439,
0.01150733, 0.01160243]),
array([0.99590079, 0.99222193, 0.99016753, 0.99265045, 0.98927264,
0.99007179, 0.99407207, 0.00986684, 0.01125181, 0.01141439,
0.01150733, 0.01160243])]],
'lowest_energy_batch': 2,
'lowest_energy_value': -115.1585667736213,
'solvation_free_energy': -0.007181981952470838,
'sci_solver_total_duration': 493.997501373291,
'metadata': {'resources_usage': {'RUNNING: MAPPING': {'CPU_TIME': 6.080063343048096},
'RUNNING: OPTIMIZING_FOR_HARDWARE': {'CPU_TIME': 1.999896764755249},
'RUNNING: WAITING_FOR_QPU': {'CPU_TIME': 6.2850868701934814},
'RUNNING: EXECUTING_QPU': {'QPU_TIME': 21.639373540878296},
'RUNNING: POST_PROCESSING': {'CPU_TIME': 495.40831995010376}},
'num_iterations_executed': 3}}

결과 메타데이터에는 각 워크로드에 필요한 QPU 및 CPU 시간을 더 잘 추정할 수 있는 리소스 사용량 요약이 포함되어 있습니다(이 예시는 더미 장치에서 실행되었으므로 실제 리소스 사용 시간은 다를 수 있습니다). 작업이 완료되면 전체 로그 출력을 사용할 수 있습니다.

print(job.logs())
2025-06-27 08:42:41,358	INFO job_manager.py:531 -- Runtime env is setting up.
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:45,015: Starting runtime service
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:45,621: Backend: ibm_sherbrooke
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:46,809: Initializing molecule object
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:51,599: Performing CCSD
Parsing /tmp/ray/session_2025-06-27_08-42-13_898146_1/runtime_resources/working_dir_files/_ray_pkg_4bc93dcc58c04b91/output_sqd_pcm/2025-06-27_08-42-45.fcidump.txt
Overwritten attributes get_ovlp get_hcore of <class 'pyscf.scf.hf_symm.SymAdaptedRHF'>
/usr/local/lib/python3.11/site-packages/pyscf/gto/mole.py:1293: UserWarning: Function mol.dumps drops attribute energy_nuc because it is not JSON-serializable
warnings.warn(msg)
/usr/local/lib/python3.11/site-packages/pyscf/gto/mole.py:1293: UserWarning: Function mol.dumps drops attribute intor_symmetric because it is not JSON-serializable
warnings.warn(msg)
converged SCF energy = -115.049680672847
E(CCSD) = -115.1519910037652 E_corr = -0.1023103309180226
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:51,694: Same spin orbital connections: [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9), (9, 10), (10, 11)]
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:51,694: Opposite spin orbital connections: [(0, 0), (4, 4), (8, 8)]
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:53,718: Optimization level: 2, ops: OrderedDict([('rz', 2438), ('sx', 1496), ('ecr', 766), ('x', 185), ('measure', 24), ('barrier', 1)]), depth: 391
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:53,736: Two-qubit gate depth: 94
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:53,737: Submitting sampler job
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:54,273: Job ID: d1f5j3lqbivc73ebqpj0
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:42:54,313: Job Status: QUEUED
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,813: Starting configuration recovery iteration 0
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,841: Batch 0 subspace dimension: 531441
2025-06-27 08:43:24,844 INFO worker.py:1588 -- Using address 172.17.16.124:6379 set in the environment variable RAY_ADDRESS
2025-06-27 08:43:24,847 INFO worker.py:1723 -- Connecting to existing Ray cluster at address: 172.17.16.124:6379...
2025-06-27 08:43:24,876 INFO worker.py:1908 -- Connected to Ray cluster. View the dashboard at http://172.17.16.124:8265 
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,945: Batch 1 subspace dimension: 519841
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,950: Batch 2 subspace dimension: 543169
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,955: Batch 3 subspace dimension: 532900
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,960: Batch 4 subspace dimension: 534361
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,964: Batch 5 subspace dimension: 531441
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,969: Batch 6 subspace dimension: 540225
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,974: Batch 7 subspace dimension: 524176
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,979: Batch 8 subspace dimension: 537289
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:43:24,983: Batch 9 subspace dimension: 540225
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:09,006: Lowest energy batch: 6
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:09,007: Lowest energy value: -115.15470029849135
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:09,007: Corresponding g_solv value: -0.0071927910374866375
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:09,007: -----------------------------------
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:09,007: Starting configuration recovery iteration 1
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,564: Batch 0 subspace dimension: 413449
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,572: Batch 1 subspace dimension: 399424
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,578: Batch 2 subspace dimension: 438244
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,583: Batch 3 subspace dimension: 422500
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,589: Batch 4 subspace dimension: 409600
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,596: Batch 5 subspace dimension: 404496
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,601: Batch 6 subspace dimension: 410881
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,605: Batch 7 subspace dimension: 442225
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,611: Batch 8 subspace dimension: 409600
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:48:40,618: Batch 9 subspace dimension: 405769
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:49:54,917: Lowest energy batch: 9
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:49:54,917: Lowest energy value: -115.15862353596414
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:49:54,917: Corresponding g_solv value: -0.0071800982859467006
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:49:54,918: -----------------------------------
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:49:54,918: Starting configuration recovery iteration 2
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,501: Batch 0 subspace dimension: 399424
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,508: Batch 1 subspace dimension: 412164
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,514: Batch 2 subspace dimension: 432964
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,519: Batch 3 subspace dimension: 400689
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,524: Batch 4 subspace dimension: 432964
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,529: Batch 5 subspace dimension: 418609
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,533: Batch 6 subspace dimension: 418609
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,538: Batch 7 subspace dimension: 425104
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,543: Batch 8 subspace dimension: 404496
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:50:25,548: Batch 9 subspace dimension: 429025
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:51:37,900: Lowest energy batch: 2
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:51:37,900: Lowest energy value: -115.1585667736213
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:51:37,901: Corresponding g_solv value: -0.007181981952470838
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:51:37,901: -----------------------------------
sqd_pcm_entrypoint.run_function:INFO:2025-06-27 08:51:37,901: SCI_solver totally takes: 493.997501373291 seconds

다음 단계

추천 사항

참고문헌

[1] Danil Kaliakin, Akhil Shajan, Fangchun Liang, and Kenneth M. Merz Jr. Implicit Solvent Sample-Based Quantum Diagonalization, The Journal of Physical Chemistry B, 2025, DOI: 10.1021/acs.jpcb.5c01030

[2] Javier Robledo-Moreno, et al., Chemistry Beyond Exact Solutions on a Quantum-Centric Supercomputer, arXiv:2405.05068 [quant-ph].

[3] Jeffery Yu, et al., Quantum-Centric Algorithm for Sample-Based Krylov Diagonalization, arXiv:2501.09702 [quant-ph].

[4] Keita Kanno, et al., Quantum-Selected Configuration Interaction: classical diagonalization of Hamiltonians in subspaces selected by quantum computers, arXiv:2302.11320 [quant-ph].

[5] Kenji Sugisaki, et al., Hamiltonian simulation-based quantum-selected configuration interaction for large-scale electronic structure calculations with a quantum computer, arXiv:2412.07218 [quant-ph].

[6] Mathias Mikkelsen, Yuya O. Nakagawa, Quantum-selected configuration interaction with time-evolved state, arXiv:2412.13839 [quant-ph].

[7] Herbert, John M. Dielectric continuum methods for quantum chemistry. WIREs Computational Molecular Science, 2021, 11, 1759-0876.

[8] Saki, A. A.; Barison, S.; Fuller, B.; Garrison, J. R.; Glick, J. R.; Johnson, C.; Mezzacapo, A.; Robledo-Moreno, J.; Rossmannek, M.; Schweigert, P. et al. Qiskit addon: sample-based quantum diagonalization, 2024; https://github.com/Qiskit/qiskit-addon-sqd

[9] Asun, Q.; Zhang, X.; Banerjee, S.; Bao, P.; Barbry, M.; Blunt, N. S.; Bogdanov, N. A.; Booth, G. H.; Chen, J.; Cui, Z.-H. PySCF: Python-based Simulations of Chemistry Framework, 2025; https://github.com/pyscf/pyscf

[10] Kevin J. Sung; et al., FFSIM: Faster simulations of fermionic quantum circuits, 2024. https://github.com/qiskit-community/ffsim

%%writefile ./source_files/__init__.py
%%writefile ./source_files/solve_solvent.py

# This code is part of a Qiskit project.
#
# (C) Copyright IBM and Cleveland Clinic 2025
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""Functions for the study of fermionic systems."""

from __future__ import annotations

import warnings

import numpy as np

# DSK Add imports needed for CASCI wrapper
from pyscf import ao2mo, scf, fci
from pyscf.mcscf import avas, casci
from pyscf.solvent import pcm
from pyscf.lib import chkfile, logger

from qiskit_addon_sqd.fermion import (
SCIState,
bitstring_matrix_to_ci_strs,
_check_ci_strs,
)

# DSK Below is the modified CASCI kernel compatible with SQD.
# It utilizes the "fci.selected_ci.kernel_fixed_space"
# as well as enables passing the "batch" and "max_davidson"
# input arguments from "solve_solvent".
# The "batch" contains the CI addresses corresponding to subspaces
# derived from LUCJ and S-CORE calculations.
# The "max_davidson" controls the maximum number of cycles of Davidson's algorithm.

# pylint: disable = unused-argument
def kernel(casci_object, mo_coeff=None, ci0=None, verbose=logger.NOTE, envs=None):
"""CASCI solver compatible with SQD.

Args:
casci_object: CASCI or CASSCF object.
In case of SQD, only CASCI instance is currently incorporated.

mo_coeff : ndarray
orbitals to construct active space Hamiltonian.
In context of SQD, these are either AVAS mo_coeff
or all of the MOs (with option to exclude core MOs).

ci0 : ndarray or custom types FCI solver initial guess.
For SQD the usage of ci0 was not tested.

For external FCI-like solvers, it can be
overloaded different data type. For example, in the state-average
FCI solver, ci0 is a list of ndarray. In other solvers such as
DMRGCI solver, SHCI solver, ci0 are custom types.

kwargs:
envs: dict
In case of SQD this option was not explored,
but in principle this can facilitate the incorporation of the external solvers.

The variable envs is created (for PR 807) to passes MCSCF runtime
environment variables to SHCI solver. For solvers which do not
need this parameter, a kwargs should be created in kernel method
and "envs" pop in kernel function.
"""
if mo_coeff is None:
mo_coeff = casci_object.mo_coeff
if ci0 is None:
ci0 = casci_object.ci

log = logger.new_logger(casci_object, verbose)
t0 = (logger.process_clock(), logger.perf_counter())
log.debug("Start CASCI")

ncas = casci_object.ncas
nelecas = casci_object.nelecas

# The start of SQD version of kernel
# DSK add the read of configurations for batch
ci_strs_sqd = casci_object.batch

# DSK add the input for the maximum number of cycles of Davidson's algorithm
max_davidson = casci_object.max_davidson

# DSK add electron up and down count and norb = ncas
n_up = nelecas[0]
n_dn = nelecas[1]
norb = ncas

# DSK Eigenstate solver info
sqd_verbose = verbose

# DSK ERI read
eri_cas = ao2mo.restore(1, casci_object.get_h2eff(), casci_object.ncas)
t1 = log.timer("integral transformation to CAS space", *t0)

# DSK 1e integrals
h1eff, energy_core = casci_object.get_h1eff()
log.debug("core energy = %.15g", energy_core)
t1 = log.timer("effective h1e in CAS space", *t1)

if h1eff.shape[0] != ncas:
raise RuntimeError(
"Active space size error. nmo=%d ncore=%d ncas=%d" # pylint: disable=consider-using-f-string
% (mo_coeff.shape[1], casci_object.ncore, ncas)
)

# DSK fcisolver needs to be defined in accordance with SQD
# in this software stack it is done in the "solve_solvent" portion of the code.
myci = casci_object.fcisolver
e_cas, sqdvec = fci.selected_ci.kernel_fixed_space(
myci,
h1eff,
eri_cas,
norb,
(n_up, n_dn),
ci_strs=ci_strs_sqd,
verbose=sqd_verbose,
max_cycle=max_davidson,
)

# DSK fcivec is the general name for CI vector assigned by PySCF.
# Depending on type of solver it is either FCI or SCI vector.
# In case of sqd we can call it "sqdvec" for clarity.
# Nonetheless, for further processing PySCF expects
# this data structure to be called fcivec, regardless of the used solver.

fcivec = sqdvec

t1 = log.timer("CI solver", *t1)
e_tot = energy_core + e_cas

# Returns either standard CASCI data or SQD data. Return depends on "sqd_run" True/False.
return e_tot, e_cas, fcivec

# Replace standard CASCI kernel with the SQD-compatible CASCI kernel defined above
casci.kernel = kernel

def solve_solvent(
bitstring_matrix: tuple[np.ndarray, np.ndarray] | np.ndarray,
/,
myeps: float,
mysolvmethod: str,
myavas: list,
num_orbitals: int,
*,
spin_sq: int | None = None,
max_davidson: int = 100,
verbose: int | None = 0,
checkpoint_file: str,
) -> tuple[float, SCIState, list[np.ndarray], float]:
"""Approximate the ground state given molecular integrals and a set of electronic configurations.

Args:
bitstring_matrix: A set of configurations defining the subspace onto which the Hamiltonian
will be projected and diagonalized. This is a 2D array of ``bool`` representations of bit
values such that each row represents a single bitstring. The spin-up configurations
should be specified by column indices in range ``(N, N/2]``, and the spin-down
configurations should be specified by column indices in range ``(N/2, 0]``, where ``N``
is the number of qubits.

(DEPRECATED) The configurations may also be specified by a length-2 tuple of sorted 1D
arrays containing unsigned integer representations of the determinants. The two lists
should represent the spin-up and spin-down orbitals, respectively.

To build PCM model PySCF needs the structure of the molecule. Hence, the electron integrals
(hcore and eri) are not enough to form IEF-PCM simulation. Instead the "start.chk" file is used.
This workflow also requires additional information about solute and solvent,
which is reflected by additional arguments below

myeps: Dielectric parameter of the solvent.
mysolvmethod: Solvent model, which can be IEF-PCM, COSMO, C-PCM, SS(V)PE,
see https://manual.q-chem.com/5.4/topic_pcm-em.html
At the moment only IEF-PCM was tested.
In principle two other models from PySCF "solvent" module can be used as well,
namely SMD and polarizable embedding (PE).
The SMD and PE were not tested yet and their usage requires addition of more
input arguments for "solve_solvent".
myavas: This argument allows user to select active space in solute with AVAS.
The corresponding list should include target atomic orbitals.
If myavas=None, then active space selected based on number of orbitals
derived from ci_strs.
It is assumed that if myavas=None, then the target calculation is either
a) corresponds to full basis case.
b) close to full basis case and only few core orbitals are excluded.
num_orbitals: Number of orbitals, which is essential when myavas = None.
In AVAS case number of orbitals and electrons is derived by AVAS procedure itself.
spin_sq: Target value for the total spin squared for the ground state.
If ``None``, no spin will be imposed.
max_davidson: The maximum number of cycles of Davidson's algorithm
verbose: A verbosity level between 0 and 10
checkpoint_file: Name of the checkpoint file

NOTE: For now open shell functionality is not supported in SQD PCM calculations.
Hence, at the moment solve_solvent does not include open_shell as one of the arguments.

Returns:
- Minimum energy from SCI calculation
- The SCI ground state
- Average occupancy of the alpha and beta orbitals, respectively
- Expectation value of spin-squared
- Solvation free energy

"""
# Unlike the "solve_fermion", the "solve_solvent" utilizes the "checkpoint" file to
# get the starting HF information, which means that "solve_solvent" does not accept
# "hcore" and "eri" as the input arguments.
# Instead "hcore" and "eri" are generated inside of the custom SQD-compatible
# CASCI kernel (defined above).
# The generation of "hcore" and "eri" is based on the information from "checkpoint" file
# as well as "myavas" and "num_orbitals" input arguments.

# DSK this part handles addresses and is identical to "solve_fermion"
if isinstance(bitstring_matrix, tuple):
warnings.warn(
"Passing the input determinants as integers is deprecated. "
"Users should instead pass a bitstring matrix defining the subspace.",
DeprecationWarning,
stacklevel=2,
)
ci_strs = bitstring_matrix
else:
# This will become the default code path after the deprecation period.
ci_strs = bitstring_matrix_to_ci_strs(bitstring_matrix, open_shell=False)
ci_strs = _check_ci_strs(ci_strs)

num_up = format(ci_strs[0][0], "b").count("1")
num_dn = format(ci_strs[1][0], "b").count("1")

# DSK assign verbosity
verbose_ci = verbose

# DSK add information about solute and solvent.
# Since PCM model needs the information about the structure of the molecule
# one cannot use only FCIDUMP. Instead converged HF data can be passed from "checkpoint" file
# along with "mol" object containing the geometry and other information about the solute.

############################################
# This section is specific to "solve_solvent" and is not present in "solve_fermion".
# In case of "solve_fermion" the "eri" and "hcore" are passed directly to
# "fci.selected_ci.kernel_fixed_space".
# In case of "solve_solvent" the incorporation of the polarizable continuum model
# requires utilization of "CASCI.with_solvent"
# data object from PySCF, where underlying CASCI.base_kernel has to be replaced
# with SQD-compatible version.
# Due to these differences in the implementation the "solve_solvent" recovers
# the converged mean field results and "molecule" object from "checkpoint" file
# (instead of using FCIDUMP),
# followed by passing of solute, solvent, and active space information to "CASCI.with_solvent".
# This includes the initiation of "mol", "cm", "mf", and "mc" data structures.

mol = chkfile.load_mol(checkpoint_file)

# DSK Initiation of the solvent model
cm = pcm.PCM(mol)
cm.eps = myeps # solute eps value
cm.method = mysolvmethod # IEF-PCM, COSMO, C-PCM, SS(V)PE,
# see https://manual.q-chem.com/5.4/topic_pcm-em.html

# DSK Read-in converged RHF solution
scf_result_dic = chkfile.load(checkpoint_file, "scf")
mf = scf.RHF(mol).PCM(cm)
mf.__dict__.update(scf_result_dic)

# Identify the active space based on the user input of AVAS or number of orbitals and electrons
if myavas is not None:
orbs = myavas
avas_obj = avas.AVAS(mf, orbs, with_iao=True)
avas_obj.kernel()
ncas, nelecas, _, _, _ = (
avas_obj.ncas,
avas_obj.nelecas,
avas_obj.mo_coeff,
avas_obj.occ_weights,
avas_obj.vir_weights,
)
else:
ncas = num_orbitals
nelecas = (num_up, num_dn)

# Initiate the "CASCI.with_solvent" object
mc = casci.CASCI(mf, ncas=ncas, nelecas=nelecas).PCM(cm)
# Replace mo_coeff with ones produced by AVAS if AVAS is utilized
if myavas is not None:
mc.mo_coeff = avas_obj.mo_coeff
# Read-in the configuration interaction subspace derived from LUCJ and S-CORE
mc.batch = ci_strs
# Assign number of maximum Davidson steps
mc.max_davidson = max_davidson

####### The definition of "fcisolver" object is identical to "solve_fermion" case ########
myci = fci.selected_ci.SelectedCI()
if spin_sq is not None:
myci = fci.addons.fix_spin_(myci, ss=spin_sq)
mc.fcisolver = myci
mc.verbose = verbose_ci
#########################################################################################

# Initiate the "CASCI.with_solvent" simulation with SQD-compatible based CASCI kernel.
mc_result = mc.kernel()

# Get data out of the "CASCI.with_solvent" object
e_sci = mc_result[0]
sci_vec = mc_result[2]
# Here we get additional output comparing to "solve_fermion",
# which is the solvation free energy (G_solv)
g_solv = mc.with_solvent.e

#####################################################
# The remainder of the code in solve_solvent is nearly identical to solve_fermion code.

# However, there are two exceptions in "solve_solvent":

# 1) The dm2 is currently not computed, but can be included if needed
# 2) e_sci is directly output as the result of CASCI.with_solvent object.

# Hence, the two following lines of code are not present in "solve_solvent"
# comparing to the "solve_fermion" code:

# dm2 = myci.make_rdm2(sci_vec, norb, (num_up, num_dn))
# e_sci = np.einsum("pr,pr->", dm1, hcore) + 0.5 * np.einsum("prqs,prqs->", dm2, eri)

# Calculate the avg occupancy of each orbital
dm1 = myci.make_rdm1s(sci_vec, ncas, (num_up, num_dn))
avg_occupancy = [np.diagonal(dm1[0]), np.diagonal(dm1[1])]

# Compute total spin
spin_squared = myci.spin_square(sci_vec, ncas, (num_up, num_dn))[0]

# Convert the PySCF SCIVector to internal format. We access a private field here,
# so we assert that we expect the SCIVector output from kernel_fixed_space to
# have its _strs field populated with alpha and beta strings.
assert isinstance(sci_vec._strs[0], np.ndarray) and isinstance(sci_vec._strs[1], np.ndarray)
assert sci_vec.shape == (len(sci_vec._strs[0]), len(sci_vec._strs[1]))
sci_state = SCIState(
amplitudes=np.array(sci_vec),
ci_strs_a=sci_vec._strs[0],
ci_strs_b=sci_vec._strs[1],
)

return e_sci, sci_state, avg_occupancy, spin_squared, g_solv
%%writefile ./source_files/sqc_pcm_entrypoint.py

# This code is part of a Qiskit project.
#
# (C) Copyright IBM and Cleveland Clinic 2025
#
# This code is licensed under the Apache License, Version 2.0. You may
# obtain a copy of this license in the LICENSE.txt file in the root directory
# of this source tree or at http://www.apache.org/licenses/LICENSE-2.0.
#
# Any modifications or derivative works of this code must retain this
# copyright notice, and modified files need to carry a notice indicating
# that they have been altered from the originals.

"""
SQD-PCM Function Template source code.
"""
from pathlib import Path
from typing import Any
from datetime import datetime
import os
import sys
import json
import logging
import time
import traceback
import numpy as np

import ffsim

from pyscf import gto, scf, mcscf, ao2mo, tools, cc
from pyscf.lib import chkfile
from pyscf.mcscf import avas
from pyscf.solvent import pcm

from qiskit import QuantumCircuit, QuantumRegister
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
from qiskit.primitives import BackendSamplerV2

from qiskit_addon_sqd.counts import counts_to_arrays
from qiskit_addon_sqd.configuration_recovery import recover_configurations
from qiskit_addon_sqd.fermion import bitstring_matrix_to_ci_strs
from qiskit_addon_sqd.subsampling import postselect_and_subsample

from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2
from qiskit_serverless import get_arguments, save_result, distribute_task, get, update_status, Job

current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path.insert(0, current_dir)
from solve_solvent import solve_solvent # pylint: disable=wrong-import-position

logger = logging.getLogger(__name__)

def run_function(
backend_name: str,
molecule: dict,
solvent_options: dict,
sqd_options: dict,
lucj_options: dict | None = None,
**kwargs,
) -> dict[str, Any]:
"""
Main entry point for the SQD-PCM (Polarizable Continuum Model) workflow.

This function encapsulates the end-to-end execution of the algorithm.

Args:
backend_name: Identifier for the target backend, required for all
workflows that access IBM Quantum hardware.

molecule: dictionary with molecule information:
- "atom" (str): required field, follows pyscf specification for atomic geometry.
For example, for methanol the value would be::

'''
O -0.04559 -0.75076 -0.00000;
C -0.04844 0.65398 -0.00000;
H 0.85330 -1.05128 -0.00000;
H -1.08779 0.98076 -0.00000;
H 0.44171 1.06337 0.88811;
H 0.44171 1.06337 -0.88811;
'''

- "number_of_active_orb" (int): required field
- "number_of_active_alpha_elec" (int): required field
- "number_of_active_beta_elec" (int): required field
- "basis" (str): optional field, default is "sto-3g"
- "verbosity" (int): optional field, default is 0
- "charge" (int): optional field, default is 0
- "spin" (int): optional field, default is 0
- "avas_selection" (list[str] | None): optional field, default is None

solvent_options: dictionary with solvent options information:
- "method" (str): required field. Method for computing solvent reaction field
for the PCM. Accepted values are: "IEF-PCM", "COSMO",
"C-PCM", "SS(V)PE", see https://manual.q-chem.com/5.4/topic_pcm-em.html
- "eps" (float): required field. Dielectric constant of the solvent in the PCM.

sqd_options: dictionary with sqd options information:
- "sqd_iterations" (int): required field.
- "number_of_batches" (int): required field.
- "samples_per_batch" (int): required field.
- "max_davidson_cycles" (int): required field.

lucj_options: optional dictionary with lucj options information:
- "optimization_level" (int): optional field, default is 2
- "initial_layout" (list[int]): optional field, default is None
- "dynamical_decoupling" (bool): optional field, default is True
- "twirling" (bool): optional field, default is True
- "number_of_shots" (int): optional field, default is 10000

**kwargs
Optional keyword arguments to customize behavior. Existing kwargs include:
- "files_name" (str): optional name for output files (enabled for local testing)
- "testing_backend" (FakeBackendV2): optional fake backend instance to bypass
qiskit runtime service instantiation (enabled for local testing)
- "count_dict_file_name" (str): path to a count dict file to bypass primitive
execution and jump directly to SQD section (enabled for local testing)

Returns:
The function should return the execution results as a dictionary with string keys.
This is to ensure compatibility with ``qiskit_serverless.save_result``.
"""

# Preparation Step: Input validation.
# Do this at the top of the function definition so it fails early if any required
# arguments are missing or invalid.

# Molecule parsing
# Required:
geo = molecule["atom"]
num_active_orb = molecule["number_of_active_orb"]
num_active_alpha = molecule["number_of_active_alpha_elec"]
num_active_beta = molecule["number_of_active_beta_elec"]
# Optional:
input_basis = molecule.get("basis", "sto-3g")
input_verbosity = molecule.get("verbosity", 0)
input_charge = molecule.get("charge", 0)
input_spin = molecule.get("spin", 0)
myavas = molecule.get("avas_selection", None)

# Solvent options parsing
myeps = solvent_options["eps"]
mymethod = solvent_options["method"]

# LUCJ options parsing
if lucj_options is None:
lucj_options = {}
opt_level = lucj_options.get("optimization_level", 2)
initial_layout = lucj_options.get("initial_layout", None)
use_dd = lucj_options.get("dynamical_decoupling", True)
use_twirling = lucj_options.get("twirling", True)
num_shots = lucj_options.get("number_of_shots", True)

# SQD options parsing
iterations = sqd_options["sqd_iterations"]
n_batches = sqd_options["number_of_batches"]
samples_per_batch = sqd_options["samples_per_batch"]
max_davidson_cycles = sqd_options["max_davidson_cycles"]

# kwarg parsing (local testing)
testing_backend = kwargs.get("testing_backend", None)
count_dict_file_name = kwargs.get("count_dict_file_name", None)

files_name = kwargs.get("files_name", datetime.now().strftime("%Y-%m-%d_%H-%M-%S"))
output_path = Path.cwd() / "output_sqd_pcm"
output_path.mkdir(exist_ok=True)
datafiles_name = str(output_path) + "/" + files_name

# --
# Preparation Step: Qiskit Runtime & primitive configuration for
# execution on IBM Quantum hardware.

if testing_backend is None:
# Initialize Qiskit Runtime Service
logger.info("Starting runtime service")
service = QiskitRuntimeService(
channel=os.environ["QISKIT_IBM_CHANNEL"],
instance=os.environ["IBM_CLOUD_INSTANCE"],
token=os.environ["your-API_KEY"], # Use the 44-character API_KEY you created and saved from the IBM Quantum Platform Home dashboard
)
backend = service.backend(backend_name)
logger.info(f"Backend: {backend.name}")

# Set up sampler and corresponding options
sampler = SamplerV2(backend)
sampler.options.dynamical_decoupling.enable = use_dd
sampler.options.twirling.enable_measure = False
sampler.options.twirling.enable_gates = use_twirling
sampler.options.default_shots = num_shots
else:
backend = testing_backend
logger.info(f"Testing backend: {backend.name}")

# Set up backend sampler.
# This doesn't allow running with twirling and dd
sampler = BackendSamplerV2(backend=testing_backend)

# Once the preparation steps are completed, the algorithm can be structured following a
# Qiskit Pattern workflow:
# https://docs.quantum.ibm.com/guides/intro-to-patterns

# --
# Step 1: Map
# In this step, input arguments are used to construct relevant quantum circuits and operators

start_mapping = time.time()
update_status(Job.MAPPING)

# Initialize the molecule object (pyscf)
logger.info("Initializing molecule object")
mol = gto.Mole()
mol.build(
atom=geo,
basis=input_basis,
verbose=input_verbosity,
charge=input_charge,
spin=input_spin,
symmetry=False,
) # Not tested for symmetry calculations

cm = pcm.PCM(mol)
cm.eps = myeps
cm.method = mymethod

mf = scf.RHF(mol).PCM(cm)
# Generation of checkpoint file for the solute and solvent
# which will be used reused in all subsequent sections
checkpoint_file_name = str(datafiles_name + ".chk")
mf.chkfile = checkpoint_file_name
mf.kernel()

# Read-in the information about the molecule
mol = chkfile.load_mol(checkpoint_file_name)

# Read-in RHF data
scf_result_dic = chkfile.load(checkpoint_file_name, "scf")
mf = scf.RHF(mol)
mf.__dict__.update(scf_result_dic)

# LUCJ uses isolated solute
mf.kernel()

# Initialize orbital selection based on user input
if myavas is not None:
orbs = myavas
avas_out = avas.AVAS(mf, orbs, with_iao=True)
avas_out.kernel()
ncas, nelecas = (avas_out.ncas, avas_out.nelecas)
else:
ncas = num_active_orb
nelecas = (
num_active_alpha,
num_active_beta,
)

# LUCJ Step:
# Generate active space
mc = mcscf.CASCI(mf, ncas=ncas, nelecas=nelecas)
if myavas is not None:
mc.mo_coeff = avas_out.mo_coeff
mc.batch = None
# Reliable and most convenient way to do the CCSD on only the active space
# is to create the FCIDUMP file and then run the CCSD calculation only on the
# orbitals stored in the FCIDUMP file.

h1e_cas, ecore = mc.get_h1eff()
h2e_cas = ao2mo.restore(1, mc.get_h2eff(), mc.ncas)

fcidump_file_name = str(datafiles_name + ".fcidump.txt")
tools.fcidump.from_integrals(
fcidump_file_name,
h1e_cas,
h2e_cas,
ncas,
nelecas,
nuc=ecore,
ms=0,
orbsym=[1] * ncas,
)

logger.info("Performing CCSD")
# Read FCIDUMP and perform CCSD on only active space
mf_as = tools.fcidump.to_scf(fcidump_file_name)
mf_as.kernel()

mc_cc = cc.CCSD(mf_as)
mc_cc.kernel()
mc_cc.t1 # pylint: disable=pointless-statement
t2 = mc_cc.t2

n_reps = 2
norb = ncas

if myavas is not None:
nelec = (int(nelecas / 2), int(nelecas / 2))
else:
nelec = nelecas

alpha_alpha_indices = [(p, p + 1) for p in range(norb - 1)]
alpha_beta_indices = [(p, p) for p in range(0, norb, 4)]

logger.info(f"Same spin orbital connections: {alpha_alpha_indices}")
logger.info(f"Opposite spin orbital connections: {alpha_beta_indices}")

# Construct LUCJ op
ucj_op = ffsim.UCJOpSpinBalanced.from_t_amplitudes(
t2, n_reps=n_reps, interaction_pairs=(alpha_alpha_indices, alpha_beta_indices)
)
# Construct circuit
qubits = QuantumRegister(2 * norb, name="q")
circuit = QuantumCircuit(qubits)
circuit.append(ffsim.qiskit.PrepareHartreeFockJW(norb, nelec), qubits)
circuit.append(ffsim.qiskit.UCJOpSpinBalancedJW(ucj_op), qubits)
circuit.measure_all()
end_mapping = time.time()

# --
# Step 2: Optimize
# Transpile circuits to match ISA

start_optimizing = time.time()
update_status(Job.OPTIMIZING_HARDWARE)

pass_manager = generate_preset_pass_manager(
optimization_level=opt_level,
backend=backend,
initial_layout=initial_layout,
)

pass_manager.pre_init = ffsim.qiskit.PRE_INIT
transpiled = pass_manager.run(circuit)

end_optimizing = time.time()
logger.info(
f"Optimization level: {opt_level}, ops: {transpiled.count_ops()}, depth: {transpiled.depth()}"
)

two_q_depth = transpiled.depth(lambda x: x.operation.num_qubits == 2)
logger.info(f"Two-qubit gate depth: {two_q_depth}")

# --
# Step 3: Execute on Hardware
# Submit the underlying Sampler job. Note that this is not the
# actual function job.
if count_dict_file_name is None:
# Submit the LUCJ job
logger.info("Submitting sampler job")
job = sampler.run([transpiled])
logger.info(f"Job ID: {job.job_id()}")
logger.info(f"Job Status: {job.status()}")

start_waiting_qpu = time.time()
while job.status() == "QUEUED":
update_status(Job.WAITING_QPU)
time.sleep(5)

end_waiting_qpu = time.time()
update_status(Job.EXECUTING_QPU)

# Wait until job is complete
result = job.result()
end_executing_qpu = time.time()

pub_result = result[0]
counts_dict = pub_result.data.meas.get_counts()

waiting_qpu_time = end_waiting_qpu - start_waiting_qpu
executing_qpu_time = end_executing_qpu - end_waiting_qpu
else:
# read LUCJ samples from count_dict
logger.info("Skipping sampler, loading counts dict from file")
with open(count_dict_file_name, "r") as file:
count_dict_string = file.read().replace("\n", "")
counts_dict = json.loads(count_dict_string.replace("'", '"'))
waiting_qpu_time = 0
executing_qpu_time = 0

# --
# Step 4: Post-process

start_pp = time.time()
update_status(Job.POST_PROCESSING)

# SQD-PCM section
start = time.time()

# Orbitals, electron, and spin initialization
num_orbitals = ncas
if myavas is not None:
num_elec_a = num_elec_b = int(nelecas / 2)
else:
num_elec_a, num_elec_b = nelecas
spin_sq = input_spin

# Convert counts into bitstring and probability arrays
bitstring_matrix_full, probs_arr_full = counts_to_arrays(counts_dict)

# We set qiskit_serverless to explicitly reserve 1 cpu per thread, as
# the task is CPU-bound and might degrade in performance when sharing
# a core at scale (this might not be the case with smaller examples)
@distribute_task(target={"cpu": 1})
def solve_solvent_parallel(
batches,
myeps,
mysolvmethod,
myavas,
num_orbitals,
spin_sq,
max_davidson,
checkpoint_file,
):
return solve_solvent( # sqd for pyscf
batches,
myeps,
mysolvmethod,
myavas,
num_orbitals,
spin_sq=spin_sq,
max_davidson=max_davidson,
checkpoint_file=checkpoint_file,
)

e_hist = np.zeros((iterations, n_batches)) # energy history
s_hist = np.zeros((iterations, n_batches)) # spin history
g_solv_hist = np.zeros((iterations, n_batches)) # g_solv history
occupancy_hist = []
avg_occupancy = None

num_ran_iter = 0
for i in range(iterations):
logger.info(f"Starting configuration recovery iteration {i}")
# On the first iteration, we have no orbital occupancy information from the
# solver, so we begin with the full set of noisy configurations.
if avg_occupancy is None:
bs_mat_tmp = bitstring_matrix_full
probs_arr_tmp = probs_arr_full

# If we have average orbital occupancy information, we use it to refine the full
# set of noisy configurations
else:
bs_mat_tmp, probs_arr_tmp = recover_configurations(
bitstring_matrix_full, probs_arr_full, avg_occupancy, num_elec_a, num_elec_b
)

# Create batches of subsamples. We post-select here to remove configurations
# with incorrect hamming weight during iteration 0, since no config recovery was performed.
batches = postselect_and_subsample(
bs_mat_tmp,
probs_arr_tmp,
hamming_right=num_elec_a,
hamming_left=num_elec_b,
samples_per_batch=samples_per_batch,
num_batches=n_batches,
)

# Run eigenstate solvers in a loop. This loop should be parallelized for larger problems.
e_tmp = np.zeros(n_batches)
s_tmp = np.zeros(n_batches)
g_solvs_tmp = np.zeros(n_batches)
occs_tmp = []
coeffs = []

res1 = []
for j in range(n_batches):
strs_a, strs_b = bitstring_matrix_to_ci_strs(batches[j])
logger.info(f"Batch {j} subspace dimension: {len(strs_a) * len(strs_b)}")

res1.append(
solve_solvent_parallel(
batches[j],
myeps,
mymethod,
myavas,
num_orbitals,
spin_sq=spin_sq,
max_davidson=max_davidson_cycles,
checkpoint_file=checkpoint_file_name,
)
)

res = get(res1)

for j in range(n_batches):
energy_sci, coeffs_sci, avg_occs, spin, g_solv = res[j]
e_tmp[j] = energy_sci
s_tmp[j] = spin
g_solvs_tmp[j] = g_solv
occs_tmp.append(avg_occs)
coeffs.append(coeffs_sci)

# Combine batch results
avg_occupancy = tuple(np.mean(occs_tmp, axis=0))

# Track optimization history
e_hist[i, :] = e_tmp
s_hist[i, :] = s_tmp
g_solv_hist[i, :] = g_solvs_tmp
occupancy_hist.append(avg_occupancy)

lowest_e_batch_index = np.argmin(e_hist[i, :])

logger.info(f"Lowest energy batch: {lowest_e_batch_index}")
logger.info(f"Lowest energy value: {np.min(e_hist[i, :])}")
logger.info(f"Corresponding g_solv value: {g_solv_hist[i, lowest_e_batch_index]}")
logger.info("-----------------------------------")
num_ran_iter += 1

end_pp = time.time()
end = time.time()
duration = end - start
logger.info(f"SCI_solver totally takes: {duration} seconds")

metadata = {
"resources_usage": {
"RUNNING: MAPPING": {
"CPU_TIME": end_mapping - start_mapping,
},
"RUNNING: OPTIMIZING_FOR_HARDWARE": {
"CPU_TIME": end_optimizing - start_optimizing,
},
"RUNNING: WAITING_FOR_QPU": {
"CPU_TIME": waiting_qpu_time,
},
"RUNNING: EXECUTING_QPU": {
"QPU_TIME": executing_qpu_time,
},
"RUNNING: POST_PROCESSING": {
"CPU_TIME": end_pp - start_pp,
},
},
"num_iterations_executed": num_ran_iter,
}

output = {
"total_energy_hist": e_hist,
"spin_squared_value_hist": s_hist,
"solvation_free_energy_hist": g_solv_hist,
"occupancy_hist": occupancy_hist,
"lowest_energy_batch": lowest_e_batch_index,
"lowest_energy_value": np.min(e_hist[i, :]),
"solvation_free_energy": g_solv_hist[i, lowest_e_batch_index],
"sci_solver_total_duration": duration,
"metadata": metadata,
}

return output

def set_up_logger(my_logger: logging.Logger, level: int = logging.INFO) -> None:
"""Logger setup to communicate logs through serverless."""

log_fmt = "%(module)s.%(funcName)s:%(levelname)s:%(asctime)s: %(message)s"
formatter = logging.Formatter(log_fmt)

# Set propagate to `False` since handlers are to be attached.
my_logger.propagate = False

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)
my_logger.addHandler(stream_handler)
my_logger.setLevel(level)

# This is the section where `run_function` is called, it's boilerplate code and can be used
# without customization.
if __name__ == "__main__":

# Use serverless helper function to extract input arguments,
input_args = get_arguments()

# Allow to configure logging level
logging_level = input_args.get("logging_level", logging.INFO)
set_up_logger(logger, logging_level)

try:
func_result = run_function(**input_args)
# Use serverless function to save the results that
# will be returned in the job.
save_result(func_result)
except Exception:
save_result(traceback.format_exc())
raise

sys.exit(0)
# This cell is hidden from users.  It verifies both source listings are identical then deletes the working folder we created
import shutil

with open("./source_files/sqd_pcm_entrypoint.py") as f1:
with open("./source_files/sqd_pcm_entrypoint.py") as f2:
assert f1.read() == f2.read()

with open("./source_files/solve_solvent.py") as f1:
with open("./source_files/solve_solvent.py") as f2:
assert f1.read() == f2.read()

with open("./source_files/__init__.py") as f1:
with open("./source_files/__init__.py") as f2:
assert f1.read() == f2.read()

shutil.rmtree("./source_files/")