Author: Jiangming Yao (FRIB/NSCL, MSU)
In the Generator-Coordinate Method (GCM), the model space for nuclear low-lying states is truncated based on the choice of collective coordinates ($q_i$), which can be either quadrupole deformation $\beta_2$, pairing amplitude $\Delta$ or cranking frequency $\omega$, etc. This truncation scheme provides an effective way to capture relevant correlations in the model space. For the matrix elements of neutrinoless double beta decay, both the collective orrelation (deformation $\beta$) and neutron-proton isoscalar pairing amplitude $\phi_{np}$ should be chosen. The key ingredients of the GCM are the kernels related to the observables of interest. The number of the kernels to be calculated scales as $O(N^{2d})$, where $N$ is the number of configurations for each type of collective coordinates and $d$ is the number of types of collective coordinates. Therefore, there is a trade off between the number of collective coordinates and the computation time. Different from what the Monte-Carlo shell model does based on the stochastic sampling of configurations, we introduce machine-learning techniques to reduce the computation time. The procedure can be as follows:
Discretize the $d$-dimension collective space with a relatively large step along each direction and calculate the kernels on the mesh points exactly using nuclear models. The values of these kernels serve as the dataset for both model training and model validation.
Use machine learning techniques to train a model for the kernels (including Hamiltonian and norm kernels) in the $d$-dimensional space using parameterized Gaussian functions.
Use the trained model to predict/interpolate the kernels in the collective space with denser mesh points.
As a demonstration, in this notebook, we take the following nuclear model in the calculation:
We take $^{48}$Ti as an example.
We adopt both conventional Machine Learning (ML) approach and deep learning approach for comparison.
Introduction to the nuclear models can be found on arXiv or Physical Review C.
The results by the nuclear model calculation are stored in a file, "data4ML.dat".
We adopt three methods to do interpolation:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler,PolynomialFeatures,RobustScaler,MinMaxScaler
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.interpolate as interp
from matplotlib.ticker import MultipleLocator, FormatStrFormatter
from matplotlib.ticker import AutoMinorLocator
import matplotlib.ticker as ticker
minorLocator = AutoMinorLocator()
blacksolid ={'color':'k',
'linestyle': '-'}
blackdot ={'color':'k',
'linestyle': ':'}
blackdashdot ={'color':'k',
'linestyle': '-.'}
redsolid ={'color':'r',
'linestyle': '-'}
bluedash ={'color':'b',
'linestyle': '--'}
fs= 18
plt.rcParams['figure.figsize'] = (7, 7) # set default size of plots
from SNN import *
from gcm import *
Introduction to the data:
data = pd.read_csv("data4ML.dat")
features = ['JJ','q1','q2','bet1','bet2','nn','hh']
data = data[features]
data0 = data
data.head()
plot_dist_nn_hh(data,6)
plot the H/N and N for J=0
data0 = data[data['JJ']==0]
q1 = data0['bet1']
q2 = data0['bet2']
hh = data0['hh']
nn = data0['nn']
levels=np.arange(-26,-15,1)
plot_surface(q1,q2,hh,levels)
plt.show()
levels=np.arange(0,0.2,0.02)
# fig2 =plt.figure(figsize=(8,6))
# ax1 = plt.gca()
# cont2 = ax1.contourf(xi, yi, zi2,levels2,cmap="RdBu_r")
# cbar = fig2.colorbar(cont2,shrink=0.8,aspect=10)
plot_surface(q1,q2,nn,levels)
data = data[abs(data['nn'])>0.01]
data = data[(abs(data['bet1'])<0.40) & (abs(data['bet2'])<0.40)]
data.shape
data_dic = split_data4J(data,4)
data_dic['JJ=0'].head()
rbf_Robust_regression = Pipeline([
("scaler",RobustScaler()),
("lin_reg",SVR(kernel='rbf', degree=5, C=1000, gamma=1)) # C: regularization coeff.
])
poly5_Std_regression = Pipeline([
("poly_features",PolynomialFeatures(degree=5)),
("scaler",StandardScaler()),
("lin_reg",LinearRegression())
])
hamiltonian matrix elements
X_features=['bet1','bet2']
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['hh'])
mlmodel4hh_rbf5 = plot_learning_curves(rbf_Robust_regression,X,y,
'SVR-rbf-Robust','hh','2',
log=False)
plt.ylim(0,0.5)
plt.show()
results_rbf = mlmodel4hh_rbf5
yh_test = results_rbf['y_test']
yh_pred = results_rbf['y_test_pred']
df = pd.DataFrame({'Column1':yh_test,'Column2':yh_pred})
df_train = pd.DataFrame({'Column1':results_rbf['y_train'],'Column2':results_rbf['y_train_pred']})
plt.scatter(df['Column1'],df['Column2']-df['Column1'],c='b',label='test')
plt.scatter(df_train['Column1'],df_train['Column2']-df_train['Column1'],c='r',label='train')
plt.axhline(0,c='k')
plt.legend()
set_figstyle(-18,-25,-0.5,+0.5,'norm','difference')
The Norm matrix elements: taking log(n) as y during training
X_features=['bet1','bet2']
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['nn'])
ylog = np.log(y)
The following three Scalers give a similar performance.
poly5_Robust_regression = Pipeline([
("poly_features",PolynomialFeatures(degree=5)),
("scaler",RobustScaler()),
("lin_reg",LinearRegression())
])
poly5_MM_regression = Pipeline([
("poly_features",PolynomialFeatures(degree=5)),
("scaler",MinMaxScaler()),
("lin_reg",LinearRegression())
])
poly5_Std_regression = Pipeline([
("poly_features",PolynomialFeatures(degree=5)),
("scaler",StandardScaler()),
("lin_reg",LinearRegression())
])
poly6_Std_regression = Pipeline([
("poly_features",PolynomialFeatures(degree=6)),
("scaler",StandardScaler()),
("lin_reg",LinearRegression())
])
mlmodel4nn_poly5 = plot_learning_curves(poly5_MM_regression,
X,ylog,'poly5-MM','nn','2',
log=False)
plt.ylim(0,0.1)
mlmodel4nn_poly6 = plot_learning_curves(poly6_Std_regression,
X,ylog,'poly6-MM','nn','2',
log=True)
plt.ylim(0,0.02)
mlmodel4nn_poly5 = plot_learning_curves(poly5_Std_regression,
X,ylog,'poly5-MM','nn','2',
log=True)
plt.ylim(0,0.02)
The above figure shows the significant improvement achieved by using log(nn) in the training.
results4nn = mlmodel4nn_poly5
df4nn_test = pd.DataFrame({'Column1':results4nn['y_test'],'Column2':results4nn['y_test_pred']})
df4nn_train = pd.DataFrame({'Column1':results4nn['y_train'],'Column2':results4nn['y_train_pred']})
plt.scatter(df4nn_test['Column1'],df4nn_test['Column2']-df4nn_test['Column1'],c='b',label='test')
plt.scatter(df4nn_train['Column1'],df4nn_train['Column2']-df4nn_train['Column1'],c='r',label='train')
plt.axhline(0,c='k')
plt.legend()
set_figstyle(0.02,0.2,-0.02,+0.02,'norm','difference')
results4nn = mlmodel4nn_poly5
df4nn_test = pd.DataFrame({'Column1':results4nn['y_test'],'Column2':results4nn['y_test_pred']})
df4nn_train = pd.DataFrame({'Column1':results4nn['y_train'],'Column2':results4nn['y_train_pred']})
plt.scatter(df4nn_test['Column1'],100*(df4nn_test['Column2']-df4nn_test['Column1'])/df4nn_test['Column1'],c='b',label='test')
plt.scatter(df4nn_train['Column1'],100*(df4nn_train['Column2']-df4nn_train['Column1'])/df4nn_train['Column1'],c='r',label='train')
plt.axhline(0,c='k')
plt.legend()
set_figstyle(0.02,0.2,-20,+20,'norm','difference [%]')
m_samples =50 #-1 #200
n_mesh = 19 #18 # 30
model4hh_use = mlmodel4hh_rbf5['models'][m_samples]
model4nn_use = mlmodel4nn_poly5['models'][m_samples]
pData = Create_XData_OnMesh(n_mesh)
#pData = Create_XData_OnMeshByRange(n_mesh,-0.4,0.4)
pfeatures = ['bet1','bet2']
pData['hhp'] = model4hh_use.predict(pData[pfeatures])
pData['lognnp'] = model4nn_use.predict(pData[pfeatures])
pData['nnp'] = np.exp(pData['lognnp'])
# delete the data outlier
pDatar = pData[(pData['bet1']<0.4) & (pData['bet2']<0.4)]
#pDatar = pData
#pDatar.tail(10)
# exact
dataJ0 = data[data['JJ']==0]
hhmat = df2matrix(dataJ0,'q1','q2','hh' )
nnmat = df2matrix(dataJ0,'q1','q2','nn')
nn_exact = nnmat
hh_exact = hhmat
HH0 = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,3,'nos',redsolid)
# prediction
HH = df2matrix(pDatar,'q1','q2','hhp')
NN = df2matrix(pDatar,'q1','q2','nnp')
HHn = np.multiply(HH,NN)
solution = HWG_Solution(HHn,NN,1,'nos',bluedash)
solution = HWG_Solution(HHn,NN,2,'nos',bluedash)
solution = HWG_Solution(HHn,NN,3,'nos',bluedash)
set_figstyle(1,HHn.shape[0],-23,-16,'NoS','E [MeV]')
plt.xlim(0,15)
data0 = data0[abs(data0['nn'])>0.001]
data0 = data0[(abs(data0['bet1'])<0.40) & (abs(data0['bet2'])<0.40)]
dataJ0 = data0
coords = dataJ0['bet1'].unique()
# hamiltonian
hhmat = df2matrix(dataJ0,'q1','q2','hh')
# norm
nnmat = df2matrix(dataJ0,'q1','q2','nn')
hhi = interp.RectBivariateSpline(coords, coords, hhmat)
nni = interp.RectBivariateSpline(coords, coords, nnmat)
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi(*np.ix_(coords,coords)), hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni(*np.ix_(coords,coords)), nnmat ))))
hhi_spare = interp.RectBivariateSpline(coords[::2], coords[::2], hhmat[::2,::2])
nni_spare = interp.RectBivariateSpline(coords[::2], coords[::2], nnmat[::2,::2])
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi_spare(*np.ix_(coords,coords)), hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni_spare(*np.ix_(coords,coords)), nnmat ))))
plt.contourf(coords, coords, hhi_spare(*np.ix_(coords,coords)) - hhmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Normalized Hamiltonian Kernels")
plt.show()
plt.contourf(coords, coords, nni_spare(*np.ix_(coords,coords)) - nnmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Norm Kernels")
# exact
nn_exact = nnmat
hh_exact = hhmat
HH0 = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)
# half
# nnh = nnmat[::2,::2]
# hhh = hhmat[::2,::2]
# HHh = np.multiply(hhh,nnh)
# solution = HWG_Solution(HHh,nnh,1,'nos',blackdashdot)
# solution = HWG_Solution(HHh,nnh,2,'nos',blackdashdot)
# prediction
nn_pred = nni_spare(*np.ix_(coords,coords))
hh_pred = hhi_spare(*np.ix_(coords,coords))
HHp = np.multiply(hh_pred,nn_pred)
solution = HWG_Solution(HHp,nn_pred,1,'nos',bluedash)
solution = HWG_Solution(HHp,nn_pred,2,'nos',bluedash)
plt.ylim(-25,-16)
plt.xlim(0,15)
plt.xlabel(r'NOS',fontsize=24)
plt.ylabel('E (MeV)', fontsize=24)
plt.legend(['Full(k=1)','Full(k=2)',
'Spline1/4(k=1)','Spline1/4(k=2)'],fontsize=14)
hhi_spare3 = interp.RectBivariateSpline(coords[::3], coords[::3], hhmat[::3,::3])
nni_spare3 = interp.RectBivariateSpline(coords[::3], coords[::3], nnmat[::3,::3])
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi_spare(*np.ix_(coords,coords)), hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni_spare(*np.ix_(coords,coords)), nnmat ))))
hhmat[::3,::3].shape
plt.contourf(coords, coords, hhi_spare3(*np.ix_(coords,coords)) - hhmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Normalized Hamiltonian Kernels")
plt.show()
plt.contourf(coords, coords, nni_spare3(*np.ix_(coords,coords)) - nnmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Norm Kernels")
# exact
nn_exact = nnmat
hh_exact = hhmat
HH0 = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)
# half
# nn3 = nnmat[::3,::3]
# hh3 = hhmat[::3,::3]
# HH3 = np.multiply(hh3,nn3)
# solution = HWG_Solution(HH3,nn3,1,'nos',blackdashdot)
# solution = HWG_Solution(HH3,nn3,2,'nos',blackdashdot)
# prediction
nn_pred = nnmat #nni_spare3(*np.ix_(coords,coords))
hh_pred = hhi_spare3(*np.ix_(coords,coords))
HHp = np.multiply(hh_pred,nn_pred)
solution = HWG_Solution(HHp,nn_pred,1,'nos',bluedash)
solution = HWG_Solution(HHp,nn_pred,2,'nos',bluedash)
plt.ylim(-25,-16)
plt.xlim(0,15)
plt.xlabel(r'NOS',fontsize=24)
plt.ylabel('E (MeV)', fontsize=24)
plt.legend(['Full(k=1)','Full(k=2)',
'Spline1/9(k=1)','Spline1/9(k=2)'],fontsize=14)
Conclusion from doing the interpolation directly with Spline:
For Hamiltonian kernels
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['hh'])
X_train,X_test,y_train, y_test = train_test_split(X,y,test_size=0.2)
X_train.shape,y_train.shape
X4train = X_train.T
yhh4train = y_train.reshape(1,y_train.shape[0])
X4test = X_test.T
yhh4test = y_test.reshape(1,y_test.shape[0])
# Build a model with a n_h-dimensional hidden layer
parameters4hh,error = nn_model(X4train, yhh4train, n_h = 20,
num_iterations = 1e6,
learning_rate=0.05, print_cost=True, log=False)
# Print accuracy
predictions4hh = predict(parameters4hh, X4train)
print('rms error={}'.format(np.sqrt(mean_squared_error(yhh4train,predictions4hh))))
plt.plot(error['steps'],error['rmse'])
plt.xscale('log')
set_figstyle(1,1e7,0,0.6,'steps','rms error')
# train_data
orig = pd.DataFrame(yhh4train.T,columns=['calc.'])
predictions = predict(parameters4hh, X4train)
pred = pd.DataFrame(predictions.T,columns=['pred.'])
#predictions4hh = predict(parameters4hh, X4train)
#pred = pd.DataFrame(predictions4hh.T,columns=['pred.'])
df4hh = pd.concat([pred,orig],axis=1)
plt.scatter(df4hh['calc.'],df4hh['pred.']-df4hh['calc.'],c='r',label='train')
# test data
orig_test = pd.DataFrame(yhh4test.T,columns=['calc.'])
pred_test = pd.DataFrame(predict(parameters4hh, X4test).T,columns=['pred.'])
df4hh_test = pd.concat([pred_test,orig_test],axis=1)
plt.scatter(df4hh_test['calc.'],df4hh_test['pred.']-df4hh_test['calc.'],
c='b',label='test')
plt.axhline(0,c='k')
plt.legend()
#set_figstyle(0.02,0.2,-0.5,0.5,'hh','difference')
Norm kernels
X = np.array(data_dic['JJ=0'][X_features])
y = np.log(np.array(data_dic['JJ=0']['nn']))
X.shape,y.shape
X_train,X_test,y_train, y_test = train_test_split(X,y,test_size=0.2)
X4train = X_train.T
ynn4train = y_train.reshape(1,y_train.shape[0])
X4test = X_test.T
ynn4test = y_test.reshape(1,y_test.shape[0])
X4test.shape
Information on the dataset:
# Build a model with a n_h-dimensional hidden layer
parameters4nn, error_nn = nn_model(X4train, ynn4train, n_h = 20,
num_iterations = 1e6,
learning_rate=0.1, print_cost=True, log=True)
# Print accuracy
predictions4nn = predict(parameters4nn, X4train)
print('rms error={}'.format(np.sqrt(mean_squared_error(ynn4train,predictions4nn))))
plt.plot(error_nn['steps'],error_nn['rmse'])
plt.xscale('log')
set_figstyle(1,1e7,0,0.06,'steps','rms error')
# train data
orig = pd.DataFrame(np.exp(ynn4train).T,columns=['calc.']) # attention to the log
#pred = pd.DataFrame(np.exp(predictions4nn).T,columns=['pred.'])
predictions = predict(parameters4nn, X4train)
pred = pd.DataFrame(np.exp(predictions).T,columns=['pred.'])
df4nn = pd.concat([pred,orig],axis=1)
plt.scatter(df4nn['calc.'],df4nn['pred.']-df4nn['calc.'],c='r',label='train')
# test data
orig_test = pd.DataFrame(np.exp(ynn4test).T,columns=['calc.'])
predictions = predict(parameters4nn, X4test)
pred_test = pd.DataFrame(np.exp(predictions).T,columns=['pred.'])
df4nn_test = pd.concat([pred_test,orig_test],axis=1)
plt.scatter(df4nn_test['calc.'],df4nn_test['pred.']-df4nn_test['calc.'],
c='b',label='test')
plt.axhline(0,c='k')
plt.legend()
set_figstyle(0.02,0.2,-0.02,+0.02,'norm','difference')
X = np.array(data_dic['JJ=0'][X_features])
yhh = np.array(data_dic['JJ=0']['hh'])
ynn = np.array(data_dic['JJ=0']['nn'])
# log for n
ynnlog = np.log(np.array(data_dic['JJ=0']['nn']))
X4all = X.T
yhh4all = yhh.reshape(1,yhh.shape[0])
ynn4all = ynn.reshape(1,ynn.shape[0])
ynn_pred = np.exp(predict(parameters4nn, X4all))
#yhh_pred = predict(parameters4hh, X4all)
def Create_XData_OnFineMesh(N,step):
q1list,q2list = Creat_Twolist4Mesh(N)
dftest = pd.DataFrame([q1list,q2list])
dftest = dftest.T
dftest.columns = ['q1','q2']
dftest['bet1'] = dftest['q1']*step-N*step/2
dftest['bet2'] = dftest['q2']*step-N*step/2
return dftest
n_mesh = 20
pData = Create_XData_OnFineMesh(n_mesh,0.1)
pfeatures = ['bet1','bet2']
X4all = pData[pfeatures].T
pData['hhp'] = predict(parameters4hh, X4all).T
pData['lognnp'] = predict(parameters4nn, X4all).T
pData['nnp'] = np.exp(pData['lognnp'])
# delete the data outlier
pDatar = pData[(pData['bet1']<0.4) & (pData['bet2']<0.4)]
#pDatar = pData
pDatar.shape
pDatar['bet1'].unique()
# exact
dataJ0 = data[data['JJ']==0]
hhmat = df2matrix(dataJ0,'q1','q2','hh' )
nnmat = df2matrix(dataJ0,'q1','q2','nn')
nn_exact = nnmat
hh_exact = hhmat
HH0 = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)
# prediction
HH = df2matrix(pDatar,'q1','q2','hhp')
NN = df2matrix(pDatar,'q1','q2','nnp')
HHn = np.multiply(HH,NN)
solution = HWG_Solution(HHn,NN,1,'nos',bluedash)
solution = HWG_Solution(HHn,NN,2,'nos',bluedash)
set_figstyle(1,HHn.shape[0],-23,-16,'NoS','E [MeV]')
plt.legend(['Exact(k=1)','Exact(k=2)','ANN(k=1)','ANN(k=2)'],fontsize=16)
plt.xlim(0,15)
plt.imshow(np.log(nnmat))
plt.colorbar()
We have used three different techniques (SVR with polynomial kernel of dof=5, Spline, and ANN with one hidden layer and 20 neurons) to do interpolation for both the Hamiltonian kernels and norm kernels. As the first step, we only consider the kernels as functions of two variables ($q_1=\beta_1, q_2=\beta_2$),
$$ N^{J=0}(q_1, q_2) = <\Phi(q_1)\vert \hat P^J \hat P^N \vert\Phi(q_2)\rangle, $$
and
$$ H^{J=0}(q_1, q_2) = <\Phi(q_1)\vert \hat H\hat P^J \hat P^N \vert\Phi(q_2)\rangle, $$ where $\hat H$ is the hamiltonian operator. The findings are summarized as below:
The next step will be considering higher dimension case.