Machine Learning in Nuclear Physics (1)

Author: Jiangming Yao (FRIB/NSCL, MSU)

Introduction

In the Generator-Coordinate Method (GCM), the model space for nuclear low-lying states is truncated based on the choice of collective coordinates ($q_i$), which can be either quadrupole deformation $\beta_2$, pairing amplitude $\Delta$ or cranking frequency $\omega$, etc. This truncation scheme provides an effective way to capture relevant correlations in the model space. For the matrix elements of neutrinoless double beta decay, both the collective orrelation (deformation $\beta$) and neutron-proton isoscalar pairing amplitude $\phi_{np}$ should be chosen. The key ingredients of the GCM are the kernels related to the observables of interest. The number of the kernels to be calculated scales as $O(N^{2d})$, where $N$ is the number of configurations for each type of collective coordinates and $d$ is the number of types of collective coordinates. Therefore, there is a trade off between the number of collective coordinates and the computation time. Different from what the Monte-Carlo shell model does based on the stochastic sampling of configurations, we introduce machine-learning techniques to reduce the computation time. The procedure can be as follows:

  • Discretize the $d$-dimension collective space with a relatively large step along each direction and calculate the kernels on the mesh points exactly using nuclear models. The values of these kernels serve as the dataset for both model training and model validation.

  • Use machine learning techniques to train a model for the kernels (including Hamiltonian and norm kernels) in the $d$-dimensional space using parameterized Gaussian functions.

  • Use the trained model to predict/interpolate the kernels in the collective space with denser mesh points.

  • Solve the Hill-Wheeler equation and calculate the observables of interest using all the kernels (including the exactly calculated ones and model predicted ones).

As a demonstration, in this notebook, we take the following nuclear model in the calculation:

  • GCM+HFB(with constrains on the quadrupole deformation $\beta_2$)
  • Effective interaction KB3G defined within $fp$ shell

We take $^{48}$Ti as an example.

We adopt both conventional Machine Learning (ML) approach and deep learning approach for comparison.

The Methods

Nuclear models

Introduction to the nuclear models can be found on arXiv or Physical Review C.

The results by the nuclear model calculation are stored in a file, "data4ML.dat".

Interpolation Methods

We adopt three methods to do interpolation:

  • conventional machine learning
  • Spline interpolation
  • Neural network.
In [1]:
from sklearn.pipeline import  Pipeline
from sklearn.preprocessing import StandardScaler,PolynomialFeatures,RobustScaler,MinMaxScaler
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import scipy.interpolate as interp

from matplotlib.ticker import MultipleLocator, FormatStrFormatter
from matplotlib.ticker import AutoMinorLocator 
import matplotlib.ticker as ticker 
minorLocator = AutoMinorLocator()

blacksolid ={'color':'k',
        'linestyle': '-'}

blackdot ={'color':'k',
        'linestyle': ':'}

blackdashdot ={'color':'k',
        'linestyle': '-.'}

redsolid ={'color':'r',
        'linestyle': '-'}
bluedash ={'color':'b',
        'linestyle': '--'}

fs= 18


plt.rcParams['figure.figsize'] = (7, 7) # set default size of plots
In [2]:
from SNN import *
from gcm import *

Dataset

Introduction to the data:

  • bet1, bet2 are for the quadrupole deformations of reference states.
  • nn is for the norm kernel, $$ <\beta_1 | P^J P^N P^Z| \beta_2> $$
  • hh is for the Hamiltonian kernel, normalized with the norm kernel, $$ \dfrac{<\beta_1 | H P^J P^N P^Z| \beta_2>}{<\beta_1 | P^J P^N P^Z| \beta_2>}. $$

Load data

In [83]:
data = pd.read_csv("data4ML.dat") 
features = ['JJ','q1','q2','bet1','bet2','nn','hh']
data = data[features]

data0 = data
data.head()
Out[83]:
JJ q1 q2 bet1 bet2 nn hh
0 0 1 1 -0.35 -0.35 0.056751 -19.248030
1 0 2 1 -0.30 -0.35 0.055657 -20.415483
2 0 3 1 -0.25 -0.35 0.049036 -21.459291
3 0 4 1 -0.20 -0.35 0.040537 -22.473485
4 0 5 1 -0.15 -0.35 0.032881 -23.320063

Visualization

In [4]:
plot_dist_nn_hh(data,6)

plot the H/N and N for J=0

In [5]:
data0 = data[data['JJ']==0]
q1 = data0['bet1']
q2 = data0['bet2']
hh = data0['hh']
nn = data0['nn']
levels=np.arange(-26,-15,1)
plot_surface(q1,q2,hh,levels)

plt.show()

levels=np.arange(0,0.2,0.02)

# fig2 =plt.figure(figsize=(8,6))
# ax1 = plt.gca()
# cont2 = ax1.contourf(xi, yi, zi2,levels2,cmap="RdBu_r")
# cbar = fig2.colorbar(cont2,shrink=0.8,aspect=10)


plot_surface(q1,q2,nn,levels)
/Users/jmyao/Documents/WorkSpace/Data_Analysis/Machine_Learning/gcm.py:41: MatplotlibDeprecationWarning: The griddata function was deprecated in version 2.2.
  zi = griddata(xx,yy,zz,xi,yi,interp='linear')
/Users/jmyao/Documents/WorkSpace/Data_Analysis/Machine_Learning/gcm.py:41: MatplotlibDeprecationWarning: The griddata function was deprecated in version 2.2.
  zi = griddata(xx,yy,zz,xi,yi,interp='linear')

Results and discussions

Interpolation with Conventional ML

filter our the outliers in the dataset

In [6]:
data = data[abs(data['nn'])>0.01]
data = data[(abs(data['bet1'])<0.40) & (abs(data['bet2'])<0.40)]
data.shape
Out[6]:
(630, 7)

Prepare the data -- split the data based on different J

In [7]:
data_dic = split_data4J(data,4)
data_dic['JJ=0'].head()
Out[7]:
JJ q1 q2 bet1 bet2 nn hh
0 0 1 1 -0.35 -0.35 0.056751 -19.248030
1 0 2 1 -0.30 -0.35 0.055657 -20.415483
2 0 3 1 -0.25 -0.35 0.049036 -21.459291
3 0 4 1 -0.20 -0.35 0.040537 -22.473485
4 0 5 1 -0.15 -0.35 0.032881 -23.320063

Build Machine-Learning models

  • SVR is better to be used together with 'rbf', instead of 'poly'
  • If one wants to use 'poly', it is better to transform the features with PolynomiaFeatures and then use LinearRegression()
  • No significant difference is found between the RobustScaler() and StandardScaler() in this problem.
In [8]:
rbf_Robust_regression = Pipeline([ 
    ("scaler",RobustScaler()),
    ("lin_reg",SVR(kernel='rbf', degree=5, C=1000, gamma=1)) # C: regularization coeff.
])  

poly5_Std_regression = Pipeline([ 
    ("poly_features",PolynomialFeatures(degree=5)),
    ("scaler",StandardScaler()),
    ("lin_reg",LinearRegression())
])

hamiltonian matrix elements

In [9]:
X_features=['bet1','bet2']
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['hh']) 
In [10]:
mlmodel4hh_rbf5  = plot_learning_curves(rbf_Robust_regression,X,y,
                                        'SVR-rbf-Robust','hh','2',
                                        log=False) 
plt.ylim(0,0.5)
plt.show()  
In [76]:
results_rbf = mlmodel4hh_rbf5
yh_test = results_rbf['y_test']
yh_pred = results_rbf['y_test_pred']
df = pd.DataFrame({'Column1':yh_test,'Column2':yh_pred})
df_train = pd.DataFrame({'Column1':results_rbf['y_train'],'Column2':results_rbf['y_train_pred']})
plt.scatter(df['Column1'],df['Column2']-df['Column1'],c='b',label='test')
plt.scatter(df_train['Column1'],df_train['Column2']-df_train['Column1'],c='r',label='train')
plt.axhline(0,c='k') 
plt.legend()
set_figstyle(-18,-25,-0.5,+0.5,'norm','difference')

The Norm matrix elements: taking log(n) as y during training

In [12]:
X_features=['bet1','bet2']
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['nn'])
ylog = np.log(y)

The following three Scalers give a similar performance.

In [13]:
poly5_Robust_regression = Pipeline([ 
    ("poly_features",PolynomialFeatures(degree=5)),
    ("scaler",RobustScaler()),
    ("lin_reg",LinearRegression())
])
 
poly5_MM_regression = Pipeline([ 
    ("poly_features",PolynomialFeatures(degree=5)),
    ("scaler",MinMaxScaler()),
    ("lin_reg",LinearRegression())
])
  
poly5_Std_regression = Pipeline([ 
    ("poly_features",PolynomialFeatures(degree=5)),
    ("scaler",StandardScaler()),
    ("lin_reg",LinearRegression())
])
  

poly6_Std_regression = Pipeline([ 
    ("poly_features",PolynomialFeatures(degree=6)),
    ("scaler",StandardScaler()),
    ("lin_reg",LinearRegression())
])
In [14]:
mlmodel4nn_poly5  = plot_learning_curves(poly5_MM_regression,
                                         X,ylog,'poly5-MM','nn','2',
                                         log=False)  
plt.ylim(0,0.1)
for nn
Out[14]:
(0, 0.1)
In [71]:
mlmodel4nn_poly6  = plot_learning_curves(poly6_Std_regression,
                                         X,ylog,'poly6-MM','nn','2',
                                         log=True)  
plt.ylim(0,0.02)
for nn
Out[71]:
(0, 0.02)
In [16]:
mlmodel4nn_poly5  = plot_learning_curves(poly5_Std_regression,
                                         X,ylog,'poly5-MM','nn','2',
                                         log=True)  
plt.ylim(0,0.02)
for nn
Out[16]:
(0, 0.02)

The above figure shows the significant improvement achieved by using log(nn) in the training.

In [17]:
results4nn = mlmodel4nn_poly5
df4nn_test  = pd.DataFrame({'Column1':results4nn['y_test'],'Column2':results4nn['y_test_pred']})
df4nn_train = pd.DataFrame({'Column1':results4nn['y_train'],'Column2':results4nn['y_train_pred']})

plt.scatter(df4nn_test['Column1'],df4nn_test['Column2']-df4nn_test['Column1'],c='b',label='test')
plt.scatter(df4nn_train['Column1'],df4nn_train['Column2']-df4nn_train['Column1'],c='r',label='train')
plt.axhline(0,c='k') 
plt.legend()

set_figstyle(0.02,0.2,-0.02,+0.02,'norm','difference')
In [18]:
results4nn = mlmodel4nn_poly5
df4nn_test  = pd.DataFrame({'Column1':results4nn['y_test'],'Column2':results4nn['y_test_pred']})
df4nn_train = pd.DataFrame({'Column1':results4nn['y_train'],'Column2':results4nn['y_train_pred']})

plt.scatter(df4nn_test['Column1'],100*(df4nn_test['Column2']-df4nn_test['Column1'])/df4nn_test['Column1'],c='b',label='test')
plt.scatter(df4nn_train['Column1'],100*(df4nn_train['Column2']-df4nn_train['Column1'])/df4nn_train['Column1'],c='r',label='train')
plt.axhline(0,c='k') 
plt.legend()

set_figstyle(0.02,0.2,-20,+20,'norm','difference [%]')

HWG equation

In [19]:
m_samples =50 #-1 #200
n_mesh = 19 #18 # 30


model4hh_use =   mlmodel4hh_rbf5['models'][m_samples]
model4nn_use =   mlmodel4nn_poly5['models'][m_samples]
pData        =   Create_XData_OnMesh(n_mesh) 
#pData        = Create_XData_OnMeshByRange(n_mesh,-0.4,0.4)


pfeatures    = ['bet1','bet2']
pData['hhp'] = model4hh_use.predict(pData[pfeatures])
pData['lognnp'] = model4nn_use.predict(pData[pfeatures])

pData['nnp'] = np.exp(pData['lognnp'])

# delete the data outlier
pDatar = pData[(pData['bet1']<0.4) & (pData['bet2']<0.4)] 
#pDatar = pData
#pDatar.tail(10)
In [20]:
# exact
 
dataJ0 = data[data['JJ']==0]  
hhmat = df2matrix(dataJ0,'q1','q2','hh' ) 
nnmat = df2matrix(dataJ0,'q1','q2','nn')

nn_exact = nnmat
hh_exact = hhmat 


HH0      = np.multiply(hh_exact,nn_exact) 
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,3,'nos',redsolid)  


# prediction

HH       = df2matrix(pDatar,'q1','q2','hhp')
NN       = df2matrix(pDatar,'q1','q2','nnp') 
HHn      = np.multiply(HH,NN)
solution = HWG_Solution(HHn,NN,1,'nos',bluedash)
solution = HWG_Solution(HHn,NN,2,'nos',bluedash)
solution = HWG_Solution(HHn,NN,3,'nos',bluedash) 

set_figstyle(1,HHn.shape[0],-23,-16,'NoS','E [MeV]') 

plt.xlim(0,15)
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-20.652369615782895, -21.75228090237229, -21.830596982083218, -21.977477485251566, -21.996928699861535, -22.0071621870279, -22.023496362923062, -22.040467905716277, -22.04549543737467, -22.051011574582997, -22.103317360552722, -22.124030373301295]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-15.960670328647392, -16.95661934112475, -16.99012833244533, -16.995134846709895, -17.08246935833585, -17.1031882583619, -17.122147310649368, -17.13146813062061, -17.173481870455944, -17.266982454486552, -18.936746646130075]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-15.751517409612568, -16.60134770123154, -16.619621632998175, -16.95070267302671, -16.95876279816134, -16.969090361972384, -16.970487671209987, -16.984970733217146, -17.06394992032665, -17.171629809285438]
The eigenvalues of norm: [-9.40383100e-03 -5.19295987e-04 -3.29552485e-04 -2.21769589e-04
 -9.41776480e-05 -7.87229048e-05 -2.28542968e-05  5.25510294e-05
  1.65907734e-04  1.91944341e-04  6.21238227e-04  6.97459654e-03
  4.94743868e-02  1.17860809e-01  1.43023591e+00]
The energy of the state: [-20.666277217494162, -21.668019146486202, -21.726278756710805, -21.84173733740657, -21.88175306747207, -21.881940965817318, -21.895530929235562, -21.949325155071403]
The eigenvalues of norm: [-9.40383100e-03 -5.19295987e-04 -3.29552485e-04 -2.21769589e-04
 -9.41776480e-05 -7.87229048e-05 -2.28542968e-05  5.25510294e-05
  1.65907734e-04  1.91944341e-04  6.21238227e-04  6.97459654e-03
  4.94743868e-02  1.17860809e-01  1.43023591e+00]
The energy of the state: [-16.185047540912564, -16.664875765241366, -17.334273429258992, -17.393727145761506, -17.426082174041298, -17.441365169598924, -17.519692758958378]
The eigenvalues of norm: [-9.40383100e-03 -5.19295987e-04 -3.29552485e-04 -2.21769589e-04
 -9.41776480e-05 -7.87229048e-05 -2.28542968e-05  5.25510294e-05
  1.65907734e-04  1.91944341e-04  6.21238227e-04  6.97459654e-03
  4.94743868e-02  1.17860809e-01  1.43023591e+00]
The energy of the state: [-16.008906767201058, -16.65244973883094, -16.65829350805017, -16.680047060044572, -16.680163352939406, -16.68457953332698]
/Users/jmyao/Documents/WorkSpace/Data_Analysis/Machine_Learning/gcm.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  matrix = df[col_matrix].as_matrix().reshape(dim,dim)
Out[20]:
(0, 15)

Interpolation with Spline

filter the ourliers

In [21]:
data0 = data0[abs(data0['nn'])>0.001]
data0 = data0[(abs(data0['bet1'])<0.40) & (abs(data0['bet2'])<0.40)]
In [22]:
 

dataJ0 = data0
coords = dataJ0['bet1'].unique()

# hamiltonian
hhmat = df2matrix(dataJ0,'q1','q2','hh')
# norm
nnmat = df2matrix(dataJ0,'q1','q2','nn')
/Users/jmyao/Documents/WorkSpace/Data_Analysis/Machine_Learning/gcm.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  matrix = df[col_matrix].as_matrix().reshape(dim,dim)

Include all the dataset in the interpolation

In [23]:
hhi = interp.RectBivariateSpline(coords, coords, hhmat)
nni = interp.RectBivariateSpline(coords, coords, nnmat)
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi(*np.ix_(coords,coords)),  hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni(*np.ix_(coords,coords)),  nnmat ))))
rms error for hh: 4.431012459792239e-15
rms error for nn: 2.1736945807150597e-17

Include 1/4 of the dataset in the interpolation (every second point)

In [24]:
hhi_spare = interp.RectBivariateSpline(coords[::2], coords[::2], hhmat[::2,::2])
nni_spare = interp.RectBivariateSpline(coords[::2], coords[::2], nnmat[::2,::2])
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi_spare(*np.ix_(coords,coords)),  hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni_spare(*np.ix_(coords,coords)),  nnmat ))))
rms error for hh: 0.06714349353905456
rms error for nn: 0.0029656296135663705
In [25]:
plt.contourf(coords, coords, hhi_spare(*np.ix_(coords,coords)) - hhmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Normalized Hamiltonian Kernels")
plt.show()

plt.contourf(coords, coords, nni_spare(*np.ix_(coords,coords)) - nnmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Norm Kernels")
 
Out[25]:
Text(0.5,1,'Error in Norm Kernels')
In [26]:
# exact

nn_exact = nnmat
hh_exact = hhmat

HH0      = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid) 

# half

# nnh = nnmat[::2,::2]
# hhh = hhmat[::2,::2]

# HHh      = np.multiply(hhh,nnh)
# solution = HWG_Solution(HHh,nnh,1,'nos',blackdashdot)
# solution = HWG_Solution(HHh,nnh,2,'nos',blackdashdot)


# prediction

nn_pred  = nni_spare(*np.ix_(coords,coords))
hh_pred  = hhi_spare(*np.ix_(coords,coords))

HHp      = np.multiply(hh_pred,nn_pred)
solution = HWG_Solution(HHp,nn_pred,1,'nos',bluedash)
solution = HWG_Solution(HHp,nn_pred,2,'nos',bluedash)

plt.ylim(-25,-16)
plt.xlim(0,15)


plt.xlabel(r'NOS',fontsize=24)
plt.ylabel('E (MeV)', fontsize=24)

plt.legend(['Full(k=1)','Full(k=2)',
            'Spline1/4(k=1)','Spline1/4(k=2)'],fontsize=14)
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-20.652369615782895, -21.75228090237229, -21.830596982083218, -21.977477485251566, -21.996928699861535, -22.0071621870279, -22.023496362923062, -22.040467905716277, -22.04549543737467, -22.051011574582997, -22.103317360552722, -22.124030373301295]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-15.960670328647392, -16.95661934112475, -16.99012833244533, -16.995134846709895, -17.08246935833585, -17.1031882583619, -17.122147310649368, -17.13146813062061, -17.173481870455944, -17.266982454486552, -18.936746646130075]
The eigenvalues of norm: [-7.62436670e-17 -2.40932552e-17 -2.38837410e-17  3.03890684e-18
  1.17782171e-17  1.68214153e-17  4.62955169e-17  2.19019814e-06
  6.53702186e-05  1.91596956e-03  2.98868014e-03  1.66571899e-02
  5.03617172e-02  1.22976972e-01  1.41957407e+00]
The energy of the state: [-20.66893640012859, -21.76883550283229, -21.845439311925944, -22.000913128956732, -22.013117764272934, -22.035995515017873, -22.05993456871574, -22.07154461292473, -22.18889833981669, -26.98217270842582, -30.834310831996657, -361590101080.947]
The eigenvalues of norm: [-7.62436670e-17 -2.40932552e-17 -2.38837410e-17  3.03890684e-18
  1.17782171e-17  1.68214153e-17  4.62955169e-17  2.19019814e-06
  6.53702186e-05  1.91596956e-03  2.98868014e-03  1.66571899e-02
  5.03617172e-02  1.22976972e-01  1.41957407e+00]
The energy of the state: [-15.843297424892667, -16.712806756616114, -16.714017341568827, -16.736126615085666, -17.0619215449819, -17.125581242744243, -18.548142926893508, -19.508334096132838, -21.974914583299224, -22.00820789672313, -22.00852175637609]
Out[26]:
<matplotlib.legend.Legend at 0x10488c518>

Include 1/9 of the dataset in the interpolation (every two points)

In [27]:
hhi_spare3 = interp.RectBivariateSpline(coords[::3], coords[::3], hhmat[::3,::3])
nni_spare3 = interp.RectBivariateSpline(coords[::3], coords[::3], nnmat[::3,::3])
print("rms error for hh: {}".format(np.sqrt(mean_squared_error(hhi_spare(*np.ix_(coords,coords)),  hhmat ))))
print("rms error for nn: {}".format(np.sqrt(mean_squared_error(nni_spare(*np.ix_(coords,coords)),  nnmat ))))
rms error for hh: 0.06714349353905456
rms error for nn: 0.0029656296135663705
In [28]:
hhmat[::3,::3].shape
Out[28]:
(5, 5)
In [29]:
plt.contourf(coords, coords, hhi_spare3(*np.ix_(coords,coords)) - hhmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Normalized Hamiltonian Kernels")
plt.show()

plt.contourf(coords, coords, nni_spare3(*np.ix_(coords,coords)) - nnmat)
plt.xlim(-.35,.35)
plt.ylim(-.35,.35)
plt.colorbar()
plt.title("Error in Norm Kernels")
 
Out[29]:
Text(0.5,1,'Error in Norm Kernels')
In [30]:
# exact

nn_exact = nnmat
hh_exact = hhmat

HH0      = np.multiply(hh_exact,nn_exact)
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid) 


# half

# nn3 = nnmat[::3,::3]
# hh3 = hhmat[::3,::3]

# HH3      = np.multiply(hh3,nn3)
# solution = HWG_Solution(HH3,nn3,1,'nos',blackdashdot)
# solution = HWG_Solution(HH3,nn3,2,'nos',blackdashdot)




# prediction

nn_pred  = nnmat #nni_spare3(*np.ix_(coords,coords))
hh_pred  = hhi_spare3(*np.ix_(coords,coords))

HHp      = np.multiply(hh_pred,nn_pred)
solution = HWG_Solution(HHp,nn_pred,1,'nos',bluedash)
solution = HWG_Solution(HHp,nn_pred,2,'nos',bluedash)

plt.ylim(-25,-16)
plt.xlim(0,15)


plt.xlabel(r'NOS',fontsize=24)
plt.ylabel('E (MeV)', fontsize=24)
 

plt.legend(['Full(k=1)','Full(k=2)',
            'Spline1/9(k=1)','Spline1/9(k=2)'],fontsize=14)
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-20.652369615782895, -21.75228090237229, -21.830596982083218, -21.977477485251566, -21.996928699861535, -22.0071621870279, -22.023496362923062, -22.040467905716277, -22.04549543737467, -22.051011574582997, -22.103317360552722, -22.124030373301295]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-15.960670328647392, -16.95661934112475, -16.99012833244533, -16.995134846709895, -17.08246935833585, -17.1031882583619, -17.122147310649368, -17.13146813062061, -17.173481870455944, -17.266982454486552, -18.936746646130075]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-20.560086189488253, -21.581182305285697, -21.616124137032394, -21.799517024565244, -21.833518115598476, -21.866225133466813, -21.92035305487783, -21.963913813700923, -21.967858373659478, -21.97631982894286, -22.263964436694387, -44.73681984428305]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-16.483721602420978, -18.28907404558693, -19.586057226328233, -19.94512974322492, -21.391469364678926, -21.393383676900605, -21.430267902892076, -21.477168782299117, -21.491393806409587, -21.92576114003155, -21.989148417345174]
Out[30]:
<matplotlib.legend.Legend at 0x10ab1ed68>

Conclusion from doing the interpolation directly with Spline:

  • All the dataset are used: the precise is extremely high.
  • Half of the dataset are used: the precise for Hamiltonian kernels is rmsr = 0.067 and for norm kernels is 0.003 when half dataset are used. The solution of the HWG equation is retained.
  • Third of the dataset: rms error for hh: 0.499 and rms error for nn: 0.011

Artificial Neural Network

For Hamiltonian kernels

In [31]:
X = np.array(data_dic['JJ=0'][X_features])
y = np.array(data_dic['JJ=0']['hh'])
In [32]:
X_train,X_test,y_train, y_test = train_test_split(X,y,test_size=0.2)
X_train.shape,y_train.shape

X4train      = X_train.T
yhh4train    = y_train.reshape(1,y_train.shape[0])

X4test      = X_test.T
yhh4test    = y_test.reshape(1,y_test.shape[0])
In [33]:
# Build a model with a n_h-dimensional hidden layer

parameters4hh,error = nn_model(X4train, yhh4train, n_h = 20, 
                         num_iterations = 1e6,
                         learning_rate=0.05, print_cost=True, log=False)
100%|██████████| 2000000/2000000 [08:51<00:00, 3759.97it/s]
In [34]:
# Print accuracy
predictions4hh = predict(parameters4hh, X4train)
print('rms error={}'.format(np.sqrt(mean_squared_error(yhh4train,predictions4hh))))
rms error=0.08668861946181737
In [77]:
plt.plot(error['steps'],error['rmse'])
plt.xscale('log') 

set_figstyle(1,1e7,0,0.6,'steps','rms error')
In [36]:
# train_data
orig        = pd.DataFrame(yhh4train.T,columns=['calc.'])

predictions = predict(parameters4hh, X4train)
pred        = pd.DataFrame(predictions.T,columns=['pred.']) 

#predictions4hh = predict(parameters4hh, X4train)
#pred = pd.DataFrame(predictions4hh.T,columns=['pred.']) 

df4hh = pd.concat([pred,orig],axis=1)
plt.scatter(df4hh['calc.'],df4hh['pred.']-df4hh['calc.'],c='r',label='train')

# test data
orig_test = pd.DataFrame(yhh4test.T,columns=['calc.'])
pred_test = pd.DataFrame(predict(parameters4hh, X4test).T,columns=['pred.']) 
df4hh_test = pd.concat([pred_test,orig_test],axis=1)
plt.scatter(df4hh_test['calc.'],df4hh_test['pred.']-df4hh_test['calc.'],
            c='b',label='test')

plt.axhline(0,c='k') 
plt.legend()

#set_figstyle(0.02,0.2,-0.5,0.5,'hh','difference')
Out[36]:
<matplotlib.legend.Legend at 0x1a16d5ce80>

Norm kernels

In [37]:
X = np.array(data_dic['JJ=0'][X_features])
y = np.log(np.array(data_dic['JJ=0']['nn']))
X.shape,y.shape
Out[37]:
((225, 2), (225,))
In [38]:
X_train,X_test,y_train, y_test = train_test_split(X,y,test_size=0.2)

X4train      = X_train.T
ynn4train    = y_train.reshape(1,y_train.shape[0])

X4test      = X_test.T
ynn4test    = y_test.reshape(1,y_test.shape[0])
In [39]:
X4test.shape
Out[39]:
(2, 45)

Information on the dataset:

  • X4nn -- dimension(n_x=2, m=225), where n_x is the number of features, m is the number of samples
  • y4nn -- dimension(n_x=1, m=225)
In [40]:
# Build a model with a n_h-dimensional hidden layer

parameters4nn, error_nn = nn_model(X4train, ynn4train, n_h = 20, 
                                   num_iterations = 1e6, 
                                   learning_rate=0.1, print_cost=True, log=True)
100%|██████████| 500000/500000 [02:07<00:00, 3911.27it/s]
In [41]:
# Print accuracy
predictions4nn = predict(parameters4nn, X4train)
print('rms error={}'.format(np.sqrt(mean_squared_error(ynn4train,predictions4nn))))
rms error=0.03326924619773885
In [78]:
plt.plot(error_nn['steps'],error_nn['rmse'])
plt.xscale('log')

set_figstyle(1,1e7,0,0.06,'steps','rms error')
In [43]:
# train data

orig = pd.DataFrame(np.exp(ynn4train).T,columns=['calc.']) # attention to the log

#pred = pd.DataFrame(np.exp(predictions4nn).T,columns=['pred.']) 

predictions = predict(parameters4nn, X4train)
pred        = pd.DataFrame(np.exp(predictions).T,columns=['pred.']) 


df4nn = pd.concat([pred,orig],axis=1)
plt.scatter(df4nn['calc.'],df4nn['pred.']-df4nn['calc.'],c='r',label='train')

 
# test data
orig_test = pd.DataFrame(np.exp(ynn4test).T,columns=['calc.'])

predictions = predict(parameters4nn, X4test)
pred_test   = pd.DataFrame(np.exp(predictions).T,columns=['pred.']) 

df4nn_test = pd.concat([pred_test,orig_test],axis=1)
plt.scatter(df4nn_test['calc.'],df4nn_test['pred.']-df4nn_test['calc.'],
            c='b',label='test')
 
    
    
plt.axhline(0,c='k') 
plt.legend()

set_figstyle(0.02,0.2,-0.02,+0.02,'norm','difference')

HWG equation

In [44]:
X   = np.array(data_dic['JJ=0'][X_features])
yhh = np.array(data_dic['JJ=0']['hh'])
ynn = np.array(data_dic['JJ=0']['nn'])

# log for n
ynnlog = np.log(np.array(data_dic['JJ=0']['nn']))
In [45]:
X4all      = X.T
yhh4all    = yhh.reshape(1,yhh.shape[0])
ynn4all    = ynn.reshape(1,ynn.shape[0])
In [46]:
ynn_pred = np.exp(predict(parameters4nn, X4all))
#yhh_pred = predict(parameters4hh, X4all)
In [58]:
def Create_XData_OnFineMesh(N,step):
    q1list,q2list = Creat_Twolist4Mesh(N)
    dftest = pd.DataFrame([q1list,q2list])
    dftest = dftest.T
    dftest.columns = ['q1','q2']
    dftest['bet1'] = dftest['q1']*step-N*step/2
    dftest['bet2'] = dftest['q2']*step-N*step/2
    return dftest
In [68]:
n_mesh  =   20

pData           =   Create_XData_OnFineMesh(n_mesh,0.1)  

pfeatures       = ['bet1','bet2'] 
X4all           = pData[pfeatures].T 
pData['hhp']    = predict(parameters4hh, X4all).T
pData['lognnp'] = predict(parameters4nn, X4all).T
pData['nnp']    = np.exp(pData['lognnp'])
# delete the data outlier
pDatar = pData[(pData['bet1']<0.4) & (pData['bet2']<0.4)] 
#pDatar = pData
pDatar.shape
Out[68]:
(169, 7)
In [69]:
pDatar['bet1'].unique()
Out[69]:
array([-0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1,  0. ,  0.1,
        0.2,  0.3])
In [70]:
# exact
 
dataJ0 = data[data['JJ']==0]  
hhmat = df2matrix(dataJ0,'q1','q2','hh' ) 
nnmat = df2matrix(dataJ0,'q1','q2','nn')

nn_exact = nnmat
hh_exact = hhmat 


HH0      = np.multiply(hh_exact,nn_exact) 
solution = HWG_Solution(HH0,nn_exact,1,'nos',redsolid)
solution = HWG_Solution(HH0,nn_exact,2,'nos',redsolid)  


# prediction

HH       = df2matrix(pDatar,'q1','q2','hhp')
NN       = df2matrix(pDatar,'q1','q2','nnp') 
HHn      = np.multiply(HH,NN)
solution = HWG_Solution(HHn,NN,1,'nos',bluedash)
solution = HWG_Solution(HHn,NN,2,'nos',bluedash)  

set_figstyle(1,HHn.shape[0],-23,-16,'NoS','E [MeV]') 

plt.legend(['Exact(k=1)','Exact(k=2)','ANN(k=1)','ANN(k=2)'],fontsize=16)
plt.xlim(0,15)
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-20.652369615782895, -21.75228090237229, -21.830596982083218, -21.977477485251566, -21.996928699861535, -22.0071621870279, -22.023496362923062, -22.040467905716277, -22.04549543737467, -22.051011574582997, -22.103317360552722, -22.124030373301295]
The eigenvalues of norm: [-2.13095120e-07 -1.24874923e-07 -2.27940377e-08  4.06258018e-07
  2.66373857e-06  3.74198885e-06  1.02132212e-05  8.15083783e-05
  1.37870928e-04  1.76111481e-03  2.37105594e-03  1.57995039e-02
  5.08046758e-02  1.24043844e-01  1.43236900e+00]
The energy of the state: [-15.960670328647392, -16.95661934112475, -16.99012833244533, -16.995134846709895, -17.08246935833585, -17.1031882583619, -17.122147310649368, -17.13146813062061, -17.173481870455944, -17.266982454486552, -18.936746646130075]
The eigenvalues of norm: [-4.69937369e-02 -6.44116708e-03 -2.26392463e-03 -1.34064901e-03
 -3.35352621e-04 -9.94548147e-05  1.29169335e-05  1.09928304e-03
  5.10491662e-03  2.09659355e-02  4.55606798e-02  7.18053819e-02
  7.20694735e-01]
The energy of the state: [-20.399037841621805, -21.090080954831148, -22.059969433236212, -22.0673642537879, -22.09304033084606, -22.139411098789214, -29.16981432990069]
The eigenvalues of norm: [-4.69937369e-02 -6.44116708e-03 -2.26392463e-03 -1.34064901e-03
 -3.35352621e-04 -9.94548147e-05  1.29169335e-05  1.09928304e-03
  5.10491662e-03  2.09659355e-02  4.55606798e-02  7.18053819e-02
  7.20694735e-01]
The energy of the state: [-15.547714521828498, -16.652295079609587, -17.422052634986006, -17.42238301803042, -17.567238452775122, -22.075358999333808]
/Users/jmyao/Documents/WorkSpace/Data_Analysis/Machine_Learning/gcm.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
  matrix = df[col_matrix].as_matrix().reshape(dim,dim)
Out[70]:
(0, 15)
In [79]:
plt.imshow(np.log(nnmat))
plt.colorbar()
Out[79]:
<matplotlib.colorbar.Colorbar at 0x1a19b44a58>

Conclusion

We have used three different techniques (SVR with polynomial kernel of dof=5, Spline, and ANN with one hidden layer and 20 neurons) to do interpolation for both the Hamiltonian kernels and norm kernels. As the first step, we only consider the kernels as functions of two variables ($q_1=\beta_1, q_2=\beta_2$),

$$ N^{J=0}(q_1, q_2) = <\Phi(q_1)\vert \hat P^J \hat P^N \vert\Phi(q_2)\rangle, $$

and

$$ H^{J=0}(q_1, q_2) = <\Phi(q_1)\vert \hat H\hat P^J \hat P^N \vert\Phi(q_2)\rangle, $$ where $\hat H$ is the hamiltonian operator. The findings are summarized as below:

  • For this 2D case, the Spline does better job than the other twos.
  • The hamiltonian kernel/norm kernel can be trained rather well without scaling the data.
  • Taking the log of the norm kernel as the target $y$ has a much better performace which can be understood from the physics point of view: $$ N(q_1, q_2) \sim \exp\left[-(q_1 - q_2)*G(q_1, q_2)*(q_1 - q_2)\right]. $$

The next step will be considering higher dimension case.