Overview of example use

This is an example workflow for doing the whole calculation, except calculating potential itself that has been explained else where.

Molecular dynamics itself consist of 4 phases:

Equilibration of initial geometry in gas phase
Condensing gas to solid.
Equilibration to DFT
Spectrum calculation

Fitting Potential

First we need to fit a potential. In this example we fit a potential for trans-formic acid and argon. PotentialDB has pre calculated data than we can use for fitting the potential.

using PotentialDB
r = defaultregistry()
data = loadpotential(r, "4")

Dict{String, Any} with 6 entries:
  "Method"   => "CCSD(T)-F12/RI"
  "cluster2" => Cluster{AtomOnlySymbol} of 1 atoms
  "Energy"   => [0.0172099 0.0232891 … 0.0210108 0.0160872; 0.0104979 0.0147825…
  "Points"   => Cluster{AtomOnlySymbol}[Cluster{AtomOnlySymbol} of 6 atoms Clus…
  "cluster1" => Cluster{AtomOnlySymbol} of 5 atoms
  "Basis"    => "cc-pVDZ-F12 cc-pVDZ-F12-CABS cc-pVTZ/C TIGHTSCF"

The fitting procedure is the same as explained in Usage section.

First we create a potential for fitting.

using PotentialFitting, PotentialCalculation

m1, m2 = get_molecules(data)

mpp = MoleculePairPotential(m1,m2, GeneralPowers(-6,-8,-10,-12,-7,-9,-11))

Molecule Pair Potential
Molecule 1 has 5 atoms (5 unique) 
Molecule 2 has 1 atoms (1 unique) 

Potential topology:

r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 0.0  0.0  0.0  0.0  0.0  0.0  0.0   :  1C :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 0.0  0.0  0.0  0.0  0.0  0.0  0.0   :  2O :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 0.0  0.0  0.0  0.0  0.0  0.0  0.0   :  3O :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 0.0  0.0  0.0  0.0  0.0  0.0  0.0   :  4H :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 0.0  0.0  0.0  0.0  0.0  0.0  0.0   :  5H :  6Ar

Then we prepare fitting data and apply some constrains. The reason why we applying constrains is that our potential is under fitting and as a result depends heavily on the constrains. You can verify this, by taking constrains off and comparing to constrained potential, or by changing the constrains considerably.

fdata = FitData(mpp, data["Points"], data["Energy"])

# setweight_e_more!(fdata, weight, minimum_energy)
setweight_e_more!(fdata, 0.1, 1000u"cm^-1")
setweight_e_more!(fdata, 0.01, 5000u"cm^-1")
setweight_e_less!(fdata, 2, -50u"cm^-1")
setweight_e_less!(fdata, 4, -75u"cm^-1")
setweight_e_less!(fdata, 16, -100u"cm^-1");

Optional defining a model

Then we prepare fitting tools for fitting. Standard linear regression is most likely good enough. But you can try others as well.

Here we load LinearRegressor that does not fit intercept

using MLJ
LinearRegressor = @load LinearRegressor pkg=GLM

model = LinearRegressor(fit_intercept=false)

LinearRegressor(
  fit_intercept = false, 
  dropcollinear = false, 
  offsetcol = nothing, 
  report_keys = [:deviance, :dof_residual, :stderror, :vcov, :coef_table])

Fitting potential

Fit using default model

fit_potential!(mpp, fdata)

Molecule Pair Potential
Molecule 1 has 5 atoms (5 unique) 
Molecule 2 has 1 atoms (1 unique) 

Potential topology:

r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** -22538.764664060964  -5.243025376695279e6  -3.2391005886259116e7  1.5156798055682138e8  551011.5920472831  2.2660963253455512e7  -5.316652440075644e7   :  1C :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 13740.236148585838  5.472515665297194e6  1.6959907070386767e8  3.752328806901236e8  -416091.7439865846  -3.998932225141039e7  -3.899070213341327e8   :  2O :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 9792.504974051504  3.139013079090014e6  7.626585584376974e7  1.3066879800125138e8  -268152.3469762912  -2.0302156130230505e7  -1.550009093303048e8   :  3O :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** -2432.4139113706888  -633362.37890601  -1.0428608883981721e7  -1.0705245941539187e7  60927.96548650013  3.458886999782353e6  1.6484328349663341e7   :  4H :  6Ar
r^-6  r^-8  r^-10  r^-12  r^-7  r^-9  r^-11   *** 4524.673108913862  858781.5222092883  7.271737286364344e6  -3.2132918849226506e6  -99324.2346945478  -3.629952540388673e6  -4.340602429711378e6   :  5H :  6Ar

or if you defined your own model

fit_potential!(model, mpp, fdata)

Check root mean square deviation for potential.

rmsd(data["Points"], data["Energy"], mpp, emax=-10u"cm^-1")

8.36797599293565 cm^-1

Note

We are under fitting our potential, so we can safely compare the potential to the data we used for the fitting. But it would be a good idea to confirm this, by comparing to some data that was not used for fitting.

To inspect the potential we can use use scan_potential which will calculate potential for given points and visualizes the geometry.

scan_potential(data, mpp)

You can call external programs like VMD to to visualize the data also.

# Visualize line 32 using vmd
visualize_points(data["Points"][:,32], command="vmd")

Preparing CP2K input files

Geometry inputs with AtomsBase (new recommended)

To generate input data for simulations we supply a set of tools. To generate structures you can use

# structure in 86u"Å" cubic box where cluster2 has been added 1000 times
# to cluster 1
str = generate_initial_structure(data["cluster1"], data["cluster2"], 86u"Å", 1000)

FlexibleSystem(Ar₁₀₀₀CH₂O₂, periodic = TTT):
    bounding_box      : [      86        0        0;
                                0       86        0;
                                0        0       86]u"Å"

                      .-------------------------------------------------------------------------.  
                     /|       AAr   Ar                    Ar                                    |  
                    / | Ar                                                                      |  
                   /  |                                            Ar          ArAr             |  
                  /   |                                 Ar     Ar    Ar                    Ar   |  
                 /    |            Ar         Ar     Ar                       ArAr              |  
                /    Ar      Ar              Ar     AAr   ArAr              Ar      Ar   Ar     |  
               /      ArAr    Ar   Ar    Ar      ArAr   Arr            Ar     AArAr             |  
              /       |      Ar Ar Ar Ar  Ar          Ar        Ar    Ar      Ar      Arr       |  
             /    Ar AArr            Ar           Arr  Ar Ar          Ar   Ar          Ar       |  
            /         |            Ar       ArAr    ArAr   Ar  Ar     Ar   Ar     Ar     Ar     |  
           Ar       Arr ArAr   AArr    Ar  AAr   Ar           Ar      Ar   Ar Ar                |  
          /           |Ar Ar   Ar     Ar Ar   Ar   AArrr  Ar         Ar ArAr   ArAArAr  Ar      |  
         /  Ar        |AAr                  Ar  Ar   Ar Ar  Ar AAr                 Ar           |  
        /       Ar  AAr  Ar    AArAr Ar   Ar   Ar  Ar   Ar      Ar AAr              Ar          |  
       /        Ar    |Ar AArAr ArrAr  Arr      ArArAr      Ar     AArr Arr  Ar   AAr       Ar  |  
      /               |              Ar      Arr  Arr      Ar   ArAr Ar  ArArr ArAArr  Ar       |  
     /      ArAr      |   ArAr   Ar  Ar Ar   Ar ArArAr Arr  Ar          AAr     Ar  Ar          |  
    *  Ar AArAr      ArAr    AAArr  Ar  Ar Arr ArAr Arr    Ar    AArrArr ArArr          Ar      |  
    |         Ar Arr  |     ArAr Ar    ArAr     ArAr Ar Arr      Ar    ArAr  Ar    Ar Ar        |  
    |        Ar     ArArAr    Ar    ArrArAr ArAr         ArArr    ArAr   AArAr AAr   Ar   Ar  Ar|  
    |       Ar        Ar  Ar    ArAr      Ar  Arr         Ar    ArAr      AAr        Ar Ar      |  
    |  Ar   Ar Ar     |AArr         Ar        AArAArAr          Ar        Arr   Ar   Ar         |  
    | Ar       AAr    |Ar  Arr  Arr   Ar  ArrrAr  Ar  AAr   ArArAr    AAr       Ar  Ar          |  
    |         Ar      Ar          ArArAr    AArArArAr      Ar  Ar    ArrArrrAr   Arr     Ar     |  
    |       Ar Arr  ArAr          ArrAr      ArAr ArAAArArArArr               Ar     Ar  Ar     |  
    |     Ar     Ar ArrAr Ar   ArArArAr   AArAr  ArAAr        Ar AArrAArr          Ar ArAArAr   |  
    |          Ar  Ar |Ar Ar    ArArAAr   ArAr  ArC Ar Ar Ar Ar    Ar  ArrArrAr Ar   Ar  Ar     |  
    |     Ar   ArrAr Ar   ArAr ArAAr      AArr   ArH  AAr        Arr   Ar      AArr  Ar         |  
    | Ar    Arr  Ar ArAr     Ar Ar     Ar   AAr ArrAAr Ar   ArrrAr     Ar  ArrArArAAr   Arr     |  
    |  Ar    Ar   Ar  |     Ar Ar ArAr     AArr   Ar  Ar ArAr Arr  Ar    Ar     ArAr    ArAArr  |  
    |   Ar Ar     Ar  |Ar  Ar   AAr  Ar Arr   Arr   Ar  ArrAAr AArAr   ArAr  AAAr Ar            |  
    |        Ar       AAr    Arr  Ar Ar   Ar                    Ar     ArArAr  AAr ArrAr        |  
    |Ar   Ar  Ar      Ar AAr     Ar  Ar ArAr Ar Ar Ar  ArAr   Ar    AAArr    AArAr ArAAr        |  
    |     Ar Ar Arr  Ar Ar    Ar   Arr AArAr  Ar     ArArArAr  Arrr Arr Ar ArAr      Ar   Ar    |  
    |     Ar        Ar|    ArrArArr            ArAr ArAr AArAArr       ArArAr ArArr  ArAr AAr   |  
    |      Ar Ar   Ar |  AAArr    Ar      ArAr     Ar      Ar Ar AAr  Arrr Ar          Ar       |  
    |        Ar  Ar AArAr--Ar-Ar--Ar-Ar-Ar-Ar-----Ar----ArAr-------AArArAr----Ar-Ar------Ar-----.  
    |    Ar Ar AAr  Ar  Ar    AAAr  Ar Ar       Arr    Ar         ArrAr  Ar                    /   
    |            Ar / AAr          Arr    ArAr      Arr       ArAr   Ar Ar  Ar           Ar   /    
    |           Arr/              ArArrr      Arr   Ar   Ar  Ar       ArAAr     Ar           /     
    |Ar      Ar   /      Ar  Ar Ar    Ar AAr  ArrAr           Ar  ArAr          AAr Ar      /      
    |         Ar /     AArArArAArAr Ar  Ar Arr     Ar        Ar   ArArrAAr                 /       
    |        ArAr     Ar    Arr             ArAr    Arr       ArArAr  Ar           Ar     /        
    |          Ar            Ar  Arr           Ar Ar        ArArAr        Ar      Ar     /         
    |      Ar /      Ar     Ar  ArAr         Ar  AAr Ar AAr     Ar    Ar   Ar           /          
    |     Ar /  Ar     Ar   Ar     Ar    Ar     Ar  Ar Ar                Ar       Ar   /           
    |       /       Ar    Ar   Ar          AAr             Ar       Ar   Ar           /            
    |Ar    /     Ar         AAr     Ar   Ar    Ar Ar               Ar      Ar        /             
    | AAr /   Ar Ar                       Ar            ArAr  Ar                    /              
    |    /   Ar   Ar  Arr   AAr     Ar      ArArArAr          Ar    Ar  Ar         /               
    | Ar/ Ar    AAr      Ar        Ar         Ar             Ar                   /                
    |  /      AArAr                   Ar                                         /                 
    | /                                   Ar           Ar        ArAr           /                  
    |/    Ar         Ar            Ar                                  Ar      /                   
    *-------------------------------------------------------------------------*

You can edit the atomic symbols to reflect the potential

set_atomic_symbol!(str, :Oh, 2)
set_atomic_symbol!(str, :Oo, 3)
set_atomic_symbol!(str, :Ho, 4)
set_atomic_symbol!(str, :Hc, 5)
show(str[1:6])

Atom{3, Quantity{Float64, 𝐋, Unitful.FreeUnits{(Å,), 𝐋, nothing}}, Quantity{Float64, 𝐋 𝐓^-1, Unitful.FreeUnits{(a₀, s^-1), 𝐋 𝐓^-1, nothing}}, Quantity{Float64, 𝐌, Unitful.FreeUnits{(u,), 𝐌, nothing}}}[Atom(C,  [ 42.6718,  43.0575,  43.2551]u"Å"), Atom(Oh, [ 43.8511,  43.3774,  42.7534]u"Å"), Atom(Oo, [ 42.1775,  42.0041,  43.0524]u"Å"), Atom(Ho, [ 44.1017,  42.7007,  42.0086]u"Å"), Atom(Hc, [ 42.1978,  43.8603,  43.9304]u"Å"), Atom(Ar, [ 57.9096,  35.7209,  43.5668]u"Å")]

You can then save the structure using AtomsIO

using AtomsIO

save_system("init_struct.xyz", str) # xyz-file
save_system("init_struct.pdb", str) # pdb-file

CP2K needs residue information for QM and MM parts. This can be done by using

# Save pdb file that has residue name "tfa" for atoms 1 till 5
# residue name "LIG" for other atoms
save_with_residue_data("tfa-ar-init.pdb", str, "tfa", 1:5)

This file can be used as an initial file in CP2K.

Geometry inputs with Gromacs tools (old way)

First grab your favorite and make two file, one for the molecule/cluster you are studying and one for matrix and save them in PDB-format. Here we have a file for formic acid and an other for argon.

COMPND    UNNAMED
AUTHOR    GENERATED BY OPEN BABEL 2.3.2
HETATM    1  C   tfa     1      42.585  42.840  42.933  1.00  0.00           C
HETATM    2  Oh  tfa     1      43.930  42.838  42.829  1.00  0.00           O
HETATM    3  Oo  tfa     1      42.014  43.738  43.531  1.00  0.00           O
HETATM    4  Ho  tfa     1      44.463  43.543  43.222  1.00  0.00           H
HETATM    5  Hc  tfa     1      42.008  42.039  42.484  1.00  0.00           H
CONECT    1    2    3    5                                            
CONECT    2    1    4                                                 
CONECT    3    1                                                      
CONECT    4    2                                                      
CONECT    5    1                                                      
MASTER        0    0    0    0    0    0    0    0    5    0    5    0
END

COMPND    UNNAMED
AUTHOR    GENERATED BY OPEN BABEL 2.3.2
HETATM    6 AR   LIG     2       1.472  80.636  12.825  1.00  0.00           Ar
MASTER        0    0    0    0    0    0    0    0    1    0    1    0
END

Note

You need to have different residue names for QM and MM atoms, here "tfa" (trans-formic acid) and "LIG". This is demanded by CP2K.

You also need to put different identifiers to atoms at this point. Here we have:

Oh : hydroxyl group oxygen
Oo : carbonyl group oxygen
Ho : hydroxyl group hydrogen
Hc : hydrogen connected to carbon

Next we need to decide initial simulation box size, which depends on number of atoms in simulation. If we have 1000 argon atoms then good initial box size is 86 Å. To put the molecule in box we need extra tools. Gromacs in example has that kind of tools.

To to put the molecule into the box we use gmx editconf command

gmx editconf -f tfa.pdb -box 8.6 8.6 8.6 -o tfa-b86.pdb

Here we use 86x86x86 Å box, which is ok when using 1000-matrix atoms. But you can adjust this, depending on how many matrix atoms you have.

Next we need to add argon atoms into the box. For this we use gmx insert-molecules command

gmx insert-molecules -f tfa-b86.pdb -ci ar.pdb -nmol 1000 -o tfa-ar.pdb

And now we have the final geometry input.

CP2K input file

The best way to make CP2K input file is to split the file to several parts and using @include-commands to gather them to one file.

To do this we make

MM-potential file
QM-calculation file
QMMM-boundary file
CP2K file to gather

First we make the MM-potential file.

&MM
    &FORCEFIELD
      &CHARGE
         ATOM Ar
         CHARGE 0.0
      &END CHARGE
      &CHARGE
         ATOM C
         CHARGE 0.0
      &END CHARGE
      &CHARGE
         ATOM Oo
         CHARGE 0.0
      &END CHARGE
      &CHARGE
         ATOM Ho
         CHARGE 0.0
      &END CHARGE
      &CHARGE
         ATOM Hc
         CHARGE 0.0
      &END CHARGE
      &CHARGE
         ATOM Oh
         CHARGE 0.0
      &END CHARGE

      IGNORE_MISSING_CRITICAL_PARAMS .TRUE.
      &NONBONDED
        # Argon-argon potential from http://www.sklogwiki.org/SklogWiki/index.php/Argon
        &LENNARD-JONES
          ATOMS Ar Ar
          EPSILON [K_e] 125.7
          SIGMA [angstrom] 3.345
          RCUT [angstrom] 9.0
        &END LENNARD-JONES

        # trans Formic acid - argon
        # Just copy the potential terms from potential fit here
        &GENPOT
            atoms Ar C
            FUNCTION Cvi/(r^6) + Cviii/(r^8) + Cx/(r^10) +  Cxii/(r^12) + Cvii/(r^7) + Cix/(r^9) + Cxi/(r^11)
            VARIABLES r
            PARAMETERS Cvi Cviii Cx Cxii Cvii Cix Cxi
            VALUES -22540.021633788096  -5.243549409354186e6  -3.2406578590178587e7  1.515360096536201e8  551050.9077903178  2.266474955203762e7  -5.3132065936126545e7
            RCUT [angstrom] 9.0
        &END GENPOT
        &GENPOT
            atoms Ar Oh
            FUNCTION Cvi/(r^6) + Cviii/(r^8) + Cx/(r^10) +  Cxii/(r^12) + Cvii/(r^7) + Cix/(r^9) + Cxi/(r^11)
            VARIABLES r
            PARAMETERS Cvi Cviii Cx Cxii Cvii Cix Cxi
            VALUES 13739.083360114846  5.471838619459709e6  1.6957303994470203e8  3.7516840650039065e8  -416048.44688283757  -3.998370403083761e7  -3.898432560739321e8
            RCUT [angstrom] 9.0
        &END GENPOT
        &GENPOT
            atoms Ar Oo
            FUNCTION Cvi/(r^6) + Cviii/(r^8) + Cx/(r^10) +  Cxii/(r^12) + Cvii/(r^7) + Cix/(r^9) + Cxi/(r^11)
            VARIABLES r
            PARAMETERS Cvi Cviii Cx Cxii Cvii Cix Cxi
            VALUES 9793.326640804324  3.1394020151115586e6  7.627842607574251e7  1.3069576949556419e8  -268179.8831127547  -2.0305102118835907e7  -1.550294791017888e8
            RCUT [angstrom] 9.0
        &END GENPOT
        &GENPOT
            atoms Ar Ho
            FUNCTION Cvi/(r^6) + Cviii/(r^8) + Cx/(r^10) +  Cxii/(r^12) + Cvii/(r^7) + Cix/(r^9) + Cxi/(r^11)
            VARIABLES r
            PARAMETERS Cvi Cviii Cx Cxii Cvii Cix Cxi
            VALUES -2432.1617929808244  -633286.4451121971  -1.042711058845794e7  -1.070334816166878e7  60921.18782589104  3.4584354851854523e6  1.6481702154332923e7
            RCUT [angstrom] 9.0
        &END GENPOT
        &GENPOT
            atoms Ar Hc
            FUNCTION Cvi/(r^6) + Cviii/(r^8) + Cx/(r^10) +  Cxii/(r^12) + Cvii/(r^7) + Cix/(r^9) + Cxi/(r^11)
            VARIABLES r
            PARAMETERS Cvi Cviii Cx Cxii Cvii Cix Cxi
            VALUES 4525.254249630937  858980.4449115618  7.2762912659367565e6  -3.2064841129619866e6  -99340.84352925561  -3.6312235927495877e6  -4.349257632528463e6
            RCUT [angstrom] 9.0
        &END GENPOT

      &END NONBONDED

    &END FORCEFIELD
    &POISSON
      PERIODIC XYZ
      # CP2K needs ewald summation section in input. But for us it is useless,
      # and you can put whatever values here.
      &EWALD
        EWALD_TYPE spme
        ALPHA .44
        GMAX  40
      &END EWALD
    &END POISSON

  &END MM

Then we make a QM-file with semi-empirical calculation method to condense the system. Here we use AM1-method that is fine for formic acid.

&DFT
    BASIS_SET_FILE_NAME ${LIBDIR}/BASIS_MOLOPT
    POTENTIAL_FILE_NAME ${LIBDIR}/POTENTIAL
    &MGRID
      NGRIDS 4
      CUTOFF 350
      REL_CUTOFF 50
    &END MGRID

    &QS
       # We use semi-empirical to condence the system
       METHOD AM1
       &SE
         &COULOMB
           CUTOFF [angstrom] 10.0
         &END
         &EXCHANGE
           CUTOFF [angstrom] 10.0
         &END
      &END
    &END

    &SCF
      SCF_GUESS ATOMIC
      EPS_SCF 1.0E-6         # convergence threshold for total energy
      MAX_SCF 30
      &OUTER_SCF
        MAX_SCF 10
        EPS_SCF 1.0E-6
      &END
    &END SCF

    # Our QM-box is non periodic, so we need a Poison-solver
    &POISSON
      POISSON_SOLVER MT
      PERIODIC NONE
    &END POISSON

&END DFT

and a second file for DFT. We use BLYP-D3BJ here.

&DFT
    BASIS_SET_FILE_NAME ${LIBDIR}/BASIS_MOLOPT
    POTENTIAL_FILE_NAME ${LIBDIR}/POTENTIAL
    &MGRID
      NGRIDS 4
      CUTOFF 350
      REL_CUTOFF 50
    &END MGRID

    &QS
       METHOD GPW
       EPS_DEFAULT 1.0E-10
       EXTRAPOLATION ASPC
    &END


    &SCF
      SCF_GUESS ATOMIC
      EPS_SCF 1.0E-6         # convergence threshold for total energy
      MAX_SCF 30
      &OUTER_SCF
        MAX_SCF 10
        EPS_SCF 1.0E-6
      &END
    &END SCF

    &XC
      &XC_FUNCTIONAL BLYP
      &END XC_FUNCTIONAL
      &VDW_POTENTIAL
         POTENTIAL_TYPE PAIR_POTENTIAL
         &PAIR_POTENTIAL
            TYPE DFTD3(BJ)
            R_CUTOFF 16
            REFERENCE_FUNCTIONAL BLYP
            PARAMETER_FILE_NAME ${LIBDIR}/dftd3.dat
         &END
      &END
    &END XC

    &POISSON
      POISSON_SOLVER MT
      PERIODIC NONE
    &END POISSON

&END DFT

Then we need to make the QM-MM file

&QMMM
    CENTER_GRID TRUE

    # For formic acid 10Å box is enough
    &CELL                    
      ABC [angstrom] 10 10 10
      PERIODIC NONE
    &END CELL

    # Define QM atoms
    &QM_KIND C
      MM_INDEX 1
    &END QM_KIND
    &QM_KIND O
      MM_INDEX 2 3
    &END QM_KIND
    &QM_KIND H
      MM_INDEX 4 5
    &END QM_KIND

    # Apply periodic potential for MM system (default)
    &PERIODIC
      # We can turn multipole off, because argon has no partial charge
      &MULTIPOLE OFF   
      &END
    &END PERIODIC

&END QMMM

Note

CENTER_GRID TRUE is very important option. Without it QM system will heat up on its own and calculations will fail.

Finally we make a file that gathers all the other files. The idea here is that you only need to edit this file in the future.

@SET SYSTEM equilib
@SET LIBDIR #add path to cp2k-data directory

&GLOBAL
  PROJECT ${SYSTEM}
  PRINT_LEVEL LOW
  RUN_TYPE MD
&END GLOBAL

&FORCE_EVAL
  STRESS_TENSOR ANALYTICAL
  METHOD QMMM

  # These are the files we created previously
  @include qm-am1.inc
  @include mm.inc
  @include qmmm.inc

  &SUBSYS
    &CELL
      # box containing the system,
      # using 86x86x86 box we made earlier for the inputs
      ABC [angstrom] 86 86 86
    &END CELL
    &TOPOLOGY
      COORD_FILE_NAME tfa-ar.pdb
      COORD_FILE_FORMAT PDB
      CONN_FILE_FORMAT USER
      &GENERATE
         &ISOLATED_ATOMS
            # This need to span to all atoms in the simulation
            # We don't have any unbreakable mm-bonds in simulation
            LIST 1..1005
         &END
      &END
    &END TOPOLOGY
    &KIND Ho
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q1
      ELEMENT H
    &END KIND
    &KIND H                  
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q1
      ELEMENT H
    &END KIND
    &KIND Hc                 
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q1
      ELEMENT H
    &END KIND
    &KIND Oh
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q6
      ELEMENT O
    &END KIND
    &KIND Oo
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q6
      ELEMENT O
    &END KIND
    &KIND O
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q6
      ELEMENT O
    &END KIND
    &KIND C
      BASIS_SET TZVP-MOLOPT-GTH
      POTENTIAL GTH-BLYP-q4
    &END KIND
  &END SUBSYS

&END FORCE_EVAL

&MOTION

    &MD
        ENSEMBLE NVT
        TEMPERATURE [K] 80
        TEMP_TOL 50
        TIMESTEP [fs] 0.5
        STEPS 20000
        &THERMOSTAT
            REGION MASSIVE
            TYPE CSVR
            &CSVR
              TIMECON [fs] 100
            &END CSVR
        &END THERMOSTAT
    &END MD

    &PRINT
        &TRAJECTORY
            &EACH
                MD 20
            &END EACH
            FORMAT DCD
        &END TRAJECTORY
        &RESTART_HISTORY
            &EACH
                MD 5000
            &END EACH
        &END RESTART_HISTORY
    &END PRINT

&END MOTION

Molecular dynamics

Overview here is to start from gas phase and have the system to condense to solid matrix. To do this the simulation consist of 4 phases.

Equilibration from initial coordinates to some kind of gas.
Condensing gas solid.
Changing calculation method from semi-empirical to DFT and equilibrate it.
Calculate the spectrum.

The first equilibration

Aim here is to create a super cooled gas that will compress to solid/liquid in the next phase. To do this we need to set simulation temperature to lower than boiling point. For argon 80 K is a good temperature. This can be done with the input file that was constructed earlier.

Condensing

To condense the system we need to change the dynamics to NTP, so that the external pressure will compress it. To do it we'll use following input file

@SET SYSTEM condence
@SET LIBDIR # add path to cp2k-data directory

&GLOBAL
  PROJECT ${SYSTEM}
  PRINT_LEVEL LOW
  RUN_TYPE MD
&END GLOBAL

&FORCE_EVAL
  STRESS_TENSOR ANALYTICAL
  METHOD QMMM


  @include qm-am1.inc
  @include mm.inc
  @include qmmm.inc

  @include subsys.inc


&MOTION

    &MD
        ENSEMBLE NPT_I
        TEMPERATURE [K] 20
        TIMESTEP [fs] 0.5
        STEPS 200000
        &THERMOSTAT
            REGION MASSIVE
            TYPE CSVR
            &CSVR
              TIMECON [fs] 250
            &END CSVR
        &END THERMOSTAT
        &BAROSTAT
             PRESSURE [bar] 1.0
             TEMPERATURE [K] 20
             TEMP_TOL [K] 20
             TIMECON [fs] 250
        &END BAROSTAT
    &END MD

    &PRINT
        &TRAJECTORY
            &EACH
                MD 40
            &END EACH
            FORMAT DCD
        &END TRAJECTORY
        &RESTART_HISTORY
            &EACH
                MD 50000
            &END EACH
        &END RESTART_HISTORY
    &END PRINT

&END MOTION

Note

The rate at which the system condenses is heavily affected by thermostats coupling constant, here 250 fs. The system will condense slower, if the constant is smaller. 250 fs has proven to be a good value for many different systems.

This input file has additional file subsys.inc that includes &SUBSYSTEM section form the previous run. You can get the section by using grep on restart file.

grep -A 10000 "&SUBSYS"

In this phase we also set the final temperature, now 20 K.

Change to DFT

Next we'll change QM calculation method to DFT. To do this we'll use slightly modified input file from the previous phase

@SET SYSTEM dftequilib
@SET LIBDIR # add path to cp2k-data directory

&GLOBAL
  PROJECT ${SYSTEM}
  PRINT_LEVEL LOW
  RUN_TYPE MD
&END GLOBAL

&FORCE_EVAL
  STRESS_TENSOR ANALYTICAL
  METHOD QMMM


  @include qm-blyp-d3bj.inc
  @include mm.inc
  @include qmmm.inc

  @include condensed_subsys.inc


&MOTION

    &MD
        ENSEMBLE NPT_I
        TEMPERATURE [K] 20
        TIMESTEP [fs] 0.5

        # You can increase this if needed
        STEPS 10000
        &THERMOSTAT
            REGION MASSIVE
            TYPE CSVR
            &CSVR
              # This constant is not that important anymore
              TIMECON [fs] 25
            &END CSVR
        &END THERMOSTAT
        &BAROSTAT
             PRESSURE [bar] 1.0
             TEMPERATURE [K] 20
             TEMP_TOL [K] 20
             TIMECON [fs] 250
        &END BAROSTAT
    &END MD

    &PRINT
        &TRAJECTORY
            &EACH
                MD 40
            &END EACH
            FORMAT DCD
        &END TRAJECTORY
        &RESTART_HISTORY
            &EACH
                MD 5000
            &END EACH
        &END RESTART_HISTORY
    &END PRINT

&END MOTION

IR spectrum calculation

To calculate spectrum we need dipole moments, to do this we need to add command to print them to our QM-file

&DFT
    BASIS_SET_FILE_NAME ${LIBDIR}/BASIS_MOLOPT
    POTENTIAL_FILE_NAME ${LIBDIR}/POTENTIAL
    &MGRID
      NGRIDS 4
      CUTOFF 350
      REL_CUTOFF 50
    &END MGRID

    &QS
       METHOD GPW
       EPS_DEFAULT 1.0E-10
       EXTRAPOLATION ASPC
    &END


    &SCF
      SCF_GUESS ATOMIC
      EPS_SCF 1.0E-6         # convergence threshold for total energy
      MAX_SCF 30
      &OUTER_SCF
        MAX_SCF 10
        EPS_SCF 1.0E-6
      &END
    &END SCF

    &XC
      &XC_FUNCTIONAL BLYP
      &END XC_FUNCTIONAL
      &VDW_POTENTIAL
         POTENTIAL_TYPE PAIR_POTENTIAL
         &PAIR_POTENTIAL
            TYPE DFTD3(BJ)
            R_CUTOFF 16
            REFERENCE_FUNCTIONAL BLYP
            PARAMETER_FILE_NAME ${LIBDIR}/dftd3.dat
         &END
      &END
    &END XC

    &POISSON
      POISSON_SOLVER MT
      PERIODIC NONE
    &END POISSON

    &PRINT
      &MOMENTS
       PERIODIC FALSE
       &EACH
         MD 1
       &END
       MAX_MOMENT 1
       FILENAME =dipole.traj
      &END MOMENTS
     &END

&END DFT

We'll also need to edit input file to change the mode to NVE

@SET SYSTEM spectrum
@SET LIBDIR # add path to cp2k-data directory

&GLOBAL
  PROJECT ${SYSTEM}
  PRINT_LEVEL LOW
  RUN_TYPE MD
&END GLOBAL

&FORCE_EVAL
  STRESS_TENSOR ANALYTICAL
  METHOD QMMM


  @include qm-blyp-d3bj-dipoles.inc
  @include mm.inc
  @include qmmm.inc

  @include dft_subsys.inc


&MOTION

    &MD
        ENSEMBLE NVE
        TIMESTEP [fs] 0.5
        STEPS 80000
    &END MD

    &PRINT
        &TRAJECTORY
            &EACH
                MD 5
            &END EACH
            FORMAT DCD
        &END TRAJECTORY
        &RESTART_HISTORY
            &EACH
                MD 5000
            &END EACH
        &END RESTART_HISTORY
    &END PRINT

&END MOTION

This will create dipole.traj-file that contains dipole moments. You can use IRSpectrum.jl to calculate IR-spectrum form the dipole trajectory file.