Gaussian 16 Rev. C.01/C.02 Release Notes

Features and changes introduced in Revs. B.01 and C.01 are indicated by [REV B] and [REV C], respectively.

New Modeling Capabilities

[REV C] NBO version 7 is supported. There are new options to the Population keyword: Pop=NPA7, Pop=NBO7, Pop=NBO7Read and Pop=NBO7Delete request Natural Population Analysis, full Natural Bond Orbital Analysis, full NBO with NBO input read from the input stream and NBO analysis of the effects of deletion of some interactions (respectively), using NBO7 via the external interface. In addition, NEDA=n is used to perform Natural Energy Decomposition Analysis. The analysis uses the same input information about fragments as counterpoise calculations. Deletions and optimizations with deletions now work with either NBO6 or NBO7.
[REV C] The RESP (restrained electrostatic potential) constraint can be included in computing potential-derived charges. For example, Pop=(MK,Resp=N) applies a weight of N x 10^-6 Hartrees to the squared charges. Other electrostatic potential-derived charge schemes also accept this option (e.g., CHelp, HLY). N defaults to 2.
[REV C] Pop=SaveHirshfeld and Pop=SaveCM5 cause the specified charges to be saved as the MM charges to be used in a subsequent calculation.
[REV B] Static Raman intensities can be computed for excited states at the CIS and TD levels of theory. TD Freq=Raman computes the polarizability by numerical differentiation with respect to an electric field, so the cost of Freq=Raman for these methods is 7x that of the frequencies without Raman intensities.
TD-DFT analytic second derivatives for predicting vibrational frequencies/IR and Raman spectra and performing transition state optimizations and IRC calculations for excited states.
EOMCC analytic gradients for performing geometry optimizations.
Anharmonic vibrational analysis for VCD and ROA spectra: see Freq=Anharmonic.
Vibronic spectra and intensities: see Freq=FCHT and related options.
Resonance Raman spectra: see Freq=ReadFCHT.
New DFT functionals: M08HX, MN15, MN15L, PW6B95, PW6B95D3.
New double-hybrid methods: DSDPBEP86, PBE0DH and PBEQIDH.
PM7 semi-empirical method.
Ciofini excited state charge transfer diagnostic: see Pop=DCT.
The EOMCC solvation interaction models of Caricato: see SCRF=PTED.
Generalized internal coordinates, a facility which allows arbitrary redundant internal coordinates to be defined and used for optimization constraints and other purposes. See Geom=GIC and GIC Info.

Performance Enhancements

NVIDIA K40, K80, P100 (Pascal), V100 (Volta) and A100 (Ampere) GPUs are supported under Linux for Hartree-Fock and DFT calculations. A100 support is new with Revision C.02, V100 support was new with Revision C.01, and P100 support was new with [REV B]. Revisions B.01 and C.01 also provided performance improvements for all supported GPU types. See Using GPUs for details on GPU support and usage.
Parallel performance on larger numbers of processors has been improved. See the Parallel Performance tab for information about how to get optimal performance on multiple CPUs and clusters.
[REV B] Dynamic allocation of tasks among Linda workers is now the default, improving parallel efficiency.
Gaussian 16 uses an optimized memory algorithm to avoid I/O during CCSD iterations.
There are several enhancements to the GEDIIS optimization algorithm.
CASSCF improvements for active spaces ≥ (10,10) increase performance and make active spaces of up to 16 orbitals feasible (depending on the molecular system).
Significant speedup of the core correlation energies for W1 compound model.
Gaussian 16 incorporates algorithmic improvements for significant speedup of the diagonal, second-order self-energy approximation (D2) component of composite electron propagator (CEP) methods as described in [DiazTinoco16]. See EPT.

Usage Enhancements

[REV C] The ROA invariants for each vibrational mode are now only printed by G16 or by freqchk if normal mode derivatives were requested, rather than by default.
[REV C] Utilities can now take the -m command-line argument to specify the amount of memory available to the utility. For example:
```
formchk -m=1gb myfile
```
The -m option must precede any file name or other arguments.
[REV C] The %SSH Link 0 command and its equivalents can be used to name a command to run to start Linda workers, rather than either rsh or ssh.
[REV C] Some defaults when Geom=AllCheck is specified can now be overridden:
- Field=NoChk can be used to suppress reading external field coefficients from the checkpoint file.
- Geom=GenConnectivity forces the connectivity to be recomputed rather than using the information in the checkpoint file.
- Geom=UseStandardOrientation uses the coordinates in the standard orientation from the checkpoint file as the input orientation for the new job.
[REV C] Some defaults during geometry optimizations to a minimum can now be overridden:
- Opt=NGoUp=N allows the energy to increase N times before doing only linear searches. The default is 1 (only linear searches are performed after the second time in row that the energy increases); N=-1 forces only linear searches whenever the energy rises.
- When near a saddle point, Opt=NGoDown=N causes the program to mix at most N eigenvectors of the Hessian with negative eigenvalues to form a step away from the saddle point. The default is 3; N=-1 turns this feature off, and the algorithm takes only the regular RFO step.
- Opt=MaxEStep=N says to take a step of length N/1000 (Bohr or radians) when moving away from a saddle point. The default is N=600 (0.6) for regular optimizations and N=100 (0.1) for ONIOM Opt=Quadmac calculations.
[REV C] Information on multidimensional relaxed scans is now stored on the formatted checkpoint file with details about the axes, rather than flattened, so these can be displayed in GaussView and other programs.
[REV C] The program now stores and checks a version number in checkpoint files. This avoids obscure failure modes when an obsolete checkpoint is named. The c8616 utility can be used to update checkpoint files, and there is a -fixver option to unfchk to mark a checkpoint file it creates as current even if there was no version in the input formatted checkpoint file.
[REV B] The ChkChk utility now reports the job status (whether the job completed normally, failed, is in progress, etc.)
[REV B] The optional parameters in the input line for an atom can now specify the radius to use when finite (non-point) nuclei are used. The radius is specified as a floating point value in atomic units using the RadNuclear=val item. For example:
```
    C(RadNucl=0.001) 0.0 0.0 3.0
```
The GauOpen tools for interfacing Gaussian with other programs, both in compiled languages such as Fortran and C and with interpreted languages such as Python and Perl. Refer to GauOpen: Interfacing to Gaussian 16 for details.
- [REV C] supports raw binary files using either 4- or 8-byte integers. The former is the default except on NEC systems. Support for this feature includes new options to the Output keyword and the formchk utility, new Link 0 commands and new command line options and environment variables.
- [REV C] adds information about ONIOM layers and optimization and trajectory results to the matrix element file. It also adds new options to the Output keyword for including AO two-electron integrals, derivatives of the overlap, core Hamiltonian and other matrices and/or the AO 2-electron integral derivatives.
- [REV B] added many additional quantities to the matrix element file, including atomic populations, one-electron and property operator matrices and the non-adiabatic coupling vector. The new items are the labeled sections QUADRUPOLE INTEGRALS, OCTOPOLE INTEGRALS, HEXADECAPOLE INTEGRALS, [MULLIKEN,ESP,AIM,NPA,MBS] CHARGES, DIP VEL INTEGRALS, R X DEL INTEGRALS, OVERLAP DERIVATIVES, CORE HAMILTONIAN DERIVATIVES, F(X), DENSITY DERIVATIVES, FOCK DERIVATIVES, ALPHA UX, BETA UX, ALPHA MO DERIVATIVES, BETA MO DERIVATIVES, [Alpha,Beta] [SCF,MP2,MP3,MP4,CI Rho(1),CI,CC] DENSITY and TRANS MO COEFFICIENTS and the scalars 63-64.
[REV C] Enhancements to facilitate scripting:
- The AllAtoms and ActiveAtoms to the External keyword are used to provide information on all atoms or only those in the model system (high layer) when using an external program/script with ONIOM.
- The file $g16root/g16/bsd/inp2mat is a script which takes a Gaussian input file and generates a matrix element file with the information implied by the input file (coordinates, basis set, etc.) without running the full calculation. This is used by the Python interface in GauOpen to import this information into a matrix element file object, but can also be used in other scripts to avoid any need to parse Gaussian input files.
- The testrt utility now prints the integer size used by G16 so that scripts can check what size of integers will be used by default in matrix element files.
Parameters specified in Link 0 (%) input lines and/or in a Default.Route file can now also be specified via either command-line arguments or environment variables. [REV B] introduces command-line options to specify input and/or data using a checkpoint or matrix element file (the equivalent of the %OldChk or %OldMatrix Link 0 commands for input). See the Equivalencies tab for details.
You can now compute the force constants at every nth step of a geometry optimization: see Opt=Recalc.
[REV B] DFTB parameters are now read in Link 301 before the basis set is constructed, so that the presence or absence of d functions for an element can be taken from the parameter file.

Changes between G16 Revision C.01 and G16 Revision C.02

Revision C.02 is an update to support NVIDIA A100 (Ampere) GPUs and the NVIDIA SDK compiler version 21.3. The build procedure from source code has changed for all x86_64 platforms to use the new compiler. Apart from A100 GPU support, the resulting binaries offer the same functionality as Revision C.01.

Changes from Gaussian 16 Rev. A.03

There have been minor modifications to the procedure for building from source code, which is documented here.

Changes from Gaussian 09

Calculation Defaults

The following calculation defaults are different in Gaussian 16:

Integral accuracy is 10^-12 rather than 10^-10 in Gaussian 09.
The default DFT grid for general use is UltraFine rather than FineGrid in G09; the default grid for CPHF is SG1 rather than CoarseGrid. See the discussion of the Integral keyword for details.
SCRF defaults to the symmetric form of IEFPCM [Lipparini10] (not present in Gaussian 09) rather than the non-symmetric version.
Physical constants use the 2010 values rather than the 2006 values in Gaussian 09.

The first two items were changed to ensure accuracy in several new calculation types (e.g., TD-DFT frequencies, anharmonic ROA). For these reasons, Integral=(UltraFine,Acc2E=12) was made the default. Using these settings generally improve the reliability of calculations involving numerical integration, e.g., DFT optimizations in solution. There is a modest increase in the CPU requirements for these options compared to the Gaussian 09 defaults of Integral=(FineGrid,Acc2E=10).

The G09Defaults keyword sets all four of these defaults back to the Gaussian 09 values. It is provided for compatibility with previous calculations, but the new defaults are strongly recommended for new studies.

Default Memory Use

Gaussian 16 defaults memory usage to %Mem=100MW (800MB). Even larger values are appropriate for calculations on larger molecules and when using many processors; refer to the Parallel Jobs tab for details.

TD-DFT Frequencies

TDDFT frequency calculations compute second derivatives analytically by default, since these are much faster than the numerical derivatives (the only choice in Gaussian 09).

Using GPUs

Gaussian 16 can use NVIDIA K40, K80, P100 (Rev. B.01), V100 (Rev. C.01) and A100 (Rev. C.02) GPUs under Linux. Earlier GPUs do not have the computational capabilities or memory size to run the algorithms in Gaussian 16.

Allocating Memory for Jobs

Allocating sufficient amounts of memory to jobs is even more important when using GPUs than for CPUs, since larger batches of work must be done at the same time in order to use the GPUs efficiently. The K40 and K80 units can have up to 16 GB of memory. Typically, most of this should be made available to Gaussian. Giving Gaussian 8-9 GB works well when there is 12 GB total on each GPU; similarly, allocating Gaussian 11-12 GB is appropriate for a 16 GB GPU. In addition, at least an equal amount of memory must be available for each CPU thread which is controlling a GPU.

About Control CPUs

When using GPUs, each GPU must be controlled by a specific CPU. The controlling CPU should be as physically close as possible to the GPU it is controlling. GPUs cannot share controlling CPUs. Note that CPUs used as GPU controllers do not participate as compute nodes during the parts of the calculation that are GPU-parallel.

The hardware arrangement on a system with GPUs can be checked using the nvidia-smi utility. For example, this output is for a machine with two 16-core Haswell CPU chips and four K80 boards, each of which has two GPUs:

     GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 CPU Affinity	 	 
GPU0    X  PIX  SOC  SOC  SOC  SOC  SOC SOC  0-15     cores on first chip
GPU1  PIX    X  SOC  SOC  SOC  SOC  SOC SOC  0-15	 	 
GPU2  SOC  SOC    X  PIX  PHB  PHB  PHB PHB  16-31    cores on second chip	 	 
GPU3  SOC  SOC  PIX    X  PHB  PHB  PHB PHB  16-31	 	 
GPU4  SOC  SOC  PHB  PHB    X  PIX  PXB PXB  16-31	 	 
GPU5  SOC  SOC  PHB  PHB  PIX    X  PXB PXB  16-31	 	 
GPU6  SOC  SOC  PHB  PHB  PXB  PXB    X PIX  16-31	 	 
GPU7  SOC  SOC  PHB  PHB  PXB  PXB  PIX   X  16-31

The important part of this output is the CPU affinity. This example shows that GPUs 0 and 1 (on the first K80 card) are connected to the CPUs on chip 0 while GPUs 2-7 (on the other two K80 cards) are connected to the CPUs on chip 1.

Specifying GPUs & Control CPUs for a Gaussian Job

The GPUs to use for a calculation and their controlling CPUs are specified with the %GPUCPU Link 0 command. This command takes one parameter:

%GPUCPU=gpu-list=control-cpus

where gpu-list is a comma-separated list of GPU numbers, possibly including numerical ranges (e.g., 0-4,6), and control-cpus is a similarly-formatted list of controlling CPU numbers. The corresponding items in the two lists are the GPU and its controlling CPU.

For example, on a 32-processor system with 6 GPUs, a job which uses all the CPUs—26 CPUs serving solely as compute nodes and 6 CPUs used for controlling GPUs—would use the following Link 0 commands:

%CPU=0-31                               Control CPUs are included in this list. 	 
%GPUCPU=0,1,2,3,4,5=0,1,16,17,18,19

These command state that CPUs 0-31 will be used in the job. GPUs 0 through 5 will be used, with GPU0 controlled by CPU 0, GPU1 controlled by CPU 1, GPU2 controlled by CPU 16, GPU3 controlled by CPU 17, and so on. Note that the controlling CPUs are included in %CPU.

In the preceding example, the GPU and CPU lists could be expressed more tersely as:

%CPU=0-31	 	 
%GPUCPU=0-5=0-1,16-19

Normally one uses consecutive processors in the obvious way, but things can be associated differently in special cases. For example, suppose the same machine already had a job using 6 CPUs, running with %CPU=16-21. Then, in order to use the other 26 CPUs with 6 controlling GPUs, you would specify:

%CPU=0-15,22-31	 	 
%GPUCPU=0-5=0-1,22-25

This job would use a total of 26 processors, employing 20 of them solely for computation, along with the six GPUs controlled by CPUs 0, 1, 22, 23, 24 and 25 (respectively).

In [REV B], the lists of CPUs and GPUs are both sorted and then matched up. This ensures that the the lowest numbered threads are executed on CPUs that have GPUs. Doing so ensures that if a part of a calculation has to reduce the number of processors used (i.e., because of memory limitations), it will preferentially use/retain the threads with GPUs (since it removes threads in reverse order).

GPUs and Overall Job Performance

GPUs are effective for larger molecules when doing DFT energies, gradients and frequencies (for both ground and excited states), but they are not effective for small jobs. They are also not used effectively by post-SCF calculations such as MP2 or CCSD.

Each GPU is several times faster than a CPU. However, on modern machines, there are typically many more CPUs than GPUs. The best performance comes from using all the CPUs as well as the GPUs.

In some circumstances, the potential speedup from GPUs can be limited because many CPUs are also used effectively by Gaussian 16. For example, if the GPU is 5x faster than a CPU, then the speedup of using the GPU versus the CPU alone would be 5x. However, the potential speedup resulting from using GPUs on a larger computer with 32 CPUs and 8 GPUs is 2x:

Without GPUs: 32*1 = 32
With GPUs: (24*1) + (8*5) = 64 Remember that control CPUs are not used for computation.
Speedup: 64/32 = 2

Note that this analysis assumes that the GPU-parallel portion of the calculation dominates the total execution time.

Allocation of memory. GPUs can have up to 16 GB of memory. One typically tries to make most of this available to Gaussian. Be aware that there must be at least an equal amount of memory given to the CPU thread running each GPU as is allocated for computation. Using 8-9 GB works well on a 12 GB GPU, or 11-12 GB on a 16 GB GPU (reserving some memory for the system). Since Gaussian gives equal shares of memory to each thread, this means that the total memory allocated should be the number of threads times the memory required to use a GPU efficiently. For example, when using 4 CPUs and 2 GPUs each with 16 GB of memory, you should use 4 × 12 GB of total memory. For example:

%Mem=48GB
%CPU=0-3
%GPUCPU=0-1=0,2

You will need to analyze the characteristics of your own environment carefully when making decisions about which processors and GPUs to use and how much memory to allocate.

GPUs in a Cluster

GPUs on nodes in a cluster can be used. Since the %CPU and %GPUCPU specifications are applied to each node in the cluster, the nodes must have identical configurations (number of GPUs and their affinity to CPUs); since most clusters are collections of identical nodes, this restriction is not usually a problem.

Parallel Usage and Performance Notes

Shared-memory parallelism

Memory allocation. Calculations involving larger molecules and basis sets benefit from larger memory allocations. 4 GB or more per processor is recommended for calculations involving 50 or more atoms and/or 500 or more basis functions. The freqmem utility estimates the optimal memory size per thread for ground-state frequency calculations, and the same value is reasonable for excited-state frequencies and is more than sufficient for ground and excited state optimizations.

The amount of memory allowed should rise with the number of processors: if 4 GB is reasonable for one processor, then the same job using 8 CPUs would run well in 32 GB. Of course, there may be limitations to smaller values imposed by the particular hardware, but scaling memory linearly with number of CPUs should be the goal. In particular, increasing only the number of CPUs with fixed memory size is unlikely to lead to good performance when using large numbers of processors.

For large frequency calculations and for large CCSD and EOM-CCSD energies, it is also desirable to leave enough memory to buffer the large disk files involved. Therefore, a Gaussian job should only be given 50-70% of the total memory on the system. For example, on a machine with a total of 128 GB, one should typically give 64-80 GB to a job which was using all the CPUs, and leave the remaining memory for the operating system to use as disk cache.

Pinning threads to CPUs under Linux. Efficiency is lost when threads are moved from one CPU to another, thereby invalidating the cache and causing other overhead. On most machines, Gaussian can tie threads to specific CPUs, and this is the recommended mode of operation, especially when using larger numbers of processors. The %CPU Link 0 line specifies the numbers of specific CPUs to be used. Thus, on a machine with one 8-core chip, one should use %CPU=0‑7 rather than %NProc=8 because the former ties the first thread to CPU 0, the next to CPU 1, etc.

On some older Intel processors (Nehalem and before), there is not enough memory bandwidth to keep all the CPUs on a chip busy, and it is often preferable to use half the CPUs, each with twice as much memory as if all were used. For example, on such a machine with four 12-core chips and 128 GB of memory, with CPUs 0-11 on the first chip, 12-23 on the second, and so on, it is better to run using 24 processors (6 on each chip) and give them 72 GB/24 procs = 3 GB memory each, rather than use all 48 with only 1.5 GB of memory each. The required input directives would be:

%Mem=72GB
%CPU=0-47/2

where the /2 means to use every other core: i.e., cores 0, 2, 4, 6, 8, and 10 (on chip 0), 12, 14, 16, 18, 20, and 22 (on chip 1), etc.

With the most recent generations of Intel processors (Haswell and later), the memory bandwidth is better and using all the cores on each chip works well.

As long as sufficient memory is available and threads are tied to specific cores, then parallel efficiency on large molecules is good up to 64 or more cores.

Disable hyperthreading. Hyperthreading is not useful for Gaussian since it effectively divides resources such as memory bandwidth among threads on the same physical CPU. If hyperthreading cannot be turned off, Gaussian jobs should use only one hyperthread on each physical CPU. Under Linux, hyperthreads on different processors are grouped together. That is, if a machine has 2 chips each with 8 cores and 3-way hyperthreading, then “CPUs” 0-7 are across the 8 cores on chip 0, 8-15 are across the 8 cores on chip 1, and 16-23 are the second hyperthreads on the 8 cores of chip 0, and so on. So a job would run best with %CPU=0‑15.

Under AIX, hyperthreads are grouped together with up 8 hyperthread numbers for each CPU even if fewer hyperthreads are in use, so with two 8 core chips and 4-way hyperthreading, “CPUs” 0-3 are all on core 0 of chip 0, 8-11 are on core 1 of chip 0, etc. Thus, one would want to use %CPU=0‑127/8 to select “CPUs” 0, 8, 16, … which are each using a distinct core.

Cluster (Linda) parallelism

Availability. Hartree-Fock and DFT energies, gradients and frequencies run in parallel across clusters, as do MP2 energies and gradients. MP2 frequencies, CCSD, and EOM-CCSD energies and optimizations are SMP parallel but not cluster parallel. Numerical derivatives, such as DFT anharmonic frequencies and CCSD frequencies, are parallelized across nodes of a cluster by doing a complete gradient or second derivative calculation on each node, splitting the directions of differentiation across workers in the cluster.

Combining with MP parallelism. Shared-memory and cluster parallelism can be combined. Generally, one uses shared-memory parallelism across all CPUs in each node of the cluster. Note that %CPU and %Mem apply to each node of the cluster. Thus, if one has 3 nodes names apple, banana and cherry, each with two chips which have 8 CPUs each, then one might specify:

%Mem=64GB
%CPU=0-15
%LindaWorkers=apple,banana,cherry
# B3LYP/6-311+G(2d,p) Freq …

This would run 16 threads, each pinned to a CPU, on each of the 3 nodes, giving 4 GB to each of the 48 threads.

For the special case of numerical differentiation only—e.g., Freq=Anharm, CCSD Freq, etc.—one extra worker is used to collect the results. So these jobs should be run with two workers on the master node (where Gaussian 16 is started). For the above example if the job was computing anharmonic frequencies, then one would use:

%Mem=64GB
%CPU=0-15
%LindaWorkers=apple:2,banana,cherry
# B3LYP/6-311+G(2d,p) Freq=Anharm …

where Gaussian 16 is assumed to be started on node apple. This will start 2 workers on node apple, one of which just collects results, and will do the computational work using the other worker on apple and those on banana and cherry.

Memory requirements for CCSD, CCSD(T) and EOM-CCSD calculations

These calculations can use memory to avoid I/O and will run much more efficiently if they are allowed enough memory to store the amplitudes and product vectors in memory. If there are NO active occupied orbitals (NOA in the output) and NV virtual orbitals (NVB in the output) then approximately 9NO²NV² words of memory are required. This does not depend on the number of processors used.

Most options that control how Gaussian 16 operates can be specified in any of 4 ways. From highest to lowest precedence these are:

As Link 0 input (%-lines): This is the usual method to control a specific job and the only way to control a specific step within a multi-step input file. Example: %CPU=1,2,3,4. For full documentation on Link 0 command, see Link 0 Commands
As options on the command line: Command line options are useful when you want to define aliases or other shortcuts for different common ways of running the program. Example: g16 -c="1,2,3,4" …
As environment variables: This is most useful in standard scripts, for example for generating and submitting jobs to batch queuing systems. Example: export GAUSS_CDEF="1,2,3,4"
As directives in the Default.Route file: This is most useful when one wants to change the program defaults for all jobs. Example: -C- 1,2,3,4

When searching for a Default.Route file the current default directory is checked first, followed by the directories in the path for Gaussian 16 executables: environment variable GAUSS_EXEDIR, which normally points to $g16root/g16.

The following table lists the equivalences among Link 0 commands, command line options, Default.Route items and environment variables. The -h, -o options and the -i and -o option classes were introduced in [REV B], as were their corresponding environment variables.

Default.Route	Link 0	Option	Env. Var.	Description
Gaussian 16 execution defaults
-R-		-r	GAUSS_RDEF	Route section keyword list.
-M-	%Mem	-m	GAUSS_MDEF	Memory amount for Gaussian jobs.
-C-	%CPU	-c	GAUSS_CDEF	Processor/core list for multiprocessor parallel jobs.
-G-	%GPUCPU	-g	GAUSS_GDEF	GPUs=Cores list for GPU parallel jobs.
-S-	%SSH=command	-s	GAUSS_SDEF	Program to start workers for network parallel jobs. %UseSSH is equivalent to %SSH=ssh and %UseRSH similarly specifies rsh.
-W-	%LindaWorkers	-w	GAUSS_WDEF	List of hostnames for network parallel jobs.
-P-	%NProcShared	-p	GAUSS_PDEF	#processors/cores for multiprocessor parallel jobs. Deprecated; use -C-.
-L-	%NProcLinda	-l	GAUSS_LDEF	#nodes for network parallel jobs. Deprecated; use -W-.
Archive entry data
-H-		-h	GAUSS_HDEF	Computer hostname.
-O-		-o	GAUSS_ODEF	Organization (site) name.
Utility program defaults
-F-			GAUSS_FDEF	Options for the formchk utility.
-U-			GAUSS_UDEF	Memory amount for utilities.
Parameters for scripts and external programs
	# section	-x	GAUSS_XDEF	Complete route for the job (route not read from input file).
	%Chk	-y	GAUSS_YDEF	Checkpoint file for job.
	%RWF	-z	GAUSS_ZDEF	Read-write file for job.
	%OldChk	-ic	GAUSS_ICDEF	Existing checkpoint file from which to read input.
	%OldMatrix	-im	GAUSS_IMDEF	Matrix element file from which to read input.
	%OldMatrix=(file,i4lab)	-im4	GAUSS_IM4DEF	Matrix element file using 4-byte integers from which to read input.
	%OldMatrix=(file,i8lab)	-im8	GAUSS_IM8DEF	Matrix element file using 8-byte integers from which to read input.
	%OldRaw	-ir	GAUSS_IRDEF	Raw matrix element file from which to read input.
	%OldRaw=(file,i4lab)	-ir4	GAUSS_IR4DEF	Raw matrix element file using 4-byte integers from which to read input.
	%OldRaw=(file,i8lab)	-ir8	GAUSS_IR8DEF	Raw matrix element file using 8-byte integers from which to read input.
		-oc	GAUSS_OCDEF	Output checkpoint file. Generally redundant with -y/GAUSS_YDEF.
		-om	GAUSS_OMDEF	Output matrix element file.
		-om4	GAUSS_OM4DEF	Output matrix element file using 4-byte integers.
		-om8	GAUSS_OM8DEF	Output matrix element file using 8-byte integers.
		-or	GAUSS_ORDEF	Output raw matrix element file.
		-or4	GAUSS_OR4DEF	Output raw matrix element file using 4-byte integers.
		-or8	GAUSS_OR8DEF	Output raw matrix element file using 8-byte integers.

Note that the quotation marks are normally required around the specified value for the command line and environment variables to avoid modification of the parameter string by the shell.

The following bugs are fixed in Rev. C.01:

Problems with Freq=Anharmonic when doing Raman or ROA with multiple incident light frequencies were fixed.
Fixes for memory allocation running in parallel with high angular momentum and pure DFT functionals and some unusual cases with cluster parallelism.
Documentation within DFTB parameter files is skipped properly.
A problem with running chkchk on a checkpoint file from a job which died early was fixed.
Performance problems with the hybridization term in PM7R6 for large molecules were fixed.
The limit on the number of occupied orbitals in the GVB code has been increased to 1000, and some problems with FMM andGVB for large molecules were fixed.
Problems with Grimme (D2 or D3) dispersion and ghost atoms were fixed.
A problem with the orbital energies printed by Punch=MO and chkchk -p was fixed.
The handling of the default file extension for the -fck= (/fck= on Windows systems) command line argument was fixed, so that the default is .fck but specifying other extensions such as .fchk also work.
Some errors in specifying a named basis in general basis input which were previously undetected are now recognized.
A problem with using / rather than – to specify the option selecting an ONIOM subcalculation to chkchk, copychk and formchk on Windows was fixed.
Problems with running formchk on files during an ONIOM model system calculation, or on a checkpoint file from an ONIOM job which stopped during a model system calculation were fixed.
A problem with MM parameter values being incomplete or wrong in formatted checkpoint files was fixed.
Field=Read when density fitting is also in use no longer tries to read the field values twice.
An error in parsing a bare CBSB7 on the route, treating this as implying CBS extrapolation rather than naming the basis set, was fixed.
Various wrong defaults in the route generated for CIS with SCF=Conventional were corrected.
A problem with Guess=Read when ghost atoms were present was fixed.
Punch=GAMESS now works with H and higher functions.
-2 instead of Tv for translation vectors in the atom specification input section works again.
Some problems in generating internal coordinates for molecules having long linear chains of atoms were fixed.
A problem in doing one-electron derivatives when a very small threshold for two-electron integrals was specified was fixed.
A problem caused an Opt+Freq jobs which specified a non-default post-SCF window (e.g., MP2=FreezeG2) to fail in the frequency step was fixed.
An underestimate of memory requirements for incore, which could cause jobs to default to incore and then run out of memory, was fixed.

The following bugs were fixed in Rev. B.01:

A problem with restarting in the middle of a job step (from the RWF) when using SCF=QC was fixed.
When doing the regular SCF part of SCF=XQC or SCF=YQC, the orbitals and density are only saved when a new lowest energy wavefunction is found. If L502 fails to converge and the calculation moves on to L508 (QC or steepest descent SCF) then the best wavefunction from the regular SCF iterations is used.
Problems with restarting from the RWF in the middle of EOM-CC calculations were fixed.
Problems with ROMP4 and with EOM-CC when there was an empty beta spin-space or a full alpha spin-space were fixed.
The erroneous labels in the summary table for G4 and G4MP2 jobs were corrected.
Problems with naming scratch files for NBO when the RWF was split across physical files were fixed.
An allocation problem which caused CIS and TD frequency jobs on large molecules using very small amounts of memory to fail was fixed. These jobs now complete, but they would run much more efficiently if given more memory (i.e., a larger value to %Mem).
A bug which caused jobs which used the FormCheck keyword to fail was fixed. This keyword is deprecated, and the -fchk command line option, which is more flexible, is the preferred alternative.
Unnecessary warnings which were printed by formchk when operating on a checkpoint file from a calculation which included PCM solvation were removed.
The route for Opt=(TS,ReCalcFC=N) was corrected.
Molecular mechanics parameters are now stored correctly in formatted checkpoint files.
The route for doing interaction deletions using NBO6 (Pop=NBO6Del) was corrected.
A bug which prevented GPUs from being enabled in later steps of a compound job was fixed.
A problem with parsing the obsolete keywords QMom and Magneton in atomic property lists was corrected.

Last updated: 31 August 2022