Model

This module is designed to include machine-learning models for interpolating energies and forces from either an atom-centered or image-centered fingerprint description.

Model

class amp.model.LossFunction(energy_coefficient=1.0, force_coefficient=0.04, convergence=None, parallel=None, overfit=0.0, raise_ConvergenceOccurred=True, log_losses=True, d=None)[source]

Basic loss function, which can be used by the model.get_loss method which is required in standard model classes.

This version is pure python and thus will be slow compared to a fortran/parallel implementation.

If parallel is None, it will pull it from the model itself. Only use this keyword to override the model’s specification.

Also has parallelization methods built in.

See self.default_parameters for the default values of parameters specified as None.

Parameters:
  • energy_coefficient (float) – Coefficient of the energy contribution in the loss function.
  • force_coefficient (float) – Coefficient of the force contribution in the loss function. Can set to None as shortcut to turn off force training.
  • convergence (dict) – Dictionary of keys and values defining convergence. Keys are ‘energy_rmse’, ‘energy_maxresid’, ‘force_rmse’, and ‘force_maxresid’. If ‘force_rmse’ and ‘force_maxresid’ are both set to None, force training is turned off and force_coefficient is set to None.
  • parallel (dict) – Parallel configuration dictionary. Will pull from model itself if not specified.
  • overfit (float) – Multiplier of the weights norm penalty term in the loss function.
  • raise_ConvergenceOccurred (bool) – If True will raise convergence notice.
  • log_losses (bool) – If True will log the loss function value in the log file else will not.
  • d (None or float) – If d is None, both loss function and its gradient are calculated analytically. If d is a float, then gradient of the loss function is calculated by perturbing each parameter plus/minus d.
attach_model(model, fingerprints=None, fingerprintprimes=None, images=None)[source]

Attach the model to be used to the loss function.

fingerprints and training images need not be supplied if they are already attached to the model via model.trainingparameters.

Parameters:
  • model (object) – Class representing the regression model.
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • fingerprintprimes (dict) – Dictionary with images hashs as keys and the corresponding fingerprint derivatives as values.
  • images (list or str) – List of ASE atoms objects with positions, symbols, energies, and forces in ASE format. This is the training set of data. This can also be the path to an ASE trajectory (.traj) or database (.db) file. Energies can be obtained from any reference, e.g. DFT calculations.
calculate_loss(parametervector, lossprime)[source]

Method that calculates the loss, derivative of the loss with respect to parameters (if requested), and max_residual.

Parameters:
  • parametervector (list) – Parameters of the regression model in the form of a list.
  • lossprime (bool) – If True, will calculate and return dloss_dparameters, else will only return zero for dloss_dparameters.
check_convergence(loss, energy_loss, force_loss, energy_maxresid, force_maxresid)[source]

Check convergence

Checks to see whether convergence is met; if it is, raises ConvergenceException to stop the optimizer.

Parameters:
  • loss (float) – Value of the loss function.
  • energy_loss (float) – Value of the energy contribution of the loss function.
  • force_loss (float) – Value of the force contribution of the loss function.
  • energy_maxresid (float) – Maximum energy residual.
  • force_maxresid (float) – Maximum force residual.
default_parameters = {'convergence': {'energy_rmse': 0.001, 'force_rmse': None, 'energy_maxresid': None, 'force_maxresid': None}}
get_loss(parametervector, lossprime)[source]

Returns the current value of the loss function for a given set of parameters, or, if the energy is less than the energy_tol raises a ConvergenceException.

Parameters:
  • parametervector (list) – Parameters of the regression model in the form of a list.
  • lossprime (bool) – If True, will calculate and return dloss_dparameters, else will only return zero for dloss_dparameters.
process_parallels(vector, server, n_pids)[source]
Parameters:
  • vector (list) – Parameters of the regression model in the form of a list.
  • server (object) – Master session of parallel processing.
  • processes (list of objects) – Worker sessions for parallel processing.
class amp.model.Model[source]

Bases: object

Class that includes common methods between different models.

calculate_dEnergy_dParameters(fingerprints)[source]

Calculates a list of floats corresponding to the derivative of model-predicted energy of an image with respect to model parameters.

Parameters:fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
calculate_dForces_dParameters(fingerprints, fingerprintprimes)[source]

Calculates an array of floats corresponding to the derivative of model-predicted atomic forces of an image with respect to model parameters.

Parameters:
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • fingerprintprimes (dict) – Dictionary with images hashs as keys and the corresponding fingerprint derivatives as values.
calculate_energy(fingerprints)[source]

Calculates the model-predicted energy for an image, based on its fingerprint.

Parameters:fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
calculate_forces(fingerprints, fingerprintprimes)[source]

Calculates the model-predicted forces for an image, based on derivatives of fingerprints.

Parameters:
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • fingerprintprimes (dict) – Dictionary with images hashs as keys and the corresponding fingerprint derivatives as values.
calculate_numerical_dEnergy_dParameters(fingerprints, d=1e-05)[source]

Evaluates dEnergy_dParameters using finite difference.

This will trigger two calls to calculate_energy(), with each parameter perturbed plus/minus d.

Parameters:
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • d (float) – The amount of perturbation in each parameter.
calculate_numerical_dForces_dParameters(fingerprints, fingerprintprimes, d=1e-05)[source]

Evaluates dForces_dParameters using finite difference. This will trigger two calls to calculate_forces(), with each parameter perturbed plus/minus d.

Parameters:
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • fingerprintprimes (dict) – Dictionary with images hashs as keys and the corresponding fingerprint derivatives as values.
  • d (float) – The amount of perturbation in each parameter.
log

Method to set or get a logger. Should be an instance of amp.utilities.Logger.

Parameters:log (Logger object) – Write function at which to log data. Note this must be a callable function.
tostring()[source]

Returns an evaluatable representation of the calculator that can be used to re-establish the calculator.

amp.model.calculate_fingerprints_range(fp, images)[source]

Calculates the range for the fingerprints corresponding to images, stored in fp. fp is a fingerprints object with the fingerprints data stored in a dictionary-like object at fp.fingerprints. (Typically this is a .utilties.Data structure.) images is a hashed dictionary of atoms for which to consider the range.

In image-centered mode, returns an array of (min, max) values for each fingerprint. In atom-centered mode, returns a dictionary of such arrays, one per element.

amp.model.ravel_data(train_forces, mode, images, fingerprints, fingerprintprimes)[source]

Reshapes data of images into lists.

Parameters:
  • train_forces (bool) – Determining whether forces are also trained or not.
  • mode (str) – Can be either ‘atom-centered’ or ‘image-centered’.
  • images (list or str) – List of ASE atoms objects with positions, symbols, energies, and forces in ASE format. This is the training set of data. This can also be the path to an ASE trajectory (.traj) or database (.db) file. Energies can be obtained from any reference, e.g. DFT calculations.
  • fingerprints (dict) – Dictionary with images hashs as keys and the corresponding fingerprints as values.
  • fingerprintprimes (dict) – Dictionary with images hashs as keys and the corresponding fingerprint derivatives as values.
amp.model.send_data_to_fortran(_fmodules, energy_coefficient, force_coefficient, overfit, train_forces, num_atoms, num_images, actual_energies, actual_forces, atomic_positions, num_images_atoms, atomic_numbers, raveled_fingerprints, num_neighbors, raveled_neighborlists, raveled_fingerprintprimes, model, d)[source]

Function that sends images data to fortran code. Is used just once on each core.

Neural Network

class amp.model.neuralnetwork.NeuralNetwork(hiddenlayers=(5, 5), activation='tanh', weights=None, scalings=None, fprange=None, regressor=None, mode=None, lossfunction=None, version=None, fortran=True, checkpoints=100)[source]

Bases: amp.model.Model

Class that implements a basic feed-forward neural network.

Parameters:
  • hiddenlayers (dict) –

    Dictionary of chemical element symbols and architectures of their corresponding hidden layers of the conventional neural network. Number of nodes of last layer is always one corresponding to energy. However, number of nodes of first layer is equal to three times number of atoms in the system in the case of no descriptor, and is equal to length of symmetry functions of the descriptor. Can be fed using tuples as:

    >>> hiddenlayers = (3, 2,)
    

    for example, in which a neural network with two hidden layers, the first one having three nodes and the second one having two nodes is assigned (to the whole atomic system in the no descriptor case, and to each chemical element in the atom-centered mode). When setting only one hidden layer, the dictionary can be fed as:

    >>> hiddenlayers = (3,)
    

    In the atom-centered mode, neural network for each species can be assigned seperately, as:

    >>> hiddenlayers = {"O":(3,5), "Au":(5,6)}
    

    for example.

  • activation (str) – Assigns the type of activation funtion. “linear” refers to linear function, “tanh” refers to tanh function, and “sigmoid” refers to sigmoid function.
  • weights (dict) – In the case of no descriptor, keys correspond to layers and values are two dimensional arrays of network weight. In the atom-centered mode, keys correspond to chemical elements and values are dictionaries with layer keys and network weight two dimensional arrays as values. Arrays are set up to connect node i in the previous layer with node j in the current layer with indices w[i,j]. The last value for index i corresponds to bias. If weights is not given, arrays will be randomly generated.
  • scalings (dict) – In the case of no descriptor, keys are “intercept” and “slope” and values are real numbers. In the fingerprinting scheme, keys correspond to chemical elements and values are dictionaries with “intercept” and “slope” keys and real number values. If scalings is not given, it will be randomly generated.
  • fprange (dict) –

    Range of fingerprints of each chemical species. Should be fed as a dictionary of chemical species and a list of minimum and maximun, e.g.:

    >>> fprange={"Pd": [0.31, 0.59], "O":[0.56, 0.72]}
    
  • regressor (object) – Regressor object for finding best fit model parameters, e.g. by loss function optimization via amp.regression.Regressor.
  • mode (str) – Can be either ‘atom-centered’ or ‘image-centered’.
  • lossfunction (object) – Loss function object, if at all desired by the user.
  • version (object) – Version of this class.
  • fortran (bool) – If True, allows for extrapolation, if False, does not allow.
  • checkpoints (int) – Frequency with which to save parameter checkpoints upon training. E.g., 100 saves a checkpoint on each 100th training setp. Specify None for no checkpoints. Note: You can make this negative to not overwrite previous checkpoints.

:param .. note:: Dimensions of weight two dimensional arrays should be consistent: with hiddenlayers.

Raises:RuntimeError, NotImplementedError
calculate_atomic_energy(afp, index, symbol)[source]

Given input to the neural network, output (which corresponds to energy) is calculated about the specified atom. The sum of these for all atoms is the total energy (in atom-centered mode).

Parameters:
  • afp (list) – Atomic fingerprints in the form of a list to be used as input to the neural network.
  • index (int) – Index of the atom for which atomic energy is calculated (only used in the atom-centered mode).
  • symbol (str) – Symbol of the atom for which atomic energy is calculated (only used in the atom-centered mode).
Returns:

Energy.

Return type:

float

calculate_dAtomicEnergy_dParameters(afp, index=None, symbol=None)[source]

Returns the derivative of energy square error with respect to variables.

Parameters:
  • afp (list) – Atomic fingerprints in the form of a list to be used as input to the neural network.
  • index (int) – Index of the atom for which atomic energy is calculated (only used in the atom-centered mode)
  • symbol (str) – Symbol of the atom for which atomic energy is calculated (only used in the atom-centered mode)
Returns:

The value of the derivative of energy square error with respect to variables.

Return type:

list of float

calculate_dForce_dParameters(afp, derafp, direction, nindex=None, nsymbol=None)[source]

Returns the derivative of force square error with respect to variables.

Parameters:
  • afp (list) – Atomic fingerprints in the form of a list to be used as input to the neural network.
  • derafp (list) – Derivatives of atomic fingerprints in the form of a list to be used as input to the neural network.
  • direction (int) – Direction of force.
  • nindex (int) – Index of the neighbor atom which force is acting at. (only used in the atom-centered mode)
  • nsymbol (str) – Symbol of the neighbor atom which force is acting at. (only used in the atom-centered mode)
Returns:

The value of the derivative of force square error with respect to variables.

Return type:

list of float

calculate_force(afp, derafp, direction, nindex=None, nsymbol=None)[source]

Given derivative of input to the neural network, derivative of output (which corresponds to forces) is calculated.

Parameters:
  • afp (list) – Atomic fingerprints in the form of a list to be used as input to the neural network.
  • derafp (list) – Derivatives of atomic fingerprints in the form of a list to be used as input to the neural network.
  • direction (int) – Direction of force.
  • nindex (int) – Index of the neighbor atom which force is acting at. (only used in the atom-centered mode)
  • nsymbol (str) – Symbol of the neighbor atom which force is acting at. (only used in the atom-centered mode)
Returns:

Force.

Return type:

float

fit(trainingimages, descriptor, log, parallel, only_setup=False)[source]

Fit the model parameters such that the fingerprints can be used to describe the energies in trainingimages. log is the logging object. descriptor is a descriptor object, as would be in calc.descriptor.

Parameters:
  • trainingimages (dict) – Hashed dictionary of training images.
  • descriptor (object) – Class representing local atomic environment.
  • log (Logger object) – Write function at which to log data. Note this must be a callable function.
  • parallel (dict) – Parallel configuration dictionary. Takes the same form as in amp.Amp.
  • only_setup (bool) – only_setup is primarily for debugging. It initializes all variables but skips the last line of starting the regressor.
forcetraining

Returns true if forcetraining is turned on (as determined by examining the convergence criteria in the loss function), else returns False.

get_loss(vector)[source]

Method to be called by the regression master.

Takes one and only one input, a vector of parameters. Returns one output, the value of the loss (cost) function.

Parameters:vector (list) – Parameters of the regression model in the form of a list.
get_lossprime(vector)[source]

Method to be called by the regression master.

Takes one and only one input, a vector of parameters. Returns one output, the value of the derivative of the loss function with respect to model parameters.

Parameters:vector (list) – Parameters of the regression model in the form of a list.
lossfunction

Allows the user to set a custom loss function.

For example, >>> from amp.model import LossFunction >>> lossfxn = LossFunction(energy_tol=0.0001) >>> calc.model.lossfunction = lossfxn

Parameters:lossfunction (object) – Loss function object, if at all desired by the user.
vector

Access to get or set the model parameters (weights, scaling for each network) as a single vector, useful in particular for regression.

Parameters:vector (list) – Parameters of the regression model in the form of a list.
class amp.model.neuralnetwork.NodePlot(calc)[source]

Creates plots to visualize the output of the nodes in the neural networks.

initialize with a calculator that has parameters; e.g. a trained calculator or else one in which fit has been called with the setup_only flag turned on.

Call with the ‘plot’ method, which takes as argment a list of images

plot(images, filename='nodeplot.pdf')[source]

Creates a plot of the output of each node, as a violin plot.

class amp.model.neuralnetwork.Raveler(weights, scalings)[source]

Class to ravel and unravel variable values into a single vector.

This is used for feeding into the optimizer. Feed in a list of dictionaries to initialize the shape of the transformation. Note no data is saved in the class; each time it is used it is passed either the dictionaries or vector. The dictionaries for initialization should be two levels deep.

weights, scalings are the variables to ravel and unravel

to_dicts(vector)[source]

Puts the vector back into weights and scalings dictionaries of the form initialized. vector must have same length as the output of unravel.

to_vector(weights, scalings)[source]

Puts the weights and scalings embedded dictionaries into a single vector and returns it. The dictionaries need to have the identical structure to those it was initialized with.

amp.model.neuralnetwork.calculate_dOutputs_dInputs(parameters, derafp, outputs, nsymbol)[source]
Parameters:
  • parameters (dict) – ASE dictionary object.
  • derafp (list) – Derivatives of atomic fingerprints in the form of a list to be used as input to the neural network.
  • outputs (dict) – Outputs of neural network nodes.
  • nsymbol (str) – Symbol of the atom for which atomic energy is calculated (only used in the atom-centered mode)
Returns:

Derivatives of outputs of neural network nodes w.r.t. inputs.

Return type:

dict

amp.model.neuralnetwork.calculate_nodal_outputs(parameters, afp, symbol)[source]

Given input to the neural network, output (which corresponds to energy) is calculated about the specified atom. The sum of these for all atoms is the total energy (in atom-centered mode).

Parameters:
  • parameters (dict) – ASE dictionary object.
  • afp (list) – Atomic fingerprints in the form of a list to be used as input to the neural network.
  • symbol (str) – Symbol of the atom for which atomic energy is calculated (only used in the atom-centered mode)
Returns:

Outputs of neural network nodes

Return type:

dict

amp.model.neuralnetwork.calculate_ohat_D_delta(parameters, outputs, W)[source]

Calculates extra matrices ohat, D, delta needed in mathematical manipulations.

Notations are consistent with those of ‘Rojas, R. Neural Networks - A Systematic Introduction. Springer-Verlag, Berlin, first edition 1996’

Parameters:
  • parameters (dict) – ASE dictionary object.
  • outputs (dict) – Outputs of neural network nodes.
  • W (dict) – The same as weight dictionary, but the last rows associated with biases are deleted in W.
amp.model.neuralnetwork.get_random_scalings(images, activation, elements=None)[source]

Generates initial scaling matrices, such that the range of activation is scaled to the range of actual energies.

images : dict
ASE atoms objects (the training set).
activation: str
Assigns the type of activation funtion. “linear” refers to linear function, “tanh” refers to tanh function, and “sigmoid” refers to sigmoid function.
elements: list of str
List of atom symbols; used in the atom-centered mode only.
Returns:scalings
Return type:float
amp.model.neuralnetwork.get_random_weights(hiddenlayers, activation, len_of_fps=None, no_of_atoms=None)[source]

Generates random weight arrays from variables.

hiddenlayers: dict

Dictionary of chemical element symbols and architectures of their corresponding hidden layers of the conventional neural network. Number of nodes of last layer is always one corresponding to energy. However, number of nodes of first layer is equal to three times number of atoms in the system in the case of no descriptor, and is equal to length of symmetry functions in the atom-centered mode. Can be fed as:

>>> hiddenlayers = (3, 2,)

for example, in which a neural network with two hidden layers, the first one having three nodes and the second one having two nodes is assigned (to the whole atomic system in the case of no descriptor, and to each chemical element in the atom-centered mode). In the atom-centered mode, neural network for each species can be assigned seperately, as:

>>> hiddenlayers = {"O":(3,5), "Au":(5,6)}

for example.

activation : str
Assigns the type of activation funtion. “linear” refers to linear function, “tanh” refers to tanh function, and “sigmoid” refers to sigmoid function.
len_of_fps : dict

Length of fingerprints of each element, e.g:

>>> len_of_fps={"O": 20, "Pd":20}
no_of_atoms : int
Number of atoms in atomic systems; used only in the case of no descriptor.
Returns:weights
Return type:float

Tensorflow Neural Network

A work in progress, this module amp.model.tflow uses Google’s TensorFlow package to implement a neural network, which may provide GPU acceleration and other advantages.

class amp.model.tflow.NeuralNetwork(hiddenlayers=(5, 5), activation='tanh', keep_prob=1.0, maxTrainingEpochs=10000, importname=None, batchsize=2, initialTrainingRate=0.0001, miniBatch=False, tfVars=None, saveVariableName=None, parameters=None, sess=None, energy_coefficient=1.0, force_coefficient=0.04, scikit_model=None, convergenceCriteria=None, optimizationMethod='l-BFGS-b', input_keep_prob=0.8, ADAM_optimizer_params={'beta1': 0.9}, regularization_strength=None, numTrainingImages={}, elementFingerprintLengths=None, fprange=None, weights=None, scalings=None, unit_type='float', preLoadTrainingData=True, relativeForceCutoff=None)[source]

TensorFlow-based Neural Network model.

Uses Google’s machine-learning code to construct a neural network. This method also allows for GPU acceleration.

Parameters:
  • hiddenlayers – Structure of the neural network. Can either be in the format (int,int,int), where each element represnts the size of a layer and there and the length of the list is the number of layers, or dictionary format of the network structure for each element type. E.g. {‘Cu’: (5, 5), ‘O’: (10, 5)}
  • activation – Activation type. (XXX Provide list of possibilities.)
  • keep_prob (float) – Dropout rate for the neural network to reduce overfitting. (keep_prob=1. uses all nodes, keep_prob~0.5-0.8 better for training)
  • maxTrainingEpochs (int) – Maximum number of times to loop through the training data before giving up.
  • batchsize (int) – Batch size for minibatch (if miniBatch is set to True).
  • initialTrainingRate – Initial training rate for SGD optimizers like ADAM. See the TF documentation for choose this value. Likely between 1e-2 and 1e-5, depending on use case, whether mini-batch is on, etc.
  • miniBatch (bool) – Whether to use minibatches in training.
  • tfVars – Tensorflow variables (used if restoring from a previous save).
  • saveVariableName (str) – Name used for the internal tensorflow variable naming scheme. If variables have the same name as another model in the same tensorflow session, there will be collisions.
  • parameters – Dictionary of parameters to be used in initialization. Mostly these are the same keywords as the keyword arguments in this function. This is primarily used to make saving/loading easier.
  • sess – tensorflow session to use (None means start a new session)
  • maxAtomsForces (int) – Number of atoms to be used in the force training. It sets the upper bound on the number of atoms that can be used to calculate the force for. E.g., if maxAtomsForces=40, then forces can only be calculated for images with less than 40 atoms.
  • energy_coefficient (float) – Used to adjust the loss function; this is the weight applied to the energy component.
  • force_coefficient (float or None) – Used to adjust the loss function; this is the weight applied to the force component. Note you can turn off force training by setting this to None.
  • convergenceCriteria (dict) – Dictionary of convergence criteria, analagous to the main AMP convergence criteria dictionary.
  • optimizationMethod (string) – Set the optimization method for the NN parameters. Currently either ‘ADAM’ for the ADAM optimizer in tensorflow, of ‘l-BFGS-b’ for the deterministic l-BFGS-b method. ADAM is usually faster per training step, has all of the benefits of being a stochastic optimizer, and allows for mini-batch operation, but has more tunable parameters and can be harder to get working well. l-BFGS-b usually works for small/moderate network sizes.
  • input_keep_prob – Dropout ratio on the first layer (from fingerprints to the neural network. Rule of thumb is this should be 0 to 0.2. Only applies when using a SGD optimizer like ADAM. BFGS ignores this.
  • ADAM_optimizer_params – Dictionary of parameters to pass to the ADAM optimizer. See https://www.tensorflow.org/versions/r0.11/api_docs/python/ train.html#AdamOptimizer for documentation
  • regularization_strength – Weight for L2-regularization in the cost function
  • fprange (dict) – This is a dictionary that contains the minimum and maximum values seen for each fingerprint of each element. These
  • weights (np array) – Input that allows the NN weights (and biases) to be set directly. This is only used for verifying that the calculation is working correctly in the CuOPd test case. In general, don’t use this except for testing the code. This argument is analagous to the original AMP NeuralNetwork module.
  • scalings – Input that allows the NN final scaling o be set directly. This is only used for verifying that the calculation is working correctly in the CuOPd test case. In general, don’t use this except for testing the code. This argument is analagous to the original AMP NeuralNetwork module.
  • unit_type (string) – Sets the internal datatype of the tensorflow model. Either “float” for 32-bit FP precision, or “double” for 64-bit FP precision.
  • preLoadTrainingData (bool) – Decides whether to run the training by preloading all training data into tensorflow. Doing so results in faster training if the entire dataset can fit into memory. This only works when not using mini-batch.
  • relativeForceCutoff (float) – Parameter for controlling whether the force contribution to the trained cost function is absolute (just differences of force compared to training forces) or relative for large values of the force. This basically sets the upper limit on the forces that should be fitted (e.g. if the force is >A, then the force is scaled). This helps when a small number of images have very large forces that don’t need to be reconstructed perfectly.
calculate_energy(fingerprint)[source]

Get the energy by feeding in a list to the get_list version (which is more efficient for anything greater than 1 image).

calculate_forces(fingerprint, derfingerprint)[source]
constructModel(sess, graph, preLoadData=False, numElements=None, numTrainingImages=None, num_dgdx_Eindices=None, numTrainingAtoms=None)[source]

Sets up the tensorflow neural networks for each atom type.

constructSessGraphModel(tfVars, sess, trainOnly=False, numElements=None, numTrainingImages=None, num_dgdx_Eindices=None, numTrainingAtoms=None)[source]
fit(trainingimages, descriptor, parallel, log=None)[source]

Fit takes a bunch of training images (which are assumed to have a working calculator attached), and fits the internal variables to the training images.

generateFeedInput(curinds, energies, atomArraysAll, dgdx, dgdx_Eindices, dgdx_Xindices, nAtomsDict, atomsIndsReverse, batchsize, trainingrate, keepprob, inputkeepprob, natoms, forcesExp=0.0, forces=False, energycoefficient=1.0, forcecoefficient=None, training=True)[source]

Generates the input dictionary that maps various inputs on the python side to placeholders for the tensorflow model.

getVariance(fingerprint, nSamples=10, l=1.0)[source]
get_energy_list(hashs, fingerprintDB, fingerprintDerDB=None, keep_prob=1.0, input_keep_prob=1.0, forces=False, nsamples=1)[source]

Methods to get the energy and forces for a set of configurations.

initializeVariables()[source]

Resets all of the variables in the current tensorflow model.

preLoadFeed(feedinput)[source]
setWeightsScalings(feedinput, weights, scalings)[source]
tostring()[source]

Dummy tostring to make things work.

amp.model.tflow.bias_variable(shape, name, unit_type, a=0.1)[source]

Helper functions taken from the MNIST tutorial to generate weight and bias variables with random initial weights.

amp.model.tflow.generateBatch(curinds, elements, atomArraysAll, nAtomsDict, atomsIndsReverse, dgdx, dgdx_Eindices, dgdx_Xindices)[source]

This method generates batches from a large dataset using a set of selected indices curinds.

amp.model.tflow.generateTensorFlowArrays(fingerprintDB, elements, keylist, fingerprintDerDB=None)[source]

This function generates the inputs to the tensorflow graph for the selected images. The essential problem is that each neural network is associated with a specific element type. Thus, atoms in each ASE image need to be sent to different networks.

Inputs:

fingerprintDB: a database of fingerprints, as taken from the descriptor

elements: a list of element types (e.g. ‘C’,’O’, etc)

keylist: a list of hashs into the fingerprintDB that we want to create
inputs for
fingerprintDerDB: a database of fingerprint derivatives, as taken from the
descriptor

maxAtomsForces: the maximum length of the atoms

Outputs:

atomArraysAll: a dictionary of fingerprint inputs to each element’s neural
network
nAtomsDict: a dictionary for each element with lists of the number of
atoms of each type in each image
atomsIndsReverse: a dictionary that contains the index of each atom into
the original keylist

nAtoms: the number of atoms in each image

atomArraysAllDerivs: dictionary of fingerprint derivates for each
element’s neural network
amp.model.tflow.model(x, segmentinds, keep_prob, input_keep_prob, batchsize, neuronList, activationType, fplength, mask, name, dgdx, dgdx_Xindices, dgdx_Eindices, element, unit_type, totalNumAtoms)[source]

Generates a multilayer neural network with variable number of neurons, so that we have a template for each atom’s NN.

amp.model.tflow.reorganizeForces(forces, natoms)[source]
amp.model.tflow.weight_variable(shape, name, unit_type, stddev=0.1)[source]

Helper functions taken from the MNIST tutorial to generate weight and bias variables with random initial weights.