Note

Go to the end to download the full example code.

Basic usage of `ANIDataset`#

This supersedes the obsolete anidataloader. There are also builtin datasets that live in moria, and they can be directly downloaded through torchani.

# To begin with, let's import the modules we will use:
import shutil
from pathlib import Path

from torchani.datasets import ANIDataset

Downloading the builtin datasets performs a checksum to make sure the files are correct. If the function is called again and the dataset is already on the path, only the checksum is performed, the data is not downloaded. The output is an ANIDataset class Uncomment the following code to download (watch out, it may take some time):

# import torchani
# ds_1x = torchani.datasets.ANI1x('./datasets/ani1x/', download=True)
# ds_comp6 = torchani.datasets.COMP6v1('./datasets/comp6v1/', download=True)
# ds_2x = torchani.datasets.ANI2x('./datasets/ani2x/', download=True)

For the purposes of this example we will copy and modify two files inside torchani/dataset, which can be downloaded by running the download-dev-data.sh script

file1_path = Path.cwd() / "file1.h5"
file2_path = Path.cwd() / "file2.h5"
data_source = Path.cwd().parent / "dev-data" / "hf-data" / "dataset" / "ani1-up_to_gdb4"
shutil.copy(data_source / "ani_gdb_s01.h5", file1_path)
shutil.copy(data_source / "ani_gdb_s02.h5", file2_path)

PosixPath('/home/runner/work/torchani/torchani/examples/file2.h5')

ANIDataset accepts a path to an h5 file or a list of paths to many files (optionally with names)

ds = ANIDataset(locations=(file1_path, file2_path), names=("file1", "file2"))

Verifying format correctness: 0it [00:00, ?it/s]
Verifying format correctness: 22it [00:00, 1543.39it/s]
/opt/hostedtoolcache/Python/3.11.14/x64/lib/python3.11/site-packages/torchani/datasets/anidataset.py:351: UserWarning: {'energiesHE', 'smiles', 'coordinatesHE'} found in legacy dataset, this will generate unpredictable issues.
 Probably .items() and .values() will work but not much else. It is highly  recommended that you backup these properties (if needed) and *delete them* using dataset.delete_properties
  warnings.warn(

Verifying format correctness: 0it [00:00, ?it/s]
Verifying format correctness: 92it [00:00, 1552.27it/s]

ANIDatasets have properties they can access. All conformers in the dataset have the same set of properties, lets check what properties this dataset holds

print(ds.properties)

{'species', 'coordinatesHE', 'energies', 'coordinates', 'energiesHE', 'smiles'}

When opening these files we see that we get a warning because they have some unsupported legacy properties, so the first thing we will do is delete them

ds.delete_properties(("coordinatesHE", "energiesHE", "smiles"))
print(ds.properties)

Deleting properties:   0%|          | 0/3 [00:00<?, ?it/s]

Verifying format correctness: 0it [00:00, ?it/s]
Verifying format correctness: 13it [00:00, 2560.99it/s]

Deleting properties:   0%|          | 0/13 [00:00<?, ?it/s]

Verifying format correctness: 0it [00:00, ?it/s]
Verifying format correctness: 53it [00:00, 2518.36it/s]
{'energies', 'coordinates', 'species'}

Conformer groups#

To access groups of conformers we can just use the dataset as an ordered dictionary

group = ds["file2/gdb11_s02/gdb11_s02-8"]
print(group)

{'energies': tensor([-111.728, -111.746, -111.752,  ..., -111.602, -111.791, -111.720],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.697,  0.092,  0.100],
         [-0.702,  0.075, -0.092],
         [ 1.195, -0.880,  0.527],
         [ 1.080,  0.211, -0.925],
         [-1.022,  0.201,  0.616],
         [-1.187, -0.807, -0.350]],

        [[ 0.699,  0.095,  0.083],
         [-0.706,  0.073, -0.090],
         [ 0.907, -0.947,  0.280],
         [ 1.251,  0.194, -0.585],
         [-1.218,  0.421,  0.924],
         [-0.849, -0.980, -0.496]],

        [[ 0.699,  0.060,  0.074],
         [-0.717,  0.094, -0.102],
         [ 1.058, -0.874,  0.373],
         [ 1.230,  0.318, -0.487],
         [-1.158,  0.331,  0.908],
         [-0.868, -0.888, -0.390]],

        ...,

        [[ 0.717,  0.075,  0.080],
         [-0.707,  0.064, -0.119],
         [ 0.914, -1.049,  0.358],
         [ 1.189,  0.395, -0.495],
         [-1.318,  0.402,  1.039],
         [-0.907, -0.637, -0.353]],

        [[ 0.700,  0.098,  0.092],
         [-0.716,  0.059, -0.095],
         [ 1.057, -0.761,  0.327],
         [ 1.206,  0.264, -0.606],
         [-1.129,  0.254,  0.811],
         [-0.898, -0.905, -0.476]],

        [[ 0.676,  0.118,  0.103],
         [-0.690,  0.034, -0.093],
         [ 1.337, -0.843,  0.466],
         [ 1.128,  0.253, -0.906],
         [-1.032,  0.308,  0.567],
         [-1.224, -0.803, -0.315]]]), 'species': tensor([[7, 7, 1, 1, 1, 1],
        [7, 7, 1, 1, 1, 1],
        [7, 7, 1, 1, 1, 1],
        ...,
        [7, 7, 1, 1, 1, 1],
        [7, 7, 1, 1, 1, 1],
        [7, 7, 1, 1, 1, 1]])}

We see that we get some tensors with properties, but this access is not very convenient, the keys seem to have weird mangled names which don’t say very much about what is in them.

print(list(ds.keys()))

['file1/gdb11_s01/gdb11_s01-0', 'file1/gdb11_s01/gdb11_s01-1', 'file1/gdb11_s01/gdb11_s01-2', 'file2/gdb11_s02/gdb11_s02-0', 'file2/gdb11_s02/gdb11_s02-1', 'file2/gdb11_s02/gdb11_s02-10', 'file2/gdb11_s02/gdb11_s02-11', 'file2/gdb11_s02/gdb11_s02-12', 'file2/gdb11_s02/gdb11_s02-2', 'file2/gdb11_s02/gdb11_s02-3', 'file2/gdb11_s02/gdb11_s02-4', 'file2/gdb11_s02/gdb11_s02-5', 'file2/gdb11_s02/gdb11_s02-6', 'file2/gdb11_s02/gdb11_s02-7', 'file2/gdb11_s02/gdb11_s02-8', 'file2/gdb11_s02/gdb11_s02-9']

This is because this dataset is in a legacy format, we can check that by querying the “grouping”

print(ds.grouping)

legacy

Before moving on, lets reformat this dataset so that it is in a more standarized format

ds.regroup_by_formula()
print(list(ds.keys()))

Regrouping by formulas:   0%|          | 0/3 [00:00<?, ?it/s]
Regrouping by formulas:  67%|██████▋   | 2/3 [00:00<00:00, 17.92it/s]


Regrouping by formulas:   0%|          | 0/13 [00:00<?, ?it/s]
Regrouping by formulas:   8%|▊         | 1/13 [00:00<00:01,  7.84it/s]
Regrouping by formulas:  15%|█▌        | 2/13 [00:00<00:01,  8.70it/s]
Regrouping by formulas:  38%|███▊      | 5/13 [00:00<00:00, 16.93it/s]
Regrouping by formulas:  54%|█████▍    | 7/13 [00:00<00:00, 14.73it/s]
Regrouping by formulas:  85%|████████▍ | 11/13 [00:00<00:00, 21.59it/s]

['file1/CH4', 'file1/H2O', 'file1/H3N', 'file2/C2H2', 'file2/C2H4', 'file2/C2H6', 'file2/CH2O', 'file2/CH4O', 'file2/CH5N', 'file2/H2N2', 'file2/H2O2', 'file2/H3NO', 'file2/H4N2', 'file2/HNO', 'file2/N2', 'file2/O2']

Now the dataset is organized by formulas, which makes access much easier (If we only had one file ds[‘CH4’] would have been enough)

group = ds["file1/CH4"]

items(), values() and keys() work as expected for groups of conformers, here we print only the first 100 as a sample

for j, (k, v) in enumerate(ds.items()):
    print(k, v)
    if j == 10:
        break

for j, k in enumerate(ds.keys()):
    print(k)
    if j == 10:
        break

for j, v in enumerate(ds.values()):
    print(v)
    if j == 10:
        break

file1/CH4 {'energies': tensor([-40.481, -40.483, -40.485,  ..., -40.496, -40.456, -40.465],
       dtype=torch.float64), 'coordinates': tensor([[[-0.003,  0.010,  0.019],
         [-0.795,  0.577, -0.547],
         [-0.394, -0.980,  0.272],
         [ 0.634,  0.447,  0.936],
         [ 0.596, -0.165, -0.892]],

        [[ 0.003, -0.020,  0.003],
         [-0.783,  0.792, -0.260],
         [-0.454, -1.030,  0.312],
         [ 0.447,  0.636,  0.768],
         [ 0.753, -0.159, -0.853]],

        [[-0.018, -0.022, -0.011],
         [-0.730,  0.682, -0.308],
         [-0.382, -0.868,  0.382],
         [ 0.569,  0.505,  0.804],
         [ 0.755, -0.059, -0.744]],

        ...,

        [[ 0.009,  0.002, -0.009],
         [-0.855,  0.627, -0.261],
         [-0.471, -0.946,  0.325],
         [ 0.488,  0.509,  0.859],
         [ 0.734, -0.216, -0.817]],

        [[ 0.003,  0.017,  0.030],
         [-0.764,  0.542, -0.671],
         [-0.425, -1.045,  0.293],
         [ 0.657,  0.410,  0.947],
         [ 0.500, -0.106, -0.922]],

        [[ 0.035, -0.002, -0.008],
         [-0.954,  0.572, -0.159],
         [-0.633, -0.924,  0.224],
         [ 0.454,  0.590,  0.751],
         [ 0.711, -0.216, -0.720]]]), 'species': tensor([[6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        ...,
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1]])}
file1/H2O {'energies': tensor([-76.353, -76.362, -76.387,  ..., -76.361, -76.374, -76.387],
       dtype=torch.float64), 'coordinates': tensor([[[    -0.000,     -0.006,      0.107],
         [     0.000,      0.777,     -0.421],
         [     0.000,     -0.678,     -0.343]],

        [[     0.000,     -0.005,      0.133],
         [    -0.000,      0.639,     -0.614],
         [     0.000,     -0.566,     -0.557]],

        [[     0.000,      0.001,      0.123],
         [    -0.000,      0.718,     -0.504],
         [    -0.000,     -0.727,     -0.511]],

        ...,

        [[     0.000,      0.005,      0.124],
         [    -0.000,      0.589,     -0.487],
         [    -0.000,     -0.667,     -0.548]],

        [[     0.000,      0.007,      0.131],
         [    -0.000,      0.666,     -0.528],
         [    -0.000,     -0.771,     -0.610]],

        [[     0.000,      0.001,      0.122],
         [    -0.000,      0.743,     -0.495],
         [    -0.000,     -0.764,     -0.512]]]), 'species': tensor([[8, 1, 1],
        [8, 1, 1],
        [8, 1, 1],
        ...,
        [8, 1, 1],
        [8, 1, 1],
        [8, 1, 1]])}
file1/H3N {'energies': tensor([-56.510, -56.502, -56.507,  ..., -56.508, -56.518, -56.521],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.020,  0.006, -0.078],
         [ 0.385, -0.882,  0.067],
         [ 0.318,  0.931,  0.038],
         [-0.979, -0.132,  0.169]],

        [[ 0.003, -0.015, -0.143],
         [ 0.533, -0.736,  0.341],
         [ 0.229,  0.820,  0.439],
         [-0.803,  0.118,  0.400]],

        [[-0.007,  0.010, -0.095],
         [ 0.566, -0.902,  0.221],
         [ 0.528,  0.938,  0.149],
         [-0.991, -0.180,  0.147]],

        ...,

        [[-0.019,  0.005, -0.134],
         [ 0.636, -0.745,  0.409],
         [ 0.556,  0.791,  0.365],
         [-0.922, -0.118,  0.278]],

        [[-0.003,  0.007, -0.135],
         [ 0.450, -0.787,  0.383],
         [ 0.475,  0.874,  0.336],
         [-0.889, -0.184,  0.347]],

        [[ 0.003, -0.003, -0.135],
         [ 0.538, -0.818,  0.344],
         [ 0.344,  0.872,  0.365],
         [-0.919, -0.014,  0.368]]]), 'species': tensor([[7, 1, 1, 1],
        [7, 1, 1, 1],
        [7, 1, 1, 1],
        ...,
        [7, 1, 1, 1],
        [7, 1, 1, 1],
        [7, 1, 1, 1]])}
file2/C2H2 {'energies': tensor([-77.281, -77.286, -77.283,  ..., -77.295, -77.286, -77.294],
       dtype=torch.float64), 'coordinates': tensor([[[     0.001,      0.026,      0.628],
         [    -0.017,     -0.055,     -0.625],
         [     0.058,     -0.004,      1.644],
         [     0.134,      0.347,     -1.687]],

        [[     0.014,     -0.043,      0.589],
         [    -0.043,      0.034,     -0.586],
         [     0.049,      0.220,      1.604],
         [     0.293,     -0.113,     -1.640]],

        [[    -0.043,     -0.046,      0.615],
         [     0.056,      0.031,     -0.622],
         [     0.134,      0.256,      1.711],
         [    -0.290,     -0.078,     -1.639]],

        ...,

        [[     0.029,     -0.004,      0.608],
         [    -0.024,     -0.007,     -0.605],
         [    -0.147,      0.057,      1.645],
         [     0.081,      0.070,     -1.677]],

        [[     0.021,     -0.052,      0.617],
         [    -0.041,      0.039,     -0.614],
         [    -0.017,      0.275,      1.658],
         [     0.253,     -0.116,     -1.696]],

        [[     0.002,      0.011,      0.609],
         [     0.004,     -0.010,     -0.611],
         [    -0.031,     -0.056,      1.658],
         [    -0.039,      0.036,     -1.632]]]), 'species': tensor([[6, 6, 1, 1],
        [6, 6, 1, 1],
        [6, 6, 1, 1],
        ...,
        [6, 6, 1, 1],
        [6, 6, 1, 1],
        [6, 6, 1, 1]])}
file2/C2H4 {'energies': tensor([-78.555, -78.554, -78.528,  ..., -78.548, -78.544, -78.533],
       dtype=torch.float64), 'coordinates': tensor([[[-0.657,  0.071, -0.040],
         [ 0.653, -0.061,  0.045],
         [-1.287, -0.857,  0.017],
         [-1.173,  1.006, -0.253],
         [ 1.136, -1.000,  0.200],
         [ 1.365,  0.736, -0.031]],

        [[-0.667,  0.069, -0.045],
         [ 0.664, -0.072,  0.038],
         [-1.330, -0.812,  0.087],
         [-1.113,  1.049, -0.251],
         [ 1.227, -1.012,  0.223],
         [ 1.253,  0.812,  0.042]],

        [[-0.620,  0.050, -0.014],
         [ 0.603, -0.043,  0.025],
         [-1.254, -0.800, -0.028],
         [-1.109,  1.022, -0.362],
         [ 1.122, -1.068,  0.214],
         [ 1.445,  0.778,  0.043]],

        ...,

        [[-0.663,  0.078, -0.014],
         [ 0.677, -0.068,  0.036],
         [-1.306, -0.848, -0.067],
         [-1.212,  0.950, -0.393],
         [ 1.062, -1.026,  0.174],
         [ 1.307,  0.793, -0.005]],

        [[-0.661,  0.064, -0.070],
         [ 0.642, -0.051,  0.070],
         [-1.233, -0.805,  0.138],
         [-1.218,  1.025, -0.177],
         [ 1.199, -1.082,  0.111],
         [ 1.468,  0.723, -0.083]],

        [[-0.653,  0.052, -0.009],
         [ 0.671, -0.063,  0.042],
         [-1.524, -0.691, -0.098],
         [-1.163,  0.924, -0.422],
         [ 1.144, -1.035,  0.132],
         [ 1.373,  0.911, -0.027]]]), 'species': tensor([[6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        ...,
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1]])}
file2/C2H6 {'energies': tensor([-79.764, -79.783, -79.655,  ..., -79.781, -79.745, -79.745],
       dtype=torch.float64), 'coordinates': tensor([[[     0.759,     -0.034,      0.022],
         [    -0.740,      0.007,     -0.027],
         [     1.183,      0.953,     -0.303],
         ...,
         [    -1.167,      0.760,      0.679],
         [    -1.126,     -0.958,      0.512],
         [    -1.129,      0.260,     -0.968]],

        [[     0.760,     -0.009,     -0.005],
         [    -0.757,      0.005,      0.003],
         [     1.226,      0.934,     -0.248],
         ...,
         [    -1.145,      0.696,      0.668],
         [    -1.174,     -1.024,      0.384],
         [    -1.148,      0.345,     -1.018]],

        [[     0.749,     -0.019,     -0.013],
         [    -0.770,      0.001,     -0.003],
         [     1.387,      1.111,     -0.434],
         ...,
         [    -1.185,      0.910,      0.923],
         [    -1.098,     -0.926,      0.125],
         [    -1.177,      0.091,     -0.983]],

        ...,

        [[     0.771,     -0.007,     -0.021],
         [    -0.777,     -0.001,      0.005],
         [     1.191,      0.959,     -0.270],
         ...,
         [    -1.123,      0.736,      0.779],
         [    -1.168,     -0.916,      0.247],
         [    -1.126,      0.235,     -0.988]],

        [[     0.774,      0.000,     -0.017],
         [    -0.773,      0.002,      0.008],
         [     1.184,      0.918,     -0.116],
         ...,
         [    -1.129,      0.625,      0.631],
         [    -1.219,     -1.021,      0.420],
         [    -1.159,      0.411,     -1.041]],

        [[     0.782,      0.002,      0.027],
         [    -0.792,     -0.004,     -0.014],
         [     1.118,      0.943,     -0.228],
         ...,
         [    -1.246,      0.707,      0.636],
         [    -1.167,     -1.070,      0.509],
         [    -1.014,      0.365,     -1.068]]]), 'species': tensor([[6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        ...,
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1]])}
file2/CH2O {'energies': tensor([-114.457, -114.452, -114.453,  ..., -114.452, -114.452, -114.450],
       dtype=torch.float64), 'coordinates': tensor([[[    -0.515,      0.011,      0.019],
         [     0.666,     -0.003,     -0.005],
         [    -1.117,      0.960,     -0.079],
         [    -1.121,     -1.050,     -0.079]],

        [[    -0.510,      0.011,      0.020],
         [     0.648,     -0.009,     -0.005],
         [    -1.089,      0.981,     -0.083],
         [    -0.908,     -0.973,     -0.083]],

        [[    -0.530,     -0.025,     -0.019],
         [     0.671,      0.010,      0.004],
         [    -1.006,      1.103,      0.077],
         [    -1.132,     -0.970,      0.077]],

        ...,

        [[    -0.515,     -0.012,      0.028],
         [     0.669,      0.011,     -0.006],
         [    -1.024,      0.976,     -0.113],
         [    -1.263,     -1.005,     -0.113]],

        [[    -0.540,      0.018,     -0.005],
         [     0.693,     -0.014,      0.001],
         [    -1.336,      0.973,      0.020],
         [    -1.040,     -0.958,      0.020]],

        [[    -0.548,     -0.006,     -0.036],
         [     0.689,      0.008,      0.008],
         [    -1.008,      0.969,      0.147],
         [    -1.204,     -1.027,      0.147]]]), 'species': tensor([[6, 8, 1, 1],
        [6, 8, 1, 1],
        [6, 8, 1, 1],
        ...,
        [6, 8, 1, 1],
        [6, 8, 1, 1],
        [6, 8, 1, 1]])}
file2/CH4O {'energies': tensor([-115.604, -115.672, -115.631,  ..., -115.627, -115.663, -115.598],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.631, -0.033,  0.018],
         [-0.718,  0.126, -0.050],
         [ 1.036,  1.003,  0.231],
         [ 0.961, -0.630, -0.938],
         [ 1.143, -0.393,  0.733],
         [-1.214, -0.767,  0.579]],

        [[ 0.679, -0.010, -0.006],
         [-0.756,  0.114,  0.016],
         [ 0.974,  0.960, -0.100],
         [ 1.119, -0.522, -0.861],
         [ 1.011, -0.534,  0.941],
         [-1.150, -0.737, -0.156]],

        [[ 0.601,  0.003, -0.027],
         [-0.706,  0.113, -0.022],
         [ 0.977,  0.957,  0.172],
         [ 1.227, -0.743, -0.826],
         [ 0.892, -0.460,  0.829],
         [-1.017, -0.754,  0.498]],

        ...,

        [[ 0.691, -0.044, -0.005],
         [-0.777,  0.133, -0.010],
         [ 1.264,  1.143,  0.013],
         [ 1.214, -0.658, -0.912],
         [ 0.929, -0.461,  0.853],
         [-1.278, -0.772,  0.264]],

        [[ 0.678, -0.015, -0.030],
         [-0.768,  0.118, -0.009],
         [ 1.065,  0.936,  0.192],
         [ 1.207, -0.530, -0.839],
         [ 0.936, -0.475,  0.841],
         [-1.075, -0.781,  0.297]],

        [[ 0.623, -0.042, -0.027],
         [-0.689,  0.139, -0.025],
         [ 0.987,  0.969,  0.247],
         [ 1.040, -0.608, -0.971],
         [ 0.715, -0.399,  0.911],
         [-1.192, -0.829,  0.545]]]), 'species': tensor([[6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        ...,
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1]])}
file2/CH5N {'energies': tensor([-95.805, -95.715, -95.634,  ..., -95.776, -95.779, -95.716],
       dtype=torch.float64), 'coordinates': tensor([[[     0.712,      0.008,      0.032],
         [    -0.749,     -0.008,     -0.127],
         [     1.198,     -0.838,     -0.455],
         ...,
         [     0.904,     -0.100,      1.049],
         [    -1.221,     -0.758,      0.196],
         [    -1.020,      0.826,      0.414]],

        [[     0.723,     -0.003,      0.012],
         [    -0.766,     -0.023,     -0.146],
         [     1.298,     -0.812,     -0.347],
         ...,
         [     1.169,     -0.146,      1.182],
         [    -1.361,     -0.687,      0.231],
         [    -1.048,      1.000,      0.730]],

        [[     0.687,     -0.021,     -0.008],
         [    -0.715,     -0.022,     -0.127],
         [     1.009,     -0.765,     -0.273],
         ...,
         [     1.199,     -0.069,      1.138],
         [    -1.243,     -0.613,      0.164],
         [    -1.099,      0.925,      0.678]],

        ...,

        [[     0.673,     -0.008,      0.009],
         [    -0.717,     -0.005,     -0.119],
         [     0.886,     -0.858,     -0.439],
         ...,
         [     1.118,      0.014,      1.105],
         [    -0.950,     -0.786,      0.277],
         [    -1.171,      0.859,      0.430]],

        [[     0.659,     -0.003,      0.000],
         [    -0.712,     -0.000,     -0.133],
         [     1.179,     -0.876,     -0.454],
         ...,
         [     1.279,     -0.106,      1.142],
         [    -1.176,     -0.749,      0.355],
         [    -1.202,      0.862,      0.510]],

        [[     0.702,     -0.009,      0.026],
         [    -0.768,      0.020,     -0.131],
         [     1.199,     -0.946,     -0.712],
         ...,
         [     1.169,      0.166,      1.057],
         [    -1.105,     -0.946,      0.663],
         [    -1.181,      0.741,      0.092]]]), 'species': tensor([[6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        ...,
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1]])}
file2/H2N2 {'energies': tensor([-110.583, -110.586, -110.583,  ..., -110.590, -110.588, -110.583],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.593, -0.108,  0.008],
         [-0.600, -0.117, -0.008],
         [ 1.157,  0.750, -0.073],
         [-1.065,  0.738,  0.073]],

        [[ 0.630, -0.101, -0.007],
         [-0.644, -0.135,  0.007],
         [ 1.145,  0.755,  0.064],
         [-0.964,  0.870, -0.064]],

        [[ 0.630, -0.122, -0.012],
         [-0.628, -0.122,  0.012],
         [ 0.833,  0.846,  0.108],
         [-0.868,  0.877, -0.108]],

        ...,

        [[ 0.604, -0.117, -0.008],
         [-0.613, -0.125,  0.008],
         [ 1.123,  0.875,  0.069],
         [-0.998,  0.827, -0.069]],

        [[ 0.619, -0.112,  0.010],
         [-0.629, -0.130, -0.010],
         [ 1.171,  0.841, -0.087],
         [-1.040,  0.868,  0.087]],

        [[ 0.643, -0.133,  0.013],
         [-0.627, -0.109, -0.013],
         [ 0.989,  0.850, -0.110],
         [-1.205,  0.852,  0.110]]]), 'species': tensor([[7, 7, 1, 1],
        [7, 7, 1, 1],
        [7, 7, 1, 1],
        ...,
        [7, 7, 1, 1],
        [7, 7, 1, 1],
        [7, 7, 1, 1]])}
file2/H2O2 {'energies': tensor([-151.357, -151.426, -151.385,  ..., -151.464, -151.387, -151.416],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.717,  0.125, -0.052],
         [-0.715, -0.139, -0.039],
         [ 0.939, -0.450,  0.285],
         [-0.937,  0.693,  0.285]],

        [[ 0.695,  0.157, -0.078],
         [-0.690, -0.161, -0.071],
         [ 1.175, -0.580,  0.736],
         [-1.230,  0.654,  0.702]],

        [[ 0.684,  0.151, -0.079],
         [-0.685, -0.157, -0.074],
         [ 1.217, -0.674,  0.758],
         [-1.201,  0.779,  0.767]],

        ...,

        [[ 0.711,  0.108, -0.041],
         [-0.712, -0.111, -0.039],
         [ 0.885, -0.675,  0.213],
         [-0.864,  0.739,  0.225]],

        [[ 0.715,  0.158, -0.034],
         [-0.716, -0.138, -0.050],
         [ 0.843, -0.857,  0.245],
         [-0.854,  0.527,  0.241]],

        [[ 0.712,  0.094, -0.037],
         [-0.714, -0.083, -0.048],
         [ 0.884, -0.779,  0.251],
         [-0.869,  0.587,  0.262]]]), 'species': tensor([[8, 8, 1, 1],
        [8, 8, 1, 1],
        [8, 8, 1, 1],
        ...,
        [8, 8, 1, 1],
        [8, 8, 1, 1],
        [8, 8, 1, 1]])}
file1/CH4
file1/H2O
file1/H3N
file2/C2H2
file2/C2H4
file2/C2H6
file2/CH2O
file2/CH4O
file2/CH5N
file2/H2N2
file2/H2O2
{'energies': tensor([-40.481, -40.483, -40.485,  ..., -40.496, -40.456, -40.465],
       dtype=torch.float64), 'coordinates': tensor([[[-0.003,  0.010,  0.019],
         [-0.795,  0.577, -0.547],
         [-0.394, -0.980,  0.272],
         [ 0.634,  0.447,  0.936],
         [ 0.596, -0.165, -0.892]],

        [[ 0.003, -0.020,  0.003],
         [-0.783,  0.792, -0.260],
         [-0.454, -1.030,  0.312],
         [ 0.447,  0.636,  0.768],
         [ 0.753, -0.159, -0.853]],

        [[-0.018, -0.022, -0.011],
         [-0.730,  0.682, -0.308],
         [-0.382, -0.868,  0.382],
         [ 0.569,  0.505,  0.804],
         [ 0.755, -0.059, -0.744]],

        ...,

        [[ 0.009,  0.002, -0.009],
         [-0.855,  0.627, -0.261],
         [-0.471, -0.946,  0.325],
         [ 0.488,  0.509,  0.859],
         [ 0.734, -0.216, -0.817]],

        [[ 0.003,  0.017,  0.030],
         [-0.764,  0.542, -0.671],
         [-0.425, -1.045,  0.293],
         [ 0.657,  0.410,  0.947],
         [ 0.500, -0.106, -0.922]],

        [[ 0.035, -0.002, -0.008],
         [-0.954,  0.572, -0.159],
         [-0.633, -0.924,  0.224],
         [ 0.454,  0.590,  0.751],
         [ 0.711, -0.216, -0.720]]]), 'species': tensor([[6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        ...,
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1]])}
{'energies': tensor([-76.353, -76.362, -76.387,  ..., -76.361, -76.374, -76.387],
       dtype=torch.float64), 'coordinates': tensor([[[    -0.000,     -0.006,      0.107],
         [     0.000,      0.777,     -0.421],
         [     0.000,     -0.678,     -0.343]],

        [[     0.000,     -0.005,      0.133],
         [    -0.000,      0.639,     -0.614],
         [     0.000,     -0.566,     -0.557]],

        [[     0.000,      0.001,      0.123],
         [    -0.000,      0.718,     -0.504],
         [    -0.000,     -0.727,     -0.511]],

        ...,

        [[     0.000,      0.005,      0.124],
         [    -0.000,      0.589,     -0.487],
         [    -0.000,     -0.667,     -0.548]],

        [[     0.000,      0.007,      0.131],
         [    -0.000,      0.666,     -0.528],
         [    -0.000,     -0.771,     -0.610]],

        [[     0.000,      0.001,      0.122],
         [    -0.000,      0.743,     -0.495],
         [    -0.000,     -0.764,     -0.512]]]), 'species': tensor([[8, 1, 1],
        [8, 1, 1],
        [8, 1, 1],
        ...,
        [8, 1, 1],
        [8, 1, 1],
        [8, 1, 1]])}
{'energies': tensor([-56.510, -56.502, -56.507,  ..., -56.508, -56.518, -56.521],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.020,  0.006, -0.078],
         [ 0.385, -0.882,  0.067],
         [ 0.318,  0.931,  0.038],
         [-0.979, -0.132,  0.169]],

        [[ 0.003, -0.015, -0.143],
         [ 0.533, -0.736,  0.341],
         [ 0.229,  0.820,  0.439],
         [-0.803,  0.118,  0.400]],

        [[-0.007,  0.010, -0.095],
         [ 0.566, -0.902,  0.221],
         [ 0.528,  0.938,  0.149],
         [-0.991, -0.180,  0.147]],

        ...,

        [[-0.019,  0.005, -0.134],
         [ 0.636, -0.745,  0.409],
         [ 0.556,  0.791,  0.365],
         [-0.922, -0.118,  0.278]],

        [[-0.003,  0.007, -0.135],
         [ 0.450, -0.787,  0.383],
         [ 0.475,  0.874,  0.336],
         [-0.889, -0.184,  0.347]],

        [[ 0.003, -0.003, -0.135],
         [ 0.538, -0.818,  0.344],
         [ 0.344,  0.872,  0.365],
         [-0.919, -0.014,  0.368]]]), 'species': tensor([[7, 1, 1, 1],
        [7, 1, 1, 1],
        [7, 1, 1, 1],
        ...,
        [7, 1, 1, 1],
        [7, 1, 1, 1],
        [7, 1, 1, 1]])}
{'energies': tensor([-77.281, -77.286, -77.283,  ..., -77.295, -77.286, -77.294],
       dtype=torch.float64), 'coordinates': tensor([[[     0.001,      0.026,      0.628],
         [    -0.017,     -0.055,     -0.625],
         [     0.058,     -0.004,      1.644],
         [     0.134,      0.347,     -1.687]],

        [[     0.014,     -0.043,      0.589],
         [    -0.043,      0.034,     -0.586],
         [     0.049,      0.220,      1.604],
         [     0.293,     -0.113,     -1.640]],

        [[    -0.043,     -0.046,      0.615],
         [     0.056,      0.031,     -0.622],
         [     0.134,      0.256,      1.711],
         [    -0.290,     -0.078,     -1.639]],

        ...,

        [[     0.029,     -0.004,      0.608],
         [    -0.024,     -0.007,     -0.605],
         [    -0.147,      0.057,      1.645],
         [     0.081,      0.070,     -1.677]],

        [[     0.021,     -0.052,      0.617],
         [    -0.041,      0.039,     -0.614],
         [    -0.017,      0.275,      1.658],
         [     0.253,     -0.116,     -1.696]],

        [[     0.002,      0.011,      0.609],
         [     0.004,     -0.010,     -0.611],
         [    -0.031,     -0.056,      1.658],
         [    -0.039,      0.036,     -1.632]]]), 'species': tensor([[6, 6, 1, 1],
        [6, 6, 1, 1],
        [6, 6, 1, 1],
        ...,
        [6, 6, 1, 1],
        [6, 6, 1, 1],
        [6, 6, 1, 1]])}
{'energies': tensor([-78.555, -78.554, -78.528,  ..., -78.548, -78.544, -78.533],
       dtype=torch.float64), 'coordinates': tensor([[[-0.657,  0.071, -0.040],
         [ 0.653, -0.061,  0.045],
         [-1.287, -0.857,  0.017],
         [-1.173,  1.006, -0.253],
         [ 1.136, -1.000,  0.200],
         [ 1.365,  0.736, -0.031]],

        [[-0.667,  0.069, -0.045],
         [ 0.664, -0.072,  0.038],
         [-1.330, -0.812,  0.087],
         [-1.113,  1.049, -0.251],
         [ 1.227, -1.012,  0.223],
         [ 1.253,  0.812,  0.042]],

        [[-0.620,  0.050, -0.014],
         [ 0.603, -0.043,  0.025],
         [-1.254, -0.800, -0.028],
         [-1.109,  1.022, -0.362],
         [ 1.122, -1.068,  0.214],
         [ 1.445,  0.778,  0.043]],

        ...,

        [[-0.663,  0.078, -0.014],
         [ 0.677, -0.068,  0.036],
         [-1.306, -0.848, -0.067],
         [-1.212,  0.950, -0.393],
         [ 1.062, -1.026,  0.174],
         [ 1.307,  0.793, -0.005]],

        [[-0.661,  0.064, -0.070],
         [ 0.642, -0.051,  0.070],
         [-1.233, -0.805,  0.138],
         [-1.218,  1.025, -0.177],
         [ 1.199, -1.082,  0.111],
         [ 1.468,  0.723, -0.083]],

        [[-0.653,  0.052, -0.009],
         [ 0.671, -0.063,  0.042],
         [-1.524, -0.691, -0.098],
         [-1.163,  0.924, -0.422],
         [ 1.144, -1.035,  0.132],
         [ 1.373,  0.911, -0.027]]]), 'species': tensor([[6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        ...,
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1],
        [6, 6, 1, 1, 1, 1]])}
{'energies': tensor([-79.764, -79.783, -79.655,  ..., -79.781, -79.745, -79.745],
       dtype=torch.float64), 'coordinates': tensor([[[     0.759,     -0.034,      0.022],
         [    -0.740,      0.007,     -0.027],
         [     1.183,      0.953,     -0.303],
         ...,
         [    -1.167,      0.760,      0.679],
         [    -1.126,     -0.958,      0.512],
         [    -1.129,      0.260,     -0.968]],

        [[     0.760,     -0.009,     -0.005],
         [    -0.757,      0.005,      0.003],
         [     1.226,      0.934,     -0.248],
         ...,
         [    -1.145,      0.696,      0.668],
         [    -1.174,     -1.024,      0.384],
         [    -1.148,      0.345,     -1.018]],

        [[     0.749,     -0.019,     -0.013],
         [    -0.770,      0.001,     -0.003],
         [     1.387,      1.111,     -0.434],
         ...,
         [    -1.185,      0.910,      0.923],
         [    -1.098,     -0.926,      0.125],
         [    -1.177,      0.091,     -0.983]],

        ...,

        [[     0.771,     -0.007,     -0.021],
         [    -0.777,     -0.001,      0.005],
         [     1.191,      0.959,     -0.270],
         ...,
         [    -1.123,      0.736,      0.779],
         [    -1.168,     -0.916,      0.247],
         [    -1.126,      0.235,     -0.988]],

        [[     0.774,      0.000,     -0.017],
         [    -0.773,      0.002,      0.008],
         [     1.184,      0.918,     -0.116],
         ...,
         [    -1.129,      0.625,      0.631],
         [    -1.219,     -1.021,      0.420],
         [    -1.159,      0.411,     -1.041]],

        [[     0.782,      0.002,      0.027],
         [    -0.792,     -0.004,     -0.014],
         [     1.118,      0.943,     -0.228],
         ...,
         [    -1.246,      0.707,      0.636],
         [    -1.167,     -1.070,      0.509],
         [    -1.014,      0.365,     -1.068]]]), 'species': tensor([[6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        ...,
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1],
        [6, 6, 1,  ..., 1, 1, 1]])}
{'energies': tensor([-114.457, -114.452, -114.453,  ..., -114.452, -114.452, -114.450],
       dtype=torch.float64), 'coordinates': tensor([[[    -0.515,      0.011,      0.019],
         [     0.666,     -0.003,     -0.005],
         [    -1.117,      0.960,     -0.079],
         [    -1.121,     -1.050,     -0.079]],

        [[    -0.510,      0.011,      0.020],
         [     0.648,     -0.009,     -0.005],
         [    -1.089,      0.981,     -0.083],
         [    -0.908,     -0.973,     -0.083]],

        [[    -0.530,     -0.025,     -0.019],
         [     0.671,      0.010,      0.004],
         [    -1.006,      1.103,      0.077],
         [    -1.132,     -0.970,      0.077]],

        ...,

        [[    -0.515,     -0.012,      0.028],
         [     0.669,      0.011,     -0.006],
         [    -1.024,      0.976,     -0.113],
         [    -1.263,     -1.005,     -0.113]],

        [[    -0.540,      0.018,     -0.005],
         [     0.693,     -0.014,      0.001],
         [    -1.336,      0.973,      0.020],
         [    -1.040,     -0.958,      0.020]],

        [[    -0.548,     -0.006,     -0.036],
         [     0.689,      0.008,      0.008],
         [    -1.008,      0.969,      0.147],
         [    -1.204,     -1.027,      0.147]]]), 'species': tensor([[6, 8, 1, 1],
        [6, 8, 1, 1],
        [6, 8, 1, 1],
        ...,
        [6, 8, 1, 1],
        [6, 8, 1, 1],
        [6, 8, 1, 1]])}
{'energies': tensor([-115.604, -115.672, -115.631,  ..., -115.627, -115.663, -115.598],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.631, -0.033,  0.018],
         [-0.718,  0.126, -0.050],
         [ 1.036,  1.003,  0.231],
         [ 0.961, -0.630, -0.938],
         [ 1.143, -0.393,  0.733],
         [-1.214, -0.767,  0.579]],

        [[ 0.679, -0.010, -0.006],
         [-0.756,  0.114,  0.016],
         [ 0.974,  0.960, -0.100],
         [ 1.119, -0.522, -0.861],
         [ 1.011, -0.534,  0.941],
         [-1.150, -0.737, -0.156]],

        [[ 0.601,  0.003, -0.027],
         [-0.706,  0.113, -0.022],
         [ 0.977,  0.957,  0.172],
         [ 1.227, -0.743, -0.826],
         [ 0.892, -0.460,  0.829],
         [-1.017, -0.754,  0.498]],

        ...,

        [[ 0.691, -0.044, -0.005],
         [-0.777,  0.133, -0.010],
         [ 1.264,  1.143,  0.013],
         [ 1.214, -0.658, -0.912],
         [ 0.929, -0.461,  0.853],
         [-1.278, -0.772,  0.264]],

        [[ 0.678, -0.015, -0.030],
         [-0.768,  0.118, -0.009],
         [ 1.065,  0.936,  0.192],
         [ 1.207, -0.530, -0.839],
         [ 0.936, -0.475,  0.841],
         [-1.075, -0.781,  0.297]],

        [[ 0.623, -0.042, -0.027],
         [-0.689,  0.139, -0.025],
         [ 0.987,  0.969,  0.247],
         [ 1.040, -0.608, -0.971],
         [ 0.715, -0.399,  0.911],
         [-1.192, -0.829,  0.545]]]), 'species': tensor([[6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        ...,
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1],
        [6, 8, 1, 1, 1, 1]])}
{'energies': tensor([-95.805, -95.715, -95.634,  ..., -95.776, -95.779, -95.716],
       dtype=torch.float64), 'coordinates': tensor([[[     0.712,      0.008,      0.032],
         [    -0.749,     -0.008,     -0.127],
         [     1.198,     -0.838,     -0.455],
         ...,
         [     0.904,     -0.100,      1.049],
         [    -1.221,     -0.758,      0.196],
         [    -1.020,      0.826,      0.414]],

        [[     0.723,     -0.003,      0.012],
         [    -0.766,     -0.023,     -0.146],
         [     1.298,     -0.812,     -0.347],
         ...,
         [     1.169,     -0.146,      1.182],
         [    -1.361,     -0.687,      0.231],
         [    -1.048,      1.000,      0.730]],

        [[     0.687,     -0.021,     -0.008],
         [    -0.715,     -0.022,     -0.127],
         [     1.009,     -0.765,     -0.273],
         ...,
         [     1.199,     -0.069,      1.138],
         [    -1.243,     -0.613,      0.164],
         [    -1.099,      0.925,      0.678]],

        ...,

        [[     0.673,     -0.008,      0.009],
         [    -0.717,     -0.005,     -0.119],
         [     0.886,     -0.858,     -0.439],
         ...,
         [     1.118,      0.014,      1.105],
         [    -0.950,     -0.786,      0.277],
         [    -1.171,      0.859,      0.430]],

        [[     0.659,     -0.003,      0.000],
         [    -0.712,     -0.000,     -0.133],
         [     1.179,     -0.876,     -0.454],
         ...,
         [     1.279,     -0.106,      1.142],
         [    -1.176,     -0.749,      0.355],
         [    -1.202,      0.862,      0.510]],

        [[     0.702,     -0.009,      0.026],
         [    -0.768,      0.020,     -0.131],
         [     1.199,     -0.946,     -0.712],
         ...,
         [     1.169,      0.166,      1.057],
         [    -1.105,     -0.946,      0.663],
         [    -1.181,      0.741,      0.092]]]), 'species': tensor([[6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        ...,
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1],
        [6, 7, 1,  ..., 1, 1, 1]])}
{'energies': tensor([-110.583, -110.586, -110.583,  ..., -110.590, -110.588, -110.583],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.593, -0.108,  0.008],
         [-0.600, -0.117, -0.008],
         [ 1.157,  0.750, -0.073],
         [-1.065,  0.738,  0.073]],

        [[ 0.630, -0.101, -0.007],
         [-0.644, -0.135,  0.007],
         [ 1.145,  0.755,  0.064],
         [-0.964,  0.870, -0.064]],

        [[ 0.630, -0.122, -0.012],
         [-0.628, -0.122,  0.012],
         [ 0.833,  0.846,  0.108],
         [-0.868,  0.877, -0.108]],

        ...,

        [[ 0.604, -0.117, -0.008],
         [-0.613, -0.125,  0.008],
         [ 1.123,  0.875,  0.069],
         [-0.998,  0.827, -0.069]],

        [[ 0.619, -0.112,  0.010],
         [-0.629, -0.130, -0.010],
         [ 1.171,  0.841, -0.087],
         [-1.040,  0.868,  0.087]],

        [[ 0.643, -0.133,  0.013],
         [-0.627, -0.109, -0.013],
         [ 0.989,  0.850, -0.110],
         [-1.205,  0.852,  0.110]]]), 'species': tensor([[7, 7, 1, 1],
        [7, 7, 1, 1],
        [7, 7, 1, 1],
        ...,
        [7, 7, 1, 1],
        [7, 7, 1, 1],
        [7, 7, 1, 1]])}
{'energies': tensor([-151.357, -151.426, -151.385,  ..., -151.464, -151.387, -151.416],
       dtype=torch.float64), 'coordinates': tensor([[[ 0.717,  0.125, -0.052],
         [-0.715, -0.139, -0.039],
         [ 0.939, -0.450,  0.285],
         [-0.937,  0.693,  0.285]],

        [[ 0.695,  0.157, -0.078],
         [-0.690, -0.161, -0.071],
         [ 1.175, -0.580,  0.736],
         [-1.230,  0.654,  0.702]],

        [[ 0.684,  0.151, -0.079],
         [-0.685, -0.157, -0.074],
         [ 1.217, -0.674,  0.758],
         [-1.201,  0.779,  0.767]],

        ...,

        [[ 0.711,  0.108, -0.041],
         [-0.712, -0.111, -0.039],
         [ 0.885, -0.675,  0.213],
         [-0.864,  0.739,  0.225]],

        [[ 0.715,  0.158, -0.034],
         [-0.716, -0.138, -0.050],
         [ 0.843, -0.857,  0.245],
         [-0.854,  0.527,  0.241]],

        [[ 0.712,  0.094, -0.037],
         [-0.714, -0.083, -0.048],
         [ 0.884, -0.779,  0.251],
         [-0.869,  0.587,  0.262]]]), 'species': tensor([[8, 8, 1, 1],
        [8, 8, 1, 1],
        [8, 8, 1, 1],
        ...,
        [8, 8, 1, 1],
        [8, 8, 1, 1],
        [8, 8, 1, 1]])}

To get the number of groups of conformers we can use len(), or also dataset.num_conformer_groups

num_groups = len(ds)
print(num_groups)

To get the number of conformers we can use num_conformers

num_conformers = ds.num_conformers
print(num_conformers)

Conformers#

To access individual conformers or subsets of conformers we use “conformer” methods, get_conformers and iter_conformers

conformer = ds.get_conformers("file1/CH4", 0)
print(conformer)
conformer = ds.get_conformers("file1/CH4", 1)
print(conformer)

{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.010,  0.019],
        [-0.795,  0.577, -0.547],
        [-0.394, -0.980,  0.272],
        [ 0.634,  0.447,  0.936],
        [ 0.596, -0.165, -0.892]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.483, dtype=torch.float64), 'coordinates': tensor([[ 0.003, -0.020,  0.003],
        [-0.783,  0.792, -0.260],
        [-0.454, -1.030,  0.312],
        [ 0.447,  0.636,  0.768],
        [ 0.753, -0.159, -0.853]]), 'species': tensor([6, 1, 1, 1, 1])}

A tensor / list / array can also be passed for indexing, to fetch multiple conformers from the same group, which is faster. Since we copy the data forh simplicity, this allows all fancy indexing operations (directly indexing using h5py for example does not).

conformers = ds.get_conformers("file1/CH4", [0, 1])
print(conformers)

{'energies': tensor([-40.481, -40.483], dtype=torch.float64), 'coordinates': tensor([[[-0.003,  0.010,  0.019],
         [-0.795,  0.577, -0.547],
         [-0.394, -0.980,  0.272],
         [ 0.634,  0.447,  0.936],
         [ 0.596, -0.165, -0.892]],

        [[ 0.003, -0.020,  0.003],
         [-0.783,  0.792, -0.260],
         [-0.454, -1.030,  0.312],
         [ 0.447,  0.636,  0.768],
         [ 0.753, -0.159, -0.853]]]), 'species': tensor([[6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1]])}

We can also access all the group if we don’t pass an index, same as normal indexing

conformer = ds.get_conformers("file1/CH4")
print(conformer)

{'energies': tensor([-40.481, -40.483, -40.485,  ..., -40.496, -40.456, -40.465],
       dtype=torch.float64), 'coordinates': tensor([[[-0.003,  0.010,  0.019],
         [-0.795,  0.577, -0.547],
         [-0.394, -0.980,  0.272],
         [ 0.634,  0.447,  0.936],
         [ 0.596, -0.165, -0.892]],

        [[ 0.003, -0.020,  0.003],
         [-0.783,  0.792, -0.260],
         [-0.454, -1.030,  0.312],
         [ 0.447,  0.636,  0.768],
         [ 0.753, -0.159, -0.853]],

        [[-0.018, -0.022, -0.011],
         [-0.730,  0.682, -0.308],
         [-0.382, -0.868,  0.382],
         [ 0.569,  0.505,  0.804],
         [ 0.755, -0.059, -0.744]],

        ...,

        [[ 0.009,  0.002, -0.009],
         [-0.855,  0.627, -0.261],
         [-0.471, -0.946,  0.325],
         [ 0.488,  0.509,  0.859],
         [ 0.734, -0.216, -0.817]],

        [[ 0.003,  0.017,  0.030],
         [-0.764,  0.542, -0.671],
         [-0.425, -1.045,  0.293],
         [ 0.657,  0.410,  0.947],
         [ 0.500, -0.106, -0.922]],

        [[ 0.035, -0.002, -0.008],
         [-0.954,  0.572, -0.159],
         [-0.633, -0.924,  0.224],
         [ 0.454,  0.590,  0.751],
         [ 0.711, -0.216, -0.720]]]), 'species': tensor([[6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        ...,
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1]])}

Finally, it is possible to also specify which properties we want using ‘properties’

conformer = ds.get_conformers("file1/CH4", [0, 3], properties=("species", "energies"))
print(conformer)

{'energies': tensor([-40.481, -40.492], dtype=torch.float64), 'species': tensor([[6, 1, 1, 1, 1],
        [6, 1, 1, 1, 1]])}

If you want you can also get the conformers as numpy arrays by calling get_numpy_conformers. this has an optional flag “chem_symbols” which if specified “True” will output the elements as strings (‘C’, ‘H’, ‘H’, … etc)

conformer = ds.get_numpy_conformers("file1/CH4", [0, 1], chem_symbols=True)
print(conformer)

{'energies': array([-40.48058817, -40.48311923]), 'coordinates': array([[[-0.0034502 ,  0.01017081,  0.01938033],
        [-0.7954868 ,  0.5766599 , -0.5472012 ],
        [-0.39378393, -0.97992676,  0.2722862 ],
        [ 0.6344988 ,  0.4473651 ,  0.93568736],
        [ 0.59581804, -0.16517928, -0.8915708 ]],

       [[ 0.00311385, -0.02007288,  0.00282224],
        [-0.78331304,  0.7921426 , -0.26027855],
        [-0.45410746, -1.0295471 ,  0.31240797],
        [ 0.44713658,  0.63571125,  0.76770777],
        [ 0.7531731 , -0.1592813 , -0.85348135]]], dtype=float32), 'species': array([['C', 'H', 'H', 'H', 'H'],
       ['C', 'H', 'H', 'H', 'H']], dtype='<U1')}

We can iterate over all conformers sequentially by calling iter_conformer, (this is faster than doing it manually since it caches each conformer group previous to starting the iteration), here we print the first 100 as a sample

for c in ds.iter_conformers(limit=100):
    print(c)

{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.010,  0.019],
        [-0.795,  0.577, -0.547],
        [-0.394, -0.980,  0.272],
        [ 0.634,  0.447,  0.936],
        [ 0.596, -0.165, -0.892]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.483, dtype=torch.float64), 'coordinates': tensor([[ 0.003, -0.020,  0.003],
        [-0.783,  0.792, -0.260],
        [-0.454, -1.030,  0.312],
        [ 0.447,  0.636,  0.768],
        [ 0.753, -0.159, -0.853]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.485, dtype=torch.float64), 'coordinates': tensor([[-0.018, -0.022, -0.011],
        [-0.730,  0.682, -0.308],
        [-0.382, -0.868,  0.382],
        [ 0.569,  0.505,  0.804],
        [ 0.755, -0.059, -0.744]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.492, dtype=torch.float64), 'coordinates': tensor([[-0.007,  0.013,  0.015],
        [-0.828,  0.624, -0.399],
        [-0.413, -1.013,  0.289],
        [ 0.574,  0.438,  0.783],
        [ 0.754, -0.207, -0.848]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.456, dtype=torch.float64), 'coordinates': tensor([[ 0.007,  0.035,  0.028],
        [-1.012,  0.457, -0.554],
        [-0.519, -0.962,  0.271],
        [ 0.698,  0.285,  0.818],
        [ 0.749, -0.196, -0.868]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.489, dtype=torch.float64), 'coordinates': tensor([[-0.010, -0.012,  0.011],
        [-0.788,  0.640, -0.412],
        [-0.475, -0.980,  0.279],
        [ 0.656,  0.565,  0.809],
        [ 0.726, -0.079, -0.805]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.488, dtype=torch.float64), 'coordinates': tensor([[-0.006, -0.011, -0.013],
        [-0.731,  0.733, -0.327],
        [-0.307, -0.918,  0.387],
        [ 0.459,  0.476,  0.874],
        [ 0.648, -0.160, -0.784]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.497, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.004, -0.004],
        [-0.828,  0.666, -0.301],
        [-0.377, -0.942,  0.322],
        [ 0.501,  0.455,  0.816],
        [ 0.734, -0.224, -0.786]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.479, dtype=torch.float64), 'coordinates': tensor([[-0.009,  0.009, -0.019],
        [-0.728,  0.646, -0.245],
        [-0.289, -0.919,  0.351],
        [ 0.447,  0.431,  0.867],
        [ 0.673, -0.264, -0.744]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.498, dtype=torch.float64), 'coordinates': tensor([[    -0.007,     -0.000,     -0.002],
        [    -0.817,      0.676,     -0.322],
        [    -0.381,     -0.955,      0.314],
        [     0.537,      0.487,      0.830],
        [     0.742,     -0.205,     -0.800]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.491, dtype=torch.float64), 'coordinates': tensor([[     0.007,      0.000,      0.008],
        [    -0.754,      0.594,     -0.402],
        [    -0.461,     -0.930,      0.376],
        [     0.472,      0.443,      0.852],
        [     0.665,     -0.110,     -0.927]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.465, dtype=torch.float64), 'coordinates': tensor([[ 0.023,  0.010, -0.011],
        [-0.944,  0.498, -0.285],
        [-0.569, -0.812,  0.429],
        [ 0.463,  0.318,  0.791],
        [ 0.772, -0.119, -0.807]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.452, dtype=torch.float64), 'coordinates': tensor([[-0.021,  0.015, -0.014],
        [-0.753,  0.591, -0.193],
        [-0.397, -1.036,  0.242],
        [ 0.623,  0.506,  0.684],
        [ 0.781, -0.235, -0.569]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[-0.011, -0.028, -0.016],
        [-0.733,  0.780, -0.205],
        [-0.393, -0.955,  0.329],
        [ 0.500,  0.659,  0.838],
        [ 0.760, -0.149, -0.774]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.484, dtype=torch.float64), 'coordinates': tensor([[ 0.010,  0.013, -0.012],
        [-0.928,  0.628, -0.352],
        [-0.370, -0.867,  0.397],
        [ 0.474,  0.348,  0.957],
        [ 0.710, -0.258, -0.856]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.496, dtype=torch.float64), 'coordinates': tensor([[ 0.002, -0.011,  0.001],
        [-0.752,  0.659, -0.377],
        [-0.448, -0.946,  0.374],
        [ 0.505,  0.492,  0.817],
        [ 0.667, -0.075, -0.831]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.472, dtype=torch.float64), 'coordinates': tensor([[ 0.033, -0.008, -0.018],
        [-0.818,  0.590, -0.315],
        [-0.534, -0.821,  0.527],
        [ 0.330,  0.378,  0.906],
        [ 0.627, -0.049, -0.904]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.458, dtype=torch.float64), 'coordinates': tensor([[-0.005, -0.008, -0.010],
        [-0.738,  0.838, -0.357],
        [-0.189, -0.964,  0.352],
        [ 0.415,  0.458,  0.849],
        [ 0.574, -0.233, -0.729]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.484, dtype=torch.float64), 'coordinates': tensor([[-0.018, -0.003, -0.004],
        [-0.759,  0.817, -0.357],
        [-0.251, -1.011,  0.409],
        [ 0.478,  0.425,  0.798],
        [ 0.742, -0.194, -0.798]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.452, dtype=torch.float64), 'coordinates': tensor([[-0.012, -0.034,  0.015],
        [-0.687,  0.752, -0.402],
        [-0.472, -0.930,  0.471],
        [ 0.496,  0.477,  0.636],
        [ 0.807,  0.104, -0.886]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.462, dtype=torch.float64), 'coordinates': tensor([[-0.044, -0.007, -0.033],
        [-0.781,  0.766, -0.347],
        [-0.228, -0.925,  0.542],
        [ 0.650,  0.368,  0.949],
        [ 0.879, -0.127, -0.753]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.480, dtype=torch.float64), 'coordinates': tensor([[-0.023, -0.015,  0.004],
        [-0.686,  0.702, -0.419],
        [-0.406, -0.990,  0.449],
        [ 0.581,  0.483,  0.825],
        [ 0.783, -0.018, -0.899]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.489, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.002,  0.023],
        [-0.745,  0.626, -0.414],
        [-0.379, -0.947,  0.234],
        [ 0.517,  0.477,  0.754],
        [ 0.644, -0.176, -0.841]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.473, dtype=torch.float64), 'coordinates': tensor([[-0.002,  0.024, -0.015],
        [-0.825,  0.526, -0.225],
        [-0.462, -0.978,  0.329],
        [ 0.522,  0.470,  0.928],
        [ 0.788, -0.298, -0.855]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.492, dtype=torch.float64), 'coordinates': tensor([[-0.010, -0.014, -0.004],
        [-0.818,  0.746, -0.280],
        [-0.374, -0.938,  0.312],
        [ 0.526,  0.540,  0.788],
        [ 0.781, -0.181, -0.777]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.489, dtype=torch.float64), 'coordinates': tensor([[-0.002,  0.014,  0.004],
        [-0.821,  0.646, -0.290],
        [-0.415, -1.055,  0.232],
        [ 0.528,  0.543,  0.829],
        [ 0.728, -0.300, -0.819]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.453, dtype=torch.float64), 'coordinates': tensor([[ 0.017, -0.006, -0.034],
        [-0.691,  0.708, -0.166],
        [-0.373, -0.949,  0.426],
        [ 0.294,  0.524,  0.910],
        [ 0.571, -0.210, -0.761]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.468, dtype=torch.float64), 'coordinates': tensor([[ 0.029, -0.014,  0.035],
        [-0.875,  0.561, -0.464],
        [-0.652, -0.928,  0.219],
        [ 0.535,  0.598,  0.819],
        [ 0.642, -0.066, -0.994]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.464, dtype=torch.float64), 'coordinates': tensor([[-0.007, -0.024, -0.010],
        [-0.667,  0.865, -0.327],
        [-0.316, -1.064,  0.382],
        [ 0.440,  0.656,  0.955],
        [ 0.631, -0.166, -0.886]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.479, dtype=torch.float64), 'coordinates': tensor([[ 0.024, -0.009,  0.007],
        [-0.917,  0.537, -0.344],
        [-0.563, -0.782,  0.313],
        [ 0.490,  0.449,  0.812],
        [ 0.704, -0.102, -0.859]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.495, dtype=torch.float64), 'coordinates': tensor([[     0.001,     -0.005,      0.003],
        [    -0.812,      0.604,     -0.376],
        [    -0.462,     -0.917,      0.309],
        [     0.575,      0.483,      0.804],
        [     0.689,     -0.108,     -0.778]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.491, dtype=torch.float64), 'coordinates': tensor([[-0.008,  0.010,  0.013],
        [-0.743,  0.659, -0.416],
        [-0.320, -1.019,  0.248],
        [ 0.537,  0.469,  0.808],
        [ 0.619, -0.232, -0.795]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.460, dtype=torch.float64), 'coordinates': tensor([[ 0.019,  0.001, -0.036],
        [-0.764,  0.663, -0.319],
        [-0.389, -0.903,  0.590],
        [ 0.339,  0.340,  0.998],
        [ 0.585, -0.116, -0.838]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.494, dtype=torch.float64), 'coordinates': tensor([[ 0.011,  0.004,  0.004],
        [-0.788,  0.630, -0.317],
        [-0.463, -0.981,  0.323],
        [ 0.441,  0.508,  0.849],
        [ 0.684, -0.201, -0.901]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.460, dtype=torch.float64), 'coordinates': tensor([[ 0.035, -0.020, -0.004],
        [-0.741,  0.608, -0.192],
        [-0.605, -0.947,  0.237],
        [ 0.367,  0.730,  0.869],
        [ 0.564, -0.155, -0.861]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.494, dtype=torch.float64), 'coordinates': tensor([[     0.000,      0.003,      0.011],
        [    -0.852,      0.660,     -0.378],
        [    -0.395,     -0.968,      0.231],
        [     0.560,      0.499,      0.800],
        [     0.683,     -0.223,     -0.782]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.490, dtype=torch.float64), 'coordinates': tensor([[ 0.001, -0.008,  0.002],
        [-0.792,  0.710, -0.395],
        [-0.409, -1.003,  0.321],
        [ 0.534,  0.554,  0.910],
        [ 0.654, -0.165, -0.864]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.470, dtype=torch.float64), 'coordinates': tensor([[-0.005, -0.021, -0.001],
        [-0.661,  0.607, -0.302],
        [-0.501, -0.897,  0.381],
        [ 0.508,  0.526,  0.719],
        [ 0.708,  0.018, -0.782]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.497, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.004, -0.003],
        [-0.813,  0.689, -0.321],
        [-0.350, -0.956,  0.330],
        [ 0.486,  0.455,  0.837],
        [ 0.709, -0.232, -0.805]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.498, dtype=torch.float64), 'coordinates': tensor([[-0.001,  0.002, -0.005],
        [-0.828,  0.663, -0.320],
        [-0.398, -0.945,  0.353],
        [ 0.502,  0.455,  0.848],
        [ 0.736, -0.200, -0.824]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.471, dtype=torch.float64), 'coordinates': tensor([[ 0.006, -0.013,  0.006],
        [-0.760,  0.628, -0.404],
        [-0.570, -0.985,  0.447],
        [ 0.529,  0.457,  0.712],
        [ 0.724,  0.053, -0.833]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.494, dtype=torch.float64), 'coordinates': tensor([[     0.003,     -0.000,     -0.014],
        [    -0.856,      0.668,     -0.352],
        [    -0.398,     -0.922,      0.410],
        [     0.516,      0.417,      0.895],
        [     0.698,     -0.160,     -0.787]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.495, dtype=torch.float64), 'coordinates': tensor([[     0.010,     -0.000,     -0.004],
        [    -0.828,      0.591,     -0.349],
        [    -0.500,     -0.946,      0.346],
        [     0.530,      0.499,      0.895],
        [     0.681,     -0.144,     -0.847]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.468, dtype=torch.float64), 'coordinates': tensor([[-0.025,  0.010,  0.018],
        [-0.864,  0.565, -0.491],
        [-0.406, -0.989,  0.193],
        [ 0.806,  0.489,  0.854],
        [ 0.762, -0.186, -0.775]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.459, dtype=torch.float64), 'coordinates': tensor([[ 0.011,  0.030,  0.034],
        [-0.835,  0.530, -0.524],
        [-0.369, -0.970,  0.177],
        [ 0.506,  0.431,  0.989],
        [ 0.562, -0.351, -1.042]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.455, dtype=torch.float64), 'coordinates': tensor([[-0.027, -0.032, -0.021],
        [-0.623,  0.815, -0.305],
        [-0.416, -1.043,  0.573],
        [ 0.524,  0.562,  0.844],
        [ 0.831,  0.043, -0.867]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.486, dtype=torch.float64), 'coordinates': tensor([[-0.019,  0.020, -0.007],
        [-0.796,  0.600, -0.372],
        [-0.292, -0.964,  0.306],
        [ 0.612,  0.393,  0.896],
        [ 0.700, -0.268, -0.741]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[-0.008,  0.021, -0.015],
        [-0.896,  0.692, -0.265],
        [-0.291, -0.970,  0.345],
        [ 0.492,  0.397,  0.909],
        [ 0.794, -0.367, -0.808]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.474, dtype=torch.float64), 'coordinates': tensor([[-0.013,  0.016,  0.006],
        [-0.839,  0.612, -0.504],
        [-0.393, -0.980,  0.464],
        [ 0.620,  0.260,  0.789],
        [ 0.771, -0.080, -0.819]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.485, dtype=torch.float64), 'coordinates': tensor([[ 0.007,  0.032, -0.005],
        [-0.869,  0.537, -0.327],
        [-0.431, -0.981,  0.349],
        [ 0.499,  0.331,  0.823],
        [ 0.713, -0.264, -0.788]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.456, dtype=torch.float64), 'coordinates': tensor([[ 0.007, -0.023,  0.011],
        [-0.745,  0.668, -0.201],
        [-0.606, -1.059,  0.157],
        [ 0.538,  0.775,  0.652],
        [ 0.728, -0.114, -0.744]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.486, dtype=torch.float64), 'coordinates': tensor([[-0.012,  0.016,  0.020],
        [-0.767,  0.588, -0.518],
        [-0.380, -1.012,  0.320],
        [ 0.613,  0.402,  0.874],
        [ 0.675, -0.164, -0.910]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.488, dtype=torch.float64), 'coordinates': tensor([[-0.013, -0.022, -0.005],
        [-0.728,  0.730, -0.387],
        [-0.362, -0.915,  0.403],
        [ 0.548,  0.515,  0.883],
        [ 0.699, -0.066, -0.833]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.474, dtype=torch.float64), 'coordinates': tensor([[ 0.002, -0.003,  0.016],
        [-0.832,  0.586, -0.464],
        [-0.418, -0.865,  0.270],
        [ 0.592,  0.404,  0.745],
        [ 0.638, -0.094, -0.748]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.487, dtype=torch.float64), 'coordinates': tensor([[     0.001,      0.018,      0.013],
        [    -0.845,      0.558,     -0.356],
        [    -0.482,     -1.030,      0.225],
        [     0.584,      0.506,      0.829],
        [     0.731,     -0.254,     -0.855]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[ 0.005, -0.011, -0.022],
        [-0.874,  0.795, -0.202],
        [-0.356, -0.914,  0.410],
        [ 0.392,  0.491,  0.864],
        [ 0.778, -0.240, -0.806]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.458, dtype=torch.float64), 'coordinates': tensor([[ 0.009,  0.033, -0.042],
        [-0.901,  0.694, -0.158],
        [-0.275, -0.965,  0.488],
        [ 0.315,  0.293,  0.965],
        [ 0.758, -0.421, -0.798]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.454, dtype=torch.float64), 'coordinates': tensor([[ 0.041, -0.003,  0.001],
        [-0.949,  0.623, -0.154],
        [-0.637, -1.004,  0.136],
        [ 0.427,  0.700,  0.767],
        [ 0.667, -0.283, -0.763]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.495, dtype=torch.float64), 'coordinates': tensor([[    -0.000,      0.003,     -0.016],
        [    -0.829,      0.638,     -0.250],
        [    -0.437,     -0.958,      0.362],
        [     0.516,      0.473,      0.825],
        [     0.751,     -0.191,     -0.741]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.484, dtype=torch.float64), 'coordinates': tensor([[-0.018, -0.013,  0.019],
        [-0.683,  0.645, -0.442],
        [-0.388, -0.923,  0.312],
        [ 0.574,  0.512,  0.813],
        [ 0.707, -0.074, -0.909]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.481, dtype=torch.float64), 'coordinates': tensor([[     0.014,     -0.012,      0.000],
        [    -0.877,      0.591,     -0.324],
        [    -0.568,     -0.858,      0.373],
        [     0.503,      0.514,      0.889],
        [     0.773,     -0.098,     -0.944]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.487, dtype=torch.float64), 'coordinates': tensor([[ 0.002,  0.007, -0.023],
        [-0.893,  0.640, -0.267],
        [-0.467, -0.990,  0.393],
        [ 0.568,  0.449,  0.839],
        [ 0.771, -0.178, -0.695]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.493, dtype=torch.float64), 'coordinates': tensor([[ 0.006, -0.001,  0.002],
        [-0.871,  0.702, -0.361],
        [-0.389, -0.955,  0.296],
        [ 0.505,  0.499,  0.876],
        [ 0.681, -0.230, -0.832]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.459, dtype=torch.float64), 'coordinates': tensor([[-0.038,  0.005, -0.030],
        [-0.687,  0.647, -0.295],
        [-0.289, -0.922,  0.553],
        [ 0.578,  0.307,  0.827],
        [ 0.856, -0.089, -0.729]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.451, dtype=torch.float64), 'coordinates': tensor([[-0.042,  0.013, -0.008],
        [-0.613,  0.735, -0.329],
        [-0.145, -0.996,  0.432],
        [ 0.489,  0.321,  0.741],
        [ 0.767, -0.210, -0.747]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.480, dtype=torch.float64), 'coordinates': tensor([[ 0.002, -0.018,  0.013],
        [-0.728,  0.784, -0.341],
        [-0.441, -1.051,  0.330],
        [ 0.435,  0.618,  0.794],
        [ 0.708, -0.132, -0.936]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.475, dtype=torch.float64), 'coordinates': tensor([[-0.023, -0.009, -0.015],
        [-0.858,  0.822, -0.278],
        [-0.280, -1.003,  0.311],
        [ 0.601,  0.540,  0.836],
        [ 0.811, -0.252, -0.692]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.464, dtype=torch.float64), 'coordinates': tensor([[ 0.017, -0.003, -0.011],
        [-0.907,  0.507, -0.165],
        [-0.681, -0.884,  0.355],
        [ 0.515,  0.527,  0.768],
        [ 0.874, -0.111, -0.824]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.461, dtype=torch.float64), 'coordinates': tensor([[-0.011, -0.019,  0.032],
        [-0.627,  0.678, -0.520],
        [-0.492, -1.077,  0.330],
        [ 0.602,  0.544,  0.647],
        [ 0.647,  0.076, -0.840]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.460, dtype=torch.float64), 'coordinates': tensor([[     0.035,     -0.008,      0.000],
        [    -0.904,      0.724,     -0.371],
        [    -0.406,     -0.927,      0.239],
        [     0.404,      0.575,      0.984],
        [     0.494,     -0.275,     -0.853]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.499, dtype=torch.float64), 'coordinates': tensor([[     0.001,     -0.002,      0.004],
        [    -0.773,      0.661,     -0.371],
        [    -0.414,     -0.976,      0.328],
        [     0.508,      0.497,      0.839],
        [     0.670,     -0.157,     -0.842]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.486, dtype=torch.float64), 'coordinates': tensor([[ 0.020, -0.013,  0.010],
        [-0.796,  0.650, -0.342],
        [-0.490, -0.903,  0.298],
        [ 0.408,  0.576,  0.900],
        [ 0.638, -0.164, -0.977]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.497, dtype=torch.float64), 'coordinates': tensor([[-0.008, -0.005, -0.008],
        [-0.791,  0.712, -0.333],
        [-0.364, -0.965,  0.367],
        [ 0.525,  0.492,  0.871],
        [ 0.723, -0.176, -0.808]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.489, dtype=torch.float64), 'coordinates': tensor([[    -0.010,     -0.000,      0.005],
        [    -0.792,      0.642,     -0.432],
        [    -0.410,     -0.950,      0.405],
        [     0.578,      0.376,      0.759],
        [     0.740,     -0.067,     -0.795]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.493, dtype=torch.float64), 'coordinates': tensor([[-0.014,  0.006, -0.006],
        [-0.772,  0.667, -0.268],
        [-0.356, -0.978,  0.317],
        [ 0.518,  0.488,  0.819],
        [ 0.772, -0.242, -0.796]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.466, dtype=torch.float64), 'coordinates': tensor([[-0.010, -0.021,  0.011],
        [-0.712,  0.803, -0.432],
        [-0.325, -0.930,  0.468],
        [ 0.422,  0.427,  0.792],
        [ 0.733, -0.055, -0.958]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.494, dtype=torch.float64), 'coordinates': tensor([[-0.003,  0.002,  0.018],
        [-0.819,  0.639, -0.398],
        [-0.396, -0.966,  0.201],
        [ 0.579,  0.512,  0.778],
        [ 0.673, -0.208, -0.790]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.466, dtype=torch.float64), 'coordinates': tensor([[ 0.014,  0.021, -0.031],
        [-0.942,  0.565, -0.053],
        [-0.456, -0.897,  0.275],
        [ 0.421,  0.500,  0.906],
        [ 0.809, -0.417, -0.759]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.491, dtype=torch.float64), 'coordinates': tensor([[    -0.018,      0.011,     -0.001],
        [    -0.753,      0.608,     -0.404],
        [    -0.336,     -0.969,      0.331],
        [     0.609,      0.412,      0.851],
        [     0.693,     -0.178,     -0.767]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.465, dtype=torch.float64), 'coordinates': tensor([[    -0.033,     -0.008,      0.000],
        [    -0.701,      0.769,     -0.538],
        [    -0.260,     -1.031,      0.457],
        [     0.653,      0.432,      0.941],
        [     0.702,     -0.070,     -0.862]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.492, dtype=torch.float64), 'coordinates': tensor([[     0.000,      0.006,     -0.004],
        [    -0.856,      0.675,     -0.374],
        [    -0.371,     -0.941,      0.391],
        [     0.499,      0.408,      0.910],
        [     0.725,     -0.212,     -0.873]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.498, dtype=torch.float64), 'coordinates': tensor([[ 0.002,  0.003, -0.003],
        [-0.792,  0.652, -0.318],
        [-0.415, -0.970,  0.347],
        [ 0.485,  0.470,  0.826],
        [ 0.701, -0.185, -0.814]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.464, dtype=torch.float64), 'coordinates': tensor([[ 0.049,  0.011,  0.006],
        [-0.828,  0.487, -0.376],
        [-0.607, -0.899,  0.390],
        [ 0.318,  0.410,  0.901],
        [ 0.534, -0.132, -0.987]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.498, dtype=torch.float64), 'coordinates': tensor([[ 0.002, -0.009,  0.006],
        [-0.804,  0.697, -0.365],
        [-0.419, -0.960,  0.310],
        [ 0.512,  0.522,  0.817],
        [ 0.690, -0.151, -0.835]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.472, dtype=torch.float64), 'coordinates': tensor([[    -0.028,     -0.000,      0.026],
        [    -0.789,      0.736,     -0.546],
        [    -0.347,     -1.089,      0.307],
        [     0.710,      0.462,      0.759],
        [     0.765,     -0.105,     -0.828]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.477, dtype=torch.float64), 'coordinates': tensor([[ 0.009,  0.002,  0.012],
        [-0.919,  0.622, -0.505],
        [-0.458, -0.935,  0.337],
        [ 0.593,  0.436,  0.952],
        [ 0.672, -0.152, -0.923]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.494, dtype=torch.float64), 'coordinates': tensor([[ 0.014,  0.007, -0.004],
        [-0.845,  0.621, -0.360],
        [-0.411, -0.921,  0.340],
        [ 0.465,  0.424,  0.879],
        [ 0.624, -0.205, -0.813]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.468, dtype=torch.float64), 'coordinates': tensor([[ 0.022, -0.016,  0.014],
        [-0.770,  0.650, -0.235],
        [-0.596, -1.021,  0.187],
        [ 0.431,  0.743,  0.797],
        [ 0.674, -0.183, -0.920]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.487, dtype=torch.float64), 'coordinates': tensor([[     0.015,     -0.001,      0.004],
        [    -0.842,      0.545,     -0.340],
        [    -0.586,     -0.966,      0.285],
        [     0.567,      0.545,      0.822],
        [     0.683,     -0.114,     -0.811]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.477, dtype=torch.float64), 'coordinates': tensor([[ 0.018, -0.005, -0.030],
        [-0.806,  0.703, -0.206],
        [-0.454, -0.985,  0.427],
        [ 0.383,  0.558,  0.987],
        [ 0.663, -0.220, -0.847]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.459, dtype=torch.float64), 'coordinates': tensor([[-0.004, -0.032, -0.004],
        [-0.905,  0.691, -0.244],
        [-0.559, -0.842,  0.363],
        [ 0.595,  0.524,  0.624],
        [ 0.912,  0.009, -0.696]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.480, dtype=torch.float64), 'coordinates': tensor([[-0.030,  0.006, -0.020],
        [-0.740,  0.771, -0.289],
        [-0.246, -1.051,  0.408],
        [ 0.549,  0.469,  0.899],
        [ 0.790, -0.255, -0.774]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.465, dtype=torch.float64), 'coordinates': tensor([[ 0.005, -0.009,  0.026],
        [-0.729,  0.655, -0.554],
        [-0.540, -1.099,  0.360],
        [ 0.594,  0.527,  0.789],
        [ 0.610,  0.024, -0.905]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.471, dtype=torch.float64), 'coordinates': tensor([[-0.022,  0.012, -0.015],
        [-0.758,  0.777, -0.303],
        [-0.270, -1.118,  0.385],
        [ 0.544,  0.425,  0.749],
        [ 0.746, -0.231, -0.649]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.463, dtype=torch.float64), 'coordinates': tensor([[ 0.015, -0.014, -0.018],
        [-0.851,  0.659, -0.343],
        [-0.397, -0.738,  0.520],
        [ 0.381,  0.319,  0.892],
        [ 0.688, -0.075, -0.861]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.491, dtype=torch.float64), 'coordinates': tensor([[ 0.014, -0.003, -0.018],
        [-0.848,  0.576, -0.271],
        [-0.519, -0.885,  0.383],
        [ 0.499,  0.484,  0.893],
        [ 0.700, -0.133, -0.792]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.488, dtype=torch.float64), 'coordinates': tensor([[-0.008,  0.021,  0.011],
        [-0.857,  0.564, -0.374],
        [-0.370, -0.952,  0.216],
        [ 0.595,  0.439,  0.865],
        [ 0.731, -0.306, -0.837]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.468, dtype=torch.float64), 'coordinates': tensor([[     0.000,      0.024,      0.002],
        [    -0.870,      0.541,     -0.504],
        [    -0.367,     -0.960,      0.298],
        [     0.631,      0.396,      1.054],
        [     0.600,     -0.263,     -0.873]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.478, dtype=torch.float64), 'coordinates': tensor([[-0.015, -0.020,  0.001],
        [-0.687,  0.684, -0.421],
        [-0.412, -0.939,  0.422],
        [ 0.585,  0.454,  0.722],
        [ 0.697,  0.045, -0.738]]), 'species': tensor([6, 1, 1, 1, 1])}
{'energies': tensor(-40.491, dtype=torch.float64), 'coordinates': tensor([[ 0.005, -0.002, -0.020],
        [-0.809,  0.667, -0.308],
        [-0.444, -0.966,  0.452],
        [ 0.486,  0.465,  0.925],
        [ 0.711, -0.140, -0.828]]), 'species': tensor([6, 1, 1, 1, 1])}

We will now delete the files we copied for cleanup purposes

file1_path.unlink()
file2_path.unlink()

Basic usage of ANIDataset#

Conformer groups#

Conformers#

Basic usage of `ANIDataset`#