Skip to content

cif

Convert an OPTIMADE structure, in the format of StructureResource to a CIF file (Crystallographic Information File).

For more information on the CIF file format, see the official documentation.

Note

This conversion function is inspired heavily by the similar conversion function in the ASE library.

See here for the original ASE code.

For more information on the ASE library, see their documentation.

This conversion function relies on the NumPy library.

NUMPY_NOT_FOUND = 'NumPy not found, cannot convert structure to CIF' module-attribute

__all__ = ('get_cif') module-attribute

AdapterPackageNotFound

Bases: OptimadeWarning

The package for an adapter cannot be found.

Source code in optimade/adapters/warnings.py
6
7
class AdapterPackageNotFound(OptimadeWarning):
    """The package for an adapter cannot be found."""

OptimadeStructure

Bases: EntryResource

Representing a structure.

Source code in optimade/models/structures.py
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
class StructureResource(EntryResource):
    """Representing a structure."""

    type: Annotated[
        Literal["structures"],
        StrictField(
            description="""The name of the type of an entry.

- **Type**: string.

- **Requirements/Conventions**:
    - **Support**: MUST be supported by all implementations, MUST NOT be `null`.
    - **Query**: MUST be a queryable property with support for all mandatory filter features.
    - **Response**: REQUIRED in the response.
    - MUST be an existing entry type.
    - The entry of type `<type>` and ID `<id>` MUST be returned in response to a request for `/<type>/<id>` under the versioned base URL.

- **Examples**:
    - `"structures"`""",
            pattern="^structures$",
            support=SupportLevel.MUST,
            queryable=SupportLevel.MUST,
        ),
    ] = "structures"

    attributes: StructureResourceAttributes

OptimadeStructureSpecies

Bases: BaseModel

A list describing the species of the sites of this structure.

Species can represent pure chemical elements, virtual-crystal atoms representing a statistical occupation of a given site by multiple chemical elements, and/or a location to which there are attached atoms, i.e., atoms whose precise location are unknown beyond that they are attached to that position (frequently used to indicate hydrogen atoms attached to another element, e.g., a carbon with three attached hydrogens might represent a methyl group, -CH3).

  • Examples:
    • [ {"name": "Ti", "chemical_symbols": ["Ti"], "concentration": [1.0]} ]: any site with this species is occupied by a Ti atom.
    • [ {"name": "Ti", "chemical_symbols": ["Ti", "vacancy"], "concentration": [0.9, 0.1]} ]: any site with this species is occupied by a Ti atom with 90 % probability, and has a vacancy with 10 % probability.
    • [ {"name": "BaCa", "chemical_symbols": ["vacancy", "Ba", "Ca"], "concentration": [0.05, 0.45, 0.5], "mass": [0.0, 137.327, 40.078]} ]: any site with this species is occupied by a Ba atom with 45 % probability, a Ca atom with 50 % probability, and by a vacancy with 5 % probability. The mass of this site is (on average) 88.5 a.m.u.
    • [ {"name": "C12", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [12.0]} ]: any site with this species is occupied by a carbon isotope with mass 12.
    • [ {"name": "C13", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [13.0]} ]: any site with this species is occupied by a carbon isotope with mass 13.
    • [ {"name": "CH3", "chemical_symbols": ["C"], "concentration": [1.0], "attached": ["H"], "nattached": [3]} ]: any site with this species is occupied by a methyl group, -CH3, which is represented without specifying precise positions of the hydrogen atoms.
Source code in optimade/models/structures.py
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
class Species(BaseModel):
    """A list describing the species of the sites of this structure.

    Species can represent pure chemical elements, virtual-crystal atoms representing a
    statistical occupation of a given site by multiple chemical elements, and/or a
    location to which there are attached atoms, i.e., atoms whose precise location are
    unknown beyond that they are attached to that position (frequently used to indicate
    hydrogen atoms attached to another element, e.g., a carbon with three attached
    hydrogens might represent a methyl group, -CH3).

    - **Examples**:
        - `[ {"name": "Ti", "chemical_symbols": ["Ti"], "concentration": [1.0]} ]`: any site with this species is occupied by a Ti atom.
        - `[ {"name": "Ti", "chemical_symbols": ["Ti", "vacancy"], "concentration": [0.9, 0.1]} ]`: any site with this species is occupied by a Ti atom with 90 % probability, and has a vacancy with 10 % probability.
        - `[ {"name": "BaCa", "chemical_symbols": ["vacancy", "Ba", "Ca"], "concentration": [0.05, 0.45, 0.5], "mass": [0.0, 137.327, 40.078]} ]`: any site with this species is occupied by a Ba atom with 45 % probability, a Ca atom with 50 % probability, and by a vacancy with 5 % probability. The mass of this site is (on average) 88.5 a.m.u.
        - `[ {"name": "C12", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [12.0]} ]`: any site with this species is occupied by a carbon isotope with mass 12.
        - `[ {"name": "C13", "chemical_symbols": ["C"], "concentration": [1.0], "mass": [13.0]} ]`: any site with this species is occupied by a carbon isotope with mass 13.
        - `[ {"name": "CH3", "chemical_symbols": ["C"], "concentration": [1.0], "attached": ["H"], "nattached": [3]} ]`: any site with this species is occupied by a methyl group, -CH3, which is represented without specifying precise positions of the hydrogen atoms.

    """

    name: Annotated[
        str,
        OptimadeField(
            description="""Gives the name of the species; the **name** value MUST be unique in the `species` list.""",
            support=SupportLevel.MUST,
            queryable=SupportLevel.OPTIONAL,
        ),
    ]

    chemical_symbols: Annotated[
        list[ChemicalSymbol],
        OptimadeField(
            description="""MUST be a list of strings of all chemical elements composing this species. Each item of the list MUST be one of the following:

- a valid chemical-element symbol, or
- the special value `"X"` to represent a non-chemical element, or
- the special value `"vacancy"` to represent that this site has a non-zero probability of having a vacancy (the respective probability is indicated in the `concentration` list, see below).

If any one entry in the `species` list has a `chemical_symbols` list that is longer than 1 element, the correct flag MUST be set in the list `structure_features`.""",
            support=SupportLevel.MUST,
            queryable=SupportLevel.OPTIONAL,
        ),
    ]

    concentration: Annotated[
        list[float],
        OptimadeField(
            description="""MUST be a list of floats, with same length as `chemical_symbols`. The numbers represent the relative concentration of the corresponding chemical symbol in this species. The numbers SHOULD sum to one. Cases in which the numbers do not sum to one typically fall only in the following two categories:

- Numerical errors when representing float numbers in fixed precision, e.g. for two chemical symbols with concentrations `1/3` and `2/3`, the concentration might look something like `[0.33333333333, 0.66666666666]`. If the client is aware that the sum is not one because of numerical precision, it can renormalize the values so that the sum is exactly one.
- Experimental errors in the data present in the database. In this case, it is the responsibility of the client to decide how to process the data.

Note that concentrations are uncorrelated between different site (even of the same species).""",
            support=SupportLevel.MUST,
            queryable=SupportLevel.OPTIONAL,
        ),
    ]

    mass: Annotated[
        Optional[list[float]],
        OptimadeField(
            description="""If present MUST be a list of floats expressed in a.m.u.
Elements denoting vacancies MUST have masses equal to 0.""",
            unit="a.m.u.",
            support=SupportLevel.OPTIONAL,
            queryable=SupportLevel.OPTIONAL,
        ),
    ] = None

    original_name: Annotated[
        Optional[str],
        OptimadeField(
            description="""Can be any valid Unicode string, and SHOULD contain (if specified) the name of the species that is used internally in the source database.

Note: With regards to "source database", we refer to the immediate source being queried via the OPTIMADE API implementation.""",
            support=SupportLevel.OPTIONAL,
            queryable=SupportLevel.OPTIONAL,
        ),
    ] = None

    attached: Annotated[
        Optional[list[str]],
        OptimadeField(
            description="""If provided MUST be a list of length 1 or more of strings of chemical symbols for the elements attached to this site, or "X" for a non-chemical element.""",
            support=SupportLevel.OPTIONAL,
            queryable=SupportLevel.OPTIONAL,
        ),
    ] = None

    nattached: Annotated[
        Optional[list[int]],
        OptimadeField(
            description="""If provided MUST be a list of length 1 or more of integers indicating the number of attached atoms of the kind specified in the value of the :field:`attached` key.""",
            support=SupportLevel.OPTIONAL,
            queryable=SupportLevel.OPTIONAL,
        ),
    ] = None

    @field_validator("concentration", "mass", mode="after")
    def validate_concentration_and_mass(
        cls, value: Optional[list[float]], info: "ValidationInfo"
    ) -> Optional[list[float]]:
        if not value:
            return value

        if info.data.get("chemical_symbols"):
            if len(value) != len(info.data["chemical_symbols"]):
                raise ValueError(
                    f"Length of concentration ({len(value)}) MUST equal length of "
                    f"chemical_symbols ({len(info.data['chemical_symbols'])})"
                )
            return value

        raise ValueError(
            f"Could not validate {info.field_name!r} as 'chemical_symbols' is missing/invalid."
        )

    @field_validator("attached", "nattached", mode="after")
    @classmethod
    def validate_minimum_list_length(
        cls, value: Optional[Union[list[str], list[int]]]
    ) -> Optional[Union[list[str], list[int]]]:
        if value is not None and len(value) < 1:
            raise ValueError(
                "The list's length MUST be 1 or more, instead it was found to be "
                f"{len(value)}"
            )
        return value

    @model_validator(mode="after")
    def attached_nattached_mutually_exclusive(self) -> "Species":
        if (self.attached is None and self.nattached is not None) or (
            self.attached is not None and self.nattached is None
        ):
            raise ValueError(
                f"Either both or none of attached ({self.attached}) and nattached "
                f"({self.nattached}) MUST be set."
            )

        if (
            self.attached is not None
            and self.nattached is not None
            and len(self.attached) != len(self.nattached)
        ):
            raise ValueError(
                f"attached ({self.attached}) and nattached ({self.nattached}) MUST be "
                "lists of equal length."
            )

        return self

cell_to_cellpar(cell, radians=False)

Returns the cell parameters [a, b, c, alpha, beta, gamma].

Angles are in degrees unless radian=True is used.

Note

Based on ASE code.

Parameters:

Name Type Description Default
cell tuple[Vector3D, Vector3D, Vector3D]

A Cartesian 3x3 cell. This equates to the lattice_vectors attribute.

required
radians bool

Use radians instead of degrees (default) for angles.

False

Returns:

Type Description
list[float]

The unit cell parameters as a list of float values.

Source code in optimade/adapters/structures/utils.py
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
def cell_to_cellpar(
    cell: tuple[Vector3D, Vector3D, Vector3D], radians: bool = False
) -> list[float]:
    """Returns the cell parameters `[a, b, c, alpha, beta, gamma]`.

    Angles are in degrees unless `radian=True` is used.

    Note:
        Based on [ASE code](https://wiki.fysik.dtu.dk/ase/_modules/ase/geometry/cell.html#cell_to_cellpar).

    Parameters:
        cell: A Cartesian 3x3 cell. This equates to the
            [`lattice_vectors`][optimade.models.structures.StructureResourceAttributes.lattice_vectors] attribute.
        radians: Use radians instead of degrees (default) for angles.

    Returns:
        The unit cell parameters as a `list` of `float` values.

    """
    if globals().get("np", None) is None:
        warn(NUMPY_NOT_FOUND, AdapterPackageNotFound)
        return None  # type: ignore[return-value]

    cell = np.asarray(cell)

    lengths = [np.linalg.norm(vector) for vector in cell]
    angles = []
    for i in range(3):
        j = i - 1
        k = i - 2
        outer_product = lengths[j] * lengths[k]
        if outer_product > 1e-16:
            x_vector = np.dot(cell[j], cell[k]) / outer_product
            angle = 180.0 / np.pi * np.arccos(x_vector)
        else:
            angle = 90.0
        angles.append(angle)
    if radians:
        angles = [angle * np.pi / 180 for angle in angles]
    return np.array(lengths + angles)

fractional_coordinates(cell, cartesian_positions)

Returns fractional coordinates and wraps coordinates to [0,1[.

Note

Based on ASE code.

Parameters:

Name Type Description Default
cell tuple[Vector3D, Vector3D, Vector3D]

A Cartesian 3x3 cell. This equates to the lattice_vectors attribute.

required
cartesian_positions list[Vector3D]

A list of cartesian atomic positions. This equates to the cartesian_site_positions attribute.

required

Returns:

Type Description
list[Vector3D]

A list of fractional coordinates for the atomic positions.

Source code in optimade/adapters/structures/utils.py
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
def fractional_coordinates(
    cell: tuple[Vector3D, Vector3D, Vector3D], cartesian_positions: list[Vector3D]
) -> list[Vector3D]:
    """Returns fractional coordinates and wraps coordinates to `[0,1[`.

    Note:
        Based on [ASE code](https://wiki.fysik.dtu.dk/ase/_modules/ase/atoms.html#Atoms.get_scaled_positions).

    Parameters:
        cell: A Cartesian 3x3 cell. This equates to the
            [`lattice_vectors`][optimade.models.structures.StructureResourceAttributes.lattice_vectors] attribute.
        cartesian_positions: A list of cartesian atomic positions. This equates to the
            [`cartesian_site_positions`][optimade.models.structures.StructureResourceAttributes.cartesian_site_positions]
            attribute.

    Returns:
        A list of fractional coordinates for the atomic positions.

    """
    if globals().get("np", None) is None:
        warn(NUMPY_NOT_FOUND, AdapterPackageNotFound)
        return None  # type: ignore[return-value]

    cell_array = np.asarray(cell)
    cartesian_positions_array = np.asarray(cartesian_positions)

    fractional = np.linalg.solve(cell_array.T, cartesian_positions_array.T).T

    # Expecting a bulk 3D structure here, note, this may change in the future.
    # See `ase.atoms:Atoms.get_scaled_positions()` for ideas on how to handle lower dimensional structures.
    # Furthermore, according to ASE we need to modulo 1.0 twice.
    # This seems to be due to small floats % 1.0 becomes 1.0, hence twice makes it 0.0.
    for i in range(3):
        fractional[:, i] %= 1.0
        fractional[:, i] %= 1.0

    return [tuple(position) for position in fractional]  # type: ignore

get_cif(optimade_structure)

Get CIF file as string from OPTIMADE structure.

Parameters:

Name Type Description Default
optimade_structure StructureResource

OPTIMADE structure.

required

Returns:

Type Description
str

The CIF file as a single Python str object.

Source code in optimade/adapters/structures/cif.py
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
def get_cif(  # pylint: disable=too-many-locals,too-many-branches
    optimade_structure: OptimadeStructure,
) -> str:
    """Get CIF file as string from OPTIMADE structure.

    Parameters:
        optimade_structure: OPTIMADE structure.

    Returns:
        The CIF file as a single Python `str` object.

    """
    # NumPy is needed for calculations
    if globals().get("np", None) is None:
        warn(NUMPY_NOT_FOUND, AdapterPackageNotFound)
        return None  # type: ignore[return-value]

    cif = """#
# Created from an OPTIMADE structure.
#
# See https://www.optimade.org and/or
# https://github.com/Materials-Consortia/OPTIMADE for more information.
#
"""

    cif += f"data_{optimade_structure.id}\n\n"

    attributes = optimade_structure.attributes

    # Do this only if there's three non-zero lattice vectors
    # NOTE: This also negates handling of lattice_vectors with null/None values
    if valid_lattice_vector(attributes.lattice_vectors):  # type:ignore[arg-type]
        a_vector, b_vector, c_vector, alpha, beta, gamma = cell_to_cellpar(
            attributes.lattice_vectors  # type: ignore[arg-type]
        )

        cif += (
            f"_cell_length_a                    {a_vector:g}\n"
            f"_cell_length_b                    {b_vector:g}\n"
            f"_cell_length_c                    {c_vector:g}\n"
            f"_cell_angle_alpha                 {alpha:g}\n"
            f"_cell_angle_beta                  {beta:g}\n"
            f"_cell_angle_gamma                 {gamma:g}\n\n"
        )
        cif += (
            "_symmetry_space_group_name_H-M    'P 1'\n"
            "_symmetry_int_tables_number       1\n\n"
            "loop_\n"
            "  _symmetry_equiv_pos_as_xyz\n"
            "  'x, y, z'\n\n"
        )

        # Since some structure viewers are having issues with cartesian coordinates,
        # we calculate the fractional coordinates if this is a 3D structure and we have all the necessary information.
        if not hasattr(attributes, "fractional_site_positions"):
            attributes.fractional_site_positions = fractional_coordinates(
                cell=attributes.lattice_vectors,  # type:ignore[arg-type]
                cartesian_positions=attributes.cartesian_site_positions,  # type:ignore[arg-type]
            )

    # NOTE: This is otherwise a bit ahead of its time, since this OPTIMADE property is part of an open PR.
    # See https://github.com/Materials-Consortia/OPTIMADE/pull/206
    coord_type = (
        "fract" if hasattr(attributes, "fractional_site_positions") else "Cartn"
    )

    cif += (
        "loop_\n"
        "  _atom_site_type_symbol\n"  # species.chemical_symbols
        "  _atom_site_label\n"  # species.name + unique int
        "  _atom_site_occupancy\n"  # species.concentration
        f"  _atom_site_{coord_type}_x\n"  # cartesian_site_positions
        f"  _atom_site_{coord_type}_y\n"  # cartesian_site_positions
        f"  _atom_site_{coord_type}_z\n"  # cartesian_site_positions
        "  _atom_site_thermal_displace_type\n"  # Set to 'Biso'
        "  _atom_site_B_iso_or_equiv\n"  # Set to 1.0:f
    )

    if coord_type == "fract":
        sites = attributes.fractional_site_positions
    else:
        sites = attributes.cartesian_site_positions

    species: dict[str, OptimadeStructureSpecies] = {
        species.name: species for species in attributes.species  # type: ignore[union-attr]
    }

    symbol_occurences: dict[str, int] = {}
    for site_number in range(attributes.nsites):  # type: ignore[arg-type]
        species_name = attributes.species_at_sites[site_number]  # type: ignore[index]
        site = sites[site_number]

        current_species = species[species_name]

        for index, symbol in enumerate(current_species.chemical_symbols):
            if symbol == "vacancy":
                continue

            if symbol in symbol_occurences:
                symbol_occurences[symbol] += 1
            else:
                symbol_occurences[symbol] = 1
            label = f"{symbol}{symbol_occurences[symbol]}"

            cif += (
                f"  {symbol} {label} {current_species.concentration[index]:6.4f} {site[0]:8.5f}  "
                f"{site[1]:8.5f}  {site[2]:8.5f}  {'Biso':4}  {'1.000':6}\n"
            )

    return cif

valid_lattice_vector(lattice_vec)

Source code in optimade/adapters/structures/utils.py
23
24
25
26
27
28
29
30
31
def valid_lattice_vector(lattice_vec: tuple[Vector3D, Vector3D, Vector3D]):
    if len(lattice_vec) != 3:
        return False
    for vector in lattice_vec:
        if (
            (len(vector) != 3) or (None in vector) or (np.linalg.norm(vector) < 1e-15)
        ):  # Due to rounding errors very small values instead of 0.0 may appear for the lattice vectors. therefore I check here whether the value is not too small. I am however not sure what the smallest value is that I can put here.
            return False
    return True