{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quick start with aiida_castep - `CastepBaseWorkChain`\n", "\n", "## Introduction\n", "\n", "`aiida-castep` is a plugin to interface CASTEP with [AiiDA](www.aiida.net) (Automated Interactive Infrastructure and Database for Computational Science). \n", "\n", "This example notebook goes through running calculations though `CastepBaseWorkChain`, which wrapps around `CastepCalculation` and attempts to correct common errors.\n", "\n", "In AiiDA's terminology a `Calculation` can be thought as a single *transaction* with the underlying code,\n", "whereas `WorkChain`s are entities that perform workflows for solving specific problems. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Profile" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%load_ext aiida\n", "\n", "from aiida import load_profile, engine, orm, plugins\n", "from aiida.storage.sqlite_temp import SqliteTempBackend\n", "\n", "profile = load_profile(\n", " SqliteTempBackend.create_profile(\n", " 'myprofile',\n", " sandbox_path='_sandbox',\n", " options={\n", " 'warnings.development_version': False,\n", " 'runner.poll.interval': 1\n", " },\n", " debug=False\n", " ),\n", " allow_switch=True\n", ")\n", "profile" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing a CASTEP calculation\n", "\n", "Please also take a look at the `CastepCalculation` example before you proceed to learn what the input and outputs look like.\n", "\n", "As `CastepBaseWorkChain` takes similar inputs are for a `CastepCalculation`, but it allows certain simplifications:\n", "\n", "The main difference is that what originally goes into `CastepCalculation` now resides under a `PortNameSpace` names `calc`. For example, to pass the `parameters` to a `CastepCalculation`, ones needs to do:\n", "\n", "```python\n", "inputs = {\n", " 'parameters': Dict(dict={\n", " 'PARAM': {\n", " 'task': 'singlepoint',\n", " ....\n", " },\n", " 'CELL': {\n", " 'symmetry_generate': True,\n", " .....\n", " }\n", " })\n", "}\n", "submit(CastepCalculation, **inputs)\n", "```\n", "\n", "and the `ProcessBuilder` interface follows the same scheme.\n", "Using `CastepBaseWorkChain`, this becomes:\n", "\n", "```python\n", "inputs = {\n", " 'calc': {\n", " 'parameters': Dict(dict={\n", " 'task': 'singlepoint',\n", " 'symmetry_generate': True,\n", " .....\n", " }\n", " })\n", " }\n", "}\n", "```\n", "\n", "where `Dict` node no longer needs to have sub fields corresponding to `cell` and `param` files - all of the keys can be placed at the top level. \n", "\n", "The other differences are:\n", "\n", "- a `kpoints_spacing` input port for defining the density of kpoints. Note that CASTEP itself supports `kpoints_mp_spacing`, but using this would not allow the grid to be recorded in the provenance graph, so using `kpoints_spacing` input port is the preferred approach.\n", "- if `clean_workdir` is set to `Bool(True)` the remote workdir(s) will be cleaned it the workflow is successful.\n", "- if `ensure_gamma_centering` is set to `Bool(True)` the kpoints mesh will include the necessary offsets to make sure it is Gamma centred.\n", "- Instead of setting pseudopotentials explicitly for each element, the *family* can be passed directly under the `pseudo_family` port.\n", "- if `continuation_folder` or `reuse_folder` is set, the flag for continuation/reuse will be included automatically for the underlying calculations.\n", "\n", "\n", "Below is a walk-through of the steps to create a single point calculation using the `BaseCastepWorkChain`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from pprint import pprint\n", "# Load the AiiDA environment\n", "import aiida.orm as orm\n", "from aiida.plugins import WorkflowFactory" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example - silicon bandstructure\n", "This is taken from the online tutorial." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `cell` file contain the crystal structure and related setting." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "%block lattice_cart\n", "2.6954645 2.6954645 0.0 \n", "2.6954645 0.0 2.6954645\n", "0.0 2.6954645 2.6954645\n", "%endblock lattice_cart\n", "%block positions_frac\n", "Si 0.00 0.00 0.00\n", "Si 0.25 0.25 0.25\n", "%endblock positions_frac\n", "symmetry_generate\n", "%block species_pot\n", "Si Si_00.usp\n", "%endblock species_pot\n", "kpoint_mp_grid 4 4 4\n", "%block bs_kpoint_path \n", "0.5 0.25 0.75 ! W\n", "0.5 0.5 0.5 ! L\n", "0.0 0.0 0.0 ! Gamma\n", "0.5 0.0 0.5 ! X\n", "0.5 0.25 0.75 ! W\n", "0.375 0.375 0.75 ! K\n", "%endblock bs_kpoint_path \n" ] } ], "source": [ "!cat 'bandstructure/silicon/Si2.cell' | grep -v -e \"^!\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `param` file contains a list of key-value pairs " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "task\t\tbandstructure ! The TASK keyword instructs CASTEP what to do\n", "xc_functional LDA ! Which exchange-correlation functional to use.\n", "basis_precision MEDIUM ! Choose high cut-off COARSE/MEDIUM/FINE/PRECISE\n", "fix_occupancy true ! Treat the system as an insulator\n", "opt_strategy speed ! Choose algorithms for best speed at expense of memory.\n", "num_dump_cycles 0 ! Don't write unwanted \".wvfn\" files.\n", "write_formatted_density TRUE ! Write out a density file that we can view using (e.g.) Jmol.\n" ] } ], "source": [ "!cat 'bandstructure/silicon/Si2.param' | grep -v -e \"^!\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup the WorkChain with AiiDA\n", "We setup a similar calculation with Si here. Instead of going for the band structure, we just do a single point run." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define the structure by creating the `StructureData` node" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Define the structure\n", "from aiida.plugins import DataFactory\n", "StructureData = DataFactory('structure')\n", "silicon = StructureData()\n", "r_unit = 2.6954645\n", "silicon.set_cell(np.array([[1, 1, 0], [1, 0, 1], [0, 1, 1]]) * r_unit)\n", "silicon.append_atom(symbols=[\"Si\"], position=[0, 0, 0])\n", "silicon.append_atom(symbols=[\"Si\"], position=[r_unit * 0.5] * 3)\n", "silicon.label = \"Si\"\n", "silicon.description = \"A silicon structure\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Atomic Simulation Environment (ASE) has a rich set of tools for handling structures, center around the `ase.Atoms` class.\n", "They can be converted to `StructureData` that AiiDA understands and saves to the provenance graph." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# You can also use ase.Atoms object to create the StructureData\n", "from ase import Atoms\n", "silicon_atoms = Atoms('Si2', cell=silicon.cell, scaled_positions=((0, 0, 0), (0.25, 0.25, 0.25)))\n", "silicon_from_atoms = StructureData(ase=silicon_atoms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`StructureData` can be converted back to `Atoms` for complex operation using `ase`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# You can also convent the StructureData back to ase.Atoms\n", "silicon_atoms_2 = silicon_from_atoms.get_ase()\n", "silicon_atoms_2 == silicon_atoms" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that a similar interface also exists to work with `pymatgen`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Getting a `ProcessBuilder` for setting up the inputs\n", "\n", "The `ProcessBuilder` is very useful - it allows the inputs to be defined interactively." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "CastepBaseWorkChain = WorkflowFactory('castep.base')\n", "builder = CastepBaseWorkChain.get_builder()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "builder.calc.parameters = {\n", " \"task\": \"singlepoint\",\n", " \"basis_precision\": \"medium\",\n", " \"fix_occupancy\": True, \n", " \"opt_strategy\": \"speed\",\n", " \"num_dump_cycles\": 0, \n", " \"write_formatted_density\": True,\n", " \"symmetry_generate\": True,\n", " \"snap_to_symmetry\": True,\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that we have used a plain python `dict` here - the builder will automatically convert it to a \n", "`Dict`.\n", "\n", "Try to include a typo in the cell above and see what happens - the builder will validate the keys before the `Dict` node is created." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from aiida_castep.data.otfg import upload_otfg_family\n", "upload_otfg_family(['C19'], 'C19', 'C19 potential library') \n", "\n", "builder.kpoints_spacing = 0.07\n", "builder.pseudos_family = 'C19'\n", "\n", "# Locate a previously set mock code in the Calculation Example\n", "builder.calc.structure = silicon\n", "# Note that the resources needs to go under `calc.metadata.options` instead of `metadata.options`\n", "builder.calc.metadata.options.resources = {'num_machines': 1, 'tot_num_mpiprocs': 2}\n", "builder.calc.metadata.options.max_wallclock_seconds = 600\n", "builder.metadata.description = 'A Example CASTEP calculation for silicon'\n", "builder.metadata.label = 'Si SINGLEPOINT'" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Define a mock code on the localhost computer\n", "comp = orm.Computer('localhost', 'localhost', transport_type='core.local', scheduler_type='core.direct')\n", "comp.store()\n", "comp.set_workdir('/tmp/aiida_run/')\n", "comp.configure()\n", "\n", "code_path = !which castep.mock\n", "castep_mock = orm.Code((comp, code_path[0]), input_plugin_name='castep.castep')\n", "builder.calc.code = castep_mock" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "05/26/2022 10:31:56 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|validate_inputs]: Direct input of calculations metadata is deprecated - please pass them with `calc_options` input port.\n", "05/26/2022 10:31:56 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|validate_inputs]: Using kpoints: Kpoints mesh: 5x5x5 (+0.0,0.0,0.0)\n", "05/26/2022 10:31:57 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|run_calculation]: launching CastepCalculation<12> iteration #1\n", "05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|inspect_calculation]: CastepCalculation<12> completed successfully\n", "05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|results]: workchain completed after 1 iterations\n", "05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|on_terminated]: remote folders will not be cleaned\n" ] } ], "source": [ "from aiida.engine import run_get_node\n", "results, work = run_get_node(builder)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check if the workflow is finished OK" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "work.exit_status" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To can the list of possible `exit_status` and they meanings, we can use `verdi plugin list`. This also shows list of inputs of which the required ones are in bold." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[31m\u001b[1mDescription:\n", "\u001b[0m\n", "\u001b[22m A basic workchain for generic CASTEP calculations.\u001b[0m\n", "\u001b[22m We try to handle erros such as walltime exceeded or SCF not converged\u001b[0m\n", "\u001b[22m\u001b[0m\n", "\u001b[31m\u001b[1mInputs:\u001b[0m\n", "\u001b[1m calc: required \u001b[0m\n", "\u001b[22m calc_options: optional Dict Options to be passed to calculations's metadata.options\u001b[0m\n", "\u001b[22m clean_workdir: optional Bool Wether to clean the workdir of the calculations or not, the default is not ...\u001b[0m\n", "\u001b[22m continuation_folder: optional RemoteData Use a remote folder as the parent folder. Useful for restarts.\u001b[0m\n", "\u001b[22m ensure_gamma_centering: optional Bool Ensure the kpoint grid is gamma centred.\u001b[0m\n", "\u001b[22m kpoints_spacing: optional Float Kpoint spacing\u001b[0m\n", "\u001b[22m max_iterations: optional Int Maximum number of restarts\u001b[0m\n", "\u001b[22m metadata: optional \u001b[0m\n", "\u001b[22m options: optional Dict Options specific to the workchain.Avaliable options: queue_wallclock_limit, ...\u001b[0m\n", "\u001b[22m pseudos_family: optional Str Pseudopotential family to be used\u001b[0m\n", "\u001b[22m reuse_folder: optional RemoteData Use a remote folder as the parent folder. Useful for restarts.\u001b[0m\n", "\u001b[31m\u001b[1mOutputs:\u001b[0m\n", "\u001b[1m output_bands: required BandsData \u001b[0m\n", "\u001b[1m output_parameters: required Dict \u001b[0m\n", "\u001b[1m remote_folder: required RemoteData \u001b[0m\n", "\u001b[22m output_array: optional ArrayData \u001b[0m\n", "\u001b[22m output_structure: optional StructureData \u001b[0m\n", "\u001b[22m output_trajectory: optional ArrayData \u001b[0m\n", "\u001b[31m\u001b[1mExit codes:\u001b[0m\n", "\u001b[22m 1: The process has failed with an unspecified error.\u001b[0m\n", "\u001b[22m 2: The process failed with legacy failure mode.\u001b[0m\n", "\u001b[22m 10: The process returned an invalid output.\u001b[0m\n", "\u001b[22m 11: The process did not register a required output.\u001b[0m\n", "\u001b[22m 200: The maximum number of iterations has been exceeded\u001b[0m\n", "\u001b[22m 201: The maximum length of the wallclocks has been exceeded\u001b[0m\n", "\u001b[22m 301: CASTEP generated error files and is not recoverable\u001b[0m\n", "\u001b[22m 302: Cannot reach SCF convergence despite restart efforts\u001b[0m\n", "\u001b[22m 400: The stop flag has been put in the .param file to request termination of the calculation.\u001b[0m\n", "\u001b[22m 900: Input validate is failed\u001b[0m\n", "\u001b[22m 901: Completed one iteration but found not calculation returned\u001b[0m\n", "\u001b[22m 1000: Error is not known\u001b[0m\n" ] } ], "source": [ "!verdi plugin list aiida.workflows castep.base" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What has this workflow done?\n", "\n", "In this simple example, it only calls a single calculation calculation. In practice, the workflow will act to mitigate certain errors such as running out of walltime and electronic convergence problems." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "from aiida.cmdline.utils.ascii_vis import format_call_graph" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CastepBaseWorkChain<8> Finished [0] [4:results]\n", " └── CastepCalculation<12> Finished [0]\n" ] } ], "source": [ "print(format_call_graph(work))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Access the results\n", "Results of workflow can be accessed in a similar way compared with `CastepCalculation`.\n", "However, here we do not have the `.res` interface as for `CastepCalculation`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'warnings': [],\n", " 'castep_version': '20.11',\n", " 'unit_length': 'A',\n", " 'unit_time': 'ps',\n", " 'unit_energy': 'eV',\n", " 'unit_force': 'eV/A',\n", " 'num_ions': 2,\n", " 'n_kpoints': '10',\n", " 'point_group': '32: Oh, m-3m, 4/m -3 2/m',\n", " 'space_group': '227: Fd-3m, F 4d 2 3 -1d',\n", " 'cell_constraints': '1 1 1 0 0 0',\n", " 'pseudo_pots': {'Si': '3|1.8|5|6|7|30:31:32'},\n", " 'initialisation_time': 4.49,\n", " 'calculation_time': 0.87,\n", " 'finalisation_time': 0.01,\n", " 'total_time': 5.37,\n", " 'parallel_efficiency': 57,\n", " 'geom_unconverged': None,\n", " 'parser_warnings': [],\n", " 'parser_info': 'AiiDA CASTEP basic Parser v1.1.1',\n", " 'total_energy': -337.8473203292,\n", " 'error_messages': []}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# A Dict node is created containing some of the parsed results\n", "work.outputs.output_parameters.get_dict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Eigenvalues parsed from `.bands` output" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-4.72022699, -2.20565637, 1.06932498, 2.83544156],\n", " [-5.63940581, -0.76783726, 1.89878269, 2.54227671],\n", " [-4.29355492, -2.00728782, 0.48127062, 1.49935793],\n", " [-6.80455106, 1.9917044 , 3.05980487, 3.06015916],\n", " [-6.98398755, 1.54800182, 4.00173867, 4.00257351],\n", " [-3.50106388, -2.90938714, 0.39021289, 1.95043126],\n", " [-7.50759964, 4.60236006, 4.60306892, 4.60501208],\n", " [-5.58411173, -1.78292124, 3.44303941, 3.44488053],\n", " [-6.13414661, 0.08379311, 1.49548901, 3.73091709],\n", " [-4.75110484, -1.62044505, 1.82693538, 1.82725022]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "work.outputs.output_bands.get_array('bands')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Forces on the atoms" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[ 0., -0., 0.],\n", " [-0., 0., 0.]]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "work.outputs.output_array.get_array('forces')" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-337.8473203292" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Alternatively, you can access the parsed results using\n", "work.called[0].res.total_energy # hit tab for completion after res" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-337.8473203292" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "work.outputs.output_parameters['total_energy'] " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "57" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "work.outputs.output_parameters['parallel_efficiency'] " ] } ], "metadata": { "interpreter": { "hash": "cf5bfbb275efa34724f83034d97eac1f6cad0e6c7edb82ca36de8357056ef910" }, "kernelspec": { "display_name": "Python [conda env:aiida-2.0-dev]", "language": "python", "name": "conda-env-aiida-2.0-dev-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }