{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. Loading and fitting multiple diffraction patterns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A key use for SXRD is monitoring the change in the structure of a material over time. If a number of diffraction patterns are collected over time `xrdfit` can be used to automatically fit the peaks and extract the changing fit parameters over time. The change in the positions or heights of the peaks can then be correlated to material properties. We call the fitting of multiple **spectra** an **Experiment**.\n", "\n", "In the example files folder there is a sequence of 10 diffraction patterns which we will use for demonstration. The workflow used in `xrdfit` assumes that the data are stored in sequentially numbered files, one spectrum per file, all in the same folder.\n", "\n", "## 3.1. Fitting a peak at multiple times\n", "\n", "The idea behind the time fitting in `xrdfit` is that you should be able to set up the peaks to fit for the first time step of the experiment and then run the fits automatically for the rest of the time steps, without manual intervention. The code will follow the changing positions of the peaks over time.\n", "\n", "In order to fit peaks they must exist in the first frame of the fitting experiment. If certain peaks appear part way through an experiment then the spectrum will have to be fitted in parts, the appearing peaks being fitted only after they appear. As long as a peak exists at the start of the experiment, `xrdfit` should be able to cope with a peak disappearing for a time and reappearing later (an example is shown in tutorial notebook 4). If peaks merge together over time the behavior of `xrdfit` is not defined. In this case the fitting may need more manual adjustment.\n", "\n", "The first step in setting up a multi-time fit is to load a `FitExperiment` object - this contains some metadata about the experiment and will hold one `FitSpectrum` for each diffraction pattern. The *first_cake_angle*, *cakes_to_fit*, *peak_params* and *merge_cakes* parameters are the same as previously defined. The *spectrum_time* parameter sets the number of seconds between each spectrum in the sequence - this value is used to appropriately label the time on the x-axis of plots. The *file_stub* parameter is used to locate the files for analysis. To use all of the spectra in a folder that are sequentially numbered - provide the stub of the file name with a star (wildcard).\n", "\n", "Once the `FitExperiment` object is loaded, the `run_analysis` method runs the fit over all of the specified files." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import xrdfit.spectrum_fitting as spectrum_fitting\n", "from xrdfit.spectrum_fitting import PeakParams\n", "\n", "first_cake_angle = 90\n", "cakes_to_fit = [36, 1, 2]\n", "peak_params = PeakParams((2.75, 2.95), '1')\n", "merge_cakes = True\n", "\n", "spectrum_time = 1\n", "file_stub = \"../example_data/adc_041_7Nb_NDload_700C_15mms_*\"\n", "\n", "experiment = spectrum_fitting.FitExperiment(spectrum_time, file_stub, first_cake_angle, \n", " cakes_to_fit, peak_params, merge_cakes)\n", "\n", "experiment.run_analysis()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results are hierarchically stored in the `FitExperiment` object and can be accessed directly if you wish. \n", "\n", "However, it is probably easier to use the `FitExperiment` helper methods described below to plot the results rather than accessing the raw fit data directly.\n", "\n", "The `peak_names` method lists the names of the fitted peaks specified in the `PeakParams` objects. The `fit_parameters` method gives a list of the names of the fit parameters for a particular peak." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(experiment.peak_names())\n", "print(experiment.fit_parameters('1'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Peak names are given by concatenating the names of the constituent maxima with a space between them. In this case there is only a single maximum in the peak and so the peak name is the same as the maximum name.\n", "\n", "Fit parameter names follow the syntax, $x\\_y$ where $x$ is the maximum name and $y$ is the parameter type.\n", "\n", "To plot a parameter for a fit over time use the `plot_fit_parameter` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.plot_fit_parameter('1', '1_height')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To plot all of the parameters for all of the peaks use a loop." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for peak_name in experiment.peak_names():\n", " for parameter in experiment.fit_parameters(peak_name):\n", " experiment.plot_fit_parameter(peak_name, parameter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The blue line is the value of the parameter while the light blue shaded area is +- 1 standard error on the determination of the parameter from the fit. By default, the plots scale the y-axis to the data, not the error bars. But, the y-scale can be adjusted with the *scale_by_error* parameter:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for peak_name in experiment.peak_names():\n", " for parameter in experiment.fit_parameters(peak_name):\n", " experiment.plot_fit_parameter(peak_name, parameter, scale_by_error=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case the errors on the parameters are mostly small relative to the magnitude of the parameters they look particularly large in this case because the fit parameters do not change significantly. This is because the 10 example files represent only a short time period where little was changing in the material.\n", "\n", "Error bars are plotted by default, to turn them off you can use the `show_error` parameter:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.plot_fit_parameter(\"1\", \"1_height\", show_error=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.2. Fitting multiple peaks at multiple times\n", "We can set up a larger analysis to fit multiple peaks over time. It is probably a good workflow to determine good values for the `PeakParams` on a single spectrum first using a `FitSpectrum` object, to check that the fits are good and then use them to run over multiple files using a `FitExperiment`. Here we take the `PeakParams` determined in the first tutorial notebooks. To make the analysis clearer however we name the maxima with the names of their crystallographic lattice planes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "first_cake_angle = 90\n", "cakes_to_fit = [36, 1, 2]\n", "\n", "\n", "peak_params = [PeakParams((2.75, 2.95), '110'),\n", " PeakParams((3.02, 3.32), ['200', '10-10', '10-11'], [(3.09, 3.12), (3.19, 3.22), (3.24, 3.26)]),\n", " PeakParams((4.13, 4.30), '210')]\n", "\n", "merge_cakes = True\n", "\n", "spectrum_time = 1\n", "file_stub = \"../example_data/adc_041_7Nb_NDload_700C_15mms_*\"\n", "\n", "experiment = spectrum_fitting.FitExperiment(spectrum_time, file_stub, first_cake_angle, \n", " cakes_to_fit, peak_params, merge_cakes)\n", "\n", "experiment.run_analysis()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lets see what parameters we can now plot using the `FitExpreiment.peak_names` and `FitExperiment.fitParameters` methods:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(experiment.peak_names(), \"\\n\")\n", "for peak_name in experiment.peak_names():\n", " print(experiment.fit_parameters(peak_name), \"\\n\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we can see that we have the three peaks we have fitted. The peak names are given as a compound of the maxima names seperated by a space. For convinience, anywhere you need to specify a peak name, you can either specify the compound peak name or just one of the constituent maxima names. In this case `experiment.fit_parameters(\"200 10-10 10-11\")` and `experiment.fit_parameters(\"10-10\")` would give the same result.\n", "\n", "Notice that the second peak has three sets of fit parameters, one set for each maximum.\n", "\n", "We likely want to focus on a subset of the parameters rather than plotting them all. This time we use a Python `if` statement to select only the parameters corresponding to the peak centers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for peak_name in experiment.peak_names():\n", " for parameter in experiment.fit_parameters(peak_name):\n", " if \"center\" in parameter:\n", " experiment.plot_fit_parameter(peak_name, parameter, scale_by_error=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.3. Plotting fits\n", "\n", "It is good practice to check the fits directly (or at least a subset of them) to check that the automated fits are working correctly. To get an idea of how the fits went you can plot the fits from a time series using the `plot_fits` method. By default the method prints plots for 5 time steps, evenly spaced in time, plotting one plot for each fitted peak." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.plot_fits()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This could potentially be a lot of plots so you can narrow down the plots you want by providing different arguments to the plot_fits function. `num_time_steps` sets how many evenly spaced time steps to plot (default: 5). `peak_names` is a list of one or more peak names to plot (default: all fitted peaks). `time_steps` is a list of integer values specifying which time steps to plot, if `time_steps` is provided then `num_time_steps` will be ignored." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.plot_fits(peak_names=[\"110\", \"210\"], time_steps=[2, 3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.4. Fitting a subset of time steps\n", "\n", "Sometimes it is the case that we do not want to process all spectra in a series. Perhaps the sampling frequency is too high and we only want to fit every other spectrum or every 10th spectrum. Perhaps the interesting data is at the end so you want to skip fitting the first 100 spectra. This can be done by supplying an extra parameter to the `FitExperiment` object.\n", "\n", "The *frames_to_load* parameter is a list of integer values specifying which files to load. The file stub also has to be modified here - adding a python format string where the numbers need to be substituted in the file name. In this example `:05d` corresponds to a 5 digit wide integer padded with zeros. This means 1 will become 00001, 10 will become 00010 etc. For more on python sting formatting see: https://pyformat.info/#number\n", "\n", "The below example will be just the same as the one above except it will only load spectra 1, 3 and 4 from the example folder. Notice how the x-axis on the parameter plots scales correctly - leaving a gap at 2 seconds where there is no data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "frames_to_load = [1, 3, 4]\n", "file_stub = \"../example_data/adc_041_7Nb_NDload_700C_15mms_{:05d}.dat\"\n", "\n", "experiment = spectrum_fitting.FitExperiment(spectrum_time, file_stub, first_cake_angle, \n", " cakes_to_fit, peak_params, merge_cakes, frames_to_load)\n", "\n", "experiment.run_analysis()\n", "\n", "experiment.plot_fit_parameter(\"110\", \"110_center\", show_points=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get the raw data you can use the `get_fit_parameter` method. The first column is the time (x-data), the second column is the requested parameter (y-data) and the third column is the standard error on the fit parameter (y-error) estimated from the fitting covariance matrix." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.get_fit_parameter(\"110\", \"110_center\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.5. Saving and loading plots\n", "Once you have done an experiment it may be desirable to save the fits to be able to refer back to them later.\n", "\n", "This can be done using the `save` method of the `FitExperiment` object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.save(\"experiment.dump\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This file is a compressed binary file and so is not human readable. Note that although the file is compressed, the output may well be large - typically on the order of the size of the input data since the input data is embedded in the object.\n", "\n", "To read in a previously saved `FitExperiment` object, use the `spectrum_fitting.load_dump` method. This returns a new `FitExperiment` with the saved fits which you can operate on just as before." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "old_experiment = spectrum_fitting.load_dump(\"experiment.dump\")\n", "experiment.plot_fit_parameter(\"110\", \"110_center\", show_points=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.6. Using previous fits as a starting point for the next fit\n", "\n", "The result of a fit from a previous time step can be used as the starting parameters for the next fit. You can do this by using the reuse_fits parameter of the `run_analysis` method. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment.run_analysis(reuse_fits=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reusing the fits can provide a significant speedup in fitting, especially if the fits change little between time steps as in this case the previous fit should be a good starting point for the next one. \n", "\n", "If the fits are quite different between time steps it is likely better to not reuse the fits. In this case the code will make an educated guess about the parameters at each time step instead.\n", "\n", "We have previously found that while reusing fits often improves performance, on occasion reusing the fits can cause poor fitting performance, taking many iterations to complete each fit. Try with and without reusing fits and see which works best for your dataset." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "pycharm": { "stem_cell": { "cell_type": "raw", "metadata": { "collapsed": false }, "source": [] } } }, "nbformat": 4, "nbformat_minor": 2 }