Data Fitting Exercises

1. Linear Regression

A common part of data analysis is fitting a mathematical model to some experimental data. This might be to determine how well a particular model describes your experimental data, or to extract a numerical parameter; for example, fitting rate laws to reactant concentrations over time to obtain rate constants. Another example is the semester 1 exercise, where you fitted CO$_2$ vapour pressures to the Clausius-Clapeyron equation to obtain phase transition enthalpies.

A third example where fitting to simple exerimental data can give chemical information about a particular reaction involves the Van't Hoff equation. This relates the change in the equilibrium constant, $K$, for a reaction, to a change in temperature, providing the enthalpy change for the reaction, $\Delta H$, is constant:

\begin{equation} \frac{\mathrm{d}\,\ln{K}}{\mathrm{d}\,T} = \frac{\Delta H}{RT^2}.\tag{1} \end{equation}

The derivation of this equation comes from combining two standard equations for the Gibbs free energy change, $\Delta G$, for a reaction:

\begin{equation} \Delta G = \Delta H - T \Delta S\tag{2} \end{equation}\begin{equation} \Delta G = -RT \ln{K}\tag{3} \end{equation}

Combining these two equations, we get

\begin{equation} \ln{K} = -\frac{\Delta H}{RT} + \frac{\Delta S}{R}.\tag{4} \end{equation}

This is the linear form of the Van't Hoff equation. Differentiating with respect to $\frac{1}{T}$ (keeping $\Delta H$ and $\Delta S$ constant) gives equation 1.

Because the linear form of the Van't Hoff equation gives the equation for a straight line (hence “linear”) a plot of $\ln{K}$ against $\frac{1}{T}$ should give a straight line, as long as the enthalpy and entropy of reaction are approximately constant over the temperature range of interest. Furthermore, a straight line fitted to these data will have a slope of $-\frac{\Delta H}{R}$ and an intercept of $\frac{\Delta S}{R}$. By performing a linear fit, it is therefore possible to extract values for $\Delta H$ and $\Delta S$.

In the semeseter 1 lab, you learned how to use the linregress() function (from scipy.stats) to perform a linear regression (fit a straight line to a set of data). This exercises will work through model fitting in a bit more detail, divided into three parts.

  1. In the first part, you will have the opportunity to revise and practice working with data, and using scipy.stats.linregress to fit a straight line, to obtain the experimental reaction enthalpy and entropy for an equilibrium.

  2. The second part will go into more detail about what we mean when we talk about “fitting&rqduo; a straight line, and will show you a different, more general, approach to solving the same problem, that involves writing your own functions.

  3. In the third part, you will apply what you have learned to fit a non-linear model to some experimental data, to analyse a flash-photolysis experiment.

Review the basics of using matplotlib to plot graphs, and using scipy.stats.linregress for linear regression, using your notebooks from last semester.

Exercise

The file data/equilibrium_constant.dat contains a set of equilibrium constants measured at different temperatures for the reaction

\begin{equation} 2\mathrm{NO}_2 \rightleftharpoons \mathrm{N}_2\mathrm{O}_4. \end{equation}

By performing a linear regression with these data, calculate the enthalpy and entropy changes for this reaction.

Instructions

The first thing you will need to do is set up your notebook for working with data, by importing numpy and the plotting functions from matplotlib.

import numpy as np

import matplotlib.pyplot as plt

%matplotlib inline
In [1]:
 

Process

The data from this experiment are stored in a text file in a shared folder on ScienceData, data/equilbirium_constant.dat, which looks like


# equilibrium constant data for 2 NO2 => N2O4  

# columns are: temperature (degrees Celsius), K


9   34.3

20  12

25  8.79

33  4.4

40  2.8

52  1.4

60  0.751

70  0.4

To be able to work with the temperature and equilibrium constant data, you want to load them into numpy arrays. You can read the data from the file using the np.loadtxt() function, e.g.

data = np.loadtxt('https://sciencedata.dk/public/6e3ed434c0fa43df906ce2b6d1ba9fc6/chem_data_analysis_jupyter/data/equilibrium_constant.dat')
Load the dataset from the file into a numpy array called data. Check that this is a 2D array containing both the temperatures and equilibrium constants.
In [ ]:
 
To make it easier to work with the temperature and equilibrium constant data separately, copy the columns from your data array into a pair of 1D arrays, named temperature and k_eq. Remember that you can refer to a particular column or row in a 2D array using **array slicing**. Convert your temperature array from Celsius into Kelvin.
In [ ]:
 
In [ ]:
 
Plot $\ln{K}$ against $1/T$, and check that this gives a straight line relationship. Label your axes.
In [ ]:
 
Using scipy.stats.linregress fit a straight line to this dataset, and print the slope and intercept from your fit.
In [ ]:
 
Generate another $\ln{K}$ versus $1/T$ plot, showing the experimental data as points, and your “line of best fit” as a line.

A simple way to generate points that lie along a straight line is to define a line function:

def line( m, c, x ):

    y = m * x + c

    return y

This function takes three arguments; $m$, $c$, and $x$; and returns the corresponding $y$ values. By passing in slope as $m$, intercept as $c$, and 1/temperature as $x$, you can calculate the $y$ coordinates to plot on your figure.

In [ ]:
# Define your function here
In [ ]:
# Plot your figure.

# You will need to use `plt.plot()` twice to plot the original data, and your fitting line.
Use your fitted values for the slope and intercept to calculate the enthalpy and entropy of this reaction. Remember that you can import the gas constant, $R$, using

from scipy.constants import R Add more code cells as necessary to organise your calculation.
In [ ]:
 
Create a markdown cell below, and use it to comment on the signs of $\Delta H$ and $\Delta S$ for this reaction.