Home > Guides, TKJ Electronics > A practical approach to Kalman filter and how to implement it

A practical approach to Kalman filter and how to implement it

I have for a long time been interrested in Kalman filers and how they work, I also used a Kalman filter for my Balancing robot, but I never explained how it actually was implemented. Actually I had never taken the time to sit down with a pen and a piece of paper and try to do the math by myself, so I actually did not know how it was implemented.
It turned out to be a good thing, as I actually discovered a mistake in the original code, but I will get back to that later.

I actually wrote about the Kalman filter as my master assignment in high school back in December 2011. But I only used the Kalman filter to calculate the true voltage of a DC signal modulated by known Gaussian white noise. My assignment can be found in the following zip file: http://www.tkjelectronics.dk/uploads/Kalman_SRP.zip. It is in danish, but you can properly use google translate to translate some of it. If you got any specific questions regarding the assignment, then ask in the comments below.

Okay, but back to the subject. As I sad I had never taken the time to sit down and do the math regarding the Kalman filter based on an accelerometer and a gyroscope. It was not as hard as I expected, but I must confess that I still have not studied the deeper theory behind, on why it actually works. But for me, and most people out there, I am more interrested in implementing the filter, than in the deeper theory behind and why the equations works.

Before we begin you must have some basic knowledge about matrices like multiplication of matrices and transposing of matrices. If not then please take a look at the following websites:

For those of you who do not know what a Kalman filter is, it is an algorithm which uses a series of measurements observed over time, in this context an accelerometer and a gyroscope. These measurements will contain noise that will contribute to the error of the measurement. The Kalman filter will then try to estimate the state of the system, based on the current and previous states, that tend to be more precise that than the measurements alone.

In this context the problem is that the accelerometer is in general very noise when it is used to measure the gravitational acceleration since the robot is moving back and forth. The problem with the gyro is that it drifts over time – just like a spinning wheel-gyro will start to fall down when it is losing speed.
In short you can say that you can only trust the gyroscope on a short term while you can only trust the accelerometer on a long term.

There is actually a very easy way to deal with this by using a complimentary filter, which basicly just consist of a digital low-pass filter on the accelerometer and digital high-pass filter on the gyroscope readings. But it is not as accurate as the Kalman filter, but other people have succesfully build balancing robots using a fine-tuned complimentary filter.

More information about gyroscopes, accelerometer and complimentary filters can be found in this pdf. A comparison between a complimentary filter and a Kalman filter can be found in the following blog post.

The Kalman filter operates by producing a statistically optimal estimate of the system state based upon the measurement(s). To do this it will need to know the noise of the input to the filter called the measurement noise, but also the noise of the system itself called the process noise. To do this the noise has to be Gaussian distributed and have a mean of zero, luckily for us most random noise have this characteristic.
For more information about the theory behind the filter take a look at the following pages:

The system state \boldsymbol{x}_k
The next of this article might seem very confusing for some, but I promise you if you grab a pen and a piece of paper and try to follow along it is not that hard if you are reasonable at math.

If you, like me, do not have a calculator or computer program that can work with matrices, then I recommend the free online calculator Wolfram Alpha. I used it for all the calculations in this article.

I will use the same notation as the wikipedia article, but I will like to note that when the matrixes are constants and does not depend on the current time you do not have to write the k after them. So for instance F_k can be simplified to F.

Also I would like to write a small explanation of the other aspects of the notations.
First I will make a note about whats called the previous state:

\boldsymbol{\hat{x}}_{k-1 | k-1}

Which is the previous estimated state based on the previous state and the estimates of the states before it.

The next is the a priori state:

\boldsymbol{\hat{x}}_{k | k-1}

A priori means the estimate of the state matrix at the current time k based on the previous state of the system and the estimates of the states before it.

The last one is called a posteriori state:

\boldsymbol{\hat{x}}_{k | k}

Is the estimated of the state at time k given observations up to and including at time k.

The problem is that the system state itself is hidden and can only be observed through observation z_k. This is also called a Hidden Markov model.

This means that the state will be based upon the state at time k and all the previous states. That also means that you can not trust the estimate of the state before the Kalman filter has stabilized – take a look at the graph at the front page of my assignment.

The hat over the \hat{x} means that is the estimate of the state. Unlike just a single x which means the true state – the one we are trying to estimate.
So the notation for the state at time k is:


The state of the system at time k if given by:

\boldsymbol{x}_k = \boldsymbol{F}x_{k-1} + \boldsymbol{B}u_k + w_k

Where x_k is the state matrix which is given by:

\boldsymbol{x}_k = \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_k

As you can see the output of the filter will be the angle \theta but also the bias \dot{\theta}_b based upon the measurements from the accelerometer and gyroscope. The bias is the amount the gyro has drifted. This means that one can get the true rate by subtracting the bias from the gyro measurement.

The next is the F matrix, which is the state transition model which is applied to the prevouis state x_{k-1}.

In this case F is defined as:

\boldsymbol{F} = \begin{bmatrix} 1 & -\Delta t \\ 0 & 1 \end{bmatrix}

I know that the -\Delta t might seem confusing, but it will make sense later (take a look at my comment).

The next is the control input u_k, in this case it is the gyroscope measurement in degrees per second (°/s) at time k, this is also called the rate \dot{\theta}. We will actually rewrite the state equation as:

\boldsymbol{x}_k = \boldsymbol{F}x_{k-1} + \boldsymbol{B}{\dot{\theta}_k} + w_k

The next thing is the B matrix. Which is called the control-input model, which is defined as:

\boldsymbol{B} = \begin{bmatrix} \Delta t \\ 0 \end{bmatrix}

This makes perfectly sense as you will get the angle \theta when you multiply the rate \dot{\theta} by the delta time \Delta t and since we can not calculate the bias directly based on the rate we will set the bottom of the matrix to 0.

w_k is process noise which is Gaussian distributed with a zero mean and with covariance Q to the time k:

\boldsymbol{w}_k \sim N\left ( 0, \boldsymbol{Q}_k \right )

Q_k is the process noise covariance matrix and in this case the covariance matrix of the state estimate of the accelerometer and bias. In this case we will consider the estimate of the bias and the accelerometer to be independent, so it’s actually just equal to the variance of the estimate of the accelerometer and bias.
The final matrix is defined as so:

\boldsymbol{Q}_k = \begin{bmatrix} Q_\theta & 0 \\ 0 & Q_{\dot{\theta}_b} \end{bmatrix}\Delta t

As you can see the Q_k covariance matrix depends on the current time k, so the accelerometer variance Q_\theta and the variance of the bias Q_{\dot{\theta}_b} is multiplied by the delta time \Delta t.
This makes sense as the process noise will be larger as longer time it is since the last update of the state. For instance the gyro could have drifted.
You will have to know these constants for the Kalman filter to work.
Note if you set a larger value, the more noise in the estimation of the state. So for instance if the estimated angle starts to drift you have to increase the value of Q_{\dot{\theta}_b}. Otherwise if the estimate tends to be slow you are trusting the estimate of the angle too much and should try to decrease the value of Q_\theta to make it more responsive.

The measurement \boldsymbol{z}_k
Now we will take a look at the observation or measurement z_k of the true state x_k. The observation z_k is given by:

\boldsymbol{z}_k = \boldsymbol{H}x_{k} + v_k

As you can see the measurement z_k is given by the current state x_k multiplied by the H matrix plus the measurement noise v_k.

H is called the observation model and is used to map the true state space into the observed space. The true state can not be observed. Since the measurement is just the measurement from the accelerometer, H is given by:

\boldsymbol{H} = \begin{bmatrix} 1 & 0 \end{bmatrix}

The noise of the measurement have to be Gaussian distributed as well with a zero mean and R as the covariance:

\boldsymbol{v}_k \sim N\left ( 0, \boldsymbol{R} \right )

But as R is not a matrix the measurement noise is just equal to the variance of the measurement, since the covariance of the same variable is equal to the variance. See this page for more information.
Now we can define R as so:

\boldsymbol{R} = E \begin{bmatrix} v_k & {v_k}^T \end{bmatrix} = var(v_k)

More information about covariance can be found on Wikipedia and in my assignment.

We will assume that the measurement noise is the same and does not depend on the time k:

var(v_k) = var(v)

Note that if you set the measurement noise variance var(v) too high the filter will respond really slowly as it is trusting new measurements less, but if it is too small the value might overshoot and be noisy since we trust the accelerometer measurements too much.

So to round up you have to find the the process noise variances Q_\theta and Q_{\dot{\theta}_b} and the measurement variance of the measurement noise var(v). There are multiple ways to find them, but it is out of the aspect of this article.

The Kalman filter equations
Okay now to the equations we will use to estimate the true state of the system at time k \hat{x}_k. Some clever guys came up with equations found below to estimate the state of the system.
The equations can be written more compact, but I prefer to have them stretched out, so it is easier to implement and understand the different steps.

In the first two equations we will try to predict the current state and the error covariance matrix at time k. First the filter will try to estimate the current state based on all the previous states and the gyro measurement:

\boldsymbol{\hat{x}}_{k | k-1} = \boldsymbol{F}\hat{x}_{k-1 | k-1} + \boldsymbol{B}{\dot{\theta}_k}

That is also why it is called a control input, since we use it as an extra input to estimate the state at the current time k called the a priori state \hat{x}_{k | k-1} as described in the beginning of the article.

The next thing is that we will try to estimate the a priori error covariance matrix P_{k | k-1} based on the previous error covariance matrix P_{k-1 | k-1}, which is defined as:

\boldsymbol{P}_{k | k-1} = \boldsymbol{F}\boldsymbol{P}_{k-1 | k-1}\boldsymbol{F}^T + \boldsymbol{Q}_k

This matrix is used to estimate how much we trust the current values of the estimated state. The smaller the more we trust the current estimated state. The principle of the equation above is actually pretty easy to understand, as it is pretty obvious that the error covariance will increase since we last updated the estimate of the state, therefore we multiplied the error covariance matrix by the state transition model F and the transpose of that F^T and add the current process noise Q_k at time k.

The error covariance matrix P in our case is a 2×2 matrix:

\boldsymbol{P} = \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}

The fist thing we will do is to compute the difference between the measurement z_k and the a priori state x_{k | k-1}, this is also called the innovation:

\boldsymbol{\tilde{y}}_k = \boldsymbol{z}_k - \boldsymbol{H}\hat{x}_{k | k-1}

The observation model H is used to map the a priori state x_{k | k-1} into the observed space which is the measurement from the accelerometer, therefore the innovation is not a matrix

\boldsymbol{\tilde{y}}_k = \begin{bmatrix} \boldsymbol{\tilde{y}} \end{bmatrix}_k

The next thing we will do is calculate whats called the innovation covariance:

\boldsymbol{S}_k = \boldsymbol{H} \boldsymbol{P}_{k | k-1} \boldsymbol{H}^T + \boldsymbol{R}

What it does is that it tries to predict how much we should trust the measurement based on the a priori error covariance matrix P_{k | k-1} and the measurement covariance matrix R. The observation model H is used to map the a priori error covariance matrix P_{k | k-1} into observed space.

The bigger the value of the measurement noise the larger the value of S, this means that we do not trust the incoming measurement that much.
In this case S is not a matrix and is just written as:

\boldsymbol{S}_k = \begin{bmatrix} \boldsymbol{S} \end{bmatrix}_k

The next step is to calculate the Kalman gain. The Kalman gain is used to to indicate how much we trust the innovation and is defined as:

\boldsymbol{K}_k = \boldsymbol{P}_{k | k-1} \boldsymbol{H}^T \boldsymbol{S}_k^{-1}

You can see that if we do not trust the innovation that much the innovation covariance S will be high and if we trust the estimate of the state then the error covariance matrix P will be small the Kalman gain will therefore be small and oppesite if we trust the innovation but does not trust the estimation of the current state.

If you take a deeper look you can see that the transpose of the observation model H is used to map the state of the error covariance matrix P into observed space. We then compare the error covariance matrix by multiplying with the inverse of the innovation covariance S.

This make sense as we will use the observation model H to extract data from the state error covariance and compare that with the current estimate of the innovation covariance.

Note that if you do not know the state at startup you can set the error covariance matrix like so:

\boldsymbol{P} = \begin{bmatrix} L & 0 \\ 0 & L \end{bmatrix}

Where L represent a large number.

For my balancing robot I know the starting angle and I find the bias of the gyro at startup by calibrating, so I assume that the state will be known at startup, so I initialize the error covariance matrix like so:

\boldsymbol{P} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix}

Take a look at my calibration routine for more information.

In this case the Kalman gain is a 2×1 matrix:

\boldsymbol{K} = \begin{bmatrix} K_0 \\ K_1 \end{bmatrix}

Now we can update the a posteriori estimate of the current state:

\boldsymbol{\hat{x}}_{k | k} = \boldsymbol{\hat{x}}_{k | k-1} + \boldsymbol{K}_k \; \boldsymbol{\tilde{y}}_k

This is done by adding the a priori state \hat{x}_{k | k-1} with the Kalman gain multiplied by the innovation \tilde{y}_k.
Remember that the innovation \tilde{y}_k is the difference between the measurement z_k and the estimated priori state \hat{x}_{k | k-1}, so the innovation can both be positive and negative.

A little simplified the equation can be understood as we simply correct the estimate of the a priori state \hat{x}_{k | k-1}, that was calculated using the previous state and the gyro measurement, with the measurement – in this case the accelerometer.

The last thing we will do is update the a posteriori error covariance matrix:

\boldsymbol{P}_{k | k} = (\boldsymbol{I} - \boldsymbol{K}_k \boldsymbol{H}) \boldsymbol{P}_{k | k-1}

Where I is called the identity matrix and is defined as:

\boldsymbol{I} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

What the filter is doing is that it is basically self-correcting the error covariance matrix based on how much we corrected the estimate. This make sense as we corrected the state based the a priori error covariance matrix P_{k | k-1}, but also the innovation covariance S_k.

Implementing the filter
In this section I will use the equation from above to implement the filter into a simple c++ code that can be used for balancing robots, quadcopters and other applications where you need to compute the angle, bias or rate.

In case you want the code next to you, it can be found at github: https://github.com/TKJElectronics/KalmanFilter.

I will simply write the equations at the top of each step and then simplify them after that I will write how it is can be done i C and finally I will link to calculations done in Wolfram Alpha in the bottom of each step, as I used them to do the calculation.

Step 1:

\boldsymbol{\hat{x}}_{k | k-1} = \boldsymbol{F}\hat{x}_{k-1 | k-1} + \boldsymbol{B}{\dot{\theta}_k}

\begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k | k-1} = \begin{bmatrix} 1 & -\Delta t \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k-1 | k-1} + \begin{bmatrix} \Delta t \\ 0 \end{bmatrix} \dot{\theta}_k

 = \begin{bmatrix} \theta - \dot{\theta}_b \Delta t \\ \dot{\theta}_b \end{bmatrix}_{k-1 | k-1} + \begin{bmatrix} \Delta t \\ 0 \end{bmatrix} \dot{\theta}_k

 = \begin{bmatrix} \theta - \dot{\theta}_b \Delta t + \dot{\theta} \Delta t \\ \dot{\theta}_b \end{bmatrix}

 = \begin{bmatrix} \theta + \Delta t (\dot{\theta} - \dot{\theta}_b) \\ \dot{\theta}_b \end{bmatrix}

As you can see the a priori estimate of the angle is \hat{\theta}_{k | k-1} is equal to the estimate of the previous state \hat{\theta}_{k-1 | k-1} plus the unbiased rate times the delta time \Delta t.
Since we can not directly measure the bias the estimate of the a priori bias is just equal to the previous one.

This can be written in C like so:

rate = newRate - bias;
angle += dt * rate;

Note that I calculate the unbiased rate, so it can be be used by the user as well.

Wolfram Alpha links:

Step 2:

\boldsymbol{P}_{k | k-1} = \boldsymbol{F}\boldsymbol{P}_{k-1 | k-1}\boldsymbol{F}^T + \boldsymbol{Q}_k

\begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1} = \begin{bmatrix} 1 & -\Delta t \\ 0 & 1 \end{bmatrix} \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k-1 | k-1} \begin{bmatrix} 1 & 0 \\ -\Delta t & 1 \end{bmatrix} + \begin{bmatrix} Q_\theta & 0 \\ 0 & Q_{\dot{\theta}_b} \end{bmatrix}\Delta t

= \begin{bmatrix} P_{00} - \Delta t P_{10} & P_{01} - \Delta t P_{11} \\ P_{10} & P_{11} \end{bmatrix}_{k-1 | k-1} \begin{bmatrix} 1 & 0 \\ -\Delta t & 1 \end{bmatrix} + \begin{bmatrix} Q_\theta & 0 \\ 0 & Q_{\dot{\theta}_b} \end{bmatrix}\Delta t

= \begin{bmatrix} P_{00} -\Delta t P_{10} -\Delta t (P_{01} - \Delta t P_{11}) & P_{01} -\Delta t P_{11} \\ P_{10} -\Delta t P_{11} & P_{11} \end{bmatrix}_{k-1 | k-1} + \begin{bmatrix} Q_\theta & 0 \\ 0 & Q_{\dot{\theta}_b} \end{bmatrix}\Delta t

= \begin{bmatrix} P_{00} -\Delta t P_{10} -\Delta t (P_{01} - \Delta t P_{11}) + Q_\theta \Delta t & P_{01} -\Delta t P_{11} \\ P_{10} -\Delta t P_{11} & P_{11} + Q_{\dot{\theta}_b} \Delta t \end{bmatrix}

= \begin{bmatrix} P_{00} + \Delta t (\Delta t P_{11} - P_{01} - P_{10} + Q_\theta) & P_{01} -\Delta t P_{11} \\ P_{10} -\Delta t P_{11} & P_{11} + Q_{\dot{\theta}_b} \Delta t \end{bmatrix}

The equations above can be written in C like so:

P[0][0] += dt * (dt*P[1][1] - P[0][1] - P[1][0] + Q_angle);
P[0][1] -= dt * P[1][1];
P[1][0] -= dt * P[1][1];
P[1][1] += Q_gyroBias * dt;

Note that this is the part of the code that there was an error in in the original code that I used.

Wolfram Alpha links:

Step 3:

\boldsymbol{\tilde{y}}_k = \boldsymbol{z}_k - \boldsymbol{H}\hat{x}_{k | k-1}

= \boldsymbol{z}_k - \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k | k-1}

= \boldsymbol{z}_k - \theta_{k | k-1}

The innovation can be calculated in C like so:

y = newAngle - angle;

Wolfram Alpha links:

Step 4:

\boldsymbol{S}_k = \boldsymbol{H} \boldsymbol{P}_{k | k-1} \boldsymbol{H}^T + \boldsymbol{R}

= \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1} \begin{bmatrix} 1 \\ 0 \end{bmatrix} + \boldsymbol{R}

= {P_{00}}_{k | k-1} + \boldsymbol{R}

= {P_{00}}_{k | k-1} + var(v)

Again the C code is pretty simple:

S = P[0][0] + R_measure;

Wolfram Alpha links:

Step 5:

\boldsymbol{K}_k = \boldsymbol{P}_{k | k-1} \boldsymbol{H}^T \boldsymbol{S}_k^{-1}

\begin{bmatrix} K_0 \\ K_1 \end{bmatrix}_k = \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1} \begin{bmatrix} 1 \\ 0 \end{bmatrix} \boldsymbol{S}_k^{-1}

= \begin{bmatrix} P_{00} \\ P_{10} \end{bmatrix}_{k | k-1} \boldsymbol{S}_k^{-1}

= \dfrac{\begin{bmatrix} P_{00} \\ P_{10}\end{bmatrix}_{k | k-1}}{\boldsymbol{S}_k}

Note that in other cases S can be a matrix and you can not just simply divide P by S. Instead you have to calculate the inverse of the matrix. See the following page for more information on how to do so.

The C implementation looks like this:

K[0] = P[0][0] / S;
K[1] = P[1][0] / S;

Wolfram Alpha links:

Step 6:

\boldsymbol{\hat{x}}_{k | k} = \boldsymbol{\hat{x}}_{k | k-1} + \boldsymbol{K}_k \; \boldsymbol{\tilde{y}}_k

\begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k | k} = \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k | k-1} + \begin{bmatrix} K_0 \\ K_1 \end{bmatrix}_k \boldsymbol{\tilde{y}}_k

= \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k | k-1} + \begin{bmatrix} K_0 \; \boldsymbol{\tilde{y}} \\ K_1 \; \boldsymbol{\tilde{y}} \end{bmatrix}_k

Yet again the equation end up pretty short, and can be written as so in C:

angle += K[0] * y;
bias += K[1] * y;

Step 7:

\boldsymbol{P}_{k | k} = (\boldsymbol{I} - \boldsymbol{K}_k \boldsymbol{H}) \boldsymbol{P}_{k | k-1}

\begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k} = \left(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} - \begin{bmatrix} K_0 \\ K_1 \end{bmatrix}_k \begin{bmatrix} 1 & 0 \end{bmatrix}\right) \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1}

= \left(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} - \begin{bmatrix} K_0 & 0 \\ K_1 & 0 \end{bmatrix}_k \right) \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1}

= \begin{bmatrix} P_{00} & P_{01} \\ P_{10} & P_{11} \end{bmatrix}_{k | k-1} - \begin{bmatrix} K_0 \; P_{00} & K_0 \; P_{01} \\ K_1 \; P_{00} & K_1 \; P_{01} \end{bmatrix}

Remember that we decrease the error covariance matrix again, since the error of the estimate of the state has been decreased.
The C code looks like this:

float P00_temp = P[0][0];
float P01_temp = P[0][1];

P[0][0] -= K[0] * P00_temp;
P[0][1] -= K[0] * P01_temp;
P[1][0] -= K[1] * P00_temp;
P[1][1] -= K[1] * P01_temp;

Wolfram Alpha links:

Note that I have found that the following variances works perfectly for most IMUs:

float Q_angle = 0.001;
float Q_gyroBias = 0.003;
float R_measure = 0.03;

Remember that it’s very important to set the target angle at startup if you need to use the output at startup. For more information, see the calibration routine for my balancing robot.

In case you missed it here is the library I wrote a library that can be used by any microcontroller that supports floating math. The source code can be found at github: https://github.com/TKJElectronics/KalmanFilter.

If you prefer a video explanation about the Kalman filter, I recommend the following video series: http://www.youtube.com/watch?v=FkCT_LV9Syk.

Note that you can not use the library if you need to represent something in a full 3D orientations, as euler angles suffer from what is called Gimbal lock you will need to use Quaternions to do that, but that is a whole nother story. For now take a look at the following page.

This is all for know, I hope that you will find i helpfull, if you do or have any questions fell free to post a comment below – it supports LaTeX syntax as well, if you need to write equations.
If you spot any errors please let me know as well.

Categories: Guides, TKJ Electronics Tags:
  1. September 10th, 2012 at 23:06 | #1

    Thanks for the great article and for the reference to filter.pdf (my document from ages ago…). This is definitely one of the better explanations I’ve seen about making a Kalman filter for angle estimation based on accelerometer/gyro data.

    What do you think about the idea of pre-computing the P, S, and K values? They should converge independent of the states and measurements. In the steady-state, it seems like you would be back to a complementary filter with K[0]’s steady-state value and a simple bias estimator with K[1]’s steady-state. Of course, having the computation done in real-time opens up more possibilities (covariance reset, EKF, etc.).

  2. September 10th, 2012 at 23:40 | #2

    Thank you very much. I must say that your pdf and the project details about your balancing robot were a big inspiration too, when I started to look into building a balancing robot.

    I see you point. They should end up being constants after some time, as they don’t depend on any incoming data from either gyroscope or accelerometer.
    It might be interesting to run the numbers in Matlab and see what P, S, and K end up with. That might be material for a future blog post.

  3. September 11th, 2012 at 02:21 | #3

    I look forward to it! I’ve been trying to compare the two filters for a long time and this has helped a lot. I just never had a simple enough presentation of the Kalman filter approach to try.

    What is the delta-T in your implementation?

  4. September 11th, 2012 at 08:44 | #4

    Okay. I will try to hopefully start working with Matlab in the coming weeks and then I will let you know what conclusion I come up with.

    Delta t is the time since the calculation was last performed. See these lines at the example code: https://github.com/TKJElectronics/KalmanFilter/blob/master/examples/KalmanExample/KalmanExample.ino#L45-46.

  5. Conrado
    September 11th, 2012 at 15:08 | #5

    Great explanation! Many people complain about Kalman Filters, but this surely will help them.

    About @Shane’s comment, many industry applications use constant values for the Kalman gain (K), as it converges if the process is stationary. I tried it myself and works very nicely, specially if the processor is not very powerful.

    Also, being a Control Engineer, I’d like to add a minor detail: the gimbal lock occurs because of a mechanical characteristic of the system, independently of the mathematical representation. It reflects as a loss of degree of freedom in the Euler angles representations and a singularity in the Jacobian. But the same problem happens with quaternions. Although quaternions don’t suffer from the singularity, and therefore are able to keep tracking the angles, the mechanical gimbal lock still occurs. Think about a robotic arm and you’ll see that sometimes the one of the wrists has to move very (almost infinitely) fast. That occurs because of the singularity in the Euler angles used to build the robot (which can be reflected using Euler angles in the formulation), but will happen even if we use quaternions for tracking (which will be able to track correctly, but the speed will be infinite). Unfortunately, we can’t build robot arms based on quaternions. The problem is in the inverse dynamics.

    Finally, from the point of view of filtering, the gimbal lock can only be observed through Euler angles, but from the point of view of control, it can happen because of the mechanics of the robot.

  6. September 11th, 2012 at 17:56 | #6

    This is the greatest article on this topic I’ve ever seen!

    But could you please make a .pdf version too? Or, did I missed the link?

  7. September 11th, 2012 at 19:24 | #7

    I have now generated a pdf version of the article. It can be found in the bottom of the article.

  8. September 11th, 2012 at 19:37 | #8

    Thank you for your clarification! The true is that I have not looked into the theory of quaternions as I have never needed to use them, so it’s nice to get some feedback from people like you!

  9. Marco
    September 12th, 2012 at 09:11 | #9

    Thank thee very much!!! I spend two days to read the pager….still a little hard for me, but closer to kalman filter!

    Thank you!

  10. bimmer
    September 12th, 2012 at 15:13 | #10

    Awesome article, really enjoyed it.
    Could you please explain about the ‘physical model’ behind the equations? How did you get the matrices F and B.

    Thank you!

  11. Claudio Donaté
    September 12th, 2012 at 16:24 | #11

    Very nice article, Lauszus. Truly amazing! Best one I’ve seen so far. Congrats!

    You said that you found an error in your original code, can you specify what and where was it?

    Thank you very much for this!

  12. September 13th, 2012 at 22:33 | #12

    Okay. I will try with a example.
    Remember that the state is given by:
    $latex \boldsymbol{x}_k = \boldsymbol{F}x_{k-1} + \boldsymbol{B}{\dot{\theta}_k} + w_k$

    We will just assume that after some samples the state is equal to:
    $latex \boldsymbol{x}_{k-1} = \begin{bmatrix} \theta \\ \dot{\theta}_b \end{bmatrix}_{k-1} = \begin{bmatrix} 180deg \\ 10deg/s \end{bmatrix}_{k-1}$
    Which means that the angle is currently 180° and the amount the gyroscope has drifted is equal to 10°/s.

    Okay now assume that the output of the gyroscope is measured as 15°/s:
    $latex \dot{\theta}_k = 15deg/s$

    It’s easy to understand that it’s not 15°/s since the bias is 10°/s, so the real rate is 5°/s -- this what is we are trying to achieve.

    Now we will define the $latex F$ and $latex B$ matrices as we did in the blog post:
    $latex \boldsymbol{F} = \begin{bmatrix} 1 & -\Delta t \\ 0 & 1 \end{bmatrix}$

    $latex \boldsymbol{B} = \begin{bmatrix} \Delta t \\ 0 \end{bmatrix}$

    Let’s put it into the equation:
    $latex \boldsymbol{x}_k = \begin{bmatrix} 1 & -\Delta t \\ 0 & 1 \end{bmatrix}\begin{bmatrix} 180deg \\ 10deg/s \end{bmatrix} + \begin{bmatrix} \Delta t \\ 0 \end{bmatrix}{15deg/s} + w_k$

    So now let’s multiply those matrices:
    $latex \boldsymbol{x}_k = \begin{bmatrix} 180deg-\Delta t \times 10deg/s \\ 10deg/s \end{bmatrix} + \begin{bmatrix} \Delta t \times 15deg/s \\ 0 \end{bmatrix} + w_k$

    Now we will add them together:
    $latex \boldsymbol{x}_k = \begin{bmatrix} 180deg-\Delta t \times 10deg/s+\Delta t \times 15deg/s \\ 10deg/s \end{bmatrix} + w_k$

    The end up result will be:
    $latex \boldsymbol{x}_k = \begin{bmatrix} 180deg+\Delta t \times 5deg/s \\ 10deg/s \end{bmatrix} + w_k$

    Also check out step 1 for more help.

    @Claudio Donaté
    See step 2.
    I mention it just after the C code.

  13. Grant Pitzer
    September 17th, 2012 at 03:52 | #13


    I hope your high school teacher gave you an A++! Extremely impressive!!! You’ll go far, I’m sure we’ll be reading about you one day soon.

  14. September 18th, 2012 at 00:12 | #14

    @Grant Pitzer
    Thank you very much. I actually got the highest grade for the assignment (we use a different grading system in Denmark) 🙂

  15. Grant Pitzer
    September 20th, 2012 at 19:53 | #15


    I just received an email about this TRAX AHRS System. They advertise their use of the Kalman Filter.


    Keep up the good work!

  16. Vic
    October 17th, 2012 at 23:28 | #16

    Ok, I think I am missing something, but how do you obtain z (newAngle) in step 3?

    Do you have to integrate the unfiltered signal or do you need to have a some other source?


  17. October 17th, 2012 at 23:30 | #17

    It’s just the angle calculated using the accelerometers. See this example sketch on how to use it: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino#L70

  18. Vic
    October 18th, 2012 at 00:23 | #18


    Thanks for the quick answer.

    But I am not sure how this would work in a helicopter for example. Wouldn’t this only work if the accelerometer was standing still (or moving with a constant speed)?

    accYangle = (atan2(accX,accZ)+PI)*RAD_TO_DEG;
    accXangle = (atan2(accY,accZ)+PI)*RAD_TO_DEG;

    If it was standing still it would register -\vec{g} but if it was on-board a vehicle, like a helicopter acelerating, it would register -\vec{g} + \vec{a}.

    Finally if I was design a kalman filter for accelerometer should I also use an external source like a gps or ultrasonic range sensor for z? How about if I don’t have any other sensor, just the IMU?

    Thanks again!

  19. October 18th, 2012 at 00:43 | #19

    No it should work in a helicopter as long as you don’t start to do loops and so on.
    That’s the hole point of this Kalman filter -- it combines the angle obtained from the accelerometer and the rate obtained from the gyroscope to calculate the pitch and roll.

    Remember that if you wan’t to use other inputs to the filter you’ll have to redefine the matrices.

    When you write IMU do you mean only the accelerometer or also the gyroscope?

  20. Vic
    October 18th, 2012 at 00:51 | #20

    I mean the accelerometer + gyroscope?


  21. October 18th, 2012 at 00:53 | #21

    The hole point of this post was to create the code to use with an accelerometer and gyroscope. It can be found at github: https://github.com/TKJElectronics/KalmanFilter as mentioned in the post.

    Here is a basic example on how to use it: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino
    And here it is used for my balancing robot: https://github.com/TKJElectronics/BalancingRobotArduino/blob/master/BalancingRobotArduino.ino

  22. December 4th, 2012 at 16:42 | #22

    I admire the vital in rank you place forward in your articles. I will bookmark your blog and be inflicted with my family ranking try out up at this calculate evenly. I am practically guaranteed they will gather bags of new material at this calculate than any nature moreover!

  23. mich
    January 28th, 2013 at 17:57 | #23

    Can you explain, why one of the measurement from your system sensors goes into the “measurement” vector z, but the other measurement from your system sensors goes into the “input” vector u. Neither of these is a real control input into your system, or in the case of your balancing robot, is it ?

  24. January 31st, 2013 at 15:08 | #24

    I can’t explain exactly why I used the gyro as the control input and the accelerometer as the measurement.
    I just saw other people use that, so I decided to use it as well.

  25. dheeraj
    February 18th, 2013 at 18:17 | #25

    i want to know weather we can use MPU-6050-Triple Axis Accelerometer Gyro
    if yes how to connect it to arduino

  26. alejandroamez
    February 19th, 2013 at 07:39 | #27

    I’ll have to say this is probably one of the most easy to read kalman filter application article. Good job and keep it going! I’m just wondering whether this kalman filter will also apply to two-axis angles. So I’ll have two pairs of gyro-accelerometer fusion. Do I simply double the size of every matrix since they each pair still shares the same mechanism? Thanks in advance.

  27. February 20th, 2013 at 14:32 | #28

    Thank you for your kind words.

    So you want to use it to calculate both pitch and roll? It’s really simple just create two instances: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino#L4-L5.

    And then you simply use them as they are two sperate libraries: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino#L70-L71.

  28. Rafael
    February 25th, 2013 at 01:19 | #29

    Thank you for the explain.

    Why the scketch is only for two-axis angle?

    For the Z axis is the same as X and Y?

  29. February 25th, 2013 at 01:25 | #30

    The problem is that you can’t measure yaw using an accelerometer, so you can’t use the combination of an accelerometer and a gyroscope to calculate the yaw you will need something like a magnetometer to do that.

  30. Thamanoon
    February 26th, 2013 at 00:58 | #31

    Thank you for the explanation.
    I would like to ask you about kalmanExample how and why to set kX.setAngle(270);. I don’t understand some kind of calibration. I am new for gyro and acc.
    Thank agian.

  31. February 26th, 2013 at 17:31 | #32

    The reason why I set the angle at startup is because the filter takes time to stabilize, so if the angle started at 0 it would take some time before it it ended up at 270. This means that the output would actually be at 180 at some point, so my balancing robot would think that it’s actually standing up causing the motors to spin when it’s actually laying on the ground -- which is obviously not a good thing.

    Take a look at the front page of my assignment: http://www.tkjelectronics.dk/uploads/Kalman_SRP.zip (It’s the pdf named “Kristian Lauszus SRP 2011.pdf”). The blue line is the measured voltage while the red one is the estimated value using the Kalman filter, as you can see in that case it takes about 40 samples before the filter has stabilized.
    In a real situation you could then set the value to the first measurement, in this case around 300, which will reduce the time it takes the filter to stabilize.

  32. Thamanoon
    February 27th, 2013 at 05:54 | #33

    Thank you very much for your explanation. I’m more understand, will try to test and build my own balancing robot.

  33. Rafael
    February 27th, 2013 at 18:44 | #34

    Lauszus :@Rafael The problem is that you canâ??t measure yaw using an accelerometer, so you canâ??t use the combination of an accelerometer and a gyroscope to calculate the yaw you will need something like a magnetometer to do that.

    Ohhh ok, I understand ;D

    Another question.

    Why when I modify the scketch like this:



    Serial.print(“compY:”);Serial.print(compAngleY); Serial.print(“\t”);


    I only introduce a Tittles, but data vary widely!!!!


  34. February 27th, 2013 at 18:50 | #35

    I guess you are using this example: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino, right? You are using a MPU-6050, right?
    Note that you should compare accXangle, gyroXangle, compAngleX etc.

  35. Rafael
    February 27th, 2013 at 22:53 | #36

    Lauszus :@Rafael I guess you are using this example: https://github.com/TKJElectronics/Example-Sketch-for-IMU-including-Kalman-filter/blob/master/IMU6DOF/MPU6050/MPU6050.ino, right? You are using a MPU-6050, right?Note that you should compare accXangle, gyroXangle, compAngleX etc.

    Sorry, yes I’m using MPU-6050 with this scketch.

    the data of kalAngleX or Y vary more than if I don’t Introduce a Tittle…why?

    Thaks for you attention

  36. February 28th, 2013 at 17:02 | #37

    Are you 100% sure that this is the case, and you didn’t change anything else?

  37. Rafael
    March 2nd, 2013 at 01:55 | #38

    Hello Lauszus, all it’s OK, sorry. I’m the worst!!!

  38. Afar
    March 2nd, 2013 at 02:06 | #39

    Hello, I’m in a project for measure angles with MPU-6050.
    When I move the GY-521 a little angle in X axis …OK, but when I move a long angle (> 45º), the Y angle vary… Why?

  39. March 2nd, 2013 at 16:04 | #40

    How are you calculating the angle?

  40. Rob
    March 4th, 2013 at 23:13 | #41

    Hi Lauszus,

    I’m trying to implement this code for roll control on an RC airplane. I have it running, but it is slow to calculate theta angles for quick disturbances. Anything simple I’m missing/ way to speed up the response time? Thanks!

  41. March 5th, 2013 at 14:09 | #42

    You could try to “trust” the accelerometer more by lowering the the Q_angle and R_measure variables.
    What IMU are you using? Please report back if you find some better values for your specific IMU, so I can share it with others 🙂

  42. Afar
    March 5th, 2013 at 22:16 | #43


    Exemple 1: I put the MPU-6050 horizontally (kalAngleX = 178 º, kalAngleY â??â??= 181 º), rotating in X axis, when I put in vertical (kalAngleX = 275 º, kalAngleY â??â??= 348 º).

    Exemple 2: I put the MPU-6050 horizontally (kalAngleX = 177 º, kalAngleY â??â??= 178 º), rotating in Y axis, when I put in vertical (kalAngleX = 272 º, kalAngleY â??â??= 38 º).

    Don’t understand why the oposite axis values â??â??vary! They vary in the last angles… I don’t undersand something…


  43. Rob
    March 10th, 2013 at 00:06 | #44

    Thanks for the quick response Lauszus! We found that using only simple low pass filter with only accelerometer data was good enough for roll control. I still want to try to speed the Kalman filter up for a future controller, just haven’t had enough time recently. I’ll be sure to post back if I figure it out.

  44. March 10th, 2013 at 00:19 | #45

    Okay. Glad you solved your problem 🙂

  45. kelvin
    March 12th, 2013 at 02:42 | #46

    Hi,May i noe hw kalman filter can be implement into Matlab coding?

  46. March 12th, 2013 at 03:04 | #47

    Check out this video series: http://youtu.be/FkCT_LV9Syk where he implements a Kalman filter in Matlab.

Comment pages
1 2 3 7 2868
  1. September 10th, 2012 at 19:47 | #1
  2. September 11th, 2012 at 06:08 | #2
  3. September 12th, 2012 at 01:10 | #3