(Replying to PARENT post)

I've always wanted to deeply, fundamentally understand a kalman filter, but I usually get lost in the math notation that comes up without the plain language explanation of what that math means.
๐Ÿ‘คbfrog๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

For an intuitive understanding, try to build one in one dimension. Imagine you have a position that's drifting at a roughly constant, but noisy, rate, modeled by a normal distribution. You also have an occasional 1D GPS reading about that position. As building blocks, check out conjugate priors, specifically normal-normal [1] and the algebra of random variables, specifically the sum of normally distributed variables [2].

Briefly:

If you have a normal distribution as a prior and a normal distribution as new evidence, then applying Bayes Theorem on them results in a new normal distribution (times a constant we'll ignore). The calculation can actually be pretty simple - the mean is a weighted average, weighted by precision (precision is 1/variance), and the precision is the sum of the two precisions.

If you have two normal distributions modeling unknown variables a and b, and you know x=a+b, then you can model your knowledge of x as another normal distribution. The calculation for this one is also simple - its mean is the sum of the two means (intuitively has to be) and its variance is the sum of the two variances (I've seen several math books say it's surprising. It's very convenient though).

So you have all the pieces you need to do (new state guess = old state knowledge + noise from passage of time) and (new state knowledge = new state guess | new observation), where all the variables are normal distributions, and the new knowledge ends up being a normal distribution too, so you can use it as the old knowledge next time you get information.

[1] https://en.wikipedia.org/wiki/Conjugate_prior#When_likelihoo...

[2] https://en.wikipedia.org/wiki/Sum_of_normally_distributed_ra...

๐Ÿ‘คjtolmar๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

At the risk of sounding pompous and self-serving, try my book, I tried really hard to give the plain language explanation, and to build knowledge through intuition and thought experiments.

https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt...

๐Ÿ‘คRogerL๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

The only way is to derive it.

The KF maximizes the likelihood of the observations..

You can easily form the optimization problem, "what state x maximizes the likelihood of my observations z". I had a writeup to this effect in my class notes from Prof Roumeliotis' class. I'll see if I can post it, but you could probably derive it following the KF wiki page.

๐Ÿ‘คjvanderbot๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0

(Replying to PARENT post)

Ask me a question or two about particular math notations and I will try to answer it.

I'll just go ahead and explain indices and summation in the meantime already because these are so super-common that you basically cannot read math things without knowing them. I will denote things LaTeX-like, that is subscript with _ and superscript with ^, if multiple things go in there, they will be enclosed in {}.

---

First: indices, usually written as sub-script, are super-common. They might be separated by a comma if there are more than one (but they do not have to be. In that case, you need to check which variables make sense as indices.) Usually variables are just single letters and not superWordyDescriptionsOfThingsBecauseCodeCompletionIsDifficultWithPenAndPaper.

Examples:

a_i is like a[i]

m_{ij} is element at position i, j in matrix M, i.e. ~ M[i][j]

---

Summation is also very common, denoted via symbol ฮฃ . That's the upper-case greek letter "sigma". This one will have a few things accompanying it: below it is a variable name and most likely an assignment of a value to it. That's the first value that the variable will take. Above, the last value the variable should have. The expression behind the symbol is evaluated for each of these values and all integers in between and summed up. (It might also appear with just the varaible name below it, in which case it means: sum over all values of the variable).

Example: Summing up the squares of all integers from 3 to 6

ฮฃ_{i=3}^6 i^2

(Properly rendered here: https://wikimedia.org/api/rest_v1/media/math/render/svg/a60f... )

Pseudocode for this is:

  for(i = 3; i <= 6; i++)
    sum += i^2   
---

Multiplication: there is also a similiar one for multiplication that I will include because it is so easily explained when having discussed summation already. It's denoted by the greek upper-case letter ฮ  (Pi)

ฮ  works like ฮฃ but the results of the expressions are multiplied instead.

Example: ฮ _{i=3}^6 i^2 = 3^2 * 4^2 * 5^2 * 6^2

Pseudocode:

  for(i = 3; i <= 6; i++)
    product *= i^2
This is only meant to help you translate a few formulas into forms you might easier work with. Following a university math course for a single semester or maybe two might get you a long way as well.
๐Ÿ‘ค_Microft๐Ÿ•‘4y๐Ÿ”ผ0๐Ÿ—จ๏ธ0