Notation
•5 min read•
MathematicsMachine LearningReference
Notation
"Words offer the means to meaning, and for those who will listen, the enunciation of truth" ~ V for Vendetta
Notation is the verbiage of mathematics, and one I confess I have never mastered. Given my ambitions to get at least somewhat cracked at ML/AI stuff, I need to get better at this first.
Below is the notation table I extracted from the Mathematics for Machine Learning book by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong.
I will be using this as a reference to get better at understanding the notation used in ML/AI.
Table of Symbols
| Symbol | Typical meaning |
|---|---|
| Scalars are lowercase | |
| Vectors are bold lowercase | |
| Matrices are bold uppercase | |
| Transpose of a vector or matrix | |
| Inverse of a matrix | |
| Inner product of and | |
| Dot product of and | |
| (Ordered) tuple | |
| Matrix of column vectors stacked horizontally | |
| Set of vectors (unordered) | |
| Integers and natural numbers, respectively | |
| Real and complex numbers, respectively | |
| -dimensional vector space of real numbers | |
| Universal quantifier: for all | |
| Existential quantifier: there exists | |
| is defined as | |
| is defined as | |
| is proportional to , i.e., | |
| Function composition: " after " | |
| If and only if | |
| Implies | |
| Sets | |
| is an element of set | |
| Empty set | |
| without : the set of elements in but not in | |
| Number of dimensions; indexed by | |
| Number of data points; indexed by | |
| Identity matrix of size | |
| Matrix of zeros of size | |
| Matrix of ones of size | |
| Standard/canonical vector (where is the component that is ) | |
| Dimensionality of vector space | |
| Rank of matrix | |
| Image of linear mapping | |
| Kernel (null space) of a linear mapping | |
| Span (generating set) of | |
| Trace of | |
| Determinant of | |
| Absolute value or determinant (depending on context) | |
| Norm; Euclidean, unless specified | |
| Eigenvalue or Lagrange multiplier | |
| Eigenspace corresponding to eigenvalue | |
| Vectors and are orthogonal | |
| Vector space | |
| Orthogonal complement of vector space | |
| Sum of the : | |
| Product of the : | |
| Parameter vector | |
| Partial derivative of with respect to | |
| Total derivative of with respect to | |
| Gradient | |
| The smallest function value of | |
| The value that minimizes (note: returns a set of values) | |
| Lagrangian | |
| Negative log-likelihood | |
| Binomial coefficient, choose | |
| Variance of with respect to the random variable | |
| Expectation of with respect to the random variable | |
| Covariance between and | |
| is conditionally independent of given | |
| Random variable is distributed according to | |
| Gaussian distribution with mean and covariance | |
| Bernoulli distribution with parameter | |
| Binomial distribution with parameters | |
| Beta distribution with parameters |
Table of Abbreviations and Acronyms
| Acronym | Meaning |
|---|---|
| e.g. | Exempli gratia (Latin: for example) |
| GMM | Gaussian mixture model |
| i.e. | Id est (Latin: this means) |
| i.i.d. | Independent, identically distributed |
| MAP | Maximum a posteriori |
| MLE | Maximum likelihood estimation/estimator |
| ONB | Orthonormal basis |
| PCA | Principal component analysis |
| PPCA | Probabilistic principal component analysis |
| REF | Row-echelon form |
| SPD | Symmetric, positive definite |
| SVM | Support vector machine |