ELEC/VUB doctoral school lectures on “Low-rank approximation and its applications”


Established data modeling approaches are often derived in a stochastic setting. An alternative deterministic approximation approach, known in the systems and control literature as the behavioral approach, has been developed since the 80's by Jan C. Willems and co-workers. The behavioral approach differentiates between the abstract notion of a model and the concrete notion of a model representation. This distinction proves to be important for developing a coherent theory and effective algorithms for system identification, analysis, and control. The course presents a behavioral approach to system identification.

The highlight of the course is the low-rank approximation problem, which is a practical tool for modeling in the behavioral setting. A matrix constructed from the data being rank deficient implies that there is an exact low complexity linear model for that data. Moreover, the rank of the data matrix corresponds to the complexity of the model. In the generic case when an exact low-complexity model does not exist, the aim is to find a model that fits the data approximately. The corresponding computational problem is low-rank approximation. In the case of linear time-invariant dynamical models, the data matrix is, in addition, Hankel structured and the approximation should have the same structure.

Once the approximate system identification problem is formulated as a low-rank approximation problem, it is solved by generic methods. Except for a few special cases, however, low-rank approximation problems are nonconvex and a global solution is expensive to compute. In the course, we present methods based on local optimization, which lead to fast and effective algorithms. The cost function evaluation has the system theoretic interpretation of Kalman smoothing.

In addition to the theory and algorithms for exact and approximate system identification, the course presents examples from system theory (model reduction and distance to uncontrollability), computer algebra (approximate common divisor computation), and machine learning (recommender systems). Software implementation of the developed methods makes the theory applicable in practice. An essential part of the course are exercises, which give hands-on experience with the presented theory and methods.

Keywords: linear algebra, system theory, system identification, numerical algorithms


  1. Classical vs behavioral paradigms for data modeling
    The classical paradigm for data modeling invariably assumes that an input\/output partitioning of the data is a priori given. For linear models, this paradigm leads to computational problems of solving approximately overdetermined systems of linear equations. Examples of most simple data fitting problems, however, suggest that the a priori fixed input\/output partitioning of the data may be inadequate: 1) the fitting criteria often depend implicitly on the choice of the input and output variables, which may be arbitrary, and 2) the resulting computational problems may be ill-conditioned. An alternative paradigm for data modeling, sometimes refered to as the behavioral paradigm, does not assume a priori fixed input\/output partitioning of the data. The corresponding computational problems involve approximation of a matrix constructed from the data by another matrix of lower rank. We will review applications in systems and control, signal processing, computer algebra, chemometrics, psychometrics, machine learning, and computer vision that lead to low-rank approximation problems. Finally, generic solution methods for solving low-rank approximation problems are outlined.

  2. Exact identification
    Three linear model representations---kernel, image, and input\/output---are defined and the connections among them are shown. The problem of obtaining a state space representation of the system is known as system realization. The transition from impulse response to state space representation is a special exact identification problem and is at the heart of the subspace identification methods. Then the problem of deriving a representation of the system from a general exact trajectory of that system is considered. A key condition for existence of a unique solution is persistency of excitation of an input component of the data. The result leads to algorithms for exact identification. First, we present algorithms for passing from data to an impulse response representation. Then, we show how a state sequence of the model can be obtained directly from the data.

  3. Approximate identification
    Approximate data modeling is a trade-off between model complexity and model accuracy. One particular scalarization of this biobjective problem is the low-rank approximation problem. Fundamentally different ways of measuring the model accuracy are the misfit and latency. Misfit leads to errors-in-variables problems and latency leads to the prediction error problems. Independent of the particular problem formulation, there are three conceptually different approaches for solving the problem: convex relaxations, local optimization, and global optimization. We present a method based on local optimization. A bilinear structure of the problem is effectively used to eliminate part of the optimization variables. The computation of the cost function has system theoretic interpretation of Kalman smoothing. The remaining problem is a nonlinear least squares optimization over the model parameters.