General Lecture Notes
On this page, we collect Lecture Notes that we have created in the context of LAMARR (formely ML2R) and other teaching activities.
ML2R Coding Nuggets
-
Solving Linear Programming Problems by Pascal Welke and Christian Bauckhage.
This note discusses how to solve linear programming problems with SciPy. As a practical use case, we consider the task of computing the Chebyshev center of a bounded convex polytope.
-
Linear Programming for Robust Regression by Pascal Welke and Christian Bauckhage.
Having previously discussed how scipy allows us to solve linear programs, we can study further applications of linear programming. Here, we consider least absolute deviation regression and solve a simple parameter estimation problem deliberately chosen to expose potential pitfalls in using scipy's optimization functions.
-
Sorting as Linear Programming by Christian Bauckhage and Pascal Welke.
Linear programming is a surprisingly versatile tool. That is, many problems we would not usually think of in terms of a linear programming problem can actually be expressed as such. In this note, we show that sorting is such a problem and discuss how to solve linear programs for sorting using SciPy.
-
Sorting as Quadratic Unconstrained Binary Optimization Problem by Christian Bauckhage and Pascal Welke.
Having previously considered sorting as a linear programming problem, we now cast it as a quadratic unconstrained binary optimization problem (QUBO). Deriving this formulation is a bit cumbersome but it allows for implementing neural networks or even quantum computing algorithms that sort. Here, however, we consider a simple greedy QUBO solver and implement it using Numpy.
-
Numerically Solving the Schrödinger Equation (Part 1) by Christian Bauckhage.
Most quantum mechanical systems cannot be solved analytically and therefore require numerical solution strategies. In this note, we consider a simple such strategy and discretize the Schrödinger equation that governs the behavior of a one-dimensional quantum harmonic oscillator. This leads to an eigenvalue / eigenvector problem over finite matrices and vectors which we then implement and solve using standard NumPy functions.
-
Numerically Solving the Schrödinger Equation (Part 2) by Christian Bauckhage.
We revisit the problem of numerically solving the Schrödinger equation for a one-dimensional quantum harmonic oscillator. We reconsider our previous finite difference scheme and discuss how higher order finite differences can lead to more accurate solutions. In particular, we will consider a five point stencil to approximate second order derivatives and implement the approach using SciPy functions for sparse matrices.
-
Solving the Single Unit Oja Flow by Christian Bauckhage, Sebastian Müller and Fabrice Beaumont.
Ojas’ rule for neural principal component learning has a continuous analog called the Oja flow. This is a gradient flow on the unit sphere whose equilibrium points indicate the principal eigenspace of the training data. We briefly discuss characteristics of this flow and show how to solve its differential equation using SciPy.
-
Solving Least Squares Gradient Flows by Christian Bauckhage and Pascal Welke.
We approach least squares optimization from the point of view of gradient flows. As a practical example, we consider a simple linear regression problem, set up the corresponding differential equation, and show how to solve it using SciPy.
-
Reproducible Machine Learning Experiments by Lukas Pfahler, Alina Timmermann, and Katharina Morik.
The scientific areas of artificial intelligence and machine learning are rapidly evolving and their scientific discoveries are drivers of scientific progress in areas ranging from physics or chemistry to life sciences and humanities. But machine learning is facing a reproducibility crisis that is clashing with the core principles of the scientific method: With the growing complexity of methods, it is becoming increasingly difficult to independently reproduce and verify published results and fairly compare methods. One possible remedy is maximal transparency with regard to the design and execution of experiments. For this purpose, best practices for handling machine learning experiments are summarized in this Coding Nugget. In addition, a convenient and simple library for tracking of experimental results, meticulous-ml, is being introduced in the final hands-on section.
-
AdaBoost with Pre-Trained Hypotheses by Christian Bauckhage.
In preparation for things to come, we discuss the general ideas behind AdaBoost (for binary classifier training) and present efficient NumPy code for boosting pre-trained weak hypotheses.
-
Intersection String Kernels for Language Processing by Christian Bauckhage.
This is the first in a miniseries of notes on kernel methods for language processing. We discuss the idea of measuring n-gram similarities of words by computing intersection string kernels and demonstrate that the Python standard library allows for compact implementations of this idea.
-
Kernel PCA for Word Embeddings by Christian Bauckhage.
We address the general problem of computing word embeddings and discuss a simple yet powerful solution involving intersection string kernels and kernel principal component analysis. We discuss the theory behind kernel PCA for word embeddings and present corresponding Python / NumPy code. Overall, we demonstrate that the whole framework is very easy to implement.
-
SVM Training Using 16 Lines of Plain Vanilla NumPy Code by Christian Bauckhage.
We consider L2 support vector machines for binary classification. These are as robust as other kinds of SVMs but can be trained almost effortlessly. Indeed, having previously derived the corresponding dual training problem, we now show how to solve it using the Frank-Wolfe algorithm. In short, we show that it requires only a few lines of plain vanilla NumPy code to train an SVM.
-
Greedy Set Cover with Native Python Data Types by Christian Bauckhage.
In preparation for things to come, we discuss a plain vanilla Python implementation of “the” greedy approximation algorithm for the set cover problem.
-
Greedy Set Cover with Binary NumPy Arrays by Christian Bauckhage.
We revisit the minimum set cover problem and formulate it as an integer linear program over binary indicator vectors. Next, we simply adapt our earlier code for greedy set covering to indicator vector representations.
-
Faster QUBO Brute-Force Solving by Sascha Mücke.
This article describes an improved brute-force solving strategy for Quadratic Unconstrained Binary Optimization (QUBO) problems that is faster than naive approaches and easily parallelizable. The implementation in Python is discussed in detail, and an additional C implementation is provided.