Before the start, everything we learn in a school in general is basically about How to solve a problem. But the most of the finest idea is actually come from the process of defining What is the problem.

Defining the unknown problem into the known one, it is the essence of learning and also the research. The famous philosopher Wittgenstein once said,

Quote

“whereof one cannot speak, thereof we one must be silent” - Ludwig Wittgenstein

But the real problem is that we cannot define nor represent the idea in a single language to other people in a precise manner. Because the natural language itself is obscure and subjective, therefore it is the good source to communicate in a group but not as the unified communicative representations.

However, Mathematics, it only exists in a form of logic. Everything is done in a systematic rules which are expanded from a few axioms, which might be inferred from our own common sense(in most of case it is not though). By this reason, Mathematics can define and represent the idea or problem which can be regarded as same even if they speak different languages in a domain that mathematics allows.

flowchart TD
  A[The real world problem] -->|Mathematical Modeling| B[Mathematical problem]
  B -->|Try hard to solve something| C[Solution]
  C ---->|applying the solution to real world| A

(Ref from: Linear Programming, lecture from Prof. Wen Shen)

Therefore, it is important to transform the real world problem into the mathematical setting which can be written as natural language or felt by many people’s guts.

Despite the fact that this procedure will give the most logical solution(s) of the real world problem at finest, still the two problems remain:

  1. How to represent real world problems into the form of mathematical problem?
  2. How to solve the mathematical problem in an automatic manner?

Machine Learning focuses on the second, and some of the first, to solve the real world problem in an automatic way, which we categorize it as the Algorithm. Now we might be able to try to find an answer with an algorithm but still we do not know the answer will exist or not.

So, In this reason, we usually use optimisation techniques to solve the problem which can be iteratively updated by the massive amounts of computation which can be run on fastest consumer GPU which can do 8 x floating points operations in a single second.

In this setting we can find approximated solutions, and in most of the cases, it works pretty well. From the weather forecast to Large Language Models such as chatGPT, which are hard to find perfect analytical solutions.

This long and wordy introduction is for introducing what we learn in this course, which are:

  • How to solve something which can measure an error in the mathematical form.
  • How to make the setup to find appropriate solutions for real world problems.

Hope you have enjoyed the reading! we will discuss about gradient based optimisation at next time.