- MATLAB for Machine Learning
- Giuseppe Ciaburro
- 331字
- 2025-04-04 18:32:34
Supervised learning
Supervised learning is a machine learning technique that aims to program a computer system so that it can resolve the relevant tasks automatically. To do this, the input data is included in a set I, (typically vectors). Then the set of output data is fixed as set O, and finally it defines a function f that associates each input with the correct answer. Such information is called a training set. This workflow is presented in the following figure:

All supervised learning algorithms are based on the following thesis: if an algorithm provides an adequate number of examples, it will be able to create a derived function B that will approximate the desired function A.
If the approximation of the desired function is adequate, when the input data is offered to the derived function, this function should be able to provide output responses similar to those provided by the desired function and then acceptable.These algorithms are based on the following concept: similar inputs correspond to similar outputs.
Generally, in the real-world, this assumption is not valid; however, some situations exist in which it is acceptable. Clearly, the proper functioning of such algorithms depends significantly on the input data. If there are only a few training inputs, the algorithm might not have enough experience to provide a correct output. Conversely, many inputs may make it excessively slow since the derivative function generated by a large number of inputs could be very complicated.
Moreover, experience shows that this type of algorithm is very sensitive to noise; even a few pieces of incorrect data can make the entire system unreliable and lead to wrong decisions.
In supervised learning, it's possible to split problems based on the nature of the data. If the output value is categorical, such as membership/non-membership to a certain class, it is a classification problem. If the output is a continuous real value in a certain range, then it is a regression problem.