How it works...

In the preceding example, the GrabCut algorithm was able to extract the foreground object by simply specifying a rectangle inside which the object of interest (the castle) was contained. Alternatively, one could also assign the values cv::GC_BGD and cv::GC_FGD to some specific pixels of the input image, which are provided by using a mask image as the second argument of the cv::grabCut function. You would then specify GC_INIT_WITH_MASK as the input mode flag. These input labels could be obtained, for example, by asking a user to mark a few elements of the image interactively. It is also possible to combine these two input modes.

Using this input information, the GrabCut algorithm creates the background/foreground segmentation by proceeding as follows. Initially, a foreground label (cv::GC_PR_FGD) is tentatively assigned to all the unmarked pixels. Based on the current classification, the algorithm groups the pixels into clusters of similar colors (that is, K clusters for the background and K clusters for the foreground). The next step is to determine a background/foreground segmentation by introducing boundaries between the foreground and background pixels. This is done through an optimization process that tries to connect pixels with similar labels, and that imposes a penalty for placing a boundary in the regions of relatively uniform intensity. This optimization problem can be solved efficiently using the Graph Cuts algorithm, a method that can find the optimal solution for a problem by representing it as a connected graph on which cuts are applied in order to compose an optimal configuration. The obtained segmentation produces new labels for the pixels.

The clustering process can then be repeated, and a new optimal segmentation found again, and so on. Therefore, the GrabCut algorithm is an iterative procedure that gradually improves the segmentation result. Depending on the complexity of the scene, a good solution can be found in higher or lower numbers of iterations (in easy cases, one iteration would be enough).

This explains the argument of the function where the user can specify the number of iterations to be applied. The two internal models maintained by the algorithm are passed as an argument of the function (and returned). Therefore, it is possible to call the function with the models of the last run again if one wishes to improve the segmentation result by performing additional iterations.