- Multimedia Programming Using Max-MSP and TouchDesigner
- Patrik Lechner
- 2234字
- 2021-08-05 17:21:21
When to use Max
If you think of our previous different representations, you might notice that the last version might be the one that could be created in the fastest fashion. It's simply faster to type the following than it is to put objects in a Max patch, hopefully in a tidy way, and connect them:
out1 = (in1+1)*0.5-0.2
If we know exactly what we want to achieve and how to achieve it, meaning we have a picture in our minds of all operations needed to accomplish a calculation, we will typically be faster in a text-programming language than in a graphical one. However, as soon as there is some doubt about how we want to do things, or what our objective really is, a graphical programming language that also doesn't need to compile each time we want to test the result will be more inspiring and faster. We can just try out things a lot quicker, which might inspire us. If you think about experimental music for example, the word suggests it's all about trying things out, doing experiments. With Max, we get our results really fast.
A word of caution should be said though. If we are working in Max, the target is often an aesthetic one, be it music, video art, or dancing robots. If we do so, there is often a fair amount of technical interest or necessity that drives us; otherwise, we could have done the job in a higher-level application. A very common danger is to lose the target of creating beautiful things while programming the tools for them night and day. In this regard, Max is also more dangerous than a Digital Audio Workstation (DAW) but a lot less than traditional programming languages.
It's hard to find general rules when the Max/MSP/Jitter package is the best tool for a problem, mainly because it is highly individualistic. If you just started Max and are a Java professional, it does not make sense to recommend using Max for everything it can do. Mostly, we use Max when it's the most efficient solution, meaning we can get the task at hand done in the fastest way and with the most satisfying overall result within Max. However, there are problems that have a structure that is very close to a Max way of thinking and others that don't. Real-time signal-processing certainly is one of the strengths of Max since it generally follows a block diagram form and is optimized for real-time processing. Don't forget that Max is designed to do signal-processing particularly well. Also, the ease with which we can often design an appealing GUI is an advantage. Problems that are easily solved in other programming languages (but partly can be done in Max) include recursive algorithms, problems that ask for object-oriented programming, database management, optimization problems, large-scale logical systems, and web-related problems. Some of these can of course be solved with a different language and can be embedded within Max.
Max – the message domain
Max is only one part of what you get when you purchase it. The full package actually consists of Max, MSP, Jitter, and Gen (optional). Max was the first building block and will usually serve as our infrastructure and control surface. It will handle data flying through our patches and sequence events, control the others (MSP, Jitter, and Gen), handle user input, handle the GUI, and so to speak the desk on which everything else is lying around. We'll now go through a simple Max patch, and later, a similar MSP and Jitter one, and we'll see what these are good at. For everything else in general, it's a good idea to use plain Max within the Max/MS/Jitter universe. Let's consider our first Max-only patch:
You could say that this patch is made up of the following three elements:
- Input (a button we can click on)
- Processing
- Output (the float number box,
[flonum]
, at the bottom that lets us see the result)
So [counter] counts how many times we clicked on the button; then, we do some calculations on that number (you might see that it's the same calculation we looked at before) and output the result to the screen.
Note
Notice that this patch isn't doing anything unless we click the button. This is an important concept and a major difference between the Max realm and the MSP and Gen~ domains.
We call the message that comes out of the button a bang. It is nothing more than an event. You could say it is how one Max object (one of the boxes) shouts to another: now! and the receiving object knows what to do if we patched everything together correctly. In the example given along with the counter, the bang message that comes from the button
object simply tells the counter to increase the number output by one. You will meet this event-based concept everywhere in Max and it will be worthwhile to understand it thoroughly and have it in the back of your head while programming, but we will come to this again later in Chapter 2, Max Setup and Basics and in Chapter 3, Advanced Programming Techniques in Max.
Max Signal Processing
Without further investigation, we'll dive into analogies for the example we just looked at in all the other worlds; first, let's look at Max Signal Processing (MSP):
As you can see, we are still doing the same small calculation. MSP is the audio part of Max, and so all the operations that we are applying are suited for audio signals but can be used for all sorts of things, such as doing this calculation. You can recognize an MSP object by its little tilde (~) at the end of its name. The tilde isn't used a lot in many languages, which results in partly strange locations on computer keyboards. Please refer to the English Wikipedia entry on the tilde to find a key combination to create a tilde on your keyboard.
This time, our input is an [adc~ 1]
object. The adc
term stands for analog to digital conversion; it's just our audio input. Beware, this is a tricky patcher; I've hidden many things for didactical reasons, but if you wish, go ahead and look at the patcher itself (by double-clicking on p counter). Instead of monitoring whether a button has been pushed, we are checking whether the audio input level is high. Essentially, you can clap your hands instead of pushing a button (this is a very primitive clap detector though). Again, we count how often we clapped, process that count, and output a number.
The important difference to the Max version is that we are always dealing with streams of numbers in MSP, namely we are getting sample rate number of values every second, even if nothing is happening, for example, adding two constants. Refer to the following screenshot:
Here, we are adding 0 and 0 and as a result, we get 0 of course. Seems like an inexpensive process regarding CPU right? Well it is, but bear in mind that we are calculating 0 + 0 = 0 44,100 times per second in my case of a 44.1 kHz sample rate. So, the takeaway message here really is that in MSP, if we have a static input, it doesn't mean that we are not processing, so it doesn't mean we are not crunching numbers all the time. We will learn how to deal with this problem using the [poly~]
object in a later chapter.
This difference in the Max world is one reason why this simple patch became complicated (the things I have hidden in the screenshot). To really stick to the analogy instead of outputting a number, we can have an output sound, for example, a sine wave with the resulting number as audible sound, but let's keep things simple here for now.
To sum it up, we can say that we use MSP when we want to process audio (there are exceptions, though). However, there are other situations in which we would want to use MSP due to the way in which it treats data. MSP processes floating-point numbers with a 64-bit resolution called double precision, whereas Max represents floats with 32-bits (the MSP [buffer~]
object also stores its values only with 32-bits). So we essentially have more precision and can represent both bigger and smaller numbers in MSP than in Max. Also, and this is maybe even more important, we not only have a bigger resolution of our values but in time as well. The default scheduler interval of Max runs at 1 ms, so at 1000 Hz it is able to represent signals with a maximum frequency of 500 Hz, in theory. Without going into that theory too deep (Sampling theory and the Nyquist rate), you can imagine that if we process 1,000 values per second, we can't work with signals at higher frequencies. However, that's the job of MSP, and we tend to use MSP for all time-critical processes such as drum sequencers and everything where timing should be really tight. We'll use MSP simply for all high frequency (≥ 100 Hz might be a good border here) data that we manage to create in or get into our software.
Jitter, Matrix, and video processing
Let's take a look at the Jitter version:
Don't be afraid! I know it looks complicated and we won't go over every detail since it's needless to understand everything at this point. The difficulty here is only that if we still follow our analogy, we have to analyze incoming video as the input of our system. It actually still is our senseless small calculation, but since Jitter is made for matrix calculation and especially for video material, we do everything in matrices here. Our clap detector becomes a flash light detector, and doing the actual counting in Jitter is also not a task you should start with when learning Jitter. However, if you look closely, you can see that at the very bottom, there are three [jit.op]
objects that are doing the actual processing. Everything else is just to get a trigger signal out of our video input and to also count these triggers within this video context. This is a highly complicated way to achieve our initial goal, but it should show you that we can also calculate anything with matrices. Many things are hidden in there, which you can take a look at later.
Jitter processes are somewhat similar to Max processes. It runs at the scheduler rate (see how things work in MSP: a stream of data. The difference in MSP is that we are in more direct control of the rate; we have to drive the system with a [metro]
object that is sending out bangs at a given rate. In the Jitter context, we will typically use a [qmetro]
object. It's the same as the [metro]
object with the difference that it waits for other processes (like drawing the last frame) to get completed. Refer to Chapter 2, Max Setup and Basics for the scheduler and priority. In this case, it's our [qmetro]
object that is sending out a bang (resulting in the computation of the next frame), each of which is 30 milliseconds; therefore, we are running our computations at ~33 fps (1/interval in seconds = fps or also Hz).
Jitter data format
Jitter matrices are divided into planes. Planes are similar to what one often calls a channel (for example, an alpha channel, a red channel, and so on) in video technique. Let's consider the format of a standard video signal in Jitter; we have an alpha plane, a red plane, a green, and a blue one for video. Each of these planes is a two-dimensional matrix in itself; it has, for example, 1,080 columns and 720 rows, so one cell per pixel. Or you could think of it this way; each cell or pixel, in the case of a video signal, needs to store four values; therefore, we represent each pixel with 4 cells each on an individual plane. This leads us to an imagination that is depicted later; take it with a grain of salt.
The plane count is something different than the dimensions of a matrix. As soon as we have a plane count greater than 1, we can think of it as adding one dimension, independent of what the plane count might be. For a detailed explanation on Jitter matrices, refer to http://cycling74.com/docs/max5/tutorials/jit-tut/jitterwhatisamatrix.html.
The following diagram shows a two-dimensional Jitter matrix with 4 planes illustrated in three dimensions. If our plane count is greater than one, then we can imagine our matrix to have one additional dimension:
Now, since you have an idea of how Jitter handles data, you can imagine that as soon as we are confronted with multidimensional data, it's a good idea to do processing with Jitter (or even with a shader). We have tools to process arrays or lists in Max and MSP, but as soon as data gets two or more dimensions, we'll tend to use Jitter. We can have up to 32 planes and use up to 32 dimensions. Obviously, this data can grow quite quickly; therefore, we have the advantage of having great control over the bit depth, but we'll take a look at this in more detail in Chapter 7, Video in Max/Jitter of this book.