What ARE quaternions? I mean, you hear them talked about – what actually are they, and what are they used for?
Here’s a brief description and some usage cases. Hopefully this will help for those struggling to understand what they are and how they are used.
Now, a quaternion, at root, is a direction (or, in math parlance, a vector – which is the same thing) and a rotation around that direction. What does that mean?
What is a vector?
It means this. Point your finger at something. Anything. Just point your finger. The direction your finger is pointed at is a vector. It’s a direction. In 3 dimensional terms, it’s a three value combination of up/down, left/right, and forward/back.
Imagine it this way. Imagine your hand is at position 0,0,0. It’s the ‘origin’. Now point your finger. The direction your finger is pointing at is expressed, in math, as a Y position (ie up/down), between -1 and 1. -1 would mean you are pointing exactly down. A 1 value means you are pointing directly up.
It’s the same for X (left and right) – it’s a value between -1 and 1, where -1 means you are going total left, and 1 which means you are going total right. As an aside here, there are some coordinate systems that allow for Left Handed Rotation. This means that what is considered ‘left’ is in actual fact a positive 1, not a negative 1. This means that the values for X are, effectively, flipped. It’s just one of those things that happens on occasion, no one really knows why or can explain it. It’s just one of those things you need to know up front, if you are using a left handed threaded coordinate system or not. Most systems, however, are right handed.
And similarly, for Z (depth or forward and backward). A -1 here means it’s coming out of the screen, towards you, wherever the camera is. A 1 here means it’s heading exactly away from you.
Note – it’s worth pointing out that it’s entirely possible for a vector to have a zero in one or two of the positions of the axis. So a vector of (1, 0, 0) is totally legal, and is saying, the direction is left. There is no up / down, and no back / forward values at all, we are just moving left.
A note on Normalizing.
Now, you’ll note that most of the examples I’ve given here are between -1 and 1 for each axis. Why is that? A vector coordinate does not actually need to be between -1 and 1, infact, it can be any set of numbers in there? A legal vector is x=100, y=200, z = 50. So why the insistence on keeping numbers between -1 and 1?
This is defined as normalizing. Well, it’s a little more complicated than that. A Normalized Vector is a vector where, if you add up each of the X, Y and Z values, they equal either 1 or -1. Or, to put it another way, if you define a vector direction as a Normalized Value, if you drew a line from 0,0,0 to the vector position, the length of the line would be exactly 1.
Now, interestingly, all vectors (IE all possible values of a vector, even those above 1) can be ‘normalized’. For example, a vector of 0.1, 0.8, 0.1 is the exact same as 100, 800, 100. One is just a ‘smaller’ version of the other. Or, to put it another way, normalized. To normalize a vector, it’s pretty simple. Take the length of the line from 0,0,0 to the vector position (and you can use Pythagoras’ theorem for that – length = sqrt of ((x * x) + (y * y) + (z * z))) and then divide each of the axis values by that value.
Now why bother? Why do we do this? Well, it’s because certain other things you can do with vectors – like dot product and cross product (math functions that I’m not going to go into now, but that are really useful down the line for trigonometry functions) – that rely on the vector being normalized to get useful results out of them. Normalized values are useful since all they represent in those conditions is a direction – not a position. They are, by definition, an offset from 0,0,0. You can then multiply a normalized vector to get a line of any length, which is really useful for projecting out a position into the world.
Back to Quaternions.
So ok, we’ve defined what the vector, or direction part of a quaternion is – it’s a 3d point from 0,0,0, which is normalized – IE it’s a vector length of 1. What’s the rotation value and what does it mean? Let’s go back to the finger pointing exercise. Point your finger in a direction. Any direction. Now rotate your wrist. What happens is that your finger still points in the direction, but your wrist is rotated. So the rotation value of the quaternion is describing the rotation around the axis the vector is pointing at.
Why would we even care? The direction is still the direction – that hasn’t changed. Well, it has, actually, but in root ways of how it’s expressed. It still pointing in the same direction, but how that’s described at a math level has changed. Try this. point your finger in a direction, and now extend your middle finger left, and your thumb up. These represent the X and Y directions of the vector. The middle finger is pointing along the X axis, and the thumb represents the Y axis. Now rotate your finger again. You’ll notice that your thumb and middle finger are now pointing in different directions. This means that the rotation has done some weird things to what the quaternion now thinks is ‘up/down’ and ‘left/right’. If you rotate your wrist enough, X becomes Y, because now it’s pointing up, and Y becomes X, because now it’s pointing left/right. You can see how the rotation messes things up.
This becomes important when you start putting quaternions on top of quaternions. Because you’ve now altered the origin of rotation and vector direction, from the parent. I’ll go into that more in the next bit.
What can I use Quaternions for?
The most common use of quaternions is in animation systems. A quaternion can represent a bone position in a hierarchical model of bones in a skeleton (by hierarchical, I mean that bones have a parent / child relationship. A wrist bone is parented by the forearm, and that in turn is parented by the upper arm / shoulder bone. You end up with a tree of bones, each having children and parents. The reason for this is that bone positions / orientations are additive. This means that each bone inherits both rotation and raw position from it’s parent. If you rotate a shoulder, then all the bones underneath it move along with it, because they are attached to their parents. Then each bone underneath rotates itself, and adds that to the rotation the parent already has.).
So for each bone, you don’t have a position – since you don’t need that. Your original skeleton definition already has position offsets for each bone from it’s parent, and those offsets don’t change, frame to frame. The length of bones doesn’t change – it’s a set thing for all animations. What you do have, per bone, is a quaternion, which describes the angle and rotation of each limb, relative to it’s parent, based on where the end of the parent bone ends up being in the world. So, to put it in more real world terms – I know that my upper arm is of length 10 units, because that’s in the root skeleton definition. I also know that the default direction of the shoulder bone is straight out in X (this is because root skeletons define the arms as being flat out, stretched out to the side. This is known as the T Pose. No one knows why this is the default definition of a skeleton, but it is.) Now, when I have a quaternion, the rotation in the bone is an offset from the raw skeleton position. So in order for the bone to point down, as it would do for a ‘normal’ stance for the skeleton, we would need to rotate the bone down by 90 degrees. So the resulting quaternion would look like this – with a vector of (0, -1, 0) and a rotation of 0, since we aren’t rotating the bone at all – just giving it a new ‘direction’. This is saying “Point the bone down”.
What then happens is that, knowing the bone length is 10, you’d take the parent model position of the shoulder (which, again, we’d know from the root skeleton definition), get the quaternion vector position, scale that by 10, and then add that to the shoulder position, and that gives us the new position of where the upper arm ends – ie where the forearm begins.
This is better than storing a real transformation matrix per bone (which can do the same thing, but in different ways) for two reasons. One is that it’s smaller. A real matrix is 4×4 floats - 12 floats in total. Per bone. If you have fifty bones in a model (and that’s a conservative estimate for a biped, for example, once you start including fingers), that’s 50×12 floats (or 50*12*4 = 2400 bytes) per frame of animation. A quaternion is only four floats per bone, so the math means that a single frame of animation is considerably smaller – 50*4 floats (or 50*4*4 = 800 bytes per frame). That’s a saving of 1600 bytes per frame, which if you have thousands of frames (and most modern games do), is a significant saving.
But the other reason is even more significant. Matrices cannot be interpolates, and quaternions can.
What does that actually mean though? I mean, it sounds good, but it’s really gobbledygook, isn’t it? Lets go through it.
In our animation example, imagine we have three frames of animation. The animation frequency is 12hz. That means we have a different frame of animation every 5 frames, assuming we are running at 60 fps. So for 5 frames, we display frame 1, then on the 6th frame, we start displaying frame 2, and on the 11th frame, we display frame 3 etc.
But that’s not how animation systems actually work. What they do is actually interpolate between frames, based on how close you actually are to each frame.
So in our example, we have five frames of display, but not enough animation frames for each frame of display. So what animation systems do is take a percentage of frame 1 and frame 2, dependent on how close the rendered frame is to either, and then add those together. That’s not really any clearer, is it?
Ok, so in our example, we are rendering 5 frames, using frame 1 of the animation. But what we actually do is for frame 1, we are rendering 100 percent of the first frame of the animation. For frame 2, because we are moving on in time, towards frame 2, we take 4/5 (or 80%) of what frame 1 represents, and then 1/5 (or 20%) of what is in frame 2, and add them together, to generate a merged frame from frame 1 to frame 2. We are, in effect, generating a new frame of animation from two others. This is called Linear Interpolating, or in game dev parlance, Lerping.
The actual effect is basically saying “Take the rotation and the vector of frame 1, scale the vector and rotation by 0.8, then do the same for frame 2, only scale it by 0.2, then add those two together, re-normalize the vector, and that’s your interpolated frame”.
Then, for frame 3, the amounts you scale by change, so now it’s 0.6 for frame 1 and 0.4 for frame 2, because we are now getting further away from frame 1 and closer to frame 2. And so on.
The thing is, you can do this interpolation for quaternions. You cannot do this for matrices, because you end up potentially flattening the matrix (you don’t need to know what this means, just that it’s bad) and so this is one important way that quaternions score over ‘real’ matrices.
Drawbacks of using Quaternions
1) They don’t have position built in. They are purely a direction and rotation. Root position needs to be held elsewhere. A ‘real’ matrix has position built in (which is one reason why it’s larger). But this is by design, since Quaternions are designed to be used in hierarchical situations, where the result of the parent would dictate where it’s starting position in the world actually is.
2) They don’t have scale built in. A ‘real’ matrix has scale built in, for each axis (so you can scale a model by each axis individually. So you can say “I want this model to be fatter on the X Axis, but not on the Y or Z axis”, and a matrix can handle that - a quaternion cannot. Incidentally, you may ask why you’d want to do that. Well, it’s a way of being able to scale a rendered model to your view port aspect ratio. Models are built assuming that the window they are being displayed on is 1:1, so it’s a square. The moment that is no longer true, you need to cope with that in code. One way to do it is to affect the X and Y scale values of a matrix, to ‘stretch’ out a model, so it fits in the display correctly. Most games do NOT do this, and that’s ok too. But a matrix approach enables you to do this.
Now, there ARE things that a quaternion can do to represent scale. We talked about the quaternion vector component being normalized – ie a vector length of 1. What if that is not true? What if we have a vector that is not a length of 1? What if it’s 2? Or 20? Well, the practical effect is that this is a way of storing scale. The scale itself is whatever the difference is of vector length from 1. So if the vector length of the vector stored in the quaternion is 10, then the scale is 10. Now, this is different from how a matrix stores scale, because a matrix stores scale per axis – ie it has different scale values for x, y and z. A quaternion scale affects x, y and z at the same time. It’s a scale of the length of the vector along the vector. In our animation example, it would make a limb longer, not just fatter along one axis.
Ok, so that’s a basic primer of what a quaternion is, what it can do, and what some of the advantages and disadvantages are. Hope that helps.
A response to the awesome Fireproof blog post on Polygon
First, you need to read this. Fireproof ‘free-to-play binds the hands of dev’s who want to help”
It’s ok, because it’s very well written, very cognizant of the situation in mobile development today and very well argued. It’s a very good read and a refreshing perspective from someone who’s actually gone out there and done it.
Read it? Good. Because you’ll need to in order for what I have to say next to make sense.
I have to agree with quite a lot of it. In fact, all of it. I would far rather be making larger, single experience games than have to constantly be designing stuff with IAP as an integrated part of the experience, since that’s where I come from too.
But while I agree with everything said in that post, it’s also worth pointing out that within the context of the environment, some of the arguments are less persuasive.
Now, I don’t mean that to come out as ‘no, you are wrong’, because that’s not the intention at all. I LOVE the Room (finished it) and I have Room 2 on my ipad mini, but I haven’t started because I know that when I do, it’ll own me, and I just have too much to do in life right this second.
My point here is that with context, everything said I agree with. But context is larger than the article gives on. There’s risk, for example. While the mobile market is far broader than AAA console and PC development, it’s a lot shallower. What this means is that there aren’t the number of 12-16 year old CoD fanatics for Ipad Games as there are for the actual CoD games on Xbox and Playstation. A console is a dedicated games machine, mobile is not.
The fact is that deep investment in mobile is a scary proposition simply because there isn’t that rabid fanbase already there. Those that tried, as has been mentioned, treated it as a AAA platform and it…just isn’t.
Mobile gaming is very different from console gaming. When you console game, you sit down in front of your TV with a very express aim in mind – to play a game, usually for some period of time. Mobile is not like that. People pull their phone out at odd times, like in a waiting room or on a bus, and play for a very short period of time.
You can’t generate deep SkyRim like gameplay in that situation, because it relies on so much prior knowledge of the game and current situation that it takes 10 minutes to remember what you were doing and everything that pertains to that, and by the time you’ve done that, the Doctor is ready to see you now. By definition, since game play periods are unpredictable both in when they happen and duration, game play tends to be much lighter and more what we at Midway (back in the 90′s) call Dip Games. You can dip in at anytime, play for as long as you want, leave and when you come back, there is no repercussion to the game you are playing. Multi-player Quake is a perfect example of a dip game, as is Mortal Kombat.
Incidentally, it’s worth pointing out that The Room is inadvertently a Dip Game – something very specifically that works for random play times. The way it’s constructed – this puzzle leads into the next one, leads you into a nice area to put away the device. “I figured that out! Awesome! Lets put the tablet away and go to the grocery store now.” Etc.
Sure, this situation limits what you can do – you end up with a ton of asynchronous card game and turn based games – but that’s the reality of what the average gamer has time for. It’s not about “what they want”, it’s more about the realities of how they play and the attention span they have at the time of playing. There is some validity to the claim, although as has already been pointed out, that doesn’t invalidate the Polished Single Purchase game approach.
Another thing that plays into this is history. The fact is, smart phones were an evolution of existing phones, which were not designed to play games, but did anyway – witness Snake on the old Nokia phones. Phone users have been conditioned to accept that games will look a bit 1980′s, mainly because phones as a games platform hasn’t been oriented that way. We are now, for sure, or you wouldn’t be able to put games like The Room on it, but that conditioning has not gone away. People just don’t expect the same kind of graphical glory that they’d get on their console. If they did, they wouldn’t need the console in the first place, and while that is the inevitable conclusion (I’m with Ben Cousins on this), right now they need to justify the purchase of an XBox One AND an Ipad, so they play less graphical intensive games and accept it on the Ipad and then expect real time photo realism on the other.
That’s not to say that mobile games currently don’t have to look good, just that most developers aren’t spending their time on that because it’s, as it’s pointed out, there is the implicit “good enough” thinking in most mobile development.
Which brings us to the current generation of money men. Publishers, investors etc. The fact is that most of these people are not creative, and it’s foolish to expect them to understand what a creative needs to do to ensure polish. They are there to get the max return they can get for as little outlay as they can get away with. If the game looks a little shitter than it really needs to, whatever, we are still making money – expectations aren’t that high to begin with. It’s the race to the lowest common denominator that still makes them money. They are still in that mindset of “mobile games tend to look shitty” even more than the players are, to be honest, because they are seeing the cost of development of quality (of which, more in a bit).
As an example, I had a client for a year who was big into hunting games. When I looked at the codebase and what his games looked like, I gasped. It was horrible. No animation blending, no animation timing, lighting all wrong, GLsettings all incorrect etc. It was just awful. I looked at other hunting games and most of them were no better. I spent a year telling him that one month’s effort could upgrade his engine to the point where he would be head and shoulders better than the other games, and that he could own this genre with his offerings. But he just wasn’t interested – one month of me working on something that he couldn’t track as adding to his bottom line was unacceptable. You could get an entire game done in that time. Good enough paid the bills and reduced his outgoings, and that was that. (I actually added an animation blender to the codebase on my own, simply because I just couldn’t stand looking at it.)
There is a tendency to short term thinking in mobile that is a bit scary, but is also understandable because development times are so short in comparison. You get a game done in a month on mobile, so every day counts and if you need to do something that’s going to bust that deadline, well, lets put that off till the next game. Like I said, understandable, but coming from AAA, where polish is everything, it’s a hard pill to swallow.
Then there’s The Current Generation experience. While we are lamenting the lack of the One Sale situation that AAA (mostly) enjoys, it’s worth pointing out that the iphone is now seven years old. That’s seven years of a generation that is used to F2P and how IAP works. For the more casual gamers, this is a way of life. When they come to XBox and find the average game costs $60, it’s a shock. It’s certainly that way with my kids. They are now conditioned to expect this dribs and drabs kind of game play approach.
Now, that’s not an argument that F2P is “better” than a polished single purchase experience at all, just that it’s here now, and it’s not going away, and objecting to it is more than a little pointless. Sure, Polished Single Purchase is valid too, but it’s harder to be successful at it since there is an entire fan base that doesn’t want to pay $20 to play your game, but wants to try it for free and then very slowly pay out over time (usually without realising it). That’s a reality. That’s not a “You shouldn’t do it” argument, just a recognition that high quality (and usually high development cost, which often goes hand in hand – we’ll talk about that next) is a harder sell. It just is.
As an example, lets look at another extremely high quality game on Ipad – Republique. I know for a fact they’d have to sell multiple hundreds of thousands of copies to even break even on development costs. There’s little chance they will, but they do have episodic content to fall back on, since each episode won’t cost as much to make as the initial engine build. But the point is, it’s unlikely they will make their money back, at least initially.
I really know of only three games that have really had a large single player AAA like polished experience that have made serious bank on mobile, using the single purchase model. One is the Room, another is the Tiger Games set of games (most notably, Waking Mars) and the other is Infinity Blade (which, it should be pointed out, was financed as a mobile advert for Unreal on mobile, and has over $500k worth of assets in it – not that they didn’t make their money back, but it’s just not something your average mobile developer could either afford in the first place, nor afford to place that kind of risk on. Not when there are other, cheaper, methods of making a buck).
Which does neatly lead me into the last point I wanted to make, regarding development costs.
Barry mentions that The Room was made with a budget of $70k. That’s pretty damn impressive. I wouldn’t have imagined that was possible. I suspect it’s because they were all ex AAA developers, had good development habits, and knew exactly what they wanted to make up front. And that’s the rub.
Most mobile developers do not come from that background. The Tiger Games crew did, and it shows. What they produce is very AAA quality. The Republique crew did not, and their development costs were.. well, I don’t want to give away numbers that I got second hand anyway, but lets just say it wasn’t $70k. Way too few zeros there.
Most mobile developers, for better or worse, want to do the least they can and just get it done. Android is a nightmare since there are so many platforms and capabilities to handle and test, and IOS has it’s own issues when it comes to phones.
My point is that saying “We did it for $70k and had no marketing budget” is not the norm. Most developers could not have done it for that. And it strikes me that relying on that as a business practice is probably a bit foolhardy. The number one problem with the appstore is visibility, and going in blind is an invitation to be handed your ass. In this way, The Room IS an outlier, and protesting about it doesn’t change that. It got success by word of mouth and people like Nathan Fillon and Zack Levi (“Chuck”) talking about it on twitter (it’s certainly where I heard about it). Now quality is an influencing factor there, no doubt. They would not have been talking about it if it hadn’t hit a quality bar, but there are other high production value games that I don’t see them talking about – Republique etc. You can’t rely on this as a marketing solution. Getting lucky once with celebrities endorsing your product doesn’t make it repeatable. Although, to be fair, now FireProof is in the position it is, anything it announces is news now anyway, much like Epic. Anyone else announcing the Unreal Tournament new development would be buried on the back pages, but because it’s Epic, it’s front and center. And that is what it’s like for most of the other people.
Something like The Room would take most developers at least $200k to build, and that’s assuming there were no design direction changes along the way.
Now, that’s not to say it’s not possible. What I’m saying is that I think their experience is not necessarily the norm – there are some modifying circumstances -, BUT that other developers may/should need to be trying to replicate it. The problem with lack of experience is that you often don’t know what you don’t know. The issue is that of risk – banking success on a Polished Single Purchase is a higher risk than going the IAP route – there’s just reality to that statement. Both are fraught with failure, but since there are more success stories of IAP games being successful, what do you expect most mobile indies who have NOT come from AAA to do?
Now is that a self fulfilling prophecy? Possibly, but while it’s an interesting discussion, it’s also an academic exercise. The landscape is what it is by now. The more interesting question is, can this be changed? Which is (at least to me) the thrust of what the Fireproof blog post is about. It reads, to me, as “We did this, why can’t you?” and while there are some very specific conditions in how they did it, the point is well made. Why can’t we?
The answer is, yes, of course the landscape can be changed. Is it likely? Well, according to my magic 8 ball, the answer is “Ask again later”, which I’d say is probably right on.
But, it won’t change unless more experienced developers do take that same risk that the fella’s at FireProof did. So here’s to hoping they do.