Friday, March 23, 2007

Interlacing Sucks, Part 3

Another challenge in video is how to display interlaced material on a progressive screen or device. First, let's start with three different types of content you could be dealing with:
  1. Interlaced data where the fields are different moments in time (e.g. NTSC television, with 30 frames and 60 fields)
  2. Progressive data where each moment in time is spread over two fields (e.g. time lapse footage converted to NTSC)
  3. Progressive data with pulldown inserted (e.g. film shot at 24 fps which has been telecined to NTSC)
Only the second case will display without artifacts if you show both fields at once. If you're on a CRT or other device where flicker is an issue, each reconstructed frame needs to be shown twice too, making 60 progressive frames.

So how do you handle the first case? If you resize each field to the height of a full frame, resolution will suffer and the image will look soft. If you're clever, you'll try to compensate for motion that occurred between the fields. This is tricky and expensive to do. Another downside is that I don't know of a way to distinguish between the first two types of content. That means time lapse and stop motion footage may get degraded unnecessarily.

In the film case, things are more complicated. Four film frames, let's call them A through D, are pulled down by the telecine process into 10 interlaced fields. Typically the cadence for broadcast is 2-3-2-3, meaning that parts of frames B and D are repeated. If A1 is the even lines of A, and A2 the odd lines, the five interlaced frames would look like A1A2 B1B2 B1C2 C1D2 D1D2. It's up to the progressive device to reconstruct the original 24 frames from this pattern.

In theory this shouldn't be too hard, because DVDs have flags to mark the cadence. In reality, some DVDs are authored badly, and those flags can't be trusted. Therefore, good players will detect the cadence by performing memory comparisons of the fields. Of course, if the image is black, or no motion takes place, this kind of analysis will fail. Things get more complicated for material which switches back and forth between film and video sources, like a special feature which shows the making of the movie.

There are two more details about pulldown. First, it wastes 20% of the video bandwidth on duplicate fields. Smarter cameras and well authored DVDs will instead use MPEG flags to indicate that certain fields should be duplicated on playback, without wasting bits on them. Second, there is another cadence called advanced pulldown which runs 2-3-3-2. While this creates uneven motion, it's not meant to be viewed. The advantage is that this pulldown can be removed without decompressing the data, by throwing away the middle frame. This is useful in a video editor, since you don't incur a generation loss converting interlaced media back to progressive. Unfortunately, advanced pulldown is patented, which has prevented it from being universally adopted.

The result is that material which was shot as interlaced won't look very good on a progressive screen. By comparison, film originated material which is transported as interlaced can be reconstructed perfectly, but you may see glitches depending on your device.

Labels:

Wednesday, March 14, 2007

Interlacing Sucks, Part 2

If you thought interlacing was bad from a visual artifacts perspective, imagine what supporting it does to your code. For example, how do you describe a buffer of uncompressed video data? A few ways come to mind:
  1. Two pointers, one to each interlaced field
  2. One pointer, where the data for one field is immediately followed by data from the other
  3. One pointer, where the fields have been merged into a quasi-progressive image
  4. One pointer to a true progressive frame
  5. One pointer to a progressive frame which has been split into two back-to-back fields
In practice you may encounter all of these, and need to convert between them. Possibilities three and five are particularly ugly (merging fields from different points in time as one cohesive image, and treating a progressive frame as interlaced, respectively).

Of course there's lots more to deal with than that:

  • MPEG media can store frames or fields, and in the latter case, the fields don't have to be of the same IPB type. As if it weren't already confusing enough that the disk order of MPEG frames doesn't match the temporal order they're shown in.
  • How do you describe the edit rate of your project or output? In NTSC there are 30 frames per second but 60 distinct moments in time. However, if you edit on a half-frame boundary, you'll break the cadence. Still, sometimes you'll need to think in 30 and sometimes in 60. Better hope your variables are well named to indicate which you're dealing with at the moment.
  • How do you deal with effects? If you process fields by themselves, you have to account for the aspect ratio being half as tall as the final output. For example, a 45 degree line in a field will not come out at 45 degrees on screen. And how do you handle a filter which needs to examine surrounding pixels on adjacent lines, when the content in those lines may have moved?
  • And of course, different TV standards differ about whether the even or odd field should come first, so your code has to keep track of that too.
Interlacing is really a disaster. It's unpleasant for viewers and a burden on implementers.

Labels: ,

Sunday, January 21, 2007

Interlacing Sucks

One of the worst things carried over from standard definition TV into the new high def era is interlacing. That's where all the even lines of video are shown at time zero, and all the odd lines are shown 1/60th of a second later. By comparison, progressive display, which is the standard in every modern computer, updates every line of the display at every frame.

The difference is worse than it seems. Interlacing doesn't just apply to the display - it goes all the way back to capture. An interlaced camera captures a field (half of a frame) at time zero and the next field at time one. When shown, the even and odd lines of the image aren't from the same moment in time!

This causes more problems than you'd expect:

  • Maybe the worst is called combing, and is well known. This is when fast moving objects seem to get sliced apart. Horizontal motion looks bad, but vertical motion is worse - you can often see through the object on every other line!
  • Whenever you pause interlaced material, the image resolution gets cut in half. That's because repeating the last two fields would cause something called field twitter, where objects in motion rock forwards and backwards 1/60th of a second. To prevent this, a single field is shown for the even and odd lines (or at best rescaled to twice its height), which results in a blocky or soft image.
  • Rolling credits often look terrible when interlaced. Imagine some text which scrolls at one scan line per field, and covers four lines. The problem is you're only going to draw the same two lines of the characters to the screen each time. That means content creators have to time the speed of their credits to the refresh rate of the displays. But what do you do when creating content for Europe too, where the refresh rate is 50 fields per second?
Unfortunately, just like in digital cameras, people often prefer pixel resolution over quality. That's why 1080i (1920x1080 interlaced) seems to be winning over 720p (1280x720 progressive). There was an easy solution to the limited bandwidth which prevented 1080p at 60 frames per second - transmit 1080p at 30 frames, and have the display show each frame twice to prevent flicker. I've never read why this wasn't adopted, although I know that many broadcasters and Sony in particular campaigned to keep interlacing when moving to HD.

It's really a shame that we're propagating this workaround from 70 years ago into a whole new generation of equipment. At least they got rid of non-square pixels. Just don't get me started on fractional frame rates (i.e. 29.97 instead of 30.0).

Labels:

Friday, July 21, 2006

Subtle Bug

Here's an interesting bug I found the other day. It would probably make for a good interview question too:
void CallSomeCode(int value, auto_ptr<Blah> blah, bool flag);

void DoSomething() {
    auto_ptr<Blah> blah(new Blah(5));
    CallSomeCode(blah->GetValue(), blah, false);
}
This is nice and sneaky. Other than seeming strange that you'd pass a value from the object and the object itself, it looks OK.

Of course it actually crashes. As the arguments are pushed onto the stack from right to left, the auto_ptr is copied, which modifies blah, resetting its pointee to NULL. By the time you get to evaluating GetValue(), you're dead.

Labels:

Thursday, May 25, 2006

Distributed Builds

Xcode has a nice feature to distribute compiling to other machines on your network to speed up the process. It is also smart enough to spawn a thread for each CPU you have, which CodeWarrior did not.

In my testing with four dual CPU PowerPC machines (including mine), I can build a large app a bit faster than CodeWarrior 8.3. They're all connected by gigabit ethernet, and have at least 1.5 gig of RAM. Not a strong showing, considering that's 8x the processing power.

The real improvement comes on the Intel machines. With a mix of MacBook Pros and iMacs, four dual core machines can build the same app more than twice as fast as CodeWarrior. I don't know why, but gcc seems to run disproportionately fast on Intel processors.

There are two subtle consequences to all this sharing though. The first is that PowerPCs can only distribute to other PowerPCs, and likewise for Intel. This seems strange to me, since either machine can generate a binary for the other. With the same SDK and compiler version, the output should be identical.

The second catch, which is a bigger deal, is that you must be using the same OS and compiler version to distribute. That means everyone on your team has to upgrade together, or your build times go way up. Of course Software Update decides to run for everyone at different times, and even then it may not always be a good time to upgrade and reboot. The bottom line is Xcode tends to result in a lot of poking your head into other cubes and nagging people to jump to the latest.

Wednesday, May 17, 2006

Bit Shifting

Consider this test code:
int main(int argc, char** argv) {
    unsigned int a = 0xfeedface;
    unsigned int b = a;
    unsigned int c = a;
    a >>= 32; // shift by 32 using a constant
    b >>= 16; // shift by 32 in two steps
    b >>= 16;
    c >>= (argc * 32); // shift by 32 using a runtime value
    printf("a is %x b is %x c is %x\n", a, b, c);
    return 0;
}
Here's the results:

PowerPC: a is feedface b is 0 c is 0
Intel: a is feedface b is 0 c is feedface

These results surprised me on two counts. I thought that test a would always return zero on any platform, seeing as test b does. Thankfully gcc issues a warning if you shift by too large a constant. Although this is counterintuitive to me, it's not actually a useful operation. You might do it by accident through a macro or constant, but probably not on purpose.

The important behavior is test c though. Intel treats a shift by the size of the type or larger as a no-op, whereas PowerPC acts as if an infinite supply of zeros exists on either side of the variable. I think the PowerPC semantics make more sense. For example, you might shift by a variable amount up to and including sizeof(type) * 8 bits, and depend on this last value to clear a register. I encountered this exact algorithm recently, and had to add an if clause to test this case.

The problem is finding places where your code does test c. Just like divide by zero, there's no good way to search for this. If you do anything tricky with shifting, this may bite you.

Labels:

Sunday, May 14, 2006

Divide By Zero

Moving from PowerPC to Intel is not just a matter of endianness. There are other surprises waiting long after everything compiles and links.

The first is that divide by zero is fatal on Intel, but isn't on PowerPC. It's debatable whether that's a good idea, but let's put that issue to the side. The problem it poses is that there's no good way to audit your code and find out where this is happening. Even if you search for every forward slash and percent sign (remember that mod is a divide), how can you be sure the divisor is never zero? Code that used to run fine 100% of the time is now going to crash sometimes, under certain conditions.

That means the only thing you can do is test like crazy. Whether you can achieve complete coverage depends on the size of your app. If your code is already so big that just getting every function to be used takes weeks, it will be impossible to recreate every possible environment in which they could be called.

I think the net result is that Mac/Intel apps are going to be less stable than their PowerPC counterparts for a while, until the important bugs get found. Not only will regular users essentially become beta testers, but many of these bugs will be hard to reproduce. The more your app depends on hardware, drivers, and a variable environment, the more painful this is going to be.

Labels: