2:3
In the United States and other countries where television uses the 59.94 Hz vertical scanning frequency, video is broadcast at 29.97 frame/s. For the film's motion to be accurately rendered on the video signal, a telecine must use a technique called the 2:3 pull down (or a variant called 3:2 pull down) to convert from 24 to 29.97 frame/s.
The term "pulldown" comes from the mechanical process of "pulling" (physically moving) the film downward within the film portion of the transport mechanism to advance it from one frame to the next at a repetitive rate (nominally 24 frames/s). This is accomplished in two steps.
The first step is to slow down the film motion by 1/1000 to 23.976 frames/s (or 24 frames every 1.001 seconds). This difference in speed is imperceptible to the viewer. For a two-hour film, play time is extended by 7.2 seconds.
The second step of is distributing cinema frames into video fields. At 23.976 frame/s, there are four frames of film for every five frames of 29.97 Hz video:
These four frames needs to be "stretched" into five frames by exploiting the interlaced nature of video. Since an interlaced video frame is made up of two incomplete fields (one for the odd-numbered lines of the image, and one for the even-numbered lines), conceptually four frames need to be used in ten fields (to produce five frames).
The term "2:3" comes from the pattern for producing fields in the new video frames. The pattern of 2-3 is an abbreviation of the actual pattern of 2-3-2-3, which indicates that the first film frame is used in 2 fields, the second film frame is used in 3 fields, the third film frame is used in 2 fields, and the fourth film frame is used in 3 fields, producing a total of 10 fields, or 5 video frames. If the four film frames are called A, B, C and D, the five video frames produced are A1-A2, B1-B2, B2-C1, C2-D1 and D1-D2. That is, frame A is used 2 times (in both fields of the first video frame); frame B is used 3 times (in both fields of the second video frame and in one of the fields of the third video frame); frame C is used 2 times (in the other field of the third video frame, and in one of the fields of the fourth video frame); and frame D is used 3 times (in the other field of the fourth video frame, and in both fields of the fifth video frame). The 2-3-2-3 cycle repeats itself completely after four film frames have been exposed.
3:2
The alternative "3:2" pattern is similar to the one shown above, except it is shifted by one frame. For instance, a cycle that starts with film frame B yields a 3:2 pattern: B1-B2, B2-C1, C2-D1, D1-D2, A1-A2 or 3-2-3-2 or simply 3-2. In other words, there is no difference between the 2-3 and 3-2 patterns. In fact, the "3-2" notation is misleading because according to SMPTE standards for every four-frame film sequence the first frame is scanned twice, not three times.[1]
Modern alternatives
The above method is a "classic" 2:3, which was used before frame buffers allowed for holding more than one frame. It has the disadvantage of creating two dirty frames (which are a mix from two different film frames) and three clean frames (which matches an unmodified film frame) in every five video frames.
The preferred method for doing a 2:3 creates only one dirty frame in every five (i.e. 3:3:2:2 or 2:3:3:2 or 2:2:3:3). The 3-3-2-2 pattern produces A1-A2 A2-B1 B1-B2 C1-C2 D1-D2, where only the second frame is dirty. While this method has a slight bit more judder, it allows for easier upconversion (the dirty frame can be dropped without losing information) and a better overall compression when encoding. Note that just fields are displayed—no frames hence no dirty frames—in interlaced displays such as on a CRT. Dirty frames may appear in other methods of displaying the interlaced video.