Preview only show first 10 pages with watermark. For full document please download

Introduction

   EMBED


Share

Transcript

VIDEO INTRODUCTION The recording and editing of sound has long been in the domain of the PC. Doing so with motion video has only recently gained acceptance. This is because of the enormous file size required by video. For example : one second of 24-bit, 640 X 480 mode video and its associated audio required 30 MB space. Thus a 20 minute clip filled 36 GB of disk space. Moreover it required processing at 30 MB/s. The only solution was to compress the data, but compression hardware was very expensive in the early days of video editing. As a result video was played in very small sized windows of 160 X 120 pixels which occupied only 1/16 th of the total screen. It was only after the advent of the Pentium-II processor coupled with cost reduction of video compression hardware, that full screen digital video finally became a reality. Moving Pictures In motion video the illusion of moving images is created by displaying a sequence of still images rapidly one after another. If displayed fast enough our eye cannot distinguish the individual frames, rather because of persistence of vision merges the individual frames with each other thereby creating an effect of movement. Each individual image in this case is called a frame and the speed with which the images are displayed one after another is called frame rate. The frame rate should range between 20 and 30 for perceiving smooth realistic motion. Audio is added and synchronized with the apparent movement of images. Motion picture is recorded on film whereas in motion video the output is an electrical signal. Film playback is at 24 fps while for video ranges from 25 to 30. Visual and audio data when digitized and combined into a file gives rise to digital video. Video represents a sequence of real world images taken by a movie camera. So it depicts an event that physically took place in reality. Animation also works on the same principle of displaying a sequence of images at a specific speed to create the illusion of motion but here the images are drawn by artists, by hand or software. So the events do not depict any real sequence of events taking place in the physical world. ANALOG VIDEO In analog video systems video is stored and processed in the form of analog electrical signals. The most popular example is television broadcasting. In contrast, digital video is where video is represented by a string of bits. All forms of video represents in a PC represents digital video. Video Camera Analog video cameras are used to record a succession of still images and then convert the brightness and color information of the images into electrical signals. These signals are transmitted from one place to another using cables or by wireless means and in the television set at the receiving end these signals are again converted to form the images. The tube type analog video camera is generally used in professional studios and uses electron beams to scan in a raster pattern, while the CCD video camera, using a lightsensitive electronic device called the CCD, is used for home/office purposes where portability is important. Tube type Camera The visual image in front of the video camera is presented to the user by an optical lens. This lens focuses the scene on the photosensitive surface of a tube in the same way that the lens of a camera focuses the image on the film surface. The photo-sensitive surface, called Target, is a form of semi-conductor. It is almost an insulator in the absence of light. With absorption of energy caused by light striking the target, electrons acquire sufficient energy to take part in current flow. The electrons migrate towards a positive potential applied to the lens side of the target. This positive potential is applied to a thin layer of conductive but transparent material. The vacant energy states left by the liberated electrons, called holes, migrate towards the inner surface of the target. Thus a charge pattern appears on the inner surface of the target that is most positive where the brightness or luminosity of the scene is the greatest. The charge pattern is sampled point-by-point by a moving beam of electrons which originates in an electron gun in the tube. The beam scans the charge pattern in the same way a raster is produced in a monitor but approach the target at a very low velocity. The beam deposits just enough carriers to neutralize the charge pattern formed by the holes. Excess electrons are turned back towards the source. The exact number of electrons needed to neutralize the charge pattern constitute a flow of current in a series circuit. It is this current flowing across a load resistance that forms the output signal voltage of the tube. CCD Camera The ADC sends the digital information to a digital signal processor (DSP). The DSP adjusts the contrast and detail in the image, compresses the data and sends it to the camera’s storage medium. The transistors generate a continuous analog electrical signal that goes to an analog to digital converter. The ADC translates the varying signal to a digital format consisting of 1s and 0s. Instead of being focused on photographic film, the image is focused on a chip called a CCD. The face of the CCD is studded with an array of transistors that create electrical current in proportion to the intensity of light striking them. Light passing through the lens of the camera is focussed on a chip called CCD. The surface of the CCD is covered with an array of transistors that create electrical current in proportion to the intensity of the light striking them. The transistors make up the pixels of the image. The transistors generate a continuous analog electrical signal that goes to an ADC which translates the signal to a digital stream of data. The ADC sends the digital information to a digital signal processor (DSP) that has been programmed specifically to manipulate photographic images. The DSP adjusts the contrast and brightness of the image, and compresses the data before sending it to the camera’s storage medium. The image is temporarily stored on a hard drive, RAM, floppy or tape built into the camera’s body before being transferred to the PC’s permanent storage. Television Systems Color Signals The video cameras produce three output signals which require three parallel cables for transmission. Because of the complexities involved in transmitting 3 signals in exact synchronism, TV systems do not usually handle RGB signals. Signals are encoded in composite format as per LumaChroma principle based on human color perceptions. This is distributed using a single cable or channel. Human Color Perception All objects that we observe are focused sharply by the lens system of the eye on the retina. The retina which is located at the back side of the eye has light sensitive cells which measure the visual sensations. The retina is connected with the optic nerve which conduct the light stimuli as sensed by the organs to the optical centre of the brain. According to the theory formulated by Helmholtz the light sensitive cells are of two types – rods and cones. The rods provide brightness sensation and thus perceive objects in various shades of grey from black to white. The cones that are sensitive to color are broadly divided in three different groups. One set of cones detect the presence of blue color, the second set perceives red color and the third is sensitive to the green shade. The combined relative luminosity curve showing relative sensation of brightness produced by individual spectral colors is shown below. It will be seen from the plot that the sensitivity of the human eye is greatest for the green-yellow range decreasing towards both the red and blue ends of the spectrum. Any color other than red, green and blue excite different sets of cones to generate a cumulative sensation of that color. White color is perceived by the additive mixing of the sensations from all the three sets of cones. Based on the spectral response curve and extensive tests with a large number of observers, the relative intensities of the primary colors for color transmission e.g. for color television, has been standardized. The reference white for color television transmission has been chosen to be a mixture of 30% red, 59% green and 11% blue. These percentages are based on the light sensitivities of the eye to different colors. Thus one lumen (lm) of white light = 0.3 lm of red + 0.59 lm of green + 0.11 lm of blue = 0.89 lm of yellow + 0.11 lm of blue = 0.7 lm of cyan + 0.3 lm of red = 0.41 lm of magenta + 0.59 lm of green. Luma-Chroma Principle The principle states that any video signal can be broken into two components : The luma component, which describes the variation of brightness in different portions of the image without regard to any color information. It is denoted by Y and can be expressed as a linear combination of RGB : Y = 0.3R + 0.59G + 0.11B The chroma component, which describe the variation of color information in different parts of the image without regard to any brightness information. It is denoted by C and can further be subdivided into two components U and V. T hus RGB output signals from a video camera are transformed to YC format using electronic circuitry before being transmitted. At the receiving end for a B/W TV, the C component is discarded and only the Y component is used to display a B/W image. For a color TV, the YC components are again converted back to RGB signals which are used to drive the electron guns of a CRT. Color Television Camera The figure below shows a block diagram of a color TV camera. It essentially consists of three camera tubes in which each tube receives selectively filtered primary colors. Each camera tube develops a signal voltage proportional to the respective color intensity received by it. Light from the scene is processed by the objective lens system. The image formed by the lens is split into three images by glass prisms. These prisms are designed as diachroic mirrors. A diachroic mirror passes one wavelength and rejects other wavelengths. Thus red, green and blue images are formed. These pass through color filters which provide highly precise primary color images which are converted into video signals by the camera tubes. This generates the three color signals R, G and B. To generate the monochrome or brightness signal that represents the luminance of the scene the three camera outputs are added through a resistance matrix in the proportion of 0.3, 0.59 and 0.11 for R, G and B respectively Y = 0.3R + 0.59G + 0.11B The Y signal is transmitted as in a monochrome television system. However instead of transmitting all the three color signals separately the red and blue camera outputs are combined with the Y signal to obtain what is known as the color difference signals. Color difference voltages are derived by subtracting the luminance voltage from the color voltages. Only (R-Y) and (B-Y) are produced. It is only necessary to transmit two of the three color difference signals since the third may be derived from the other two. The color difference signals equal zero when white or grey shades are being transmitted. This is illustrated by the calculation below. For any grey shade (including white) let R = G = B = v volts. Then Y = 0.3v + 0.59v + 0.11v = v Thus, (R-Y) = v – v = 0 volt, and (B-Y) = v – v = 0 volt. When televising color scenes even when voltages R, G and B are not equal, the Y signal still represents monochrome equivalent of the color. This aspect can be illustrated by the example below. For simplicity of calculation let us assume that the camera output corresponding to the maximum (100%) intensity of white light be an arbitrary value of 1 volt. Consider a color of unsaturated magenta and it is required to find out the voltage components of the luminance and color difference signals. Since the hue is magenta it implies a mixture of red and blue. The word unsaturated indicates that some white light is also there. The white content will develop all the three i.e. R, G and B voltages, the magnitudes of which will depend on the extent of unsaturation. Thus R and B voltages must dominate and both must be of greater amplitude than G. Let R=0.7 volt, G=0.2 volt, B=0.6 volt represent the unsaturated magenta color. The white content must be represented by equal quantities of the three primaries and the actual amount must be indicated by the smallest voltage i.e. G=0.2 volt. Thus the remaining i.e. R=(0.7-0.2)=0.5 volt and B=(0.6-0.2)=0.4 volt is responsible for the magenta hue. The luminance signal Y = 0.3R+0.59G+0.11B = 0.3(0.7)+0.59(0.2)+0.11(0.6) = 0.394 volt. The color difference signals are : (R-Y) = 0.7-0.394 = 0.306 volt (B-Y) = 0.6-0.394 = 0.206 volt The other component (G-Y) can be derived as shown below: Y = 0.3R+0.59G+0.11B Thus, (0.3+0.59+0.11)Y = 0.3R+0.59G+0.11B Rearranging the terms, 0.59(G-Y) = -0.3(R-Y) –0.11(B-Y) i.e. G-Y = -0.51(R-Y)-0.186(B-Y) Since the value of the luminance is Y=0.394 volt and peak white corresponds to 1 volt, the magenta will show up as a fairly dull grey in a monochrome television set. Chroma Sub-sampling Conversion of RGB signals into YC format also has another important advantage of utilizing less bandwidth through the use of chroma subsampling. It had been observed through experimentation that human eye is more sensitive to brightness information than to color information. This limitation can be exploited to transmit reduced color information as compared to brightness information, a process called chromasubsampling, and save on bandwidth requirements. On account of this, we get code words like "4:2:2" and "4:1:1" to describe how the subsampling is done. Roughly, the numbers refer to the ratios of the luma sampling frequency to the sampling frequencies of the two chroma channels (typically Cb and Cr, in digital video); "roughly" because this formula doesn't make any sense for things like "4:2:0". 4:4:4 --> No chroma subsampling, each pixel has Y, Cr and Cb values. 4:2:2 --> Chroma is sampled at half the horizontal frequency as luma, but the vertical frequency is the same. The chroma samples are horizontally aligned with luma samples. 4:1:1 --> Chroma is sampled at one-fourth the horizontal frequency as luma, but at full vertical frequency. The chroma samples are horizontally aligned with luma samples. 4:2:0 --> Chroma is sampled at half the horizontal frequency as luma, and also at half the vertical frequency. Theoretically, the chroma pixel is positioned between the rows and columns. Bandwidth and Frequencies Each TV channel is allocated 6 MHz of bandwidth. Out of these the 0 to 4 MHz part of the signal is devoted to Y component, the next 1.5 MHz is taken up by the C component, and the last 0.5 MHz is taken up by the audio signal. Video Signal Formats Component Video Our color television system starts out with three channels of information; Red, Green, & Blue (RGB). In the process of translating these channels to a single composite video signal they are often first converted to Y, R-Y, and B-Y. Both three channel systems, RGB and Y, R - Y, B - Y are component video signals. They are the components that eventually make up the composite video signal. Much higher program production quality is possible if the elements are assembled in the component domain. Composite Video A video signal format where both the luminance and chroma components are transmitted along a single wire or channel. Usually used in normal video equipment like VCRs as well as TV transmissions. NTSC, PAL, and SECAM are all examples of composite video systems. S-Video Short for Super-video. A video signal format where the luminance and color components are transmitted separately using multiple cables or channels. Here picture quality is better than that of composite video but is more expensive. Usually used in high end VCRs and capture cards. Television Broadcasting Standards NTSC National Television Systems Committee. Broadcast standard used in USA and Japan. Uses 525 horizontal lines at 30 (29.97) frames / sec. Uses composite video format where luma is denoted by Y and chroma components by I and Q. While Y utilizes 4 MHz bandwidth of a television channel, I uses 1.5 MHz and Q only 0.5 MHz. I and Q can be expressed as combinations of RGB as shown below : I = 0.74(R-Y) – 0.27(B-Y) Q = 0.48(R-Y) + 0.41(B-Y) PAL Phase Alternating Lines. Broadcast standard used in Europe, Australia, South Africa, India. Uses 625 horizontal lines at 25 frames / sec. Uses composite video format where luma is denoted by Y and chroma components by U and V. While Y utilizes 4 MHz bandwidth of a television channel, U and V both uses 1.3 MHz each. U and V can be expressed as a linear combination of RGB as shown below : U = 0.493(B-Y) V = 0.877(R-Y) SECAM Sequential Color and Memory. Used in France and Russia. The fundamental difference between the SECAM system on one hand and the NTSC and PAL system on the other hand is that the latter transmit and receive two color signals simultaneously while in the SECAM system only one of the two difference signals is transmitted at a time. It also uses 625 horizontal lines at 25 frames/sec. Here the color difference signals are denoted by DR and DB and both occupies 1.5 MHz each. They are given by the relations: DR = -1.9(R-Y) DB = 1.5(B-Y) Other Television Systems Enhanced Definition Television Systems (EDTV) These are conventional systems modified to offer improved vertical and horizontal resolutions. One of the systems emerging in US and Europe is known as the Improved Definition Television (IDTV). IDTV is an attempt to improve NTSC image by using digital memory to double the scanning lines from 525 to 1050. The pictures are only slightly more detailed than NTSC images because the signal does not contain any new information. By separating the chrominance and luminance parts of the video signal, IDTV prevents cross-interference between the two. High Definition Television (HDTV) The next generation of television is known as the High Definition TV (HDTV). The HDTV image has approximately twice as many horizontal and vertical pixels as conventional systems. The increased luminance detail in the image is achieved by employing a video bandwidth approximately five times that used in conventional systems. Additional bandwidth is used to transmit the color values separately. The aspect ratio of the HDTV screen will be 16 : 9. Digital codings are essential in the design and implementation of HDTV. There are two types of possible digital codings : composite coding and component coding. Composite coding of the whole video signal is in principle easier than a digitization of the separate signal components (luma and chroma) but there are also serious problems with this approach, like disturbing cross-talk between the luma and chroma information, and requirement of more bandwidth due to the fact that chroma-subsampling would not be possible. Hence component coding seems preferable. The luminance signal is sampled at 13.5 MHz as it is more crucial. The chrominance signals (R-Y, B-Y) are sampled at 6.75 MHz (4:2:2). The digitized luminance and chrominance signals are then quantized with 8 bits each. For the US, a total of 720000 pixels are assumed per frame. If the quantization is 24 bits/pixel and the frame rate is approximately 60 frame/second, then the data rate for the HDTV will be 1036.8 Mbits/second. Using a compression method data rate reduction to 24 Mbits/second will be possible without noticeable quality loss. In the case of European HDTV, the data rate is approximately 1152 Mbits/second. DIGITAL VIDEO Video Capture Source and Capture Devices Two main components : Source and Source device Capture device. During capture the visual component and audio component are captured separately and automatically synchronized. Source devices must use PAL or NTSC playback and must have Composite video or S-video output ports. The source and source device can be the following : • Camcorder with pre-recorded video tape • VCP with pre-recorded video cassette • Video camera with live footage • Video CD with Video CD player Video Capture Card A full motion video capture card is a circuit board in the computer that consists of the following components : • Video INPUT port to accept the video input signals from NTSC/PAL/SECAM broadcast signals, video camera or VCR. The input port may conform to the composite-video or S-video standards. • Video compression-decompression hardware for video data. • Audio compression-decompression hardware for audio data. • A/D converter to convert the analog input video signals to digital form. • Video OUTPUT port to feed output video signals to camera and VCR. • D/A converter to convert the digital video data to analog signals for feeding to output analog devices. • Audio INPUT/OUTPUT ports for audio input and output functions. Rendering support for the various television signal formats e.g. NTSC, PAL, SECAM imposes a level of complexity in the design of video capture boards. Video Capture Software The following capabilities might be provided by a video capture software, often bundled with a capture card : AVI Capture : This allows capture and digitization of the input analog video signals from external devices and conversion to an AVI file on the disk of the computer. No compression is applied to the video data and hence this is suitable for small files. Playback of the video is done through the Windows Media Player. Before capturing parameters like frame rate, brightness, contrast, hue, saturation etc. as well as audio sampling rate and audio bit size may be specified. AVI to MPEG Converter : This utility allows the user to convert a captured AVI file to MPEG format. Here the MPEG compression algorithm is applied to an AVI file and a separate MPG file is created on the disk. Before compression parameters like quality, amount of compression, frame dimensions, frame rate etc. may be specified by the user. Playback of the MPEG file is done through the Windows Media Player. MPEG Capture : Certain cards allow the user to capture video directly in the MPEG format. Here analog video data is captured, digitized and compressed at the same time before being written to the disk. This is suitable for capturing large volumes of video data. Parameters like brightness, contrast, saturation etc. mat be specified by the user before starting capturing. DAT to MPEG Converter : This utility converts the DAT format of a Video-CD into MPEG. Conversion to MPEG is usually done for editing purposes. DAT and MPG are similar formats so that the file size changes by very small amounts after conversion. The user has to specify the source DAT file and the location of the target MPG file. MPEG Editor : Some capture software provide the facility of editing an MPEG file. The MPG movie file is opened in a timeline structure and functions are provided for splitting the file into small parts by specifying the start and end of each portion. Multiple portions may also be joined together. Sometimes functions for adding effects like transitions or sub-titling may also be present. The audio track may also be separately edited or manipulated. Video Compression Types of Compression Video compression is a process whereby the size of the digital video on the disk is reduced using certain mathematical algorithms. Compression is required only for storing the data. For playback of the video, the compressed data need again to be decompressed. Software used for the compression/decompression process are called CODECs. During the process of compression the algorithms analyse the source video and tries to find out redundant and irrelevant portions. Greater is the amount of these portions in the source data, better is the scope of compressing it. Video compression process may be categorised using different criteria. Lossless compression occurs when the original video data is not changed permanently in any way during the compression process. This means that the original video data can be obtained after decompression. Though this preserves the video quality, but the amount of compression achieved is usually limited. This process is usually used where the quality is of more importance than the storage space issues e.g. medical image processing. Lossy compression occurs where a part of the original data is discarded during the compression process in order to reduce the file size. This data is lost forever and cannot be recovered after the decompression process. Thus here quality is degraded due to compression. The amount of compression and hence the degradation in quality is usually selectable by the user – more is the compression greater is the degradation in quality and vice versa. This process is usually used where storage space is more important than quality e.g. corporate presentations. Since video is essentially a sequence of still images, compression can be differentiated on what kind of redundancy is exploited. Intraframe compression occurs where redundancies in each frame or still image (spatial redundancy) is exploited to produce compression. This process is same as an image compression process. A video CODEC can also implement another type of compression when it exploits the redundancies between adjacent frames in a video sequence (temporal redundancy). This is called interframe compression. Compression can also be categorized based on the time taken to compress and decompress. Symmetrical compression algorithms take almost the same for both the compression and decompression process. This is usually used in live video transmissions. An asymmetrical compression algorithm usually take a greater amount of time for the compression process than for the decompression process. This is usually used for applications like CD-ROM presentations. Since video is essentially a sequence of still images, the initial stage of video compression is same as that for image compression. This is the intraframe compression process and can be both lossless and lossy. The second stage after each frame is individually compressed is the interframe compression process where redundancies between adjacent frames are exploited to achieve compression. Lossy Coding Techniques Lossy coding techniques are also known as Source Coding. The popular methods are discussed below : Discrete Cosine Transform (DCT) Which portion of data is considered lossy and which lossless depends on the algorithm. One method to separate relevant from irrelevant information is Transform Coding. This transforms data into a different mathematical model better suited for the purpose of separation. One of the best known transform codings is Discrete Cosine Transform (DCT). For all transform codings an inverse function must exist to enable reconstruction of the relevant information by the decoder. An image is subdivided into blocks of 8 X 8 pixels. Each of these blocks is represented as a combination of DCT functions. 64 appropriately chosen coefficients represent the variation of horizontal and vertical frequencies of varying pixel intensities. The human eye is greatly sensitive at low frequency levels, but it sensitivity decreases at high frequency levels. Thus reduction in number of high frequency DCT components weakly affects image quality. After the DCT transform, a process called Quantization is used to extract the relevant information by making the high frequency components zero. Video Compression Techniques After the image compression techniques, the video CODEC uses interframe algorithms to exploit temporal redundancy, as discussed below : Motion Compensation By motion compensated prediction, temporal redundancies between two frames in a video sequence can be exploited. Temporal redundancies can arise from movement of objects in front of a stationary background. The basic concept is to look for a certain area (block) in a previous or subsequent frame that matches very closely an area of the same size in the current frame. If successful then the differences in the block intensity values are calculated. In addition, the motion vector which represents the translation of the corresponding blocks in both x- and y-directions is determined. Together the difference signal and the motion vector represent the deviation between the reference block and predicted block. Some Popular CODECs JPEG Stands for Joint Photographic Experts Group, a joint effort by ITU and ISO. Achieves compression by first applying DCT, then quantization, and finally entropy coding the corresponding DCT coefficients. Corresponding to the 64 DCT coefficients, a 64 element quantization table is used. Each DCT coefficient is then divided by the corresponding quantization table entry, and the values rounded off. For entropy coding Huffman method is used. MPEG-1 Stands for Moving Pictures Expert Group. MPEG-1 belongs to a family of ISO standards. Provides motion compensation and utilizes both intraframe and interframe compression. Uses 3 different types of frames : I-frames, P-frames and B-frames. I-frames (intracoded) : These are coded without any reference to other images. MPEG makes use of JPEG for I frames. They can be used as a reference for other frames. P-frames (predictive) : These require information from the previous I and/or P frame for encoding and decoding. By exploiting temporal redundancies, the achievable compression ratio is higher than that of the I frames. P frames can be accessed only after the referenced I or P frame has been decoded. B-frames (bidirectional predictive) : Requires information from the previous and following I and/or P frame for encoding and decoding. The highest compression ratio is attainable by using these frames. B frames are never used as reference for other frames. Reference frames must be transmitted first. Thus transmission order and display order may differ. The first I frame must be transmitted first followed by the next P frame and then by the B frames. Thereafter the second I frame must be transmitted. An important data structure is the Group of Pictures (GOP) . A GOP contains a fixed number of consecutive frames and guarantees that the first picture is an I-frame. A GOP gives an MPEG encoder information as to which picture should be encoded as an I, P or B frame and which frames should serve as references. The first frame in a GOP is always an I-frame which is encoded like an intraframe image i.e. with DCT, quantization and entropy coding. The motion estimation step is activated when B or P frames appear in the GOP. Entropy coding is done by using Huffman coding technique. Cinepak Cinepak was originally developed to play small movies on '386 systems, from a single- speed CD-ROM drive. Its greatest strength is its extremely low CPU requirements. Cinepak's quality/datarate was amazing when it was first released, but does not compare well with newer CODECs available today. There are higher-quality (and lower-datarate) solutions for almost any application. However, if you need your movies to play back on the widest range of machines, you may not be able to use many of the newer codecs, and Cinepak is still a solid choice. After sitting idle for many years, Cinepak is finally being dusted off for an upgrade. Cinepak Pro from CTI (www.cinepak.com) is now in pre-release, offering an incremental improvement in quality, as well as a number of bug fixes. Supported by QuickTime and Video for Windows. Sorenson One of the major advances of QuickTime 3 is the new Sorenson Video CODEC which is included as a standard component of the installation. It produces the highest quality low-data rate QuickTime movies. The Sorenson Video CODEC produces excellent Web video suitable for playback on any Pentium or PowerMac. It also delivers outstanding quality CD-ROM video at a fraction of traditional data rates, which plays well on 100MHz systems. Compared with Cinepak, Sorenson Video generally achieves higher image quality at a fraction of the data rate. This allows for higher quality, and either faster viewing (on the WWW), or more movies on a CD-ROM (often four times as much material on a disc as Cinepak). It supports variable bitrate encoding [When movies are compressed, each frame of the video must be encoded to a certain number of bytes. There are several techniques for allocating the bytes for each frame. Fixed bitrate is used by certain codecs (like Cinepak), which attempt to allocate approximately the same number of bytes per frame. Variable bitrate (VBR) is supported by other codecs (such as MPEG-2 and Sorenson), and attempts to give each frame the optimum number of bytes, while still meeting set constraints (such as the overall data rate of the movie, and the maximum peak data rate). ]. Supported by Quicktime. Manufacturer is Sorenson Vision Inc (www.sorensonvideo.com ) RealVideo RealMedia currently has only two video CODECs: RealVideo (Standard) and RealVideo (Fractal). RealVideo (Standard) is usually best for data rates below 3 Kbps. It works better with relatively static material than it does with higher action content. It usually encodes faster. RealVideo (Standard) is significantly more CPU intensive than the RealVideo (Fractal) CODEC. It usually requires a very fast PowerMac or Pentium for optimal playback. It is supported by the RealMedia player. Manufacturer is Progressive Networks (www.real.com). H.261 H.261 is a standard video-conferencing CODEC. As such, it is optimized for low data rates and relatively low motion. Not generally as good quality as H.263. H.261 is CPU intensive, so data rates higher than 50 Kbps may slow down most machines. It may not play well on lower-end machines. H.261 has a strong temporal compression component, and works best on movies in which there is little change between frames. Supported by Netshow, Video for Windows. H.263 H.261 is a standard video-conferencing CODEC. H.263 is an advancement of the H.261 standard, mainly it was used as a starting point for the development of MPEG (which is optimized for higher data rates). Supported by QuickTime, Netshow, Video for Windows. Indeo Video Interactive (IVI) Indeo Video Interactive (IVI) is a very high-quality, wavelet-based CODEC. It provides excellent image quality, but requires a high-end Pentium for playback. There are currently two main versions of IVI. Version 4 is included in QuickTime 3 for Windows; Version 5 is for DirectShow only. Neither version currently runs on the Macintosh, so any files encoded with IVI will not work cross-platform. Version 5 is very similar to 4, but uses an improved wavelet algorithm for better compression. Architecures supported are QuickTime for Windows, Video for Windows, DirectShow. Manufacturer is Intel (www.intel.com). VDOLive VDOLive is an architecture for web video delivery, created by VDOnet Corporation (www.vdo.net). VDOLive is a server-based, "true streaming" architecture that actually adjusts to viewers' connections as they watch movies. Thus, true streaming movies play in real-time with no delays for downloading. For example, if you clicked on a 30 second movie, it would start playing and 30 seconds later, it would be over, regardless of your connection, with no substantial delays. VDOLive's true streaming approach differs from QuickTime's "progressive download" approach. Progressive download allows you to watch (or hear) as much of the movie as has downloaded at any time, but movies may periodically pause if the movie has a higher data rate than the user's connection, or if there are problems with the connection or server, such as very high traffic. In contrast to progressive download, the VDOLive server talks to the VDOPlayer (the client) with each frame to determine how much bandwidth a connection can support. The server then only sends that much information, so movies always play in real time. In order to support this real-time adjustment of the data-stream, you must use special server software to place VDOLive files on your site. The real-time adjustment to the viewer's connection works like this: VDOLive files are encoded in a "pyramidal" fashion. The top level of the pyramid contains the smallest amount of the most critical image data. If your user has a slow connection, they are only sent this top portion. The file's next level has more data, and will be sent if the viewer's connection can handle it, and so forth. Users with very fast connections (T1 or better) are sent the whole file. Thus, users are only sent what they can receive in realtime, but the data has been pre-sorted so that the information they get is the best image for their bandwidth. MPEG-2 MPEG-2 is a standard for broadcast-quality digitally encoded video. It offers outstanding image quality and resolution. MPEG-2 is the primary video standard for DVD-Video. Playback of MPEG-2 video currently requires special hardware, which is built into all DVD-Video players, and most (but not all) DVD-ROM kits. MPEG-2 was based on MPEG-1 but optimized for higher data rates. This allows for excellent quality at DVD rates (300-1000 Kbps), but tends to produce results inferior to MPEG-1 at lower rates. MPEG-2 is definitely not appropriate for use over network connections (except in very special, ultra-highperformances cases). MPEG-4 MPEG-4 is a standard currently under development for the delivery of interactive multimedia across networks. As such, it is more than a single CODEC, and will include specifications for audio, video, and interactivity. The video component of MPEG-4 is very similar to H.263. It is optimized for delivery of video at Internet data rates. One implementation of MPEG-4 video is included in Microsoft's NetShow. The rest of the MPEG-4 standard is still being designed. It was recently announced that QuickTime's file format will be used as a starting point. Playback Architectures QuickTime QuickTime is Apple's multi-platform, industry-standard, multimedia software architecture. It is used by software developers, hardware manufacturers, and content creators to author and publish synchronized graphics, sound, video, text, music, VR, and 3D media. The latest free downloads, and more information, are available at Apple's QuickTime site. (http://www.apple.com/quicktime). QuickTime offers support for a wide range of delivery media, from WWW to DVD-ROM. It was recently announced that the MPEG-4 standard (now in design) will be based upon the QuickTime file format. QuickTime is also widely used in digital video editing for output back to videotape. QuickTime is the dominant architecture for CD-ROM video. It enjoys an impressive market share due to its cross-platform support, wide range of features, and free licensing. QuickTime is used on the vast majority of CD-ROM titles for these reasons. QuickTime is a good choice for kiosks, as it integrates well with Macromedia Director, MPEG, and a range of other technologies. RealMedia The RealMedia architecture was developed by Progessive Networks, makers of RealAudio. It was designed specifically to support live and on-demand video and audio across the WWW. The first version of RealMedia is focused on video and audio, and is referred to as RealVideo. Later releases of RealMedia will incorporate other formats including MIDI, text, images, vector graphics, animations, and presentations. RealMedia content can be placed on your site either with or without special server software. There are performance advantages with the server, but you don't have to buy one to get started. However, high volume sites will definitely want a server to get substantially improved file delivery performance. Users can view RealMedia sites with the RealPlayer, a free "client" application available from Progressive. A Netscape plug-in is also available. The main downside to RealMedia is that it currently requires a PowerMac or Pentium computer to view. As such, RealMedia movies aren't available to the full range of potential users. The latest free downloads, as well as more information, are available at www.real.com. NetShow Microsoft's NetShow architecture is aimed at providing the best multimedia delivery over networks, from 14.4 kbps modems to high-speed LANs. There is an impressive range of audio and video CODECs built into NetShow 3.0. Combined with a powerful media server, this is a powerful solution for networked media. Technically, the term "NetShow" refers to the client installation and the server software. Netshow clients are built on top of the DirectShow architecture. Because of this, NetShow has access to its own CODECs, and also those for DirectShow, Video for Windows, and QuickTime. Netshow media on WWW pages may be viewed via ActiveX components (for Internet Explorer), plug-ins (for Netscape Navigator), or standalone viewers. NetShow servers support "true streaming" (in their case, called "intelligent streaming"): the ability to guarantee continuous delivery of media even if the networks' performance degenerates. If this happens, NetShow will automatically use less video data (thus reducing the quality). If the amount of available bandwidth decreases more, NetShow will degrade video quality further, until only the audio is left. Microsoft says that their implementation provides the most graceful handling of this situation. The latest free downloads, as well as more information, are available at Microsoft's NetShow site (www.microsoft.com/netshow). DirectShow DirectShow (formerly known as ActiveMovie) is the successor to Microsoft's Video for Windows architecture. It is built on top of the DirectX architecture (including DirectDraw, DirectSound, and Direct3D), for optimum access to audio and video hardware on Windows-based computers. Supported playback media includes WWW, CD-ROM, DVD-ROM, and DVD-Video (with hardware). DV Camera support will be added in an upcoming release. DirectShow has its own player (the Microsoft Media Player, implemented as an ActiveX control) which may be used independently or within Internet Explorer. There is also a plug-in for use with Netscape Navigator. And playback may also be provided by other applications using the OCX component. As DirectShow is the playback architecture for NetShow, these playback options support either delivery approach. Media Types Supported are Audio, Video, Closed Captioning (SAMI), MIDI, MPEG, animation (2D or 3D). The latest free downloads, as well as more information, are available at Microsoft's DirectX site (www.microsoft.com/directx/pavilion/dshow/default.asp). Video for Windows Video for Windows is similar to QuickTime. Its main advantage is that it is built into Windows 95. However, it is limited in many ways. It runs on Windows only, doesn't handle audio/video synchronization as well as QuickTime, and doesn't support variable-length frames. Video for Windows is no longer supported by Microsoft, and is being replaced by DirectShow/ActiveMovie (one of the DirectX technologies). Video for Windows is often referred to as "AVI" after the .AVI extensions specified by its file format. [Some of the details discussed is available at : http://www.etsimo.uniovi.es/hypgraph/video/codecs/Default.htm ] Some Concepts of Video Editing Time Base and Frame Rates In the natural world we experience time as a continuous flow of events. However working with video requires precise synchronization so it is necessary to measure time using precise numbers. Familiar time increments like hours, minutes, seconds, are not precise enough as each second might contain several events. When editing video several source clips may need to be imported to create the output clip. The source frame rates of these source clips determine how many frame rates are displayed per second within these clips. Source frame rates can be different for different types of clips: Motion picture film – 24 fps PAL and SECAM video – 25 fps NTSC video – 29.97 fps Web applications – 15 fps CD-ROM applications – 30 fps In a video editing project file there is a single and common timeline where all the imported clips are placed. A parameter called the timebase determines how time is measured and displayed within the editing software. For example a timebase of 30 means each second is divided into 30 units. The exact time at which an edit occurs depends on the timebase specified for the particular project. Since there needs to be a common timebase for the video editor timeline, source clips whose frame rates do not match with the specified timebase needs adjustments. For example if the frame rate of a source clip is 30 fps and the timebase of the project is also 30 fps then all frames are displayed as expected (figure below, half second shown) However if the source clip was recorded at 24 fps and it is placed on a timeline with a timebase of 30 then to preserve the proper playback speed some of the original frames need to be repeated. In the figure below, frames 1, 5 and 9 are shown to be repeated for half second duration. If the final edited video clip needs to be exported at 15 fps, then from the timeline every alternate frames need to be discarded. On the other hand if the timebase was set at 24 and the final video needs to be exported at 15 fps, then some selective frames would need to be discarded. In the figure below frames 3, 6, 8 and 11 are shown to be discarded for half second duration. SMPTE Timecode Timecode defines how frames in a movie are counted and affects the way you view and edit a clip. For example, you count frames differently when editing video for television than when editing for motion-picture film. A standard way to represent timecode have been developed by a global body called Society of Motion Pictures and Television Engineers (SMPTE) and represent timecode by a set of numbers. The numbers represent hours, minutes, seconds, and frames, and are added to video to enable precise editing e.g. 00:03:51:03. When NTSC color systems were developed, the frame rate was changed by a tiny amount to eliminate the possibility of crosstalk between the audio and color information; the actual frame rate that is used is exactly 29.97 frames per second. This poses a problem since this small difference will cause SMPTE time and real time (what your clock reads) to be different over long periods. Because of this, two methods are used to generate SMPTE time code in the video world: Drop and Non-Drop. In SMPTE Non-Drop, the time code frames are always incremented by one in exact synchronization to the frames of your video. However, since the video actually plays at only 29.97 frames per second (rather than 30 frames per second), SMPTE time will increment at a slower rate than real world time. This will lead to a SMPTE time versus real time discrepancy. Thus, after a while, we could look at the clock on the wall and notice it is farther ahead than the SMPTE time displayed in our application. 1 sec. (clock time) 1 sec. (SMPTE time) [video plays at 29.97 frames per sec] Difference of 0.03 frames per second translates to (0.03 × 60 × 60) or 108 frames per hour. SMPTE Drop time code (which also runs at 29.97 frames per second ) attempts to compensate for the discrepancy between real world time and SMPTE time by "dropping" frames from the sequence of SMPTE frames in order to catch up with real world time. What this means is that occasionally in the SMPTE sequence of time, the SMPTE time will jump forward by more than one frame. The time is adjusted forward by two frames on every minute boundary which increases the numbering by 120 frames every hour. However to achieve a total compensation of 108 frames, the increment is avoided at the following minute boundaries : 00, 10, 20, 30, 40 and 50. Thus when SMPTE Drop time increments from 00:01:59:29, the next value will be 00:02:00:02 in SMPTE Drop rather than 00:02:00:00 in SMPTE Non-Drop. In SMPTE Drop, it must be remembered that certain codes no longer exist. For instance, there is no such time as 00:02:00:00 in SMPTE Drop. The time code is actually 00:02:00:02. No frames are lost, because dropframe timecode does not actually drop frames, only frame numbers. To distinguish from the non-drop type, the numbers are separated by semicolons instead of colons i.e. 00;02;00;02 1 hour (clock time) diff. of 108 frames (SMPTE time) 1 hour Online Editing and Offline Editing There are three phases of video production : • Pre-production : Involves writing scripts, visualizing scenes, storyboarding etc. • Production : Involves shooting the actual scenes • Post-production : Involves editing the scenes and correcting / enhancing wherever necessary. Editing involves a draft or rough cut called Offline Edit which gives a general idea of the editing possibilities. Offline edit is usually done on a low-end system using a low resolution copy of the original video. This is to make the process economically feasible because a low resolution copy is sufficient to decide on the edit points. An edit decision list (EDL) is created which contains a list of the edit changes to be carried out. The EDL can be refined through successive iterations until the edit points and changes are finalized. Since the iterative process may potentially take a long time duration (typically several days) using a high-end system is not considered desirable and optimum. Once the EDL is finalized, the final editing work is done on the actual high-resolution copy of the video using a powerful system. This operation is called Online Edit. It requires a much lesser time compared to an offline edit because the operations are done only once based on the finalized EDL. The higher costs of the high-end system need to be borne only for a short time duration (typically few hours). Edit Decision List (EDL) An EDL is used in offline editing for recording the edit points. It contains the names of the original clips, the In and Out points, and other editing information. In Premiere editing decisions in the Timeline are recorded in text format and then exported in one of the EDL formats. A standard EDL contains the following columns: (a) Header – Contains title and type of timecode (drop-frame or nondrop-frame) (b) Source Reel ID – Identifies the name or number of the videotape containing the source clips (c) Edit Mode – Indicates whether edits take place on video track, audio track or both. (d) Transition type – Describes the type of transition e.g. wipe, cut etc. (e) Source In and Source Out – Lists timecodes of first and last frames of clips On a high-end system the EDL is accepted by an edit controller which applies the editing changes to the high-quality clips. FireWire (IEEE-1394) Although digital video in an external device or camera is already in binary computer code, you still need to capture it to a file on a hard disk. Capturing digital video is a simple file transfer to a computer if the computer has an available FireWire (IEEE-1394) card and a digital video CODEC is available. The IEEE-1394 interface standard specification is also known as "FireWire" by Apple Computer, Inc. and as "iLink" or "iLink 1394" by Sony Corp. Developed by the Institute of Electrical and Electronics Engineers it is a serial data bus that allows high speed data transfers. Three data rates are supported : 100, 200 and 400 Mbps. The bus speed is governed by the slowest active node It consists two separately shielded pairs of wires for signaling, two power conductors and an outer shield. Upto 63 devices can be connected in daisy chain. The standard also support hot plugging which means that devices can be connected or disconnected without switching off power in the cable. IEEE 1394 is a non-proprietary standard and many organizations and companies have endorsed the standard. The Digital VCR Conference selected IEEE 1394 as its standard digital interface; an EIA committee selected IEEE 1394 as the point to point interface for digital TV. Video Experts Standards Association (VESA) adopted IEEE 1394 for home networking. Microsoft first supported IEEE 1394 in the Windows 98 operating system and it is supported in newer operating systems. THE VISUAL DISPLAY SYSTEM The Visual Display System consists of two important components – the Monitor and the Adapter Card & Cable. THE MONITOR The principle on which the monitor works is based upon the operation of a sealed glass tube called the Cathode Ray Tube (CRT). Monochrome CRT The CRT is a vacuum sealed glass tube having two electrical terminals inside, the negative electrode or cathode (K) and a positive electrode or anode (A). Across these terminals a high potential of the order of 18 KV is maintained. This produces a beam of electrons, known as cathode rays, from the cathode towards the anode. The front face of the CRT is coated with a layer of a material called phosphor arranged in the form of a rectangular grid of a large number of dots. The material phosphor has a property of emitting a glow of light when it is hit by charged particles like electrons. The beam of electrons is controlled by three other positive terminals. The control grid (G1) helps to draw out the electrons in an uniform beam, the accelerating grid (G2) provides acceleration to the electrons in the forward direction and the focusing grid (G3) focuses the beam to a single point on the screen ahead, so that the diameter of the beam is equal to the diameter of a single dot of phosphor. This dot is called a pixel, which is short for Picture Element. As the beam hits the phosphor dot, a single glowing pixel is created at the center of the screen. On the neck of the CRT are two other electrical coils called deflection coils. When current flows through these coils, the electrical field produced interacts with the electron beam thereby deflecting it from its original path. One of the coils called the horizontal deflection coil moves the beam horizontally across the screen and the other coil called the vertical deflection coil moves the beam vertically along the height of the screen. When both these coils are energized the electron beam can be moved in any direction thus generating a single spot of light at any point on the CRT screen. Raster Scanning To draw an image on the CRT screen involves the process of raster scanning. It is the process by which the electron beam sequentially moves over all the pixels on the screen. The beam starts from the upperleft corner of the screen, moves over the first row of pixels until it reaches the right hand margin of the screen. The beam is then switched off and retraces back horizontally to the beginning of the second row of pixels. This is called horizontal retrace. It is then turned on again and moves over the second row of pixels. This process continues until it reaches the bottom-right corner of the screen, after which it retraces back to the starting point. This is called vertical retrace. The entire pattern is called a raster and each scan line is called a raster line. Frames and Refresh Rate The electron beam is said to produce a complete frame of picture when starting from the top-left corner it moves over all the pixels and returns back to the starting point. The human brain has the capability to hold on to the image of an object before our eyes for a fraction of a second even after the object has been removed from before our eyes. This phenomenon is called persistence of vision. As the beam moves over each pixel, the glow of the pixel dies down although its image persists in our eyes for sometimes after that. So if the beam can come back to the pixel before its glow has completely disappeared, to us it will seem that the pixel is glowing continuously. It has been observed that we see a steady image on the screen only if 60 frames are generated on the screen per second i.e. the electron beam should return to its starting point within 1/60 th of a second. The monitor is then said to have a refresh rate of 60 Hz. A monitor with a refresh rate of less than 50 Hz produces a perceptible flicker on the screen and should be avoided. Color CRT The working principle of a color CRT is same as that of a monochrome CRT, except that here each pixel consists of three colored dots instead of one and is called a triad. These colors are red, green and blue (RGB) and are called primary colors. Corresponding to the three dots there are also three electron beams from the electrode (also called electron gun) each of which falls on the corresponding dot. It has been experimentally observed that the three primary colored lights can combine in various proportions to produce all other colors. As each of the three beams hits the corresponding dots in various intensities, they produce different proportions of the three elementary colored lights which together create the sensation of a specific color in our eyes. Our eyes cannot distinguish the individual dots but see their net effect as a whole. A perforated screen called a shadow mask prevents the beams from falling in the gap between the dots. Secondary colors are created by mixing equal quantities of primary colors e.g. red and green create yellow, green and blue create cyan, blue and red create magenta, while all the three colors in equal proportion produce white. Interlacing A process by which monitors of lower refresh rates can produce images comparable in quality to that produced by a monitor of higher refresh rate. Here each frame is split into two parts consisting of odd and even lines from the complete image. These are called odd-field and even-field. The first field is displayed for half the frame duration and then the second field is displayed so that its lines fit between the lines of the first field. This succeeds in lowering the frame rate without increasing the flicker correspondingly, although the picture quality is still not same as that of a non-interlaced monitor. One of the most popular applications of interlacing is TV broadcasting. Monitor Specifications (a) Refresh Rate : Number of frames displayed by a monitor in one second. Thus a monitor having a frame rate of 60 Hz implies that an image on the screen of the monitor is refreshed 60 times per second. (b) Horizontal Scan Rate : Number of horizontal lines displayed by the monitor in one second. For a monitor having a refresh rate of 60 Hz and 600 horizontal lines on the screen, the horizontal scan rate is 36 KHz. (c) Dot Pitch : Shortest distance between two neighbouring pixels or triads on the screen. Usually of the order of 0.4 mm to 0.25 mm. (d) Pixel Addressability : The total number of pixels that can be addressed on the screen. Measured by the product of the horizontal number of pixels and the vertical number of pixels on the screen. Modern monitors usually have 640 X 480 pixels or 800 X 600 pixels on the screen. (e) Aspect Ratio : Ratio of the width of the screen to its height. For computer monitors and TV screens it is 4:3, whereas for movie theatres it is 16:9. (f) Size : The longest diagonal length of the monitor. Standard computer monitors are usually between 15” and 20” in size. (g) Resolution : The total number of pixels per unit length of the monitor either in the horizontal or vertical directions. Measured in dots per inch (dpi). Usually of the order of 75 dpi to 96 dpi for modern monitors. (h) Color Depth : A measure of the total number of colors that can be displayed on a monitor. Depends on the number of the varying intensities that the electron beams can be made to have. A monitor with a color depth of 8-bits can display a total of 28 or 256 colors. Problem-1 A 15” monitor with aspect ratio of 4:3 has a pixel addressability of 800 X 600. Calculate its resolution. Lets the width of the monitor be 4x and its height be 3x. For a right angled triangle by we know, (4x)2 + (3x)2 = 152 i.e. 16x2 + 9x2 = 152 i.e. x=3 The width of the monitor is 12” and the height is 9” So, resolution is (800/12) = (600/9) = 66.67 dpi Problem-2 A monitor can display 4 shades of red, 8 shades of blue and 16 shades of green. Find out its color depth. Each pixel can take up a total of (4 X 8 X 16) or 512 colors. Since 29 = 512, the monitor has a color depth of 9-bits. THE VIDEO ADAPTER CARD AND CABLE The Video Adapter is an expansion card which usually sits on a slot on the motherboard. It acts as an interface between the processor of the computer and the monitor. The digital data requiring for creating an image on the screen is generated by the central processor of the computer and consists of RGB values for each pixel on the screen. These are called pixel attributes. For an 8-bit image, each pixel is digitally represented by an 8-bit binary number. The adapter interprets these attributes and translates them into one of 256 voltage levels (since 28 = 256) to drive the electron gun of the monitor. These intensity signals along with two synchronization signals for positioning the electron beam at the location of the pixel, are fed to the monitor from the adapter through the video cable. The VGA The Video Graphics Array (VGA) adapter was a standard introduced by IBM which was capable of displaying text and graphics in 16 colors at 640 x 480 mode or 256 colors at 320 x 240 mode. A VGA card had no real processing power meaning that the CPU had to do most of the image manipulation tasks. The VGA adapter was connected to a VGA compatible monitor using a video cable with a 15-pin connector. The pins on the connector carried various signals from the card to the monitor including the color intensity signals and the synchronization signals. The sync signals were generated by the adapter to control the movement of the electron guns of the CRT monitor. Sync signals consisted of the horizontal sync pulses which controlled the left to right movement of the electron beam as well as the horizontal retrace, and the vertical sync pulses which controlled the up and down movement of the beam as well as the vertical retrace. Nowadays VGA has become obsolete being replaced by the SVGA adapters. The SVGA The industry extended the VGA standard to include improved capabilities like 800 x 600 mode with 16-bit color and later on 1024 x 768 mode with 24-bit color. All of these standards were collectively called Super VGA or SVGA. The Video Electronics Standard Association (VESA) defined a standard interface for the SVGA adapters and called it VESA BIOS Extensions. Along with these new improved standards came accelerated video cards which included a special graphics processor on the adapter itself and relieved the main CPU from most of the tasks of image manipulation. Components of an Adapter The main components of the video adapter card include : Display Memory A bank of memory within the adapter card used for storing pixel attributes. Initially used for storing the image data from the CPU and later used by the adapter to generate RGB signals for the monitor. The amount of memory should be sufficient to hold the attributes of all the pixels on the screen and depends on the pixel addressability as well as the color depth. Thus for a 8-bit image displayed at 640 x 480 mode, minimum amount of display memory required is approximately 1 MB. Graphics Controller A chip within the adapter card responsible for coordinates the activities of all other components of the card. For the earlier generation video cards, the controller simply passed on the data from the processor to the monitor after conversion. For modern accelerated video cards, the controller also has the capability of manipulating the image data independently of the central processor. Digital-to-Analog Converter The DAC actually converts the digital data stored in the display memory to analog voltage levels to drive the electron beams of the CRT. Problem-3 A monitor has pixel addressability of 800 X 600 and a color depth of 24-bits. Calculate the minimum amount of display memory required in its adapter card to display an image on the screen. A total of 24-bits are allocated to each pixel. So for a total of 800 X 600, total number of bits required is (800 X 600 X 24). To store these many bits, the amount of display memory required is (800 X 600 X 24)/(8 X 1024 X 1024) which rounded to the next highest integer becomes 2 MB. Accelerated Graphics Port (AGP) To combat the eventual saturation of the PCI bus with video information a new interface has been pioneered by Intel (http://developer.intel.com/technology/agp) , designed specifically for the video subsystem. AGP was developed in response to the trend towards greater and greater performance requirements for video. As software evolves and the computer use continuously into previously unexplored areas such as 3D acceleration and full-motion video playback, both the processor and the video adapter need to process more and more information. Another issue has been the increasing demands for video memory. Much larger memory are being required on video cards not just for the screen image but for the 3D calculations. This in turn makes the video card more expensive. AGP gets around these problems by two approaches. It provides a separate AGP slot on the motherboard connected to an AGP bus providing 530 MB/sec. It also utilizes a portion of the main memory known as the texture cache for storing pixel attributes thereby going beyond the limits of display memory on the adapter card. AGP is ideal for transferring the huge amount of data required for displaying 3D graphics and animation. AGP is considered a port and not a bus as it involves only two devices, the processor and the video card and is not expandable. AGP has helped remove bandwidth overheads from the PCI bus. The slot itself is physically similar to the PCI slot but is offset further from the edge of the motherboard. The Liquid Crystal Display Principle of Operation Liquid crystals were first discovered in the late 19 th century by Austrian botanist Freidrich Reinitzer and the term liquid crystal was coined by German physicist Otto Lehmann. Liquid crystal are transparent organic substances consisting of long rod-like molecules which in their natural state arrange themselves with their axes roughly parallel to each other. By flowing the liquid crystal over finely grooved surface it is possible to control the alignment of the molecules as they follow the alignment of the grooves. The first principle of an LCD consists of sandwiching a layer of liquid crystal between two finely grooved surfaces whose grooves are perpendicular to each other. Thus the molecules at the two surfaces are aligned perpendicular to each other and those at the intermediate layers are twisted by intermediate angles. Light in following the molecules is also twisted by 90 degrees as it passes through the liquid crystal. The second principle of an LCD depends on polarizing filters. Natural light waves are oriented at random angles. A polarizing filter acts as a net of finely parallel lines blocking all light except those whose waves are parallel to those lines. A second polarizer perpendicular to the first would therefore block all of the already polarized light. An LCD consists of two polarizing filters perpendicular to each other with a layer of twisted liquid crystals between them. Light after passing through the first polarizer is twisted through 90 deg. By the liquid crystal and passes out completely through the second polarizer. This gives us a lighted pixel. On applying an electric charge across the liquid crystal its molecular alignment is disturbed. In this case light is not twisted by 90 degrees by the liquid crystal and therefore blocked by the second polarizer. This gives us a dark pixel. Images are drawn on the screen using arrangements of these lighted and dark pixels.