Preview only show first 10 pages with watermark. For full document please download

Audio

   EMBED


Share

Transcript

5 Audio Introduction „ Sound is a relatively new capability for PCs „ Hardware requirements „ Software can provide the functions of a recording studio, including multi-track recording, mixing and effects, on a desktop computer „ Audio is an important element in most multimedia applications SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 1 Digital Audio „ Sampled sound … … „ every nth fraction of a second, a sample of sound is taken and stored digitally (bits & bytes) it was proven mathematically in 1948, that you can accurately represent any analog signal with a digital sampling rate equal to twice the maximum frequency contained in the source Analog-to-digital (ADC) converters … appeared early 1980’s in the telephone industry „ Waveform (continuous) vs digital form (discontinues) „ Sampling rate (frequency) … … sampling rates of 11.025, 22.05, and 44.1 KHz (samples per second) are standard in the audio industry the higher the sampling rate, the better the fidelity … Digital Audio „ Sampling time … „ Sample resolution … … … „ the speed at which the ADC converts the amplitude to a numeric sample value # of bits to represent the amplitude value 8-bit sampling provides only 256 levels 16-bit sampling provides 65,536 levels Pulse Code Modulation (PCM) … sample values are stored sequentially in a file SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 2 … Digital Audio „ Rounding-off (quantization) … „ Clipping … … „ produces unwanted background hissing noise for amplitude greater than the intervals available produces distortion Digital-to-analog (DAC) converters … … the output of a DAC is a stepped wave “staircase” filters are used to smooth the wave „ Optimum combination of sampling rate and resolution „ Digital recording and playback … … … 8-bit ADC & DAC at 11.025 KHz: 8-bit @ 22.05 KHz: 16-bit @ 44.1 KHz: telephone-like quality AM radio quality CD-audio quality … Digital Audio „ Mono Audio „ Stereo Audio … … two channels double the space requirements „ Quality of digital audio „ Storage requirements (mono) … … „ bytes per second = (sample rate * bits per sample) / 8 Stereo * 2 Size of 1 minute recording … … … 44.1 KHz 22.05 KHz 22.05 KHz SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 16-bit 16-bit 8-bit Stereo Stereo Mono 10.58 Mb 5.29 Mb 1.32 Mb 3 Sound Boards „ Sample size … „ Sample rate … „ 8-bit, 16-bit, 24-bit 8, 11.025, 22.05, 44.1, 48 , 96 KHz Amplifiers (watts per channel) … 2, 4 „ Digital signal processor (DSP) „ On-Board Connectors „ Hardware full-duplex support enables simultaneous record and playback … Sound Boards „ MIDI Synthesis … … FM synthesis sounds are generated from mathematical formulas Wavetable synthesis a stored bank of sampled notes recorded from actual instruments „ Effects Engine „ Environmental 3D Positional Audio SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 4 Synthesized Audio „ Synthesized Music … „ Musical Instrument Digital Interface (MIDI) … … … … … „ Creating sounds that resemble those of conventional musical instruments a standard that allows music synthesizers from different manufactures to communicate with each other MIDI interface was first adopted in 1983 MIDI is both a hardware and a software specification what is performed on one instrument can be played by any other instrument standard 128 sounds (patches) MIDI files are significantly smaller than wave files MIDI „ MIDI port „ MIDI messages … status bytes or data bytes „ MIDI channels „ MIDI kit „ MIDI mapper „ MIDI sequencer SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 5 … MIDI „ Playing MIDI file „ MIDI synthesis „ FM synthesis … … „ sounds are generated from mathematical formulas (algorithms) Yamaha OPL-III chipset Wavetable synthesis … a stored bank of sampled notes recorded from actual instruments Digital Signal Processor (DSP) „ First appeared in TurtleBeach boards „ Off-load CPU „ Programmable „ Hardware compression and decompression „ Sound effects „ Multi-function boards SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 6 Speakers „ Built-in amplifier that delivers 10-20 watts per channel „ Two cones: Tweeter and Woofer „ Headphone jacks „ Separate bass and treble controls „ Magnetically shielded „ Three-piece speakers … „ stand alone subwoofer gives depth to low frequency sounds Dolby Digital 5.1 surround sound Microphones „ Operating principle … … „ Directionality … … … „ Dynamic Condenser Omnidirectional Unidirectional Bidirectional Specifications … … … … … Sensitivity Overload characteristics Linearity, or Distortion Frequency response Noise „ Microphone placement „ Studio techniques „ Mixer SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 7 Oversampling „ Improve the apparent quality (fidelity) of sound „ Faster rate of DAC than the original ADC sampling rate „ Technique … … … 4x, 8x, and 16x oversampling In case of 4x oversampling, three 0-value samples are inserted between each pair of original samples an interpolation filter calculates appropriate values and replaces the 0s DirectX „ A set of APIs that allow applications to access multimedia hardware „ Components … „ DirectDraw, DirectSound, Direct3D, DirectPlay, DirectInput, DirectAnimation, DirectShow DirectSound … … DirectSound provides mixing of audio streams, hardware acceleration, and direct access to the sound device Enable application developers to take advantage of extended services offered by sound cards and their associated drivers SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 8 The Red Book Standard „ CD-audio music market „ ISO 10149 „ 16-bit @ 44.1 kHz allow accurate reproduction of all sounds that human can hear File Format „ Each of the three major platforms has its own sound file format: AIFF for MacOS, WAV for Windows, and AU for Unix „ WAVE (.WAV) files … … … „ Several formats … „ Microsoft standard a header with information about the sampling process used to create the file, followed by a stream of digital sound data in stereo 16-bit wave file, every other pair of bytes contains the data for one channel .voc, au, aif, snd, … Many utilities convert other formats to .WAV … … WAV ripper Streambox Ripper SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 9 The Structure of a Wave File „ WAV files are RIFF files „ Resource Interchange File Format (RIFF) … … … used by Microsoft to store many types of multimedia resource files tagged file format chunk „ „ „ chunk ID size data 4 bytes 4 bytes Sample Wave File Position (Dec) Size (bytes) Content Comment 0 4 ‘RIFF’ 4 4 27796 8 4 ‘WAVE’ 12 4 ‘fmt ‘ next chunk id 16 4 16 format chunk size 20 2 1 PCM format 22 2 1 no of channels 24 4 22050 sampling rate 28 4 22050 bytes per second file size - 8 32 2 1 bytes per sample 34 2 8 bits per sample 36 4 ‘data’ next chunk id 40 4 27760 size of wave data 44 x x digitized audio data SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 10 Waveform Audio Recording Techniques „ Audio noise … … mute all unused inputs move the audio card to the last slot in the bus away from the video card „ Use a good analog audio mixer „ Use the best microphone you can afford „ The acoustic environment „ Microphone placement „ Capture with high quality „ Analog recording … … Cassette DAT Digital Audio Editing „ Mixing sound from more than one source „ Adding effects such as: echo, reverberation, chorus, etc. „ Looping to extend duration „ Commercial waveform editing tools … … Sound Forge Wave Studio SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 11 … Digital Audio Editing „ Manipulate the audio in your media files „ Applying processes and effects … … … Audio filters are used to remove noise and unwanted frequency components Effects, such as reverb and envelope shaping are used to alter the quality of sounds Digital technology permits new kinds of alteration, including time stretching and pitch alteration Text-to-Speech „ Table-based … „ Rule-based … … … „ dictionary storing text and audio (PCM) for every word no recording rules to convert text to a set of ‘sound descriptor’ sound descriptors are converted to digital audio signals Exceptions AT&T Labs’ Natural Voices Text-to-Speech Engine … AT&T Natural Voices - the WAV File edition SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 12 Voice Recognition „ Talking to the Web „ Problems … … … „ High hardware requirements Accuracy Disruptive nature of speech Major Players … … … Dragon Systems: Naturally Speaking IBM: ViaVoice Lernout & Hauspie (L&H): Voice Xpress 5 Audio 5.2 Audio Compression SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 13 Audio Compression „ No standard technique for compressing and decompressing waveform audio files „ Hardware implementation vs. software „ Encoders and Decoders (Codec) „ Compression ratio … Bitate: the average number of bits that one second of audio data will consume „ „ For a digital audio signal from a CD, the bit-rate is 1411.2 kbps With MPEG-2 AAC, CD-like sound quality is achieved at 96 kbps „ Lossless vs. Lossy „ Proprietary vs. open codecs … Audio Compression „ Differential Pulse Code Modulation (Delta Modulation) … „ if the sampling rate is fast enough, the difference between two successive values might be no more than one bit (1+, 0-) Adaptive Differential Pulse Code Modulation (ADPCM) … … … … … extension of Delta Modulation more than one bit to describe the difference 4 or 8 bits 4-bit ADPCM can provide the equivalent of about 12-bit PCM 8-bit ADPCM rivals 16-bit PCM SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 14 MPEG „ MPEG (pronounced M-peg), which stands for Moving Picture Experts Group „ MPEG is a working group in a subcommittee of ISO/IEC in charge of developing international standards for compression, decompression, processing, and coded representation of moving pictures, audio, and their combination „ ISO-MPEG Audio Layer-3 (IS 11172-3 and IS 13818-3) MP3 Audio coding project done by the Fraunhofer IIS-A starting 1987 Using MPEG audio, one may achieve a typical data reduction of … … … … 1:4 by Layer 1 (384 kbps for a stereo signal) 1:6...1:8 by Layer 2 (256..192 kbps for a stereo signal) 1:10...1:12 by Layer 3 (128..112 kbps for a stereo signal) still maintaining the original CD sound quality MPEG Audio Layers Sound Quality sound quality bandwidth mode bitrate reduction ratio telephone sound 2.5kHz mono 8 kbps 96:1 better than shortwave 4.5 kHz mono 16 kbps 48:1 better than AM radio 7.5 kHz mono 32 kbps 24:1 similar to FM radio 11 kHz stereo 56...64 kbps 26...24:1 near-CD 15 kHz stereo 96 kbps 16:1 CD >15 kHz stereo 112..128kbps 14..12:1 SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 15 MPEG Details Other MPEG Standards „ MPEG-2 AAC (ISO/IEC 13818-7) provides … … „ MPEG-4 (ISO/IEC 14496-3) provides … … … … … „ a very high-quality audio coding standard for 1 to 48 channels at sampling rates of 8 to 96 kHz, with multichannel, multilingual, and multiprogram capabilities AAC works at bitrates from 8 kbit/s for a monophonic speech signal up to in excess of 160 kbit/s/channel for very-high-quality coding that permits multiple encode/decode cycles coding and composition of natural and synthetic audio objects scalability of the bitrate of an audio bitstream scalability of encoder or decoder complexity Structured Audio: A universal language for score-driven sound synthesis TTSI: An interface for text-to-speech conversion systems MPEG-7 (ISO/IEC 15938) will provide … … standardized descriptions and description schemes of audio structures and sound content a language to specify such descriptions and description schemes SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 16 MPEG-2 AAC Details 5 Audio 5.3 Audio on the Web SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 17 Technologies & Tools „ Create „ Distribute „ Manage „ Playback „ Limited Bandwidth „ Codecs to compress or encode audio for real-time or local playback over the Internet and corporate intranets „ Audio Streaming … … … … … … Streamed data is transmitted by a server application and received and played in real-time by client applications These applications can start playing back audio as soon as enough data has been received and stored in the receiving station’s buffer A streamed file is simultaneously downloaded and played, but leaves behind no physical file on the viewer's machine UNICAST BROADCAST MULTICAST … Technologies & Tools „ HOW IP MULTICASTING WORKS … … … … … MBone has been in place since 1992 and has grown to more than 2000 subnets the user instructs the computer's network card to listen to a particular IP address for the multicast The computer originating the multicast does not need to know who has decided to receive it The bulk of the work that needs to be done to enable multicasting is performed by the network's routers and the protocols they run To signal that they want to receive a multicast, clients join the group to which the multicast is directed (groups are dynamic) SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 18 … Technologies & Tools „ Major Players (proprietary solutions) … RealNetworks „ „ … … Helix Universal Server RealOne player Apple „ QuickTime Streaming Server, QuickTime Broadcaster „ QuickTime Player Microsoft „ Windows Media Services … … … „ „ „ „ „ „ Support and deliver live broadcasts and streaming-stored multimedia content Bit rates from 28 kbps to 10 Mbps Intelligent Streaming – ensures that users will receive the highest quality regardless of connection speed or network congestion Windows Media Rights Manager (DRM) Windows Media Audio 8 (Near-CD quality at just 48 Kbps) Windows Media Audio 9 Windows Media Encoder Windows Media Player File Extensions … … .WMA for files that include audio compressed with the Windows Media Audio codec Content compressed with other codecs should be stored in file and use the .ASF extension … Technologies & Tools „ Motion Picture Experts Group (MPEG) … … … „ MP3 (MPEG 1, layer 3) Open audio compression codec MP3 players Windows Media-based content can be streamed over a network in two ways … Using a Windows Media server „ „ „ … The ideal way to stream content is from Windows 2000 Server running Microsoft Windows Media Services Provides features such as live broadcasting and intelligent streaming, which automatically adjusts the bit rate of each client stream according to current available bandwidth You can stream using the Microsoft Media Server (MMS) protocol, or Hypertext Transfer Protocol (HTTP) Using a Web server „ may be the best option if you plan to offer only a few audio clips SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 19 Embedding Audio in Web Pages „ Helper applications „ Bowser plug-ins „ Downloading and playing audio … … LiveAudio . … Reading List „ Text Chapter 3 „ Check: ftp://ics-sukairi.pc.ccse.kfupm.edu.sa/swe423/5-Audio/ SWE 423 - MULTIMEDIA SYSTEMS Dr. Abdallah Al-Sukairi - KFUPM 20