T-111.5350 Multimedia Programming Pablo Cesar
What is Multimedia? Multimedia APIs
Pablo Cesar [email protected] http://www.tml.hut.fi/~pcesar
T-111.5350 Multimedia Programming Pablo Cesar
T-111.5350 Multimedia Programming Pablo Cesar
Outline • Definitions of Multimedia • Multimedia Elements: – Multimedia Objects: Audio, video, graphics, text – Visual Style – Layout of those objects • Temporal dimension (animation, synchronization) • Graphical layout
– Application Logic: State of the application (e.g., Games) – User Interaction: Passive to authoring (Visualization, Navigation, WIMP concepts)
• Taxonomy of Authoring Content Formats – Expressive Power, Easiness of Use, Safety of Distribution, Interoperability
• Compiled Languages (C, C++) • Virtual Machine Languages (Java) • XML Based Languages (SMIL, XForms)
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Heller
– Interactivity – Aesthetics – Audience
Motion
Text
Au d Dis ienc e c Int iplin e e Qu racti v a Us lity ity e Ae fulne sth s eti s cs
xt
Sound
nte
Graphics
Abstraction
Representation
Multimedia
Co
• Context: includes properties such as
(Increasing abstraction)
Elaboration
– Elaboration: no edited information – Representation: edited or stylized – Abstraction: For example icons (most abstract)
Media Expression
Media Type
• Media Type: text, sound, graphics, and motion • Media Expression: describes the level of abstraction using the media
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Purchase
• Nature of the sign:
• Syntax / Arrangement: – – – – –
Individual Augmented Temporal Linear …
Augmented
Syntax
– Concrete iconic (photorealistic image) – Abstract iconic (map) – Symbolic (written word)
Individual
Temporal Linear Schematic Network
Mod
ality
Aura l Visu al
– Aural: audio – Visual: graphics
Symbolic
Concrete iconic
• Modality:
Abstract iconic
Sign
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Bulterman and Hardman • Media Assets: How to reference the multimedia objects of the presentation • Synchronization Composition: – Hard timing relationships, Relative structural ordering – Constraints
• Spatial Layout – Implicit (video), explicit, and dynamic
• Asynchronous Events – Content-based (timing) and user interaction (navigation)
• Adjunct/Replacement Content – Alternative content / adaptation content
• Performance Analysis – performance optimization for various delivery scenarios
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Vuorimaa • Multiple media – Types: text, graphics, animation, image, audio, and video – Source: Natural (e.g., video) vs. artificial (e.g., 3D graphics)
• Interaction – Stand-alone vs. Networked applications – Level of interaction (user interface, application, and service) – Amount of interaction • E-mail, video-on-demand, video conference, video • Game, and virtual reality
• Timing – External synchronization of different media (e.g., video and slides) – Internal timing within single medium (e.g. video) – Usually applications have time dimension (e.g., story line)
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Summary (1/2) • Multimedia Objects – Audio, video, graphics, text
• Visual Style • Layout of those objects – Temporal dimension (animation, synchronization) – Spatial layout
• User Interaction – Passive to authoring (Visualization, Navigation, WIMP concepts)
• Application Logic – State of the application (e.g., Games)
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Summary (2/2) ”Computer mediated applications that integrate and present different media objects, which are arrange spatially and temporally. Moreover, user interaction can control the behavior of the application.” Multimedia Objects
Visual Style
Temporal Dimension
Spatial Layout
Application Logic
User Interaction
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Elements Objects and Visual Style • Discrete Media – Icons: semantic images (e.g., stop symbol). Require the user to have previous knowledge – Graphics: computer generated. Can be 2D or 3D graphics depending on the goal – Images: natural source (e.g., photograph) – Text:
Size,
, Color
• Continuous Media – Motion Pictures (audio + video)
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Elements Spatial Layout (Pihkala 2003, Boll 2001) • Absolute – coordinates relative to origin
• Directional relations:
North
– define order in space
• Topological relations: – disjoint, touch, equals, inside of, covered by, contains, cover, and overlap
• Text Flow: – one-dimensional flow showed in two-dimensional area
East Contains
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Elements Temporal Dimension (Pihkala 2003, Boll 2001) • Temporal Models: – Definite: for example 6 seconds – Indefinite: for example, when user clicks – Parallel and Sequential relations (e.g., start these two videos at this moment or start this video after this other one)
• Animation: – Mixture of temporal dimension and spatial layout (i.e., position of an object changes in time) Time
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Elements User Interaction • Different Levels of Interaction (Aleem): – – – –
Passive: only visualization Reactive: limited interaction (e.g., Scroll Pane functionality). Proactive: choose a path or make selections (e.g., Button). Reciprocal: corresponds to user authoring of information
• Interaction Models (Boll): – Navigational: choice to decide where to go next – Design: user can modify the visual style of the presentation (e.g., colors) – Movie: user can control the global time (e.g., VCR capabilities)
T-111.5350 Multimedia Programming Pablo Cesar
Multimedia Elements Application Logic • Traditionally multimedia presentations did not have that much logic: – Virtual visit to a museum, DVD menus...
• Real – time interactive systems: – Virtual Reality worlds, games
• Application Logic needs of a programming language (if, case, goto...) – Compiled Languages: C, C++ – Virtual Machine: Java – World Wide Web, MPEG-4, Director: scripting
T-111.5350 Multimedia Programming Pablo Cesar
Taxonomy of Authoring Content Formats Requirements • • • • • • •
Supported Media Types: audio, video, text, graphics, and animation Arrangement of the signs: spatial and temporal Interaction: passive, reactive, proactive, and reciprocal Difficulty to use (threshold) Expressional power (i.e., ceiling) Safety of Distribution Interoperability Threshold Ceiling
Interoperability
Safety of Distribution
Compiled Languages
+++
+++
+
+
VM Languages
++
++
++
++
XML based Languages
+
+
+++
+++
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Normally, used for system software (e.g., operating system) and resource demanding services: C, C++ Pro • Efficient approach • Expressive power (closer to computer hardware)
Con • Interoperability (each service has to be compiled to target device) • Less safer to distribute (it can include harmful code)
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages System Software • ”User Interface Software Tools” (1995, Myers) defines a layered model • Applications implemented using higher-level tools • Toolkit: a library of widgets used by applications • Windowing System: helps user to monitor and control different contexts (input and output functionality)
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Windowing System (1/3) KDE Desktop Environment Window Manager Kwin
Window Manager
One per Session Enlight.
Fluxbox
Xlib Base Layer
Gnome Libraries
Sawfish
KDE Libraries Toolkit Qt Toolkit
Gnome Desktop Environment
Toolkit GIMP Toolkit (GTK) GDK Xlib
X Network Protocol
X Network Protocol XServer
GLib
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Windowing System (2/3) • X-Window – X.org: fonts management, graphics card support, composite functionality – Desktop environments: KDE, GNOME (Toolkits + Applications) – Window Managers: FluxBox, Sawfish…
• DirectFB – XDirectFB: X-Window Support on DirectFB – DirectFBGL
• Microsoft Windows – DirectX
• Mac – Video: QucikTime – 3D: OpenGL – 2D: QuickDraw
T-111.5350 Multimedia Programming Pablo Cesar
Screenshots – X.org
T-111.5350 Multimedia Programming Pablo Cesar
Screenshots – DirectFB
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Windowing System: DirectFB DirectFB Application User Space DirectFB
Chipset Driver Kernel Space
Framebufffer Driver
Framebufffer
Timing and Mode
Accelerator
Hardware
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Windowing System: Direct-X Win32 Application
Win32 Application
Direct3D API GDI HAL Device Device Driver Interface (DDI) Graphics Hardware
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Toolkits • Toolkits provide – Interaction: to handle user input – Canvas Operations: both rendering region, canvas, and graphics primitives – Set of Widgets: predefined user interface elements (e.g., Button) – Graphical Layout: to control the location of the widgets
• Examples: QT, GTK • Virtual Toolkit – Device independent Toolkit – Mapped to actual Toolkit in the device – Example: AWT
T-111.5350 Multimedia Programming Pablo Cesar
Compiled Languages Media Providers • • • • •
Audio/Video: Xine, MPlayer Television: linuxtv Games: SDL Other Languages: For example libflash 3D graphics: – OpenGL – OpenGL ES
• Home media platforms: LIMMBO, MythTV
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages A Virtual Machine is an abstraction of the computing environment. JVM + APIs • •
• •
Pro Platform independence Safer to distribute (restricts potential security attacks) Expressive power (programming language) Well documented APIs
Con • Heavy applications (because of VM concept) • Difficult of use (programming language) • Less powerful than compiled languages
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages Java Overview • Nowadays, trying to target all kind of computer devices • Editions: – Java 2 Enterprise Edition (J2EE): for servers and enterprise computers – Java 2 Standard Edition (J2SE): for servers and personal computers – Java 2 Micro Edition (J2ME): for embedded devices, PDAs, mobile phones, and Digital television set-top boxes – Java Card: for smart cards
• Profile
Profile
– Requirements for a specific vertical market of devices (set of APIs)
• Configuration – Minimum platform for a horizontal grouping of devices (VM + core APIs)
MIDP CLDC
Configuration KVM
T-111.5350 Multimedia Programming Pablo Cesar Servers
Personal Computers
Optional Packages
Java 2 Enterprise Edition (J2EE)
TV STBs High End PDAs
Mobile Phones Low end PDAs
Smart Cards
Optional Packages
Java 2 Standard Edition (J2SE)
Optional Packages
Personal Profile Foundation Profile
Optional Packages
CDC
CLDC
Java Virtual Machine
MIDP
KVM Java 2 Micro Edition(J2ME)
Java Card Card VM
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages Multimedia • User interface development (AWT/Swing) – Layout: Grid, North-South-East-West, Flow – Set of Widgets: Button, TextArea – User Interaction: awt.ui.* (Mouse, Keyboard…)
• Video/Audio and Synchronization (JMF) – Manager, Player, Data Source, and Controller
• 3D Graphics – Java3D – Java wrappers for OpenGL
• Different Devices – Television: MHP/OCAP/ACAP/ARIB -> GEM – Handheld: MIDP
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages User Interface Development
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages JMF (1/2) Retrieves the actual media data
Decodes and plays the media data
Implements the state machine
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages JMF (2/2) • Unrealised: when it does not have all the information to acquire the needed resources • Realised: when it has all the information to acquire the needed resources • Prefetched: when it has all the needed resources, and has already prefetched enough media data to start playing immediately • Started: when it is actually playing the media
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages 3D Graphics • Java3D – – – –
Completely new API for stand-alone 3D graphics applications Can use any underlying architecture (Direct-X, OpenGL...) It might not be the most efficient approach Developers have to learn a new API
• Java wrappers of OpenGL – – – – –
Functionality from OpenGL Developers knows the API already Only wrappers: uses Java Native Interface (JNI) Much intercommunication between layers (Java -> C) API is not standardised yet (Java Specification Requests) • JSR 231: OpenGL • JSR 239: OpenGL ES
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages J2ME • Defines two Configurations:
TV STBs High End PDAs
Mobile Phones Low end PDAs
– CDC: High end consumer devices • RAM Java Memory: around 2MB • ROM Java Memory: around 2.5MB
– CLDC: Low end consumer devices • Processor:16 bit/16 MHz or higher • Java total memory: 160-512 KB
• CDC (Connected Device) – Personal Profile • Adds support for lightweight AWT
– Foundation Profile • Basic application APIs (no GUI)
• CLDC (Connected Limited Device)
Optional Packages
Personal Profile Foundation Profile
Optional Packages
CDC
CLDC
JVM
KVM
MIDP
– Mobile Information Device Profile (MDIP) • Application APIs + GUI APIs
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages Handheld
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages Television Interoperable Application Application Manager
Interoperable Application Transport Protocol(s)
Data
Sun Java HAVi APIs APIs
Interoperable Application DAVIC APIs
DVB Specific APIs
Java Virtual Machine Operating System, drivers, firmware System Software
Data
T-111.5350 Multimedia Programming Pablo Cesar
VM Languages Summary Supported Media Types Text, Graphics Video, Audio Arrangement of the signs Spatial Temporal Interaction Different Devices Handheld Television
AWT JMF AWT Java Threads AWT Events MIDP GEM
T-111.5350 Multimedia Programming Pablo Cesar
XML Based Languages Declarative programming language (only what has to be done, not how). Major contributor is W3C Pro • Easiness of use (you can even use a text editor) • Interoperability (only needs a compatible browser) • Safest to distribute
Con • Expressive power (quite limited, not a programming language!) • Use of scripting for application logic (or not?) • Needs of a service under it (browser)
T-111.5350 Multimedia Programming Pablo Cesar
XML Based Languages Overview Document Document XML Based Language Document XML Based Language XML Based Language
• HTML & XHTML • Multimedia – SMIL, Timesheets
• User Interface – XForms, XIML
• Vector Graphics – SVG
• Voice – VoiceXML
T-111.5350 Multimedia Programming Pablo Cesar
XML Based Languages HTML & XHTML • • • •
HTML HTML 4.01: (24 Dec. 1999) W3C Recommendation Lingua franca for publishing hypertext on the WWW. Non-proprietary Can be created by a wide range of tools: – Text editors – Authoring tools
•
• • • •
– To only describe the structure of the document (CSS formatting)
•
XHTML 1.0 – Well formed documents – Proper nesting – ...
All kind of features (mixed together): – UI components – Fonts – Lists
XHTML XHTML 1.0 (26 Jan. 2000, revised 1 Aug. 2002) W3C Recommendation XHTML 2.0: (22 July 2004) W3C Working Draft Reformulation of HTML 4 in XML Intention
•
XHTML 2.0 – No backwards compatible – Reduces scripting – Includes XForms and XML Events
T-111.5350 Multimedia Programming Pablo Cesar
XML Based Languages XHTML Modularization and XHTML 1.1 Other XHTML Modules
Core Modules Structure Text Hypertext List
Other W3C Modules
Applet Presentation Edit Bi−directional Text Forms Tables Basic Forms Basic Tables Image Object Client−side Image Map Server−Side Image Map
Intrinsic Events Frames Target IFrame Name Identification Legacy Metainformation Scripting Stylesheet Style Attribute Link Base
Private Modules
Other XHTML Modules
Core Modules Structure Text Hypertext List
Other W3C Modules Ruby Annotation
Applet Presentation Edit Bi−directional Text Forms Tables Basic Forms Basic Tables Image Object Client−side Image Map Server−Side Image Map
Intrinsic Events Frames Target IFrame Name Identification Legacy Metainformation Scripting Stylesheet Style Attribute Link Base
Private Modules
T-111.5350 Multimedia Programming Pablo Cesar
XML Based Languages Multimedia SMIL • SMIL 2.0 (07 Aug. 2001) W3C Recommendation • Easy to write, like HTML • Doesn’t define media formats, only integrates them • ,