24th September 2004 by Derek Kite

This Week...

Kpdf supports table of contents. Krita adds scaling. Plastik is now the default style.
The multimedia architecture of Kde is about to change. For version 4, aRts will be replaced with something else. With what? The architecture would need to satisfy some basic requirements:
  • Play system sounds at roughly the same time as the event.
  • Play music, videos, dvd's reliably and smoothly.
That covers 90% of the multimedia usage. Any of the numerous sound architectures would satisfy these needs. Considering that the chosen architecture will be with us at least for the life of Kde 4, is that enough? Other needs include:
  • Network transparent sound. LTSP or NX, even plain X terminals need this capability.
  • Quick response times. Imagine scrolling through a menu with a screen reader. Quick start and stop of the sound stream is required.
  • High end sound capabilities. Applications exist that transform a well endowed desktop computer into a sound engineering workstation.
We would all be using (insert proprietary platform here) if technology were the only issue. Other importants requirements include:
  • The project needs to be actively maintained. aRts fails for this reason.
  • Open development process. Bug lists, development lists, access to developers, open to contributions
  • Appropriate licensing.
  • A stable api. Stable binary interface for the life of Kde 4 at least. That means the project would have to be willing to live with the Kde release patterns.
The good news is that all these requirements have been met. There are actively maintained projects that are suitable for high end, accessibility, network transparency, stable interfaces, appropriately licensed with open development processes. The technology is there. And it works. The bad news is that these aspects are spread between maybe half a dozen projects.

For that reason the Kde multimedia developers decided to write an interface that would connect to the various backend multimedia engines. Users or distributions could install and configure the engine that fits their needs. Kde would use it. A stable api could be used by Kde applications, and the various backends could continue to improve, jostle for position or integrate into the one multimedia engine for us all. Note that none of this is ready. Many details are yet to be decided, let alone implemented.

Here are links to some of the discussions on kde-multimedia: summary of the aKademy meetings, 'KDE4 MM, Proposal: libunixmm.so (about the Comparison: MAS, GStreamer, NMM), Vote for a MM system (Was: Re: summary of the aKademy meetings), kdemm backends & Helix. The lists are from newest to oldest, so start reading on the last item.
During the aKademy Developer Conference three multimedia engines were presented. In the coming weeks I will cover the highlights from these presentations. This week we learn about MAS, or the Media Application Server.

Leon Shiman presented MAS in Kde (Audio, Video).

The presentation started with two laptops connected by cable, and a window showing many small boxes interconnected with lines. Leon played a Berlioz mp3, reading the file from one laptop, playing the sound on the other. The data was sourced on one machine, then decoded and sound processed on the other. Networked sound is difficult because we can detect one dropped frame, whereas video can drop one frame in 30 without noticeable effect. How does MAS, or the Media Application Server work?

MAS consists of devices and interconnections controlled by a scheduler and clock. What devices are available? Here is a list. There is an OSS device which talks to the soundcard, a mixer device that can take any number of input streams, outputting to one destination. Codecs are implemented in devices. Network protocols and control are done by various devices, for example compression, defragmenting data packets into larger audio data buffers, endian switching, codecs, etc. As the presentation showed, these devices can be strung together on the local machine or over a network. The scheduler controls the data packet at every point, synchronizing the pathways.

Leon demonstrated starting a media player, which asks for source and destination ip addresses. The system is truly network transparent.

MAS can be used as a conference server. There is local control on input levels and noise detection.

An accessibility proof of concept was demonstrated. As a menu was selected, the screen reader produced a reading. As the menu was scrolled, the sound was clipped and the next menu item read. It was very fast and responsive showing the 'exquisite control' over the audio data throughout its pathway. I have a question about this. Is it using the Gnome AT framework, bonobo and all the Gnome IPC framework? Or a more direct path? The reason I ask is because the Gnome accessibility demonstration had issues with screen reader latency. The source isn't available for the accessibility demo, so I can't check. More on this below.

MAS is multiplatform, works on Solaris, AIX, BSD, Windows, Linux, ReflectiveX, 32 or 64 bit. MAS can control and transfer audio data between all these systems. MAS also has an RTP protocol implementation that is clean and Open Group verified. It is released under the MIT license. The framework is designed to be scalable, and works on embedded hardware. It is possible to use a server for the computational intensive parts, and send the data to small hardware for playback.

Where did MAS come from? X.org funded development of the specification in 1999, and for the last 2.5 years code has been available. There was perceived a need for an audio server independant of the windowing system. The goal is to release an implementation, then try to establish a standard. The architecture permits synchronizing the transport layer of X with MAS through and X extension, which would synchronize audio and video.

MAS is peer to peer and manages timing dependencies. The processes share a timing device called the master clock. There is one clock in one place, and MAS doesn't require a sound server on the platform to run. This allows computation on one machine, sourcing and playback on others.

The devices are dynamically loaded, and can be proprietary. The API is simple and unlike X doesn't require a toolkit layer. Assemblage of the device chain is manual or done by the controlling application.

From watching the demonstration, this looks to be an impressive accomplishment. So what are the advantages of MAS?
  • MIT license.
  • Fully network transparent.
  • Low latency.
And the disadvantages?
  • No ALSA modules, only OSS.
  • Missing some common codecs, ie. ogg.
  • Although free software, the development is somewhat closed. There didn't seem to be a cvs repository. There is a developer's list and bugs list available.


Commits 2491 by 196 developers, 345069 lines modified, 1500 new files
Open Bugs 7452
Open Wishes 6965
Bugs Opened 328 in the last 7 days
Bugs Closed 212 in the last 7 days

Commit Summary

Module Commits
Lines Developer Commits
Thierry Vignaud
Pedro Morais
Andrew Coles
Nicolas Goutte
Stephan Binner
Gilles Caulier
Stefan Asserhäll
Mark Kretschmann
David Faure
Richard Dale

Internationalization (i18n) Status

Language Percentage Complete
British English (en_GB)
Swedish (sv)
Portuguese (pt)
Dutch (nl)
Danish (da)
Estonian (et)
Spanish (es)
Italian (it)
Tamil (ta)
Brazilian Portuguese (pt_BR)

Bug Killers

Person Bugs Closed
Waldo Bastian
Tom Albers
Maks Orlovich
George Staikos
Luboš Luňák
Stephan Binner
David Faure
Aaron J. Seigo
Max Howell
Matt Rogers

No commits found