audio_low_latency_output_win.h revision 5821806d5e7f356e8fa4b058a389a808ea183019
1// Copyright (c) 2012 The Chromium Authors. All rights reserved. 2// Use of this source code is governed by a BSD-style license that can be 3// found in the LICENSE file. 4 5// Implementation of AudioOutputStream for Windows using Windows Core Audio 6// WASAPI for low latency rendering. 7// 8// Overview of operation and performance: 9// 10// - An object of WASAPIAudioOutputStream is created by the AudioManager 11// factory. 12// - Next some thread will call Open(), at that point the underlying 13// Core Audio APIs are utilized to create two WASAPI interfaces called 14// IAudioClient and IAudioRenderClient. 15// - Then some thread will call Start(source). 16// A thread called "wasapi_render_thread" is started and this thread listens 17// on an event signal which is set periodically by the audio engine to signal 18// render events. As a result, OnMoreData() will be called and the registered 19// client is then expected to provide data samples to be played out. 20// - At some point, a thread will call Stop(), which stops and joins the 21// render thread and at the same time stops audio streaming. 22// - The same thread that called stop will call Close() where we cleanup 23// and notify the audio manager, which likely will destroy this object. 24// - Initial tests on Windows 7 shows that this implementation results in a 25// latency of approximately 35 ms if the selected packet size is less than 26// or equal to 20 ms. Using a packet size of 10 ms does not result in a 27// lower latency but only affects the size of the data buffer in each 28// OnMoreData() callback. 29// - A total typical delay of 35 ms contains three parts: 30// o Audio endpoint device period (~10 ms). 31// o Stream latency between the buffer and endpoint device (~5 ms). 32// o Endpoint buffer (~20 ms to ensure glitch-free rendering). 33// - Note that, if the user selects a packet size of e.g. 100 ms, the total 34// delay will be approximately 115 ms (10 + 5 + 100). 35// - Supports device events using the IMMNotificationClient Interface. If 36// streaming has started, a so-called stream switch will take place in the 37// following situations: 38// o The user enables or disables an audio endpoint device from Device 39// Manager or from the Windows multimedia control panel, Mmsys.cpl. 40// o The user adds an audio adapter to the system or removes an audio 41// adapter from the system. 42// o The user plugs an audio endpoint device into an audio jack with 43// jack-presence detection, or removes an audio endpoint device from 44// such a jack. 45// o The user changes the device role that is assigned to a device. 46// o The value of a property of a device changes. 47// Practical/typical example: A user has two audio devices A and B where 48// A is a built-in device configured as Default Communication and B is a 49// USB device set as Default device. Audio rendering starts and audio is 50// played through the device B since the eConsole role is used by the audio 51// manager in Chrome today. If the user now removes the USB device (B), it 52// will be detected and device A will instead be defined as the new default 53// device. Rendering will automatically stop, all resources will be released 54// and a new session will be initialized and started using device A instead. 55// The net effect for the user is that audio will automatically switch from 56// device B to device A. Same thing will happen if the user now re-inserts 57// the USB device again. 58// 59// Implementation notes: 60// 61// - The minimum supported client is Windows Vista. 62// - This implementation is single-threaded, hence: 63// o Construction and destruction must take place from the same thread. 64// o All APIs must be called from the creating thread as well. 65// - It is recommended to first acquire the native sample rate of the default 66// input device and then use the same rate when creating this object. Use 67// WASAPIAudioOutputStream::HardwareSampleRate() to retrieve the sample rate. 68// - Calling Close() also leads to self destruction. 69// - Stream switching is not supported if the user shifts the audio device 70// after Open() is called but before Start() has been called. 71// - Stream switching can fail if streaming starts on one device with a 72// supported format (X) and the new default device - to which we would like 73// to switch - uses another format (Y), which is not supported given the 74// configured audio parameters. 75// - The audio device must be opened with the same number of channels as it 76// supports natively (see HardwareChannelCount()) otherwise Open() will fail. 77// - Support for 8-bit audio has not yet been verified and tested. 78// 79// Core Audio API details: 80// 81// - The public API methods (Open(), Start(), Stop() and Close()) must be 82// called on constructing thread. The reason is that we want to ensure that 83// the COM environment is the same for all API implementations. 84// - Utilized MMDevice interfaces: 85// o IMMDeviceEnumerator 86// o IMMDevice 87// - Utilized WASAPI interfaces: 88// o IAudioClient 89// o IAudioRenderClient 90// - The stream is initialized in shared mode and the processing of the 91// audio buffer is event driven. 92// - The Multimedia Class Scheduler service (MMCSS) is utilized to boost 93// the priority of the render thread. 94// - Audio-rendering endpoint devices can have three roles: 95// Console (eConsole), Communications (eCommunications), and Multimedia 96// (eMultimedia). Search for "Device Roles" on MSDN for more details. 97// - The actual stream-switch is executed on the audio-render thread but it 98// is triggered by an internal MMDevice thread using callback methods 99// in the IMMNotificationClient interface. 100// 101// Threading details: 102// 103// - It is assumed that this class is created on the audio thread owned 104// by the AudioManager. 105// - It is a requirement to call the following methods on the same audio 106// thread: Open(), Start(), Stop(), and Close(). 107// - Audio rendering is performed on the audio render thread, owned by this 108// class, and the AudioSourceCallback::OnMoreData() method will be called 109// from this thread. Stream switching also takes place on the audio-render 110// thread. 111// - All callback methods from the IMMNotificationClient interface will be 112// called on a Windows-internal MMDevice thread. 113// 114// Experimental exclusive mode: 115// 116// - It is possible to open up a stream in exclusive mode by using the 117// --enable-exclusive-audio command line flag. 118// - The internal buffering scheme is less flexible for exclusive streams. 119// Hence, some manual tuning will be required before deciding what frame 120// size to use. See the WinAudioOutputTest unit test for more details. 121// - If an application opens a stream in exclusive mode, the application has 122// exclusive use of the audio endpoint device that plays the stream. 123// - Exclusive-mode should only be utilized when the lowest possible latency 124// is important. 125// - In exclusive mode, the client can choose to open the stream in any audio 126// format that the endpoint device supports, i.e. not limited to the device's 127// current (default) configuration. 128// - Initial measurements on Windows 7 (HP Z600 workstation) have shown that 129// the lowest possible latencies we can achieve on this machine are: 130// o ~3.3333ms @ 48kHz <=> 160 audio frames per buffer. 131// o ~3.6281ms @ 44.1kHz <=> 160 audio frames per buffer. 132// - See http://msdn.microsoft.com/en-us/library/windows/desktop/dd370844(v=vs.85).aspx 133// for more details. 134 135#ifndef MEDIA_AUDIO_WIN_AUDIO_LOW_LATENCY_OUTPUT_WIN_H_ 136#define MEDIA_AUDIO_WIN_AUDIO_LOW_LATENCY_OUTPUT_WIN_H_ 137 138#include <Audioclient.h> 139#include <audiopolicy.h> 140#include <MMDeviceAPI.h> 141 142#include <string> 143 144#include "base/compiler_specific.h" 145#include "base/memory/scoped_ptr.h" 146#include "base/threading/platform_thread.h" 147#include "base/threading/simple_thread.h" 148#include "base/win/scoped_co_mem.h" 149#include "base/win/scoped_com_initializer.h" 150#include "base/win/scoped_comptr.h" 151#include "base/win/scoped_handle.h" 152#include "media/audio/audio_io.h" 153#include "media/audio/audio_parameters.h" 154#include "media/base/media_export.h" 155 156namespace media { 157 158class AudioManagerWin; 159 160// AudioOutputStream implementation using Windows Core Audio APIs. 161// TODO(henrika): Remove IMMNotificationClient implementation now that we have 162// AudioDeviceListenerWin; currently just disabled since extraction is extremely 163// advanced. 164class MEDIA_EXPORT WASAPIAudioOutputStream 165 : public IMMNotificationClient, 166 public AudioOutputStream, 167 public base::DelegateSimpleThread::Delegate { 168 public: 169 // The ctor takes all the usual parameters, plus |manager| which is the 170 // the audio manager who is creating this object. 171 WASAPIAudioOutputStream(AudioManagerWin* manager, 172 const AudioParameters& params, 173 ERole device_role); 174 // The dtor is typically called by the AudioManager only and it is usually 175 // triggered by calling AudioOutputStream::Close(). 176 virtual ~WASAPIAudioOutputStream(); 177 178 // Implementation of AudioOutputStream. 179 virtual bool Open() OVERRIDE; 180 virtual void Start(AudioSourceCallback* callback) OVERRIDE; 181 virtual void Stop() OVERRIDE; 182 virtual void Close() OVERRIDE; 183 virtual void SetVolume(double volume) OVERRIDE; 184 virtual void GetVolume(double* volume) OVERRIDE; 185 186 // Retrieves the number of channels the audio engine uses for its internal 187 // processing/mixing of shared-mode streams for the default endpoint device. 188 static int HardwareChannelCount(); 189 190 // Retrieves the channel layout the audio engine uses for its internal 191 // processing/mixing of shared-mode streams for the default endpoint device. 192 // Note that we convert an internal channel layout mask (see ChannelMask()) 193 // into a Chrome-specific channel layout enumerator in this method, hence 194 // the match might not be perfect. 195 static ChannelLayout HardwareChannelLayout(); 196 197 // Retrieves the sample rate the audio engine uses for its internal 198 // processing/mixing of shared-mode streams for the default endpoint device. 199 static int HardwareSampleRate(ERole device_role); 200 201 // Returns AUDCLNT_SHAREMODE_EXCLUSIVE if --enable-exclusive-mode is used 202 // as command-line flag and AUDCLNT_SHAREMODE_SHARED otherwise (default). 203 static AUDCLNT_SHAREMODE GetShareMode(); 204 205 bool started() const { return render_thread_.get() != NULL; } 206 207 // Returns the number of channels the audio engine uses for its internal 208 // processing/mixing of shared-mode streams for the default endpoint device. 209 int GetEndpointChannelCountForTesting() { return format_.Format.nChannels; } 210 211 private: 212 // Implementation of IUnknown (trivial in this case). See 213 // msdn.microsoft.com/en-us/library/windows/desktop/dd371403(v=vs.85).aspx 214 // for details regarding why proper implementations of AddRef(), Release() 215 // and QueryInterface() are not needed here. 216 STDMETHOD_(ULONG, AddRef)(); 217 STDMETHOD_(ULONG, Release)(); 218 STDMETHOD(QueryInterface)(REFIID iid, void** object); 219 220 // Implementation of the abstract interface IMMNotificationClient. 221 // Provides notifications when an audio endpoint device is added or removed, 222 // when the state or properties of a device change, or when there is a 223 // change in the default role assigned to a device. See 224 // msdn.microsoft.com/en-us/library/windows/desktop/dd371417(v=vs.85).aspx 225 // for more details about the IMMNotificationClient interface. 226 227 // The default audio endpoint device for a particular role has changed. 228 // This method is only used for diagnostic purposes. 229 STDMETHOD(OnDeviceStateChanged)(LPCWSTR device_id, DWORD new_state); 230 231 // Indicates that the state of an audio endpoint device has changed. 232 STDMETHOD(OnDefaultDeviceChanged)(EDataFlow flow, ERole role, 233 LPCWSTR new_default_device_id); 234 235 // These IMMNotificationClient methods are currently not utilized. 236 STDMETHOD(OnDeviceAdded)(LPCWSTR device_id) { return S_OK; } 237 STDMETHOD(OnDeviceRemoved)(LPCWSTR device_id) { return S_OK; } 238 STDMETHOD(OnPropertyValueChanged)(LPCWSTR device_id, 239 const PROPERTYKEY key) { 240 return S_OK; 241 } 242 243 // DelegateSimpleThread::Delegate implementation. 244 virtual void Run() OVERRIDE; 245 246 // Issues the OnError() callback to the |sink_|. 247 void HandleError(HRESULT err); 248 249 // The Open() method is divided into these sub methods. 250 HRESULT SetRenderDevice(); 251 HRESULT ActivateRenderDevice(); 252 bool DesiredFormatIsSupported(); 253 HRESULT InitializeAudioEngine(); 254 255 // Called when the device will be opened in shared mode and use the 256 // internal audio engine's mix format. 257 HRESULT SharedModeInitialization(); 258 259 // Called when the device will be opened in exclusive mode and use the 260 // application specified format. 261 HRESULT ExclusiveModeInitialization(); 262 263 // Converts unique endpoint ID to user-friendly device name. 264 std::string GetDeviceName(LPCWSTR device_id) const; 265 266 // Called on the audio render thread when the current audio stream must 267 // be re-initialized because the default audio device has changed. This 268 // method: stops the current renderer, releases and re-creates all WASAPI 269 // interfaces, creates a new IMMDevice and re-starts rendering using the 270 // new default audio device. 271 bool RestartRenderingUsingNewDefaultDevice(); 272 273 // Contains the thread ID of the creating thread. 274 base::PlatformThreadId creating_thread_id_; 275 276 // Our creator, the audio manager needs to be notified when we close. 277 AudioManagerWin* manager_; 278 279 // Rendering is driven by this thread (which has no message loop). 280 // All OnMoreData() callbacks will be called from this thread. 281 scoped_ptr<base::DelegateSimpleThread> render_thread_; 282 283 // Contains the desired audio format which is set up at construction. 284 // Extended PCM waveform format structure based on WAVEFORMATEXTENSIBLE. 285 // Use this for multiple channel and hi-resolution PCM data. 286 WAVEFORMATPCMEX format_; 287 288 // Copy of the audio format which we know the audio engine supports. 289 // It is recommended to ensure that the sample rate in |format_| is identical 290 // to the sample rate in |audio_engine_mix_format_|. 291 base::win::ScopedCoMem<WAVEFORMATPCMEX> audio_engine_mix_format_; 292 293 bool opened_; 294 295 // Set to true as soon as a new default device is detected, and cleared when 296 // the streaming has switched from using the old device to the new device. 297 // All additional device detections during an active state are ignored to 298 // ensure that the ongoing switch can finalize without disruptions. 299 bool restart_rendering_mode_; 300 301 // Volume level from 0 to 1. 302 float volume_; 303 304 // Size in bytes of each audio frame (4 bytes for 16-bit stereo PCM). 305 size_t frame_size_; 306 307 // Size in audio frames of each audio packet where an audio packet 308 // is defined as the block of data which the source is expected to deliver 309 // in each OnMoreData() callback. 310 size_t packet_size_frames_; 311 312 // Size in bytes of each audio packet. 313 size_t packet_size_bytes_; 314 315 // Size in milliseconds of each audio packet. 316 float packet_size_ms_; 317 318 // Length of the audio endpoint buffer. 319 size_t endpoint_buffer_size_frames_; 320 321 // Defines the role that the system has assigned to an audio endpoint device. 322 ERole device_role_; 323 324 // The sharing mode for the connection. 325 // Valid values are AUDCLNT_SHAREMODE_SHARED and AUDCLNT_SHAREMODE_EXCLUSIVE 326 // where AUDCLNT_SHAREMODE_SHARED is the default. 327 AUDCLNT_SHAREMODE share_mode_; 328 329 // The channel count set by the client in |params| which is provided to the 330 // constructor. The client must feed the AudioSourceCallback::OnMoreData() 331 // callback with PCM-data that contains this number of channels. 332 int client_channel_count_; 333 334 // Counts the number of audio frames written to the endpoint buffer. 335 UINT64 num_written_frames_; 336 337 // Pointer to the client that will deliver audio samples to be played out. 338 AudioSourceCallback* source_; 339 340 // An IMMDeviceEnumerator interface which represents a device enumerator. 341 base::win::ScopedComPtr<IMMDeviceEnumerator> device_enumerator_; 342 343 // An IMMDevice interface which represents an audio endpoint device. 344 base::win::ScopedComPtr<IMMDevice> endpoint_device_; 345 346 // An IAudioClient interface which enables a client to create and initialize 347 // an audio stream between an audio application and the audio engine. 348 base::win::ScopedComPtr<IAudioClient> audio_client_; 349 350 // The IAudioRenderClient interface enables a client to write output 351 // data to a rendering endpoint buffer. 352 base::win::ScopedComPtr<IAudioRenderClient> audio_render_client_; 353 354 // The audio engine will signal this event each time a buffer becomes 355 // ready to be filled by the client. 356 base::win::ScopedHandle audio_samples_render_event_; 357 358 // This event will be signaled when rendering shall stop. 359 base::win::ScopedHandle stop_render_event_; 360 361 // This event will be signaled when stream switching shall take place. 362 base::win::ScopedHandle stream_switch_event_; 363 364 // Container for retrieving data from AudioSourceCallback::OnMoreData(). 365 scoped_ptr<AudioBus> audio_bus_; 366 367 DISALLOW_COPY_AND_ASSIGN(WASAPIAudioOutputStream); 368}; 369 370} // namespace media 371 372#endif // MEDIA_AUDIO_WIN_AUDIO_LOW_LATENCY_OUTPUT_WIN_H_ 373