psychopy.sound - play various forms of sound¶

Sound¶

PsychoPy currently supports a choice of sound engines: PTB, pyo, sounddevice or pygame. You can select which will be used via the audioLib preference. sound.Sound() will then refer to one of SoundPTB, SoundDevice, SoundPyo or SoundPygame. This preference can be set on a per-experiment basis by importing preferences, and setting the audioLib option to use.

• The PTB library has by far the lowest latencies and is strongly recommended (requires 64 bit Python3)

• The pyo library is, in theory, the highest performer, but in practice it has often had issues (at least on macOS) with crashes and freezing of experiments, or causing them not to finish properly. If those issues aren’t affecting your studies then this could be the one for you.

• The sounddevice library looks like the way of the future. The performance appears to be good (although this might be less so in cases where you have complex rendering being done as well because it operates from the same computer core as the main experiment code). It’s newer than pyo and so more prone to bugs and we haven’t yet added microphone support to record your participants.

• The pygame backend is the oldest and should work without errors, but has the least good performance. Use it if latencies for your audio don’t matter.

Sounds are actually generated by a variety of classes, depending on which “backend” you use (like pyo or sounddevice) and these different backends can have slightly different attributes, as below.

The user should typically do:

from psychopy.sound import Sound


but the class that gets imported will then be an alias of one of the Sound Classes described below.

PTB audio latency¶

PTB brings a number of advantages in terms of latency.

The first is that is has been designed specifically with low-latency playback in mind (rather than, say, on-the-fly mixing and filtering capabilities). Mario Kleiner has worked very hard get the best out of the drivers available on each operating system and, as a result, with the most aggressive low-latency settings you can get a sound to play in “immediate” mode with typically in the region of 5ms lag and maybe 1ms precision. That’s pretty good compared to the other options that have a lag of 20ms upwards and several ms variability.

BUT, on top of that, PTB allows you to preschedule your sound to occur at a particular point in time (e.g. when the trigger is due to be sent or when the screen is due to flip) and the PTB engine will then prepare all the buffers ready to go and will also account for the known latencies in the card. With this method the PTB engine is capable of sub-ms precision and even sub-ms lag!

Of course, capable doesn’t mean it’s happening in your case. It can depend on many things about the local operating system and hardware. You should test it yourself for your kit, but here is an example of a standard Win10 box using built-in audio (not a fancy audio card):

Fig. 18 Sub-ms audio timing with standard audio on Win10. Yellow trace is a 440 Hz tone played at 48 kHz with PTB engine. Cyan trace is the trigger (from a Labjack output). Gridlines are set to 1 ms.

The most precise way to use the PTB audio backend is to preschedule the playing of a sound. By doing this PTB can actually take into account both the time taken to load the sound (it will preload ready) and also the time taken by the hardware to start playing it.

To do this you can call play() with an argument called when. The when argument needs to be in the PsychToolBox clock timebase which can be accessed by using psychtoolbox.GetSecs() if you want to play sound at an arbitrary time (not in sync with a window flip)

For instance:

import psychtoolbox as ptb
from psychopy import sound

mySound = sound.Sound('A')
now = ptb.GetSecs()
mySound.play(when=now+0.5)  # play in EXACTLY 0.5s


or using Window.getFutureFlipTime(clock=’ptb’) if you want a synchronized time:

import psychtoolbox as ptb
from psychopy import sound, visual

mySound = sound.Sound('A')

win = visual.Window()
win.flip()
nextFlip = win.getFutureFlipTime(clock='ptb')

mySound.play(when=nextFlip)  # sync with screen refresh


The precision of that timing is still dependent on the PTB Audio Latency Modes and can obviously not work if the delay before the requested time is not long enough for the requested mode (e.g. if you request that the sound starts on the next refresh but set the latency mode to be 0 (which has a lag of around 300 ms) then the timing will be very poor.

PTB Audio Latency Modes¶

When using the PTB backend you get the option to choose the Latency Mode, referred to in PsychToolBox as the reqlatencyclass.

PsychoPy uses Mode 3 in as a default, assuming that you want low latency and you don’t care if other applications can’t play sound at the same time (don’t listen to iTunes while running your study!)

The modes are as follows:

0 : Latency not important

For when it really doesn’t matter. Latency can easily be in the region of 300ms!

1 : Share low-latency access

Tries to use a low-latency setup in combination with other applications. Latency usually isn’t very good and in MS Windows the sound you play must be the same sample rate as any other application that is using the sound system (which means you usually get restricted to exactly 48000 instead of 44100).

2 : Exclusive mode low-latency

Takes control of the audio device you’re using and dominates it. That can cause some problems for other apps if they’re trying to play sounds at the same time.

3 : Aggressive exclusive mode

As Mode 2 but with more aggressive settings to prioritise our use of the card over all others. This is the recommended mode for most studies

4 : Critical mode

As Mode 3 except that, if we fail to be totally dominant, then raise an error rather than just accepting our slightly less dominant status.

Sound Classes¶

PTB Sound¶

class psychopy.sound.backend_ptb.SoundPTB(value='C', secs=0.5, octave=4, stereo=- 1, volume=1.0, loops=0, sampleRate=None, blockSize=128, preBuffer=- 1, hamming=True, startTime=0, stopTime=- 1, name='', autoLog=True, syncToWin=None)[source]

Play a variety of sounds using the new PsychPortAudio library

Parameters
• value – note name (“C”,”Bfl”), filename or frequency (Hz)

• secs – duration (for synthesised tones)

• octave – which octave to use for note names (4 is middle)

• stereo – -1 (auto), True or False to force sounds to stereo or mono

• volume – float 0-1

• loops – number of loops to play (-1=forever, 0=single repeat)

• sampleRate – sample rate for synthesized tones

• blockSize – the size of the buffer on the sound card (small for low latency, large for stability)

• preBuffer – integer to control streaming/buffering - -1 means store all - 0 (no buffer) means stream from disk - potentially we could buffer a few secs(!?)

• hamming – boolean (default True) to indicate if the sound should be apodized (i.e., the onset and offset smoothly ramped up from down to zero). The function apodize uses a Hanning window, but arguments named ‘hamming’ are preserved so that existing code is not broken by the change from Hamming to Hanning internally. Not applied to sounds from files.

• startTime – for sound files this controls the start of snippet

• stopTime – for sound files this controls the end of snippet

• name – string for logging purposes

• autoLog – whether to automatically log every change

• syncToWin – if you want start/stop to sync with win flips add this

_EOS(reset=True, log=True)[source]

Function called on End Of Stream

_channelCheck(array)[source]

Checks whether stream has fewer channels than data. If True, ValueError

_getDefaultSampleRate()[source]

Check what streams are open and use one of these

pause()[source]

Stop the sound but play will continue from here if needed

play(loops=None, when=None, log=True)[source]

Start the sound playing

setSound(value, secs=0.5, octave=4, hamming=None, log=True)[source]

Set the sound to be played.

Often this is not needed by the user - it is called implicitly during initialisation.

Parameters
value: can be a number, string or an array:
• If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

• It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’. …). Then you may want to specify which octave.

• Or a string could represent a filename in the current location, or mediaLocation, or a full path combo

• Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform

secs: duration (only relevant if the value is a note name or

a frequency value)

octave: is only relevant if the value is a note name.

Middle octave of a piano is 4. Most computers won’t output sounds in the bottom octave (1) and the top octave (8) is generally painful

property status

status gives a simple value from psychopy.constants to indicate NOT_STARTED, STARTED, FINISHED, PAUSED

Psychtoolbox sounds also have a statusDetailed property with further info

stop(reset=True, log=True)[source]

property stream

Read-only property returns the the stream on which the sound will be played

property track

The track on the master stream to which we belong

SoundDevice Sound¶

class psychopy.sound.backend_sounddevice.SoundDeviceSound(value='C', secs=0.5, octave=4, stereo=- 1, volume=1.0, loops=0, sampleRate=None, blockSize=128, preBuffer=- 1, hamming=True, startTime=0, stopTime=- 1, name='', autoLog=True)[source]

Play a variety of sounds using the new SoundDevice library

Parameters
• value – note name (“C”,”Bfl”), filename or frequency (Hz)

• secs – duration (for synthesised tones)

• octave – which octave to use for note names (4 is middle)

• stereo – -1 (auto), True or False to force sounds to stereo or mono

• volume – float 0-1

• loops – number of loops to play (-1=forever, 0=single repeat)

• sampleRate – sample rate (for synthesized tones)

• blockSize – the size of the buffer on the sound card (small for low latency, large for stability)

• preBuffer – integer to control streaming/buffering - -1 means store all - 0 (no buffer) means stream from disk - potentially we could buffer a few secs(!?)

• hamming – boolean (default True) to indicate if the sound should be apodized (i.e., the onset and offset smoothly ramped up from down to zero). The function apodize uses a Hanning window, but arguments named ‘hamming’ are preserved so that existing code is not broken by the change from Hamming to Hanning internally. Not applied to sounds from files.

• startTime – for sound files this controls the start of snippet

• stopTime – for sound files this controls the end of snippet

• name – string for logging purposes

• autoLog – whether to automatically log every change

_EOS(reset=True)[source]

Function called on End Of Stream

_channelCheck(array)[source]

Checks whether stream has fewer channels than data. If True, ValueError

pause()[source]

Stop the sound but play will continue from here if needed

play(loops=None, when=None)[source]

Start the sound playing

Parameters

when (not used) – Included for compatibility purposes

setSound(value, secs=0.5, octave=4, hamming=None, log=True)[source]

Set the sound to be played.

Often this is not needed by the user - it is called implicitly during initialisation.

Parameters
value: can be a number, string or an array:
• If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

• It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’. …). Then you may want to specify which octave.

• Or a string could represent a filename in the current location, or mediaLocation, or a full path combo

• Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform

secs: duration (only relevant if the value is a note name or

a frequency value)

octave: is only relevant if the value is a note name.

Middle octave of a piano is 4. Most computers won’t output sounds in the bottom octave (1) and the top octave (8) is generally painful

stop(reset=True)[source]

property stream

Read-only property returns the the stream on which the sound will be played

Pyo Sound¶

class psychopy.sound.backend_pyo.SoundPyo(value='C', secs=0.5, octave=4, stereo=True, volume=1.0, loops=0, sampleRate=44100, bits=16, hamming=True, start=0, stop=- 1, name='', autoLog=True)[source]

Create a sound object, from one of MANY ways.

value: can be a number, string or an array:
• If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

• It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’, …). Then you may want to specify which octave as well

• Or a string could represent a filename in the current location, or mediaLocation, or a full path combo

• Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform

By default, a Hanning window (5ms duration) will be applied to a generated tone, so that onset and offset are smoother (to avoid clicking). To disable the Hanning window, set hamming=False.

secs:

Duration of a tone. Not used for sounds from a file.

startfloat

Where to start playing a sound file; default = 0s (start of the file).

stopfloat

Where to stop playing a sound file; default = end of file.

octave: is only relevant if the value is a note name.

Middle octave of a piano is 4. Most computers won’t output sounds in the bottom octave (1) and the top octave (8) is generally painful

stereo: True (= default, two channels left and right),

False (one channel)

volume: loudness to play the sound, from 0.0 (silent) to 1.0 (max).

Adjustments are not possible during playback, only before.

loopsint

How many times to repeat the sound after it plays once. If loops == -1, the sound will repeat indefinitely until stopped.

sampleRate (= 44100): if the psychopy.sound.init() function has been

called or if another sound has already been created then this argument will be ignored and the previous setting will be used

bits: has no effect for the pyo backend

hamming: boolean (default True) to indicate if the sound should

be apodized (i.e., the onset and offset smoothly ramped up from down to zero). The function apodize uses a Hanning window, but arguments named ‘hamming’ are preserved so that existing code is not broken by the change from Hamming to Hanning internally. Not applied to sounds from files.

play(loops=None, autoStop=True, log=True, when=None)[source]

Starts playing the sound on an available channel.

loopsint

How many times to repeat the sound after it plays once. If loops == -1, the sound will repeat indefinitely until stopped.

when: not used but included for compatibility purposes

For playing a sound file, you cannot specify the start and stop times when playing the sound, only when creating the sound initially.

Playing a sound runs in a separate thread i.e. your code won’t wait for the sound to finish before continuing. To pause while playing, you need to use a psychopy.core.wait(mySound.getDuration()). If you call play() while something is already playing the sounds will be played over each other.

stop(log=True)[source]

Stops the sound immediately

pygame Sound¶

class psychopy.sound.backend_pygame.SoundPygame(value='C', secs=0.5, octave=4, sampleRate=44100, bits=16, name='', autoLog=True, loops=0, stereo=True, hamming=False)[source]

Create a sound object, from one of many ways.

Parameters
value: can be a number, string or an array:
• If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

• It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’, …). Then you may want to specify which octave as well

• Or a string could represent a filename in the current location, or mediaLocation, or a full path combo

• Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform

secs: duration (only relevant if the value is a note name or a

frequency value)

octave: is only relevant if the value is a note name.

Middle octave of a piano is 4. Most computers won’t output sounds in the bottom octave (1) and the top octave (8) is generally painful

sampleRate(=44100): If a sound has already been created or if the

bits(=16): Pygame uses the same bit depth for all sounds once

initialised

Parameters
• value (int, str or array_like) –

• If it’s a number between 37 and 32767 then a tone will be generated at that frequency in Hz.

• It could be a string for a note (‘A’, ‘Bfl’, ‘B’, ‘C’, ‘Csh’, …). Then you may want to specify which octave as well.

• Or a string could represent a filename in the current location, or mediaLocation, or a full path combo.

• Or by giving an Nx2 numpy array of floats (-1:1) you can specify the sound yourself as a waveform.

• secs (float) – Duration of sound in seconds (only relevant if the value is a note name or a frequency value).

• octave

fadeOut(mSecs)[source]

fades out the sound (when playing) over mSecs. Don’t know why you would do this in psychophysics but it’s easy and fun to include as a possibility :)

getDuration()[source]

Get’s the duration of the current sound in secs

getVolume()[source]

Returns the current volume of the sound (0.0:1.0)

play(fromStart=True, log=True, loops=None, when=None)[source]

Starts playing the sound on an available channel.

Parameters
fromStartbool

Not yet implemented.

logbool

Whether or not to log the playback event.

loopsint

How many times to repeat the sound after it plays once. If loops == -1, the sound will repeat indefinitely until stopped.

when: not used but included for compatibility purposes

Notes

If no sound channels are available, it will not play and return None. This runs off a separate thread i.e. your code won’t wait for the sound to finish before continuing. You need to use a psychopy.core.wait() command if you want things to pause. If you call play() whiles something is already playing the sounds will be played over each other.

setVolume(newVol, log=True)[source]

Sets the current volume of the sound (0.0:1.0)

stop(log=True)[source]

Stops the sound immediately