This kind of task would be ideal for DirectRT. But if you need to do it in MediaLab, the onset parameter works (I believe) only for question wording--not media. You would need to build in 500ms of silence into your sound file using a sound editor like Audacity (or whatever). Alternatively, you could create a little HTML page that plays the sound file with a 500ms delay and just add that html file as a background. I can elaborate on any of this if it would be useful.