Have you tried using an "rt:aud,clear1000" kind of approach? e.g., asking to have the screen cleared 1000ms into the audio RT? The "clear" function may not be applicable to audio RTs yet but it'd be worth a try. Otherwise, the only workaround I can think of would be to convert the image to a video/movie/animated gif file that lasts the desired length you are looking for. Show the movie and THEN take the voice RT with the 0ms TIME value delay between. That way the image will simply clear on its own without any help need from DirectRT--it will simply be "done and gone" if the RT has not yet occurred by the time it ends.