We may still be far from what we see on television in shows such as CSI as the enhancement of audio recordings can still require hours of intense analysis, but with a combination of cutting edge audio enhancement technology and an examiner with extensive training and experience within the field of forensic audio enhancement, we can now achieve far greater results than what was possible just a couple of decades ago.
In his book, “The Acoustics of Crime: The New Science of Forensics Phonetics” which covers a wide range of subjects in relation to the analysis of audio recordings for evidentiary purposes, Mr. Harry Hollien, a noted expert in the field of forensic phonetics, describes a number of audio enhancement techniques that were developed in the 1990’s. He stated at that time, “… none of these techniques currently appear to be fully operational (i.e., valid and inexpensive); most are extremely complex, time consuming and costly. At least, they are too costly to apply to the routine material produced for law enforcement, business, security or military purposes.” Two of the techniques detailed by Hollien are cross channel correlation and adaptive filtering and with the development of computer programming over the course of the last twenty years, these techniques have now become common and invaluable tools to the forensic audio examiner.
In covert recordings, audible changes in the environment may occur unpredictably and interfere with the source of interest. For example, the recorder may have been placed inside or in close proximity of an air conditioning unit, making the conversation unintelligible. The noise coming from the air conditioner will be constant in frequency and amplitude throughout the recording, making it highly predictable; this is where adaptive filtering becomes very useful. This filter uses a signal predictor to identify these time-correlated (predictable) sounds within recordings. Once the time correlated sounds have been detected, the predictor can then memorize the pattern of the noise and estimate what will occur next. The predicted signal can then be subtracted from the original recording, attenuating the noise from the air conditioning unit and leaving the speech unaffected. Adaptive filters do tend to work better with low frequency type noises, such as electrical and vehicle motors, but can also be effective with the reverb and echo that can often occur in police interview recordings.
The presence of loud music, television or radio can often be present in covert recordings, intended conceal a potentially incriminating conversation. These types of interference can be a major issue, especially if they are of equal volume or louder than the source of interest and, unlike time-correlated sounds, they are unpredictable over time. Using cross-channel correlation, such interferences can be significantly attenuated either during the recording or during enhancement process. For cross channel correlation to be successful, a reference source is required; therefore, in a live situation two microphones would be needed. One microphone would be used to record the conversation and a second microphone to record the signal coming from the television or the radio for example. Both signals are then time aligned and the filter will determine which elements of the recordings have similar patterns and then attenuate these identical frequencies increasing the intelligibility of the source of interest. The interfering noise will often be within the same bandwidth frequency as the conversation; therefore, cross correlation filters will enable the examiner to control the amount of attenuation applied to avoid affecting the source of interest and introducing unnatural sounding artifacts produced by the filter from over-processing. Cross correlation filters can also be used in non-real time situations by time aligning a music track from a CD with the inaudible recording. This process is very time consuming as it requires perfect alignment of the two recordings by eye and ear and filtering of the reference signal to make it sound similar to the music obstructing the audibility of the conversation. Cross-correlation filters will have a delay feature built-in to help with the alignment of both recordings but it will only allow for adjustments of a few milliseconds. Due to these delay issues, cross-correlation filters will often produce better results with digital recordings rather than analogue recordings because of the inconsistent speed of the motors inside of an analogue recorder, which will create a greater delay between the reference signal and the covert recording. A study titled “Music & Noise Fingerprinting and Reference Cancellation Applied to Forensic Audio Enhancement” by Anil Alexander, Oscar Forth and Donald Tunstall was recently presented at the 46th AES Forensic Audio Conference, demonstrating an automated method for cross-correlation filtering which yielded very promising results.
Even though many environmental aspects of a covert recording scenario are outside of the operator’s control, many of the issues degrading the intelligibility of a recording can easily be avoided during the recording process. The quality of the recording will only be as good as the quality of the recorder and microphone being used. Many small and inexpensive recorders do not have the ability to accurately record the entire vocal bandwidth and the operator will often sacrifice recording quality for recording time. Poor recording levels can also introduce a poor signal to noise ratio should the levels be set too low. Every component of the recorder will introduce noise into the recording; this noise is called the noise floor. For the recording to be intelligible, the volume of the recorded source must be at a level significantly higher than that of the noise floor. If the levels are set too high, the recorded speech will sound distorted and crucial information will be inaudible.
Computer technology has provided the forensic audio examiner with an arsenal of invaluable tools such as adaptive and cross-correlation filters. With the combination of an optimum recording procedure and the audio enhancement process, significant improvement to the recordings intelligibility can be achieved and should certainly be considered. Providing the jury with an intelligible recording may greatly improve the impact of the evidence and potentially play a greater role within the outcome of a case.
About the Author: Sean Coetzee is the owner of Prism Forensics LLC which provides forensic audio enhancement and authentication. He is a graduate of Brighton University in the UK, a listed expert with the Los Angeles Superior Court and a Certified Forensic Consultant with the American College of Forensic Examiners Institute.