[Solved] How can I detect specific sound from live application?

Question

If you have a mic on whatever device you are using, you can use that to read whatever sound is coming out of your computer. Then, you can compare those audio frames that you are recording to a sound file of what sound you are looking for

Of course, this leaves it very vulnerable to background noise, so somehow you are going to have to filter it out.

Here is an example using the PyAudio and wave libraries:

import pyaudio
import wave

wf = wave.open("websitSound.wav", "rb")
amountFrames = 100 # just an arbitrary number; could be anything
sframes = wf.readframes(amountFrames)

currentSoundFrame = 0

chunk = 1024  # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16  # 16 bits per sample
channels = 2
fs = 44100  # Record at 44100 samples per second
seconds = 3

p = pyaudio.PyAudio()  # Create an interface to PortAudio


stream = p.open(format=sample_format,
                channels=channels,
                rate=fs,
                frames_per_buffer=chunk,
                input=True)


# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
    data = stream.read(chunk)
    if data == sframes[currentSoundFrame]:
        currentSoundFrame += 1
        if currentSoundFrame == len(sframes): #the whole entire sound was played
            print("Sound was played!")
    frames.append(data)

# Stop and close the stream 
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()

Accepted Answer

If you have a mic on whatever device you are using, you can use that to read whatever sound is coming out of your computer. Then, you can compare those audio frames that you are recording to a sound file of what sound you are looking for

Of course, this leaves it very vulnerable to background noise, so somehow you are going to have to filter it out.

Here is an example using the PyAudio and wave libraries:

import pyaudio
import wave

wf = wave.open("websitSound.wav", "rb")
amountFrames = 100 # just an arbitrary number; could be anything
sframes = wf.readframes(amountFrames)

currentSoundFrame = 0

chunk = 1024  # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16  # 16 bits per sample
channels = 2
fs = 44100  # Record at 44100 samples per second
seconds = 3

p = pyaudio.PyAudio()  # Create an interface to PortAudio


stream = p.open(format=sample_format,
                channels=channels,
                rate=fs,
                frames_per_buffer=chunk,
                input=True)


# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
    data = stream.read(chunk)
    if data == sframes[currentSoundFrame]:
        currentSoundFrame += 1
        if currentSoundFrame == len(sframes): #the whole entire sound was played
            print("Sound was played!")
    frames.append(data)

# Stop and close the stream 
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()