The interplay between independent claims and dependent claims can have a substantial affect on claim interpretation. Most discussed is the issue of claim differentiation, where for example a more specific feature in a dependent claim is used to justify an interpretation in the independent claim that is in some way broader that the dependent claim’s specific feature. But that is not the only way in which dependent and independent claims can impact one another. This post reviews a patent application from Soundhound related to speech processing. In this case, the dependent claims are used to justify the broadest reasonable interpretation of a claim term that determines the outcome of the appeal.
The case is Appeal 2023-002655, Application 16/558,096.
claim 1 on appeal is reproduced below.
1. A vehicle-mounted apparatus for processing speech, the apparatus comprising:
an audio interface for receiving audio data from an audio capture device;
an image interface for receiving image data from an image capture device, wherein the image data includes mouth area image data of a person speaking;
a speech processing module for parsing an utterance of the person speaking based on the audio data and the image data; and
a speaker preprocessing module for receiving the image data and generating a speaker feature vector for the person speaking in order to predict phoneme data, wherein a lip feature vector is derived from the mouth area image data and the speech processing module uses the speaker feature vector and the lip feature vector for parsing the utterance.
As issue was the phrase “parsing an utterance”. In particular, Soundhound was arguing that even under the Broadest Reasonable Interpretation, the phrase required more than just speech transcription. Specifically, Soundhound argued that parsing an utterance, in light of the specification, means determining the intent of a command (and/or command data) associated with the captured speech or utterance, not merely transcribing the uttered speech. This was relevant because the cited art did the latter, not the former.
The Board noted that not only did the specification fail to provide such a specific definition (as it referred to determining a command, not an intent), but that the use of the phrase in other claims confirmed the proper interpretation (internal citations omitted).
Further, other claims of the application can be valuable sources of enlightenment as to the meaning of language of a rejected claim. We note that claim 28, which depends directly from independent claim 23, recites, “wherein parsing the utterance includes: providing the phoneme data to a language model of the speech processing module; predicting a transcript of the utterance using the language model; and determining a control command for the vehicle using the transcript.” Because claim language generally is used consistently throughout an application, the usage of the phrase “parsing the utterance,” in one claim may illuminate the meaning of the same phrase in other claims. Thus, consistent with the Specification and the language used across the claims, we agree with the Examiner that the phrase “parsing the utterance” does not require determining the speaker’s intent, beyond determining a command.
So, remember that the proper interpretation under BRI is in light of the specification, which includes other claims that also use the terms under evaluation.