On January 18th, France Telecom filed an IPR disclosure against Opus citing a single patent under non-royalty free terms. This raises a key question – what impact does this have on Opus? A close evaluation indicates that it has no impact on the Opus specification in any way.
A careful reading of the FT patent reveals that:
- The FT patent does not cover the Opus reference implementation because critical limitations of the claim are absent;
- The patent is directed to encoders, therefore it cannot affect the Opus specification, which only includes conformance tests for the decoder, and
- With a simple change, we can make non-infringement even more obvious.
Let’s expand on those points a bit. If you don’t want to hear about patent claims, you should stop reading this article now.
IETF IPR disclosures are a safe course of action for patent holders: they prevent unclean hands arguments or implied license grants. However, because the IETF requires specific patent numbers in these disclosures, we can analyze the claims. The patent in question is EP0743634B1, and the corresponding U.S. and other related foreign patents: “Method of adapting the noise masking level in an analysis-by-synthesis speech coder employing a short-term perceptual weighting filter”. It has a single independent claim, Claim 1. All of the other claims are “dependent claims” built on top of Claim 1. If Opus does not infringe Claim 1, it cannot infringe any other claim.
The FT patent doesn’t cover Opus
To establish infringement, all of the elements of a claim must be present in an implementation. Key elements of Claim 1 are not present in the Opus reference implementation, including, among others
- The way the bandwidth expansion coefficients are used. In Claim 1, two parameters ?1 and ?2 are used to shape the quantization noise added by the lossy compression by “minimizing the energy of an error signal resulting resulting from the filtering of the difference between the speech signal and the synthetic signal.” Opus doesn’t do this. Instead, the Opus encoder uses a single parameter
BWExp2to shape the noise, and uses a different parameter
BWExp1to shape the input signal, and also applies an additional gain to the filtered input to match the volume of the original.
- The optimization criterion. Opus doesn’t compute the “difference between the speech signal and the synthetic signal”. We want to code a signal that differs from the original speech, so we don’t compare what we code to the original speech. This is actually one of the main innovations in Opus: it’s the reason the SILK layer doesn’t need a post-filter like many other codecs do.
Thus Opus doesn’t perform the steps of the claim and cannot infringe the FT patent by definition. Of course this is not a legal opinion, but it doesn’t take a lawyer to figure this out. While we don’t know why FT disclosed this patent, we welcome the opportunity to evaluate such disclosures and remove any real or perceived encumbrances. This is one of the benefits of the IETF process.
The FT patent cannot threaten the specification
The FT patent covers perceptual noise weighting, which is specific to an encoder. The claim is about the “difference between the speech signal and the synthetic signal”, when a decoder — by definition — doesn’t have access to the input speech signal.
The Opus specification only demands specific behavior from decoders, leaving the encoder largely unspecified. Even if France Telecom were to continue to assert its patent against Opus, there’s no limit to what we could change in the encoder to avoid whatever theory they have. No deployed systems break. There’s no threat to the Opus standard. We can safely say that the FT patent doesn’t encumber Opus for this reason alone.
We can always make things even safer if needed
While we don’t believe that the Opus encoder ever infringed on this patent, we quickly realized there is a simple way to make non-infringement obvious even without analyzing complex DSP filters.
This can be done with a simple change (patch file) to the code in silk/float/noise_shape_analysis_FLP.c (an equivalent change can be made to the fixed-point version).
strength = FIND_PITCH_WHITE_NOISE_FRACTION * psEncCtrl->predGain; BWExp1 = BWExp2 = BANDWIDTH_EXPANSION / ( 1.0f + strength * strength ); delta = LOW_RATE_BANDWIDTH_EXPANSION_DELTA * ( 1.0f - 0.75f * psEncCtrl->coding_quality ); BWExp1 -= delta; BWExp2 += delta;
BWExp1 = BWExp2 = BANDWIDTH_EXPANSION; delta = LOW_RATE_BANDWIDTH_EXPANSION_DELTA * ( 1.0f - 0.75f * psEncCtrl->coding_quality ); BWExp1 -= delta; BWExp2 += delta;
Yup, that’s all of two lines changed. This makes the filter parameters depend only on the encoder’s bit-rate, which is clearly not, “spectral parameters obtained in the linear prediction analysis step,” as required by Claim 1. Below is the quality comparison between the original encoder and the modified encoder (using PESQ). As you can see, the difference is so small that it’s not worth worrying about.
View full post on Mozilla Hacks – the Web developer blog