What C2PA does and does not solve

Along with questions from an initial dive into the 2.0 spec.

This essay and all new writing is also published in my newsletter which you can sign up for here.

We’re just a few months into 2024 and there does not appear to be any sign of slow down in generative AI advancements with the quality of synthetic media increasing every day across all modalities. As election seasons progress throughout the world, we can expect to see an uptick of synthetic deepfakes making their rounds online and with it increased concerns of how to distinguish between physical and AI generated realities.

In the past few months, C2PA, touted as a technical standard for establishing the provenance of digital media, has received increased attention as a potential solution with a number of tech companies including Meta, Google and OpenAI shipping integrations or announcing plans to do so. Adobe has been leading the charge promoting the use of Content Credentials, built using C2PA, as “nutrition labels” that can be used by consumers to understand the provenance of media and most importantly distinguish between real media and AI generated deepfakes. At the same time, the approach has been criticized for being too easily bypassed by bad actors and insufficient for proving provenance.

As the standard is evaluated and adopted, it is worthwhile asking: what are the exact problems that the C2PA standard actually solves for and what does it not solve for?

What is C2PA?

From the C2PA website:

The Coalition for Content Provenance and Authenticity (C2PA) addresses the prevalence of misleading information online through the development of technical standards for certifying the source and history (or provenance) of media content.

From the C2PA spec:

This specification describes the technical aspects of the C2PA architecture; a model for storing and accessing cryptographically verifiable information whose trustworthiness can be assessed based on a defined trust model. Included in this document is information about how to create and process a C2PA Manifest and its components, including the use of digital signature technology for enabling tamper-evidence as well as establishing trust.

My simplified explanation of C2PA in a single sentence:

C2PA is a standard for digitally signing provenance claims about a media asset (i.e. image, video, audio, etc.) such that any application implementing the standard can verify that a specific identity certified the claim.

A claim consists of a list of assertions about an asset which can include:

  • The creator’s identity
  • The creation tool (i.e. camera, Adobe Photoshop, OpenAI’s DALL-E 3)
  • The time of creation
  • The “ingredient” assets used and actions performed on them during creation

A key property of C2PA is that a claim is directly tied to an asset and can travel with the asset regardless of where the asset is used whether it be on a social media platform (i.e. Facebook, Instagram) or a chat application (i.e. Whatsapp).

A claim is digitally signed by the hardware or software used to create the asset. The signature and claim are then packaged up into a manifest file that can be embedded into the asset and/or stored separately for later retrieval. An application can then fetch the manifest for an asset, verify the signer and evaluate the trustworthiness of the signer based on the requirements of the application. For example, Adobe’s Content Credentials Verify tool checks uploaded assets for a manifest and if the application trusts the signer the UI will display a Content Credentials badge.

But, how would an application know if the assertions in a claim are actually correct and that an asset was actually created using, for example, a camera and not a generative AI model? The short answer is that an application wouldn’t – the C2PA manifest for an asset alone is insufficient to guarantee that claim’s assertions are correct. So, what’s the point of a signed provenance claim if we can’t guarantee that its assertions are true?

Note: A C2PA workflow can provide stronger guarantees around the correctness of certain assertions, such as the creation tool, if a claim is signed by hardware with a secure enclave. In 2023, Leica released a camera with C2PA signing support which would allow the camera itself to sign a claim asserting that the specific camera model was used to take a photo. However, since hardware with C2PA signing support is not widely deployed as of early 2024, the rest of this post will focus instead on scenarios where claims are signed by software.

What does C2PA not solve for?

The trust model defined in the C2PA spec explicitly states that a consumer (both the human and an agent acting on behalf of the human) is able to verify the identity of the signer for a claim, but must decide on their own whether the claim’s assertions are correct using the signer’s identity and any other available information as evidence. Given this, the criticism that the use of C2PA alone would be unable to actually establish irrefutable provenance of an asset is fair – you cannot guarantee that a provenance claim for an asset accurately represents the asset’s provenance just based on signature verification alone.

What does C2PA solve for?

The problem that C2PA actually solves for is establishing non-repudiable attribution for provenance claims about a media asset in an interoperable manner. The claim signature links the claim to a specific identity which establishes attribution and non-repudiation – anyone can verify that the identity made the claim and not someone else. The standardized data model for the manifest that packages up a claim and its signature allows any application supporting the standard to process a claim and attribute it to an identity which establishes interoperability.

Is all of this still useful if the claim may not represent the asset’s true provenance? I would argue yes.

Today, as an asset is shared and re-shared across different platforms, the authentic context for the asset is lost.

Consider the following scenario: a photo taken from country A could be published by a news organization on social media and then copied and re-shared across accounts and platforms such that a consumer is misled by a third party to believe that the re-shared photo shows an event happening in the conflict zone of country B.

When the news organization publishes the photo they are making a claim that the photo was taken at a specific time and place. The claim is a part of the authentic context defined by the news organization. The problem is that this context is lost as the photo travels around the Internet making it more difficult for a consumer to determine who the photo was originally published by and allowing others to make claims about the photo that sow confusion.

A signed claim would not irrefutably prove that the photo was taken at a certain time and place, but it would preserve the authentic context for the photo – it would preserve the claim that the news organization made about the photo’s provenance. Consumers would still have to judge for themselves whether they trust the news organization’s claim, but they would know for certain who the photo was originally published by and that the claim was made by the news organization and no one else which keeps the news organization accountable for their claims.

Issues to address

A few issues come to mind that are worth addressing:

Many platforms strip metadata, which would include C2PA manifests, from uploaded media assets

Potential solutions to explore:

  • Platforms can retain manifests for uploaded assets
  • Publishers can store manifests in a public registry such that they can be looked up for assets even if platforms strip the manifest for uploaded assets

Bad actors can strip the C2PA manifest before uploading an asset

Potential solutions to explore:

  • Platforms can start flagging assets without manifests and/or prefer assets with manifests in algorithmic feeds

The challenge here is that most assets will not have manifests initially similar to how most websites did not have SSL certificates during the early days of the HTTP -> HTTPS transition so before this becomes a viable approach it will need to be much easier for publishing tools to create manifests. The more aggressive warnings shown by browsers for websites without SSL certificates were enabled in large part by services like Let’s Encrypt dramatically reducing the friction in obtaining a SSL certificate.

Bad actors can create their own C2PA manifest with incorrect assertions for an asset

Potential solutions to explore:

  • Platforms can use trust lists and scoring to evaluate whether the signer should be trusted in a specific context

Questions

A few questions I am thinking about:

How will the C2PA trust list that initial validator implementations will use by default be decided?

The current version of the C2PA spec uses a PKI based on x509 certificates. A claim signer must have a signing certificate and a validator will check that the signing certificate fulfills one of the following conditions:

  • The signing certificate can be linked back through a chain of signatures to a root Certificate Authority (CA) certificate that is on the default C2PA list of trusted CAs
  • The signing certificate can be linked back through a chain of signatures to a root CA certificate that is on the validator’s custom list of trusted CAs
  • The signing certificate is on the validator’s custom list of trusted signing certificates

The members of the default C2PA trust list will have a lot of power because claims signed by signing certificates that they issue will be trusted by default by any C2PA software.

How will major content distribution platforms (i.e. Facebook, Instagram, etc.) with C2PA validator integrations decide on the trust anchors to add to their trust list?

The platforms will be able to customize their trust lists and specify additional CAs (and possibly ignore CAs on the C2PA trust list?) that they trust. But, in practice, will they just use the default C2PA trust list?

How will creator tools get signing certificates that are accepted by validators?

Will developers have to go through a manual approval process? Will this go through a small number of CAs or will there be many CAs that issue different types of certificates that reflect the type of validation performed similar to Domain Validated vs. Organization Validated SSL certificates? Will automated services be possible similar to what Let’s Encrypt did for SSL certificate issuance?

The claim signer is expected to be the hardware or software used to create an asset and not the creator themselves. In certain cases, could it make sense for the claim signer to be the creator themselves?

The Creator Assertions Working Group (CAWG) is working on standardizing how a creator can create a signed assertion documenting themselves as the asset creator in a C2PA claim. But, this results in an additional signature separate from the claim signature. The intent seems to be for the hardware or software based claim signer to verify this identity assertion before including it in the claim. This workflow makes sense in the scenarios where the hardware or software is not completely controlled by the creator i.e. a camera with a secure enclave, a backend service for a web application. However, this workflow makes less sense for non-hardware, locally run creator tools (applies to many open source tools) without a backend that can secure a private key for signing. In this scenario, since the creator is in complete control of the tool, it seems like it would make more sense for the claim signer to just be the creator rather than having a separate claim signature and an identity assertion – who else would produce the claim signature besides the creator?

Can identity assertions preserve privacy?

The CAWG seems to be exploring the use of privacy preserving techniques for future versions of the spec that are compatible with WC3 Verifiable Credentials. There is also interesting in the wild work happening in other communities that seems worthwhile to consider learning from. The 0xPARC community has established its own Proof Carrying Data (PCD) framework that uses ZKPs to create and compose privacy preserving claims about an identity – perhaps there is an opportunity to leverage some that work here.

How will content optimization pipelines that alter the hash of data without altering visual content (i.e. image rendering, video transcoding, etc.) be affected by C2PA adoption?

The C2PA spec allows service providers to create their own signed claims with assertions about the computation performed on an input asset which is referenced in the manifest. However, this would require service providers to implement C2PA support which likely would only happen if they are pressured by platforms to do so. A platform might just show users an optimized version of an asset with a hash that does not match the one in a C2PA manifest. The argument in defense of this practice would be that the user is trusting the platform anyway and the platform knows that the optimized asset it is showing users visually corresponds to the one referenced by the hash in the manifest.

How should we think about C2PA vs. NFT standards like ERC-721?

While C2PA and NFT standards both have a design goal of establishing attribution for provenance claims in an interoperable manner, they also differ in a number of ways:

  • The C2PA spec is focused on standardizing how to make assertions about the lifecycle of a media asset while most NFT standards are more generic with their treatment of asset metadata.
  • The C2PA spec optionally allows manifests to be timestamped by centralized Time Stamp Authorities while NFT standards assume that metadata is timestamped by a blockchain.
  • A C2PA manifest records a provenance claim while a NFT records a provenance claim and links it with a transferrable/tradeable asset.

At this point, C2PA and NFT standards appear to be more likely to be complements as opposed to substitutes.

Can C2PA adoption benefit from the consumer crypto wallet infrastructure that has been deployed in the past few years?

In order for creators to create identity assertions for C2PA claims, they will need consumer friendly software for signing. While there are still a number of UX/UI improvements that need to be made, the consumer wallet software that the crypto industry has funded in the past few years arguably represents the most sophisticated deployment of consumer digital signing software out in the wild.

Subjective Service Protocol Agreements

Protocols encode the rules of engagement that coordinate the exchange of a service between a global supplier and a global consumer.

Chris Burniske (Placeholder Capital)

One question that comes to my mind given the above definition: are the rules of engagement globally defined or locally defined per service relationship?

In many blockchain based protocols, the rules of engagement are defined at a global level (i.e. all nodes follow the same consensus rules). While this protocol architecture might be ideal in many cases, is it the only viable architecture? Are there alternative ways to define rules of engagement that might be more appropriate for certain classes of protocols?

Service Agreements

Jorge from Aragon presents one such alternative:

In this approach, the rules of engagement are locally defined because liabilities, that essentially represent service agreements for a capability offered by a provider, can be issued on a per service relationship basis. Some implications of this include:

  • A single provider can own many liabilities
  • A single capability offered on the network can be associated with different liabilities

Instead of the protocol supporting a single capability associated with a single liability, this protocol structure can support multiple capabilities and multiple liabilities. There are examples of service protocols exploring this design path and reasons why this might be desirable include:

  • Some consumers might require stricter service agreement terms than others i.e. heavier penalty for a violation of the agreement
  • Some consumers might prefer to use different dispute resolution mechanisms in service agreements depending on their trust/security requirements

While a liability might need to be anchored on-chain in order to enforce a dispute resolution mechanism in the event of a violation of the service agreement, a capability can be defined off-chain as long there is a way to determine that a consumer and provider mutually agreed to associate the capability with a particular liability for a transaction. One way to accomplish this would be for both parties to cryptographically sign a data payload that references the off-chain capability, the on-chain liability and any transaction data that might be needed in the event that a dispute is required. The signatures of both parties along with this data payload function as authenticated evidence of an established mutual agreement between parties for a transaction that can be presented to parties external to the transaction if needed. An additional benefit of parties binding a capability with a liability on per service relationship, the duration of which might only be for a single transaction, is increased flexibility to experiment with and upgrade to different agreement structures that could incentivize different types of behavior in providers.

A service protocol that supports off-chain capabilities and on-chain liabilities allows consumers to exchange with providers based on not only the capabilities offered, but also the liabilities owned (and which ones they choose to use to secure their capabilities) the end result being a network with a diverse set of rules of engagement depending on the preferences and requirements of consumers and providers. But, this brings up a follow up question: in a network with a variety of liabilities and capabilities, how do parties determine if service was delivered according to different agreement terms especially if it is not clear if there are objective, algorithmic and deterministic verification method for certain capabilities?

Subjective Dispute Resolution

This follow up discussion to Jorge’s tweet points at one possible answer:

Aragon Court is as a subjective dispute resolution system (Kleros is another) that incentivizes participants to determine whether a human readable agreement was violated or not and then report the decision on-chain. Introducing a subjective dispute resolution system into a service protocol may incite a negative initial reaction. Doesn’t this run counter to the goal of eliminating human judgement, which can be manipulated, misguided, etc., from the types of permissionless, unstoppable digital marketplaces that service protocols seek to enable? We want objective, algorithmic and deterministic methods for securing service protocols! We want to trust code not humans! This sentiment is understandable. 

However, it is worth considering whether there are cases where human powered subjective dispute resolution might actually be more practical than or preferable to objective verification methods that are common in blockchain based protocols today. Consider an image/video transformation service. The goal of the service is to apply a set of permissible transformations to an image/video such that the end result still faithfully represents the content of the original image/video. There are cryptographic techniques for verifying that this service is delivered correctly, but they can be quite computationally expensive which may or may not make sense depending on the requirements of a consumer. The next best solution is to use statistical techniques – for example, a machine learning model can be constructed to classify authentic pairs of images/videos. But, statistical techniques have non-negligible error rate meaning that they will never be 100% accurate. Interestingly, while a machine might mistakenly classify a transformed image/video as authentic or inauthentic, it is actually quite easy and fast for a human to evaluate whether a transformed image/video actually contains the same content as another image/video.

In the cryptography field, one of the desired properties of verifiable computation systems is succinctness meaning that verification is substantially faster/cheaper than running the computation itself. In this case, the system could actually achieve succinctness by using a human verifier! Clearly, it would not be practical for a human to be continuously verifying service delivery, but this observation that a human can be a much more accurate/efficient verifier than a machine for certain types of services might be an indication of that there is potential in designs that introduce human verification only when absolutely needed. Perhaps that is where using subjective dispute resolution systems such as Aragon Court to report the outcomes of human verification on-chain could be useful.

U2F/True2F Cryptographic Protocols Under the Hood

Nowadays, two-factor authentication is standard security best practice for user authentication in applications. Requiring a second factor for authentication besides a password diminishes an attacker’s ability to compromise a user’s account via techniques such as phishing. A common second factor is a one-time password (OTP) that is generated by a mobile application such as Google Authenticator or Duo.

A Universal 2nd Factor (U2F) hardware token is another second factor that is growing in popularity. Users complete an authentication workflow by pressing a button on an inserted USB-based hardware token that communicates with the user’s browser. Check out Yubico’s blog post for a comparison of U2F vs. OTP based two-factor authentication.

In this post, we will explore the cryptographic protocols used by U2F hardware tokens under the hood. Additionally, we will review some interesting extensions for improving hardware token resiliency against supply chain attacks as described in a paper by Dauterman et. al.

U2F Registration

U2F uses a challenge-response protocol in which a token responds to challenges sent by the server with the user’s browser serving as a intermediary that forwards messages between the token and server. Prior to using the token for authentication, a user must first register the token with the server.

The registration process consists of the following steps [1]:

  1. The server sends an application ID.
  2. The token generates an identity ECDSA key pair for the given application ID. Tokens use a unique ECDSA key pair during authentication for each unique identity. The token also generates a key handle.
  3. The token sends the identity public key and the key handle to the server.

In order to reduce the storage used on the token (which has very limited storage to begin with), in step 2, the token does not store the identity secret key and instead derives it using a key handle and a keyed one way function. Yubico generates identity secret keys using the application ID and a nonce as parameters in HMAC-SHA256 keyed by a global device secret. The nonce and a MAC created using the application ID and the device secret compose the key handle [2]. The server stores the key handle alongside the identity public key.

U2F Authentication

The authentication process consists of the following steps [3]:

  1. The server sends an application ID, challenge and key handle.
  2. The browser also includes an origin and TLS channel ID when forwarding the server’s message to the token.
  3. The token uses the key handle to derive the identity secret key for the application ID.
  4. The token increments a local counter that tracks the number of authentications performed using the token.
  5. The token sends the identity secret key’s signature over the application ID, challenge, origin, TLS channel ID and counter value to the server.
  6. If the signature is valid for the registered identity public key, the origin and TLS channel ID are correct and the counter value is valid, the server accepts the token’s response.

In step 2, since the key handle is scoped to the application ID, a token can determine if a received key handle is valid for a particular application ID. For example, using Yubico’s key generation scheme, a token would use the MAC included in the key handle to verify that the nonce was originally generated by the token for the given application ID. Then, the token would pass the nonce and application ID into HMAC-SHA256 keyed by the device secret to derive the identity secret key [2].

The token uses a local counter to defend against device cloning attacks. If an attacker is able to physically clone a token, the counter value would be same on both copies as the counter is only incremented during authentication. As a result, if a server sees a particular counter value more than once, it can infer that a device has been cloned and then block further authentication attempts [3].

The browser’s inclusion of the origin and TLS channel ID in step 2 followed by the server’s check of these fields in step 6 create phishing and man-in-the-middle (MITM) attack protection. The server can check that the same browser origin is included in the token’s signed response to confirm that a token did not sign data provided by a phishing site that differs from the origin that the server first communicated with. The server can also check that the same TLS channel ID is included in the token’s signed response to confirm that the same TLS channel session from the server’s first communication with the browser is used.

U2F also supports device attestation which is not discussed in this post [3].

U2F Areas for Improvement

While U2F includes features to protect against common phishing and MITM attack, the protocols used in production still have some areas that could be improved which are highlighted by Dauterman et. al [4]:

  • Many U2F tokens use a global authentication counter across all services. This global counter value could potentially be used to track a user authenticating at various services if service providers colluded.
  • If a token is faulty or compromised during a supply chain attack, then the user loses out on U2F security guarantees.

True2F

Dauterman et. al. propose True2F, a two-factor authentication system that provides user protection even in the presence of compromised tokens, as an improvement over U2F. Instead of solely relying on the token while executing the challenge-response protocol with the server, True2F has the token and browser collaborate in order to respond to a server. As long as the user’s browser is not compromised, the user can still securely authenticate with services even if the token is compromised. Then, once the token is discovered to be compromised, it can be discarded.

True2F leverages a few additional cryptographic primitives not used in U2F. and we will touch on the following:

  • Verifiable random functions (VRFs) [5]
  • Collaborative key generation
  • Firewalled ECDSA signatures

Note: The use of these primitives do not cover the entirety of the paper’s contributions. The paper also describes a log data structure for minimizing storage requirements while still mitigating the privacy risk of using a global counter among other topics.

True2F Key Generation

Instead of defining a master key pair as a single ECDSA key pair, True2F defines a master key pair as:

  • A master public key composed of a ECDSA public key and a VRF public key
  • A master secret key composed of a ECDSA secret key and a VRF secret key

When initializing a token, the token and the browser will execute the following collaborative key generation protocol [4]:

  1. The browser randomly samples a value v from the elliptic curve group. The browser sends a commitment to v.
  2. The token randomly samples its own value v' from the elliptic curve group. The token sends g^{v'} to the browser where g is the curve generator.
  3. The browser opens its commitment and reveals v. Then, the browser generates the public key as g^{v} \times g^{v'} or g^{v + v'}.
  4. The token verifies that v corresponds to the browser’s commitment. If browser’s commitment opening is valid, the token accepts the public key as g^{v + v'} and uses v + v' as the secret key.

The collaborative key generation protocol can be used to generate the full master key pair such that upon completion, the token has the master secret key and the browser has the master public key.

True2F Registration

True2F introduces verifiable identity families (VIFs), a method for deriving multiple ECDSA key pairs from a single master key pair [4]. A token and browser can use this method such that the browser can verify that the token correctly generated the ECDSA key pair. The browser is not able to audit the token’s operations in this manner in U2F.

The VIFs described in the True2F paper use VRFs which is why the master key pair contains a VRF key pair. During registration, the token, the browser and the server execute the following steps [4]:

  1. The server sends an application ID.
  2. The browser includes a random key handle when forwarding the server’s message to the token.
  3. The token uses its VRF secret key and the key handle to generate a pseudorandom output y and a proof \pi. Recall that \pi can be used in VRF verification along with the VRF public key to check that the output y was correctly produced using the VRF secret key.
  4. The token computes the identity public key by raising the master public key by y and sends it to the browser with y and \pi.
  5. The browser executes VRF verification using the VRF public key, the key handle, y and \pi. If VRF verification passes, the browser checks that the identity public key is equal to the master public key raised by y.
  6. If the checks in step 5 pass, the browser forwards the key handle and the identity public key to the server.

True2F Authentication

During authentication, the token, the browser and the server execute the following steps [4]:

  1. The server sends an application ID, challenge and key handle.
  2. The browser also includes an origin and TLS channel ID when forwarding the server’s message to the token.
  3. The token uses the key handle, and its VRF secret key to generate the pseudorandom output y. The token can derive the identity secret key by multiplying the master secret key by y.
  4. The token increments a local counter that tracks the number of authentications performed using the token.
  5. The token and the browser execute a firewalled ECDSA signature protocol in order to produce the identity secret key’s signature over the application ID, challenge, origin, TLS channel ID and counter value which is sent to the server.
  6. If the signature is valid for the registered identity public key, the origin and TLS channel ID are correct and the counter value is valid, the server accepts the token’s response.

The firewalled ECDSA signature protocol described in step 5 uses a cryptographic reverse firewall which is defined as a machine that modifies the messages of a user’s machine as it interacts with the outside world in an effort to secure the user against a compromised machine while still preserving the functionality of the underlying protocol [7]. In True2F, the browser serves as the firewall that prevents a compromised token from using bad randomness when producing signatures or leaking information using signatures. The protocol consists of the following steps [4]:

  1. The token and the browser execute the collaborative key generation protocol previously described such that upon completion, the token has some value k and the browser has a value g^{k}.
  2. The token uses k as the random ECDSA nonce [6] when producing a signature.
  3. The browser checks that the correct nonce was used to produce the signature.
  4. The browser uses the token’s signature to produce another valid signature and sends it to the server.

As mentioned in a previous post, ECDSA verification requires the calculation of a point P = kG where k is a nonce. In step 3, the browser checks that the correct value for k is used. In step 4, the browser makes use of ECDSA signature malleability where both (r, s) and (r, -s \mod n) are valid signatures to produce another valid signature to send to the server.

Conclusion

U2F utilizes the basic cryptographic building blocks of digital signatures and keyed one way functions to enable the use of hardware tokens for two-factor authentication systems resistant to phishing and MITM attacks. True2F attempts to improve U2F by introducing a few new building blocks that allow a user’s browser to participate in protocols thereby protecting the user from compromised tokens. The paper is pretty interesting and worth a read!

References

[1] https://fidoalliance.org/specs/fido-u2f-v1.2-ps-20170411/FIDO-U2F-COMPLETE-v1.2-ps-20170411.pdf

[2] https://developers.yubico.com/U2F/Protocol_details/Key_generation.html

[3] https://developers.yubico.com/U2F/Protocol_details/Overview.html

[4] Dauterman et. al. “True2F: Backdoor-Resistant Authentication Tokens”. https://arxiv.org/pdf/1810.04660.pdf.

[5] https://en.wikipedia.org/wiki/Verifiable_random_function

[6] https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/

[7] Mironov et. al. “Cryptographic Reverse Firewalls”. https://eprint.iacr.org/2014/758.pdf.

How Not to Use ECDSA

For many developers, digital signatures (and most cryptographic primitives for that matter) are black boxes that “just work”. Developers can usually expect to be able to leverage open source implementations for signature generation and verification either in a programming language’s standard library or in a third party library. Oftentimes, all it takes is a quick skim through the API documentation and within minutes a developers can drop signature generation and verification into their systems.

While easier to use digital signatures can be a blessing, it is not necessarily always the case that digital signatures are easy to securely use. Furthermore, not all documentation is created equal and in some cases documentation might lack the proper guidance to ensure developers can use digital signatures in a secure manner. As a result, understanding how digital signatures can be incorrectly used is valuable knowledge for developers.

In this post, we will explore incorrect usage of the Elliptic Curve Digital Signature Algorithm (ECDSA). If you are unfamiliar with ECDSA, Andrea Corbellini has written a great introductory primer.

Pitfall #1: Signature Malleability

Recall that a ECDSA signature (r, s) is valid for a public key H and hash z if r is the x coordinate of the following point calculated by the verifier:

P = s^{-1}zG + s^{-1}rH

Notice that signature verification will pass as long as r is correct – s is used in the calculation of point P, but as long as the x coordinate of P is equal to r, the actual value for s does not matter. Thus, if there is more than one value for s that can result in a point P with an x coordinate equal to r then there can be more than one valid signature for a given public key and hash. In fact, since P is a point on an elliptic curve, there is also a point -P on the curve with the same x coordinate as P, but a negated y coordinate [1]. Since P = kG [7] and s is an input into the calculation of k, a negated s results in -P which has the same x coordinate as P. As a result, both (r, s) and (r, -s \mod n) are valid signatures for the public key H and hash z. We refer to (r, s) as a “malleable” signature because a third party can use it to compute another valid signature for a public key and hash without access to the signing private key.

This malleability property of ECDSA signatures can introduce security vulnerabilities into systems if:

  • Replay defense mechanisms use the signatures themselves as unique identifiers (i.e. a verifier should only allow a signature to be used once).
  • Software relies on signatures as identifiers to query for additional information. See Bitcoin transaction malleability [2].

Avoiding Pitfall #1

One solution to defend against signature malleability based attacks is to enforce a single canonical signature for a given public key and hash which is the approach taken by Bitcoin. More specifically, the Bitcoin core client software will only accept “low s-value” ECDSA signatures where a signature has a low s-value if s is less or equal to half the curve order [3]. The secp256k1 curve library used by the client will always generate signatures in low s-value form and the verifier expects provided signatures to also be in low s-value form [4].

Another solution is to avoid using signatures in identifiers or at the very least making sure to use unique values in the identifier creation process i.e. a nonce in the signed message. Unless there is a single canonical signature for a given public key and hash, signatures cannot be relied upon as unique identifiers.

Example Code for Pitfall #1

Example code for creating malleable signatures can be found here.

The program first generates the original signature and then TrickSig1() is used to create another valid signature. In the first attempt, the signature generated by TrickSig1() passes verification. However, in the second attempt, the ECDSA verification implementation from go-ethereum is used which implements the low s-value requirement thus causing verification to fail.

 original sig: (0xe742ff452d41413616a5bf43fe15dd88294e983d3d36206c2712f39083d638bd, 0xe0a0fc89be718fbc1033e1d30d78be1c68081562ed2e97af876f286f3453231d)
original sig verification with message hash 0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9: SUCCESS


trick sig 1: (0xe742ff452d41413616a5bf43fe15dd88294e983d3d36206c2712f39083d638bd, 0x1f5f0376418e7043efcc1e2cf28741e252a6c783c21a088c3863361d9be31e24)
NO MALLEABILITY CHECK
trick sig 1 verification with message hash 0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9: SUCCESS


WITH MALLEABILITY CHECK
HALF CURVE ORDER = 7fffffffffffffffffffffffffffffff5d576e7357a4501ddfe92f46681b20a0
trick sig 1 s-value <= half curve order!
normalizing sig by negating s-value
trick sig 1 verification with message hash 0xb94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9: FAILED

Pitfall #2: Verification Without Hash Pre-Image

Many ECDSA verification implementations expect the hash of the signed message as input, but the the verifier should always be the one to hash the message and should never accept a hash without knowledge of its pre-image [5]. Otherwise, an attacker is able to pick the hash that will used for verification. If the attacker does not need to know the pre-image of the hash, he/she can pick a hash that he/she can always produce valid signatures for [6].

The point P calculated during verification can be expressed as P = aG + bH where a = s^{-1}z and b = s^{-1}r.

We can express s in terms of b and P:

s = b^{-1}r = b^{-1}P_{x}

We can express z in terms of a, b and s:

z = as = ab^{-1}r = ab^{-1}P_{x}

We can express a valid signature (r, s) as:

(r, s) = (P_{x}, P_{x}b^{-1})

Thus, if an attacker has the freedom to choose the value for hash z, then it can set a and b to arbitrary non-zero values and then derive values for hash z and signature (r, s) that would always pass verification for a public key H.

Avoiding Pitfall #2

The solution to defend against this type of attack is to implement verifiers to always use the result of hashing a received message (that another party is claiming to have signed) for ECDSA verification.

Example Code for Pitfall #2

Example code for exploiting ECDSA verifiers that do not know the pre-image of a hash can be found here.

The program uses TrickSig2() to create a valid signature and hash pair given a public key. The verifier just accepts the hash so it has no knowledge of the pre-image (in contrast to if the verifier instead hashed the input before running the verification algorithm).

 trick sig 2: (0x41b5201d06acaafb67785a8e8aa89626e79c2117acce468196c1c5074ec9c274, 0x4f28c5a75abedce19cacba086282632f175abb635bd633465f6763a250eea629)
trick sig 2 verification with message hash 0x8bcbdc44c5ba50680f5fa229ec8befecba16cc0a1be660241d32939ec472fd8c: SUCCESS

Conclusion

Both of the described pitfalls are examples of instances where a developer that has access to an easy to use cryptographic primitive library (both code examples import widely used open source implementations either from a standard or third party library) could still incorrectly use the library thereby losing some of the security properties of the primitive. Especially when creating systems that require cryptographic guarantees, developers stand to benefit a lot from not only a larger breadth of tools, but also a better understanding of what correct (or incorrect) usage of those tools looks like.

References

[1] https://crypto.stackexchange.com/questions/54416/how-to-compute-negative-point-in-ec-dsa

[2] https://eklitzke.org/bitcoin-transaction-malleability

[3] https://github.com/bitcoin/bips/blob/master/bip-0062.mediawiki

[4] https://github.com/bitcoin-core/secp256k1/blob/master/include/secp256k1.h

[5] https://twitter.com/pwuille/status/1063582706288586752

[6] https://medium.com/@jimmysong/faketoshis-nonsense-signature-8700a44536b5

[7] https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/