Systematic Use of Random Self-Reducibility against Physical Attacks

Ferhat Erata1, TingHung Chiu2, Anthony Etim1, Srilalith Nampally2, Tejas Raju2, Rajashree Ramu2,
Ruzica Piskac1, Timos Antonopoulos1, Wenjie Xiong2 Jakub Szefer1
1Yale University, United States, 2Virginia Tech, United States
Abstract

This work presents a novel, black-box software-based countermeasure against physical attacks including power side-channel and fault-injection attacks. The approach uses the concept of random self-reducibility and self-correctness to add randomness and redundancy in the execution for protection. Our approach is at the operation level, is not algorithm-specific, and thus, can be applied for protecting a wide range of algorithms. The countermeasure is empirically evaluated against attacks over operations like modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms. An end-to-end implementation of this countermeasure is demonstrated for RSA-CRT signature algorithm and Kyber Key Generation public key cryptosystems. The countermeasure reduced the power side-channel leakage by two orders of magnitude, to an acceptably secure level in TVLA analysis. For fault injection, the countermeasure reduces the number of faults to 95.4% in average.

Index Terms:
Random Self-Reducibility, Fault Injection Attacks, Power Side-Channel Attacks, Countermeasure, NTT, PQC, RSA-CRT, Randomly Testable Functions

1 Introduction

Smart devices and IoT devices with sensors, processing capability, and actuators are becoming ubiquitous today in consumer electronics, healthcare, manufacturing, etc. These devices often collect sensitive or security-critical information and need to be protected. However, when deployed in the field, such devices are vulnerable to physical attackers who can have direct access to the devices.

Physical attacks can be categorized as passive attacks or active attacks. In passive attacks, such as Side-Channel Attacks (SCA), the attackers do not tamper with the execution, but can collect power traces, electromagnetic (EM) field traces, or traces of acoustic signals, and analyze the signals to learn information that is processed on the device. In active attacks, such as Fault Injection (FI) attacks, the attackers can inject faults through a voltage glitch, clock glitch, EM field, or laser to cause a malfunction in the processing unit or memory to tamper with the execution to obtain desired results. It has been shown that both types of physical attacks have been able to break cryptography implementations to leak secret keys, for example [21, 36, 80, 82].

Even though the assumptions on the attacker’s capability are similar for SCA and FI, the existing mitigation techniques treat the two types of attacks separately. For side-channel attacks, the mitigation techniques usually use randomness or noise to decouple the signal observable by the attacker from the data value [47, 89]. For fault injection attacks, there are typically two solutions: one is attack detection and one is to have redundancy in the execution for error correction. The detection will detect when the execution has abnormal behavior, and then handle it as an exception. The error correction uses redundancy in the execution and uses the redundancy to correct execution error if there is [65]. However, when we consider both SCA and FI attacks in the same system, separate mitigation for the two does not protect both attacks efficiently. For example, existing work [28] showed that instruction duplication as a fault tolerance mechanism amplifies the information leakage through side channels. Detection methods such as full, partial, encrypt-decrypt duplication & comparison of a cipher [52] produce repetitions of intermediate values that are exploitable by the side-channel adversary.

In this work, we propose a joint solution for both SCA and FI attacks. With a combination of random obfuscation using the Random Self-Reducibililty (RSR) property and redundancy for error correction, our proposed countermeasure is particularly effective against FI outperforming traditional redundancy-based methods. The randomness disrupts the attacker’s observation of the statistics in fault attacks, thereby nullifying the effectiveness of statistical analysis as a tool for security compromise. This aspect is crucial in the face of increasingly sophisticated FI analysis techniques. In addition to its effectiveness against FI, the countermeasure also resists SCA, by rendering power consumption variations less useful to attackers. The countermeasure significantly enhances system security, particularly in environments where physical attacks are prevalent.

Another drawback of current mitigation techniques is that most existing work focuses on a certain implementation of a cryptographic algorithm, and to adopt the protection from one implementation to another needs redoing the security analysis process and redoing the implementation.

The proposed countermeasure offers significant benefits as a black box operation-level solution to both SCA and FI attacks, and it is independent of the target algorithm being protected. This means there is no need for detailed knowledge of the implementation. The basis for the solution is to implement protection at low-level of operations such as modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms. Also, we assume a generic fault model, and thus, there is no special fault profiling of a targeted device necessary. Therefore, the proposed protection techniques can be applied directly in software without extensive system-specific adjustments. In our evaluation, we showcase how the proposal protection techniques can be adopted to protect two different cryptosystems.

Our protection requires a small number of steps to implement. It can be implemented at C or high-level and is independent of the compiler or underlying architecture; assuming the compiler. First, target software is identified. Second, we locate low-level operations such as modular exponentiation, modular multiplication, polynomial multiplication, or number theoretic transforms. These operations can be protected with the idea of Random Self-Reducibility (RSR). Each instance of the low-level operation is replaced with an equivalent RSR operation. Each RSR operation requires querying a randomness source and then executing the low-level operations multiple times with original input values modified with the random values. Typically, multiple RSR operations are instantiated and majority voting is performed on the output of RSR operations. Because the protection works at the low-level operations such as modular exponentiation, modular multiplication, polynomial multiplication, or number theoretic transforms, it is independent of the higher-level algorithm or application. Since it does not rely on any hardware tricks, it is independent of the architecture and agnostic to the underlying compiler.

Our protection can be applied to any program or algorithm that uses modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms to process secret or sensitive information. This encompasses major cryptogrpahic algorithms from ElGamal [35] and RSA [77] to post-quantum cryptography such as Kyber [7] and Dilithium [33]. In our evaluation, we show how our protection can be applied to RSA-CRT and Kyber’s Key Generation algorithms. Our contributions are summarized as follows:

  • β€’

    We propose a new software-based, combined countermeasure against power side-channel (Section 5.1) and fault injection (Section 5.2) attacks, by randomizing the intermediate values of the computation using the notion of random self-reducibility (Section 5).

  • β€’

    We formalize the security of the countermeasure in relation to an attacker’s fault injection capability, parameterize it, and quantify its effectiveness against fault-injection attacks, as detailed in Section 5.3.

  • β€’

    End-to-end implementation of the countermeasure for RSA-CRT and Kyber’s Key Generation public key cryptosystems (Section 7).

  • β€’

    Emprical evaluation of the countermeasure against power side-channel and fault-injection attacks over modular exponentiation, modular multiplication operations, polynomial multiplication, number theoretic transform operations, RSA-CRT, and Kyber’s Key Generation (Section 8).

2 Background

In this section, we provide a brief overview on side-channels and fault injections as physical attacks.

2.1 Power Side Channels

It is a well-known fact that the power consumption during certain stages of a cryptographic algorithm exhibits a strong correlation with the Hamming weight of its underlying variables, i.e., Hamming weight leakage model [49, 22, 69]. This phenomenon has been widely exploited in the cryptographic literature in various attacks targeting a broad range of schemes, particularly post-quantum cryptographic implementations [45, 91, 71, 85, 4, 70, 5, 86, 39, 14, 43]. Therefore, we use the Hamming weight leakage model in the evaluation of the robustness of the countermeasure.

The Hamming weight leakage model assumes that the Hamming weight of the operands is strongly correlated with the power consumption. Each bit flip requires one or more voltage transitions from 0 to high (or vice versa). Different data values typically entail differing numbers of bit flips and therefore produce distinct power traces [23]. Therefore, any circuit not explicitly designed to be resistant to power attacks has data-dependent power consumption. However, in a complex circuit, the differences can be so slight that they are difficult to distinguish from a single trace, particularly if an attacker’s sampling rate is limited [49, 94]. Therefore, it is necessary to use statistical techniques across multiple power traces [49].

Test Vector Leakage Assessment (TVLA). [40] identifies if two sets of side channel measurements are distinguishable by computing the Welch’s t-test for the two sets of measurements. It is being used in the literature to confirm the presence or absence of side-channel leakages for power traces, and has become the de facto standard in the evaluation of side-channel measurements [69, 81, 56, 90, 76]. In side-channel analysis, the recommended thresholds for t-values are specifically tailored to detect potential information leakage in cryptographic systems. A t-value threshold of Β±4.5plus-or-minus4.5\pm 4.5 or Β±5plus-or-minus5\pm 5 is often considered in side-channel analysis. This threshold corresponds to a very high confidence level, rejecting the null hypothesis with a confidence greater than 99.999% for a significantly large number of measurements. The null hypothesis typically being that all samples are drawn from the same distribution, a t-value outside this range indicates distinguishable distributions of the two sets and thus the existence of side-channel leakage [88]. The choice of these thresholds is influenced by the need to balance the risk of false positives (incorrectly identifying information leakage when there is none) against the risk of false negatives (failing to detect actual information leakage).

The Sum of Squared pairwise T-differences (SOST) [24] is a technique for identifying Points of Interest (PoIs) in side-channel analysis. It is particularly useful in scenarios where there are many data points (like traces in a cryptographic system), and you want to identify specific points in these traces that show significant variation based on different conditions or inputs.

2.2 Fault Injection Attacks

In the real world, there is a possibility that the devices will malfunction or be damaged, resulting in generating the error output, and we may ignore it. However, if the attacker intentionally induced the fault during the device operation, e.g., cryptographic calculation, he or she can recover the secret by analyzing the original and fault outputs. Most of the classical cryptographic algorithms can be attacked by fault injection attacks. For instance, the first fault attack research [18] was on the RSA implementation using the Chinese Remainder Theorem (CRT), which is the most common implementation of RAS used to secure communication. In this case, the attacker can recover the secret with only one faulty RSA-CRT signature. Moreover, in [62], Mus et al. provide the fault attack method, which can attack El-Gamal or elliptic-curve (ECC) based signature, such as Schnorr signature and ECDSA, via Rowhammer (a software technique used to induce the fault in memory). Not only the public key cryptosystem but also the symmetric key cryptosystem are vulnerable to fault injection attacks. In [66], Piret et al. develop the fault attack method against substitution-permutation network (SPN) structures cryptographic algorithm, such as AES or KHAZAD. Even the post-quantum cryptographic algorithms [73], which can protect against quantum computing, can be vulnerable to fault attacks. Therefore, it is necessary to have efficient FI attack protections that can be easily deployed.

The injected fault can, in principle, have an impact on any stage of the fetch-decode-execute cycle performed for each instruction [48, 99]. Additionally, any optimizations implemented by the CPU, such as pipelining [98], add to the complexity of executing a single instruction. Therefore, it is typically unknown what exactly goes wrong within the CPU when its behavior is changed due to fault injection, whereas the modified behavior itself is easier to measure. We consider a generic fault model, likely applicable to a wide range of targets, where a variable amount of bits in the instruction are flipped as a result of fault injection. Two types of behavior are possible using this fault model: 1. Instruction corruption: the original instruction is modified into an instruction that has an impact on the behavior of the device. In practice, it may modify the instruction to any other instruction supported by the architecture. 2. Instruction skipping: effectively a subset of instruction corruption. The original instruction is corrupted into an instruction that does not have an impact on the behavior of the device. The resulting instruction does not change the execution flow or any state that is used later on.

Invocation of specific behavior is not a trivial task, as the low level control required to do this is often limited. However, it is possible to identify the more probable results while assuming that bit flips affecting single or all bits are more likely than complex patterns of bit flips [92].

On embedded processors, a fault model in which an attacker can skip an assembly instruction or equivalently replace it by a nop has been observed on several architectures and for several fault injection means [61]. Moro et al. in [60] assume that the effect of the injected fault on a 32-bit microcontroller leads to an instruction skip. Moro et al. [61] and Barenghi et al. [9] have proposed implementations of the Instruction Redundancy technique as a countermeasure against this fault model. Instruction skips correspond to specific cases of instruction replacements: replacing an instruction with another one that does not affect any useful register has the same effect as a nop replacement and so is equivalent to an instruction skip.

3 Threat Model

In our threat model, we consider an attacker with physical access to a device, capable of injecting faults such as voltage glitches during the computation of a critical function like the number theoretic transform (see Section 6). These faults can corrupt or skip instructions (see Section 2.2) and happen anywhere multiple times but does not crash the device. Furthermore, the model permits the attacker to perform basic power side-channel analysis, collecting power trace samples. By correlating data-dependent power consumption with the Hamming weight leakage model, the attacker can expose vulnerabilities in cryptographic computations. This underscores the crucial need for robust defenses against both fault injection and side-channel attacks.

4 Preliminaries

We use the notion of random self-reducibility [16, 79] to develop a new software-based countermeasure against fault-injection attacks and simple power side-channel attacks. Therefore, in this section, we provide the necessary background on random self-reducibility. Since we apply our countermeasure to number-theoretic operations, we also provide the necessary background on number theoretic transforms.

4.1 Notation

The x~~π‘₯\tilde{x} notation is used to represent a specific realization of a random variable (i.e., a specific value that the random variable takes on). Let Prx∈X⁑[β‹…]subscriptPrπ‘₯𝑋⋅\Pr_{x\in X}[\cdot] denote the probability of the event in the enclosed expression when xπ‘₯x is uniformly chosen from X𝑋X. We assume the domain and range of the function are the same set, usually named as 𝔻𝔻\mathbb{D}, but the formalization can be expanded to accommodate multivariate functions and heterogeneous domains and ranges.

Let qπ‘žq be a prime number, and the field of integers modulo qπ‘žq be denoted as β„€qsubscriptβ„€π‘ž\mathbb{Z}_{q}. Schemes such as Kyber and Dilithium operate over polynomials in polynomial rings. The polynomial ring β„€q​[x]/ϕ​(x)subscriptβ„€π‘ždelimited-[]π‘₯italic-Ο•π‘₯\mathbb{Z}_{q}[x]/\phi(x) is denoted as Rqsubscriptπ‘…π‘žR_{q} where ϕ​(x)=xn+1italic-Ο•π‘₯superscriptπ‘₯𝑛1\phi(x)=x^{n}+1 is a cyclotomic polynomial with n𝑛n being a power of 2. Multiplication of polynomials a,b∈Rqπ‘Žπ‘subscriptπ‘…π‘ža,b\in R_{q} is denoted as c=aβ‹…b∈Rqπ‘β‹…π‘Žπ‘subscriptπ‘…π‘žc=a\cdot b\in R_{q}. Pointwise/Coefficientwise multiplication of two polynomials a,b∈Rqπ‘Žπ‘subscriptπ‘…π‘ža,b\in R_{q} is denoted as c=a∘b∈Rqπ‘π‘Žπ‘subscriptπ‘…π‘žc=a\circ b\in R_{q}, which means that each of the coefficients of polynomial aπ‘Ža multiplies the coefficients of b𝑏b with the same index. The NTT representation of a polynomial a∈Rqπ‘Žsubscriptπ‘…π‘ža\in R_{q} is denoted as a^∈Rq^π‘Žsubscriptπ‘…π‘ž\hat{a}\in R_{q}.

4.2 Random Self-Reducibility

Informally, a function f𝑓f is random-self-reducible if the evaluation of f𝑓f at any given instance xπ‘₯x can be reduced in polynomial time to the evaluation of f𝑓f at one or more random instances.

Definition 1 (Random Self-Reducibililty (RSR) [16, 79]).

Let xβˆˆπ”»π‘₯𝔻x\in\mathbb{D} and c>1𝑐1c>1 be an integer. We say that f𝑓f is c𝑐c-random self-reducible if f𝑓f can be computed at any particular input xπ‘₯x via:

F​[f​(x),f​(a1),…,f​(ak),a1,…,ak]=0𝐹𝑓π‘₯𝑓subscriptπ‘Ž1…𝑓subscriptπ‘Žπ‘˜subscriptπ‘Ž1…subscriptπ‘Žπ‘˜0F\left[f(x),f\left(a_{1}\right),\ldots,f\left(a_{k}\right),a_{1},\ldots,a_{k}\right]=0(1)

where F𝐹F can be computed asymptotically faster than f𝑓f and the aisubscriptπ‘Žπ‘–a_{i}’s are uniformly distributed, although not necessarily independent; e.g., given the value of a1subscriptπ‘Ž1a_{1} it is not necessary that a2subscriptπ‘Ž2a_{2} be randomly distributed in 𝔻𝔻\mathbb{D}. This notion of random self-reducibility is somewhat different than other definitions given by [1, 37, 17], where the requirement on F𝐹F is that it be computable in polynomial time.

Another similar definition was made by Lipton [50]. Suppose that we wish to compute the trivial identity function f​(x)=x𝑓π‘₯π‘₯f(x)=x, and let P𝑃P be a program that computes f​(x)𝑓π‘₯f(x). We can construct from P𝑃P another program Pβ€²superscript𝑃′P^{\prime} with the property that it can compute f​(x)𝑓π‘₯f(x) correctly at an arbitrary point xπ‘₯x provided that one can compute it at a number of random points. Consider the following program P′​(x)=Ξ”r~:=random​();return β€‹P​(x+r~)βˆ’P​(r~)formulae-sequencesuperscriptΞ”superscript𝑃′π‘₯~π‘Ÿassignrandomreturn π‘ƒπ‘₯~π‘Ÿπ‘ƒ~π‘ŸP^{\prime}(x)\stackrel{{\scriptstyle\Delta}}{{=}}\tilde{r}:=\text{random}();\text{return }P(x+\tilde{r})-P(\tilde{r}). Pβ€²superscript𝑃′P^{\prime} can compute P​(x)𝑃π‘₯P(x) with inputs x+r~π‘₯~π‘Ÿx+\tilde{r} and r~~π‘Ÿ\tilde{r}.

It is shown by Blum et al. [16] that self-correctors exist for any function that is random self-reducible. A self-corrector for f𝑓f takes a program P𝑃P that is correct on most inputs and turns it into a program that is correct on every input with high probability.

4.3 Arithmetic Secret Sharing

Privacy-preserving computing allows multiple parties to evaluate a function while keeping the inputs private and revealing only the output of the function and nothing else. One popular approach to outsourcing sensitive workloads to untrusted workers is to use arithmetic secret sharing [31, 58]. It splits a secret into multiple shares, distributing them across various workers. Each worker processes their respective share locally. Assuming the workers will not collude, it is information-theoretically impossible for each worker to recover the secret from its share [96].

In standard arithmetic secret sharing, the client aims to compute f​(x,y)=a​x+b​y=z𝑓π‘₯π‘¦π‘Žπ‘₯𝑏𝑦𝑧f(x,y)=ax+by=z, with the property that f​(x,y)=f​(x1,y1)+f​(x2,y2)𝑓π‘₯𝑦𝑓subscriptπ‘₯1subscript𝑦1𝑓subscriptπ‘₯2subscript𝑦2f(x,y)=f(x_{1},y_{1})+f(x_{2},y_{2}), where x1subscriptπ‘₯1x_{1} and y1subscript𝑦1y_{1} are randomly chosen. Let ℀​(2we)β„€superscript2subscript𝑀𝑒\mathbb{Z}\left(2^{w_{e}}\right) denote the integer ring of size 2wesuperscript2subscript𝑀𝑒2^{w_{e}}. The shares are constructed such that the sum of all shares is equal to the original secret value x∈π‘₯absentx\in ℀​(2we)β„€superscript2subscript𝑀𝑒\mathbb{Z}\left(2^{w_{e}}\right). The client then delegates the computation to workers (untrusted entities). These workers independently calculate f​(x1,y1)𝑓subscriptπ‘₯1subscript𝑦1f(x_{1},y_{1}) and f​(x2,y2)𝑓subscriptπ‘₯2subscript𝑦2f(x_{2},y_{2}), then relay their results back to the client. The client derives f​(x,y)𝑓π‘₯𝑦f(x,y) using these partial results from the untrusted workers. The randomness in the shares is crucial for our power side-channel countermeasure. While arithmetic secret sharing is based on the linearity of addition and multiplication over integers, our approach utilizes Random Self-Reducible properties, some of which may not necessarily be linear. Moreover, to counteract fault injection attacks, our algorithm must produce accurate results despite faults. We achieve this in our countermeasure by repeating the computation n𝑛n times and choosing the majority of the responses.

5 Overview of Our Countermeasure

The foundational works of Blum et al. [16] and Lipton [50] on testing have significantly influenced our approach to developing countermeasures. We have incorporated the concept of self-correctness to safeguard against fault-injection attacks, and the principles of random self-reducibility and randomly-testable functions to defend against power side-channel attacks. These notions are investigated and applied as a countermeasure against physical attacks in the literature.

At the heart of this method is the generic, randomized Algorithm 2, which is founded on the principle described in Definition 1. Additionally, Algorithm 3 boosts the effectiveness of the randomized Algorithm 2 through majority voting and probability amplification [87].

We observed that instance hiding can be also used against physical attacks, such as power side-channel and fault-injection attacks by randomizing the intermediate values of the computation. In this way, attackers won’t be able to correlate the side-channel leakage with the intermediate values of the computation (see 1). For example, secrets in ElGamal Decryption [35] (see Algorithm 1) can be protected end-to-end using instance hiding, but instead of using arithmetic secret sharing, we use random self-reducible properties.

Refer to caption
Figure 1: Motivation: In standard arithmetic secret sharing, the client aims to compute f​(x,y)=a​x+b​y=z𝑓π‘₯π‘¦π‘Žπ‘₯𝑏𝑦𝑧f(x,y)=ax+by=z, with the property that f​(x,y)=f​(x1,y1)+f​(x2,y2)𝑓π‘₯𝑦𝑓subscriptπ‘₯1subscript𝑦1𝑓subscriptπ‘₯2subscript𝑦2f(x,y)=f(x_{1},y_{1})+f(x_{2},y_{2}), where x1subscriptπ‘₯1x_{1} and x2subscriptπ‘₯2x_{2} are randomly chosen (annotated by c1,c2subscript𝑐1subscript𝑐2c_{1},c_{2}). The client delegates the computation to workers (untrusted entities). These workers independently calculate f​(x1,y1)𝑓subscriptπ‘₯1subscript𝑦1f(x_{1},y_{1}) and f​(x2,y2)𝑓subscriptπ‘₯2subscript𝑦2f(x_{2},y_{2}), then relay their results back to the client. The client derives f​(x,y)𝑓π‘₯𝑦f(x,y) using these partial results from the untrusted workers. The randomness in the shares is crucial for our power side-channel countermeasure. While arithmetic secret sharing is based on the linearity of addition and multiplication over integers (i.e., f​(x+y)=f​(x)+f​(y)𝑓π‘₯𝑦𝑓π‘₯𝑓𝑦f(x+y)=f(x)+f(y)), our approach utilizes Random Self-Reducible properties, some of which may not necessarily be linear. Moreover, to counteract fault injection attacks, our algorithm must produce accurate results despite faults. We achieve this in our countermeasure by repeating the computation n𝑛n times and choosing the majority of the responses.
Input : Ciphertexts: c1subscript𝑐1c_{1}, c2subscript𝑐2c_{2}, Secret Key: xπ‘₯x
Output : Decrypted Message: mπ‘šm
1 Calculate s:=c1xmodRassign𝑠modulosuperscriptsubscript𝑐1π‘₯𝑅s:=c_{1}^{x}\bmod R
2 Calculate l:=sβˆ’1modRassign𝑙modulosuperscript𝑠1𝑅l:=s^{-1}\bmod R
3 Calculate m:=c2β‹…lmodRassignπ‘šmoduloβ‹…subscript𝑐2𝑙𝑅m:=c_{2}\cdot l\bmod R
return mπ‘šm
Algorithm 1 ElGamal Decryption

In this algorithm, c1subscript𝑐1c_{1} and c2subscript𝑐2c_{2} are the ciphertexts, xπ‘₯x is the secret key, and mπ‘šm is the decrypted message. The operation c1xsuperscriptsubscript𝑐1π‘₯c_{1}^{x} represents raising c1subscript𝑐1c_{1} to the power of xπ‘₯x, and sβˆ’1superscript𝑠1s^{-1} represents the modular multiplicative inverse of s𝑠s (i.e., s(βˆ’1)modRmodulosuperscript𝑠1𝑅s^{(-1)}\bmod{R}, where R𝑅R is the prime modulus used in the ElGamal encryption scheme). The result of the decryption, mπ‘šm, is obtained by multiplying c2subscript𝑐2c_{2} with the modular inverse of s𝑠s, denoted as l𝑙l in this algorithm.

For instance, we can protect the modular exponentiation function, f​(a,x,R)=axmodRπ‘“π‘Žπ‘₯𝑅modulosuperscriptπ‘Žπ‘₯𝑅f(a,x,R)=a^{x}\bmod R, in ElGamal decryption using P​(a,x,R)=P​(a,x~1,R)β‹…RP​(a,x~2,R)π‘ƒπ‘Žπ‘₯𝑅subscriptβ‹…π‘…π‘ƒπ‘Žsubscript~π‘₯1π‘…π‘ƒπ‘Žsubscript~π‘₯2𝑅P(a,x,R)=P(a,\tilde{x}_{1},R)\cdot_{R}P(a,\tilde{x}_{2},R), and modular multiplication, f​(x,y,R)=xβ‹…Ry𝑓π‘₯𝑦𝑅subscript⋅𝑅π‘₯𝑦f(x,y,R)=x\cdot_{R}y, using P​(x,y,R)=P​(x~1,y~1,R)+RP​(x~2,y~1,R)+RP​(x~1,y~2,R)+RP​(x~2,y~2,R)𝑃π‘₯𝑦𝑅subscript𝑅subscript𝑅subscript𝑅𝑃subscript~π‘₯1subscript~𝑦1𝑅𝑃subscript~π‘₯2subscript~𝑦1𝑅𝑃subscript~π‘₯1subscript~𝑦2𝑅𝑃subscript~π‘₯2subscript~𝑦2𝑅P(x,y,R)=P(\tilde{x}_{1},\tilde{y}_{1},R)+_{R}P(\tilde{x}_{2},\tilde{y}_{1},R)+_{R}P(\tilde{x}_{1},\tilde{y}_{2},R)+_{R}P(\tilde{x}_{2},\tilde{y}_{2},R). In these equalities shares should be selected to make x=x1+Ry1π‘₯subscript𝑅subscriptπ‘₯1subscript𝑦1x=x_{1}+_{R}y_{1} and y=y1+Ry2𝑦subscript𝑅subscript𝑦1subscript𝑦2y=y_{1}+_{R}y_{2}.

TABLE I: Operations used in ElGamal Decryption: Deshpande et al. [32] use two different ways to implement modular exponentation (D1subscriptD1\mathrm{D_{1}} and D1subscriptD1\mathrm{D_{1}}) in ElGamal decryption.
{NiceTabular}

|l|l|l| Method-πƒπŸsubscript𝐃1\mathbf{D_{1}}Method-πƒπŸsubscript𝐃2\mathbf{D_{2}}
s:=c1xassign𝑠superscriptsubscript𝑐1π‘₯s:=c_{1}^{x} Modular exponentiation Modular exponentiation
l:=sβˆ’1assign𝑙superscript𝑠1l:=s^{-1} Fermat’s method [95] Fast GCD algorithm [13]
m:=c2.lformulae-sequenceassignπ‘šsubscript𝑐2𝑙m:=c_{2}.l Modular multiplication Modular multiplication

It is also hard to implement countermeasures for different implementations of the same mathematical function. Table 5 presents two distinct methodologies for implementing the ElGamal decryption algorithm. Method D1subscript𝐷1D_{1} employs Fermat’s method (Algorithm 18 in Appendix) for the modular inverse calculation [95], while D2subscript𝐷2D_{2} utilizes a sophisticated, constant-time modular inverse implementation recently developed by Bernstein and Yang [13] (Algorithm 17 in Appendix). Importantly, our countermeasure technique, applicable for both modular exponentiation (refer to Section 6) and modular multiplication (refer to Section 6), is compatible with either method regardless of the complexity of their respective implementations.

5.1 RSR against Power Side Channels

Consider a correct program P𝑃Pthat has an associated random self-reducible property, which takes the form of a functional equation p𝑝p. This property is deemed satisfied if, in the equation p𝑝p, we can substitute P𝑃Pfor the function f𝑓fand the equation remains true.

1
Input : Program: P𝑃P, Sensitive input: xπ‘₯x, Security: c𝑐c
Output : P​(x)𝑃π‘₯P(x)
2
3Randomly split a1,…,acsubscriptπ‘Ž1…subscriptπ‘Žπ‘a_{1},\ldots,a_{c}based on xπ‘₯x.
4 for i=1,…,c𝑖1…𝑐i=1,\ldots,c do
5       Ξ±i←P​(ai)←subscript𝛼𝑖𝑃subscriptπ‘Žπ‘–\alpha_{i}\leftarrow P(a_{i})
return F​[x,a1,…,ac,Ξ±1,…,Ξ±c]𝐹π‘₯subscriptπ‘Ž1…subscriptπ‘Žπ‘subscript𝛼1…subscript𝛼𝑐F[x,a_{1},\ldots,a_{c},\alpha_{1},\ldots,\alpha_{c}]
Algorithm 2 c𝑐c-secure-countermeasure PSCA (P,x,c)𝑃π‘₯𝑐(P,x,c).

Generic c𝑐c-secure-countermeasure PSCA (P,x,c)𝑃π‘₯𝑐(P,x,c)defined Algorithm 2 takes a program P𝑃P, a sensitive input xπ‘₯x, and a security parameter c𝑐c. The algorithm randomly splits xπ‘₯xinto c𝑐cshares a1,…,acsubscriptπ‘Ž1…subscriptπ‘Žπ‘a_{1},\ldots,a_{c}such that x=a1+β‹―+acπ‘₯subscriptπ‘Ž1β‹―subscriptπ‘Žπ‘x=a_{1}+\cdots+a_{c}, and calls P𝑃Pon each share aisubscriptπ‘Žπ‘–a_{i}to obtain Ξ±i=P​(ai)subscript𝛼𝑖𝑃subscriptπ‘Žπ‘–\alpha_{i}=P(a_{i}). Finally, the algorithm returns the result of the function F𝐹Fon x,a1,…,ac,Ξ±1,…,Ξ±cπ‘₯subscriptπ‘Ž1…subscriptπ‘Žπ‘subscript𝛼1…subscript𝛼𝑐x,a_{1},\ldots,a_{c},\alpha_{1},\ldots,\alpha_{c}. The function basis F𝐹Fis defined based on the random self-reducible property of the function f𝑓fthat P𝑃Pimplements (cf. Definition 1).

To ensure minimum security, splitting the secret input into two shares would suffice. However, for enhanced security, the secret input can be divided into additional shares. It’s important to view the security parameter c𝑐cas an invocation to P𝑃P, especially in the context of bivariate functions, rather than merely the number of shares.

Masking with Random Self-Reducibility. If a cryptographic operation has a random self-reducible property, then it is possible to protect it against power side-channel attacks by masking with arithmetic secret sharing.

5.2 Self-Correctness against Fault Injections

Fault injection attacks rely on obtaining a faulty output or correlating the faulty output with the input or secret-dependent intermediate values. By introducing redundancy and majority voting, we can obtain correct results even if some results are incorrect due to injected faults.

In Algorithm 3, we show how to apply the fault injection countermeasure approach on top of the power side-channel countermeasure. To protect a program P𝑃Pthat implements a function f𝑓fhaving a random self-reducible property, the algorithm calls P𝑃P’s c𝑐c-secure-countermeasure n𝑛ntimes and returns the majority of the answers. The function c𝑐c-secure-countermeasure takes a program P𝑃P, a sensitive input xπ‘₯x, and a security parameter c𝑐c.

1
Input : Program: P𝑃P, Sensitive input: xπ‘₯x, Security: n,c𝑛𝑐n,c
Output : P​(x)𝑃π‘₯P(x)
2
3for m=1,…,nπ‘š1…𝑛m=1,\ldots,n do
4       answerm←←subscriptanswerπ‘šabsent\text{answer}_{m}\leftarrowcall c𝑐c-secure-countermeasure(P,x,c𝑃π‘₯𝑐P,x,c)
return the majority in {answerm​​m=1,…,n}subscriptanswerπ‘šπ‘š1…𝑛\{\text{answer}_{m}\text{: }m=1,\ldots,n\}
Algorithm 3 n𝑛n-secure countermeasure FIA (P,x,n,c)𝑃π‘₯𝑛𝑐(P,x,n,c).

Note that c𝑐cand n𝑛nare independent security parameters. The security parameter c𝑐crepresents the number of calls to the unprotected program used in the PSCA countermeasure, whereas n𝑛nsignifies the number of iterations in the FIA countermeasure. The security parameter n𝑛nis associated with the attacker’s capability to inject effective faults. Owing to redundancy, an increase in the security parameter c𝑐cresults in a decreased likelihood of the attacker successfully injecting a fault.

Input : Program: P𝑃P, Sensitive input: xπ‘₯x, Security: n,c𝑛𝑐n,c
Output : P​(x)𝑃π‘₯P(x)
1
2for m=1,…,nπ‘š1…𝑛m=1,\ldots,n do
3       x1,x2,…,xcsubscriptπ‘₯1subscriptπ‘₯2…subscriptπ‘₯𝑐x_{1},x_{2},\ldots,x_{c}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(R​2n,x𝑅superscript2𝑛π‘₯R2^{n},x)
4       answerm←P​(x1,R)+RP​(x2,R)​…+RP​(xc,R)←subscriptanswerπ‘šsubscript𝑅subscript𝑅𝑃subscriptπ‘₯1𝑅𝑃subscriptπ‘₯2𝑅…𝑃subscriptπ‘₯𝑐𝑅\text{answer}_{m}\leftarrow P(x_{1},R)+_{R}P(x_{2},R)\ldots+_{R}P(x_{c},R)
return the majority in {answerm​​m=1,…,n}subscriptanswerπ‘šπ‘š1…𝑛\{\text{answer}_{m}\text{: }m=1,\ldots,n\}
Algorithm 4 (c𝑐c, n𝑛n)-secure mod operation (P,R,x,c,n)𝑃𝑅π‘₯𝑐𝑛(P,R,x,c,n).

Algorithm 4 presents an example of a combined and configurable countermeasure, effective against both PSCA and FIA, applied to the modular multiplication operation. In Line 2, the algorithm divides the input xπ‘₯xinto c𝑐cshares x1,x2,…,xcsubscriptπ‘₯1subscriptπ‘₯2…subscriptπ‘₯𝑐x_{1},x_{2},\ldots,x_{c}, satisfying x=x1+x2+β‹―+xcπ‘₯subscriptπ‘₯1subscriptπ‘₯2β‹―subscriptπ‘₯𝑐x=x_{1}+x_{2}+\cdots+x_{c}. The methodology for the random splitter algorithm is detailed in Section 6. Furthermore, the majority function, which essentially returns the most common answer, is described in Section 6.

Self-Correctness with Majority Voting. Fault injection attacks rely on faulty output. By majority voting, we can obtain correct results even if some results are incorrect.

5.3 n and attacker’s probability of success

Fault injection occurs at the hardware level and is both challenging and unpredictable to control. When a successful fault is induced, it transforms a previously correct victim program into an incorrect one. Consequently, the essence of a fault injection attack is its probabilistic nature. This concept is abstracted in terms of the attacker’s probability of success in our work.

Definition 2 (Ξ΅πœ€\varepsilon-fault tolerance).

Let Ξ΅πœ€\varepsilon be the upper bound on the attacker’s probability of injecting a fault successfully at an unprotected program P𝑃P that correctly implements a function f𝑓f. Say that the program P𝑃P is Ξ΅πœ€\varepsilon-fault tolerant for the function f𝑓f provided P​(x)=f​(x)𝑃π‘₯𝑓π‘₯P(x)=f(x) for at least 1βˆ’Ξ΅1πœ€1-\varepsilon of any input xπ‘₯x. We assume each fault injection is independent of the others: Prf​a​u​l​t⁑[P​(x)β‰ f​(x)]<Ξ΅.subscriptPrπ‘“π‘Žπ‘’π‘™π‘‘π‘ƒπ‘₯𝑓π‘₯πœ€\operatorname{Pr}_{fault}[P(x)\neq f(x)]<\varepsilon.

Algorithm 2 is a randomized algorithm and Algorithm 3 is also a randomized algorithm that repeats the computation n𝑛ntimes by calling Algorithm 2 and uses majority voting to pick the correct answer. Therefore, we can use Chernoff bounds [87] to show that the probability of getting the correct answer is at least 1βˆ’Ξ΄1𝛿1-\delta.

A simple and common use of Chernoff bounds is for "boosting" of randomized algorithms. If one has an algorithm that outputs a guess that is the desired answer with probability p>1/2𝑝12p>1/2, then one can get a higher success rate by running the algorithm n=log⁑(1/Ξ΄)​2​p/(pβˆ’1/2)2𝑛1𝛿2𝑝superscript𝑝122n=\log(1/\delta)2p/(p-1/2)^{2}times and outputting a guess that is output by more than n/2𝑛2n/2runs of the algorithm. Assuming that these algorithm runs are independent, the probability that more than n/2𝑛2n/2of the guesses is correct is equal to the probability that the sum of independent Bernoulli random variables Xksubscriptπ‘‹π‘˜X_{k}that are 1 with probability p𝑝pis more than n/2𝑛2n/2. This can be shown to be at least 1βˆ’Ξ΄1𝛿1-\deltavia the multiplicative Chernoff bound (ΞΌ=n​pπœ‡π‘›π‘\mu=np[29]: Pr⁑[X>n/2]β‰₯1βˆ’eβˆ’n​(pβˆ’1/2)2/(2​p)β‰₯1βˆ’Ξ΄.Pr𝑋𝑛21superscript𝑒𝑛superscript𝑝1222𝑝1𝛿\Pr\left[X>{n}/{2}\right]\geq 1-e^{-n(p-1/2)^{2}/(2p)}\geq 1-\delta.

Theorem 1 (Derived from Theorem 3.1 in [50]).

Suppose that f𝑓f is randomly self-reducible and that P𝑃P is Ξ΅πœ€\varepsilon-fault tolerant for the function f𝑓f. Consider a c𝑐c-secure countermeasure C~​(x)~𝐢π‘₯\widetilde{C}(x) (Line 4 in Algorithm 2):

return β€‹F​[x,a1,…,ac,P​(a1),…,P​(ac)]return πΉπ‘₯subscriptπ‘Ž1…subscriptπ‘Žπ‘π‘ƒsubscriptπ‘Ž1…𝑃subscriptπ‘Žπ‘\textbf{return }F[x,a_{1},\ldots,a_{c},P(a_{1}),\ldots,P(a_{c})]

Then, for any x,C~​(x)π‘₯~𝐢π‘₯x,\widetilde{C}(x) is equal to f​(x)𝑓π‘₯f(x) with probability at least 1βˆ’Ξ΅β€‹c1πœ€π‘1-\varepsilon c.

Proof.

Fix an input xπ‘₯x. Clearly, the probability that C~​(x)~𝐢π‘₯\widetilde{C}(x)is correct is at least the probability that for each i,P​(ai)=f​(ai)𝑖𝑃subscriptπ‘Žπ‘–π‘“subscriptπ‘Žπ‘–i,P(a_{i})=f(a_{i}). This follows since f𝑓fis random self-reducible with respect to the number of calls to P𝑃Pis done. It therefore follows that C~~𝐢\widetilde{C}returns correct results at least 1βˆ’Ξ΅β€‹c1πœ€π‘1-\varepsilon cof the time. ∎

In the next sections, we will present a number of examples of c𝑐c-secure countermeasures whose security parameter is mostly c=2𝑐2c=2. Thus, for these functions, Theorem 1 says that, for Ξ΅πœ€\varepsilonequal to 1/10011001/100, the probability that C~~𝐢\widetilde{C}returns correct results is at least 0.98. We can amplify the probability of success by repeating the computation n𝑛ntimes and using majority voting. In addition, we can select a bigger n𝑛nby adjusting δ𝛿\deltaas the confidence parameter:

Lower bound for 𝐧𝐧\bf{n}. The attacker’s probability of success is Ξ΅πœ€\varepsilon, and for a c𝑐c-secure countermeasure, the lower bound for n𝑛nis defined as: n=log⁑(1/Ξ΄)​2​(1βˆ’Ξ΅β€‹c)/(Ρ​c/2)2𝑛1𝛿21πœ€π‘superscriptπœ€π‘22n=\log(1/\delta)2(1-\varepsilon c)/(\varepsilon c/2)^{2}, where δ𝛿\deltais the confidence parameter.

Algorithm 2 makes calls to a program P𝑃Pthat implements a function f𝑓fhaving a random self-reducible property. However, we do not need to know the implementation of the function f𝑓f, we just need to know the mathematical definition of the function f𝑓fto configure the Algorithm 2 and 3. Therefore, one further advantage of our countermeasure is that it follows β€œblack-box” approach. The fault injection attacks are hardware attacks, and the attacker does not have access to the software implementation of the function. Therefore, the attacker can only observe the input and output of the function. By using the black-box approach, we basically make the countermeasure robust at the hardware level.

Black-box. If we replace the f𝑓ffunction with a program P𝑃Pthat computes the function f𝑓f, then our countermeasure C~~𝐢\widetilde{C}access P𝑃Pas a black-box and computes the function f𝑓fusing the random self-reducible properties of f𝑓f.

6 Implementation of Countermeasures

Table 6 lists all functions of some finite field operations and their corresponding random self-reducible properties. In this section, we examplify each function that we used in end-to-end experiments and show how to apply the countermeasure approach to protect the function.

TABLE II: Random Self-Reducible Properties. xπ‘₯x and y𝑦y are integers and p𝑝p and qπ‘žq are polynomial.
{NiceTabular}

|Ol|Ol|Ol| Program P𝑃PFunction f𝑓fRandom Self-Reducible Property
Mod Operation f​(x,R)=xmodR𝑓π‘₯𝑅moduloπ‘₯𝑅f(x,R)=x\bmod RP​(x,R)←P​(x~1,R)+RP​(x~2,R)←𝑃π‘₯𝑅subscript𝑅𝑃subscript~π‘₯1𝑅𝑃subscript~π‘₯2𝑅P(x,R)\leftarrow P(\tilde{x}_{1},R)+_{R}P(\tilde{x}_{2},R)
Modular Multiplication f​(x,y,R)=xβ‹…Ry𝑓π‘₯𝑦𝑅subscript⋅𝑅π‘₯𝑦f(x,y,R)=x\cdot_{R}yP​(x,y,R)←P​(x~1,y~1,R)+RP​(x~2,y~1,R)+RP​(x~1,y~2,R)+RP​(x~2,y~2,R)←𝑃π‘₯𝑦𝑅subscript𝑅subscript𝑅subscript𝑅𝑃subscript~π‘₯1subscript~𝑦1𝑅𝑃subscript~π‘₯2subscript~𝑦1𝑅𝑃subscript~π‘₯1subscript~𝑦2𝑅𝑃subscript~π‘₯2subscript~𝑦2𝑅P(x,y,R)\leftarrow P(\tilde{x}_{1},\tilde{y}_{1},R)+_{R}P(\tilde{x}_{2},\tilde{y}_{1},R)+_{R}P(\tilde{x}_{1},\tilde{y}_{2},R)+_{R}P(\tilde{x}_{2},\tilde{y}_{2},R)
Modular Exponentiation f​(a,x,R)=axmodRπ‘“π‘Žπ‘₯𝑅modulosuperscriptπ‘Žπ‘₯𝑅f(a,x,R)=a^{x}\bmod RP​(a,x,R)←P​(a,x~1,R)β‹…RP​(a,x~2,R)β†π‘ƒπ‘Žπ‘₯𝑅subscriptβ‹…π‘…π‘ƒπ‘Žsubscript~π‘₯1π‘…π‘ƒπ‘Žsubscript~π‘₯2𝑅P(a,x,R)\leftarrow P(a,\tilde{x}_{1},R)\cdot_{R}P(a,\tilde{x}_{2},R)
Modular Inverse f​(x,R)β‹…Rx=1subscript⋅𝑅𝑓π‘₯𝑅π‘₯1f(x,R)\cdot_{R}x=1P​(x,R)←w~β‹…RP​(xβ‹…Rw~)←𝑃π‘₯𝑅subscript⋅𝑅~𝑀𝑃subscript⋅𝑅π‘₯~𝑀P(x,R)\leftarrow\tilde{w}\cdot_{R}P(x\cdot_{R}\tilde{w})where P​(w~,R)β‹…Rw~=1subscript⋅𝑅𝑃~𝑀𝑅~𝑀1P(\tilde{w},R)\cdot_{R}\tilde{w}=1and P​(x,R)β‹…Rx=1subscript⋅𝑅𝑃π‘₯𝑅π‘₯1P(x,R)\cdot_{R}x=1
Polynomial Multiplication f​(px,qx)=𝑓subscript𝑝π‘₯subscriptπ‘žπ‘₯absentf(p_{x},q_{x})=pxβ‹…qxβ‹…subscript𝑝π‘₯subscriptπ‘žπ‘₯p_{x}\cdot q_{x}P​(p,q)←P​(p~1,q~1)+P​(p~2,q~1)+P​(p~1,q~2)+P​(p~2,q~2)β†π‘ƒπ‘π‘žπ‘ƒsubscript~𝑝1subscript~π‘ž1𝑃subscript~𝑝2subscript~π‘ž1𝑃subscript~𝑝1subscript~π‘ž2𝑃subscript~𝑝2subscript~π‘ž2P(p,q)\leftarrow P(\tilde{p}_{1},\tilde{q}_{1})+P(\tilde{p}_{2},\tilde{q}_{1})+P(\tilde{p}_{1},\tilde{q}_{2})+P(\tilde{p}_{2},\tilde{q}_{2})
Number Theoretic Transform f​(x1,…,xn)=⋯𝑓subscriptπ‘₯1…subscriptπ‘₯𝑛⋯f(x_{1},\ldots,x_{n})=\cdotsP​(x1,…,xn)←P​(x1+r~1,…,xn+r~n)βˆ’P​(r~1,…,r~n)←𝑃subscriptπ‘₯1…subscriptπ‘₯𝑛𝑃subscriptπ‘₯1subscript~π‘Ÿ1…subscriptπ‘₯𝑛subscript~π‘Ÿπ‘›π‘ƒsubscript~π‘Ÿ1…subscript~π‘Ÿπ‘›P(x_{1},\ldots,x_{n})\leftarrow P(x_{1}+\tilde{r}_{1},\ldots,x_{n}+\tilde{r}_{n})-P(\tilde{r}_{1},\ldots,\tilde{r}_{n})
Integer Multiplication f​(x,y)=xβ‹…y𝑓π‘₯𝑦⋅π‘₯𝑦f(x,y)=x\cdot yP​(x,y)←P​(x~1,y~1)+P​(x~1,y~2)+P​(x~2,y~1)+P​(x~2,y2~)←𝑃π‘₯𝑦𝑃subscript~π‘₯1subscript~𝑦1𝑃subscript~π‘₯1subscript~𝑦2𝑃subscript~π‘₯2subscript~𝑦1𝑃subscript~π‘₯2~subscript𝑦2P(x,y)\leftarrow P(\tilde{x}_{1},\tilde{y}_{1})+P(\tilde{x}_{1},\tilde{y}_{2})+P(\tilde{x}_{2},\tilde{y}_{1})+P(\tilde{x}_{2},\tilde{y_{2}})
Integer Multiplication f​(x,y)=xβ‹…y𝑓π‘₯𝑦⋅π‘₯𝑦f(x,y)=x\cdot yP​(x,y)←P​(x+r~,y+s~)βˆ’P​(r~,y+s~)βˆ’P​(x+t~,s~)+P​(t~,s~)←𝑃π‘₯𝑦𝑃π‘₯~π‘Ÿπ‘¦~𝑠𝑃~π‘Ÿπ‘¦~𝑠𝑃π‘₯~𝑑~𝑠𝑃~𝑑~𝑠P(x,y)\leftarrow P(x+\tilde{r},y+\tilde{s})-P(\tilde{r},y+\tilde{s})-P(x+\tilde{t},\tilde{s})+P(\tilde{t},\tilde{s})
Integer Division f​(x,R)=xΓ·R𝑓π‘₯𝑅π‘₯𝑅f(x,R)=x\div RP​(x,y)←P​(x1,R)+P​(x2,R)+P​(Pmod​(x1,R)+Pmod β€‹(x2,R),R)←𝑃π‘₯𝑦𝑃subscriptπ‘₯1𝑅𝑃subscriptπ‘₯2𝑅𝑃subscript𝑃modsubscriptπ‘₯1𝑅subscript𝑃mod subscriptπ‘₯2𝑅𝑅\displaystyle P(x,y)\leftarrow P(x_{1},R)+P(x_{2},R)+P\left(P_{\text{mod}}(x_{1},R)+P_{\text{mod }}(x_{2},R),R\right)
Matrix Multiplication f​(A,B)=AΓ—B𝑓𝐴𝐡𝐴𝐡f(A,B)=A\times BP​(A,B)←P​(A~1,B~1)+P​(A~2,B~1)+P​(A~1,B~2)+P​(A~2,B~2)←𝑃𝐴𝐡𝑃subscript~𝐴1subscript~𝐡1𝑃subscript~𝐴2subscript~𝐡1𝑃subscript~𝐴1subscript~𝐡2𝑃subscript~𝐴2subscript~𝐡2P(A,B)\leftarrow P(\tilde{A}_{1},\tilde{B}_{1})+P(\tilde{A}_{2},\tilde{B}_{1})+P(\tilde{A}_{1},\tilde{B}_{2})+P(\tilde{A}_{2},\tilde{B}_{2})
Matrix Inverse f​(A)=Aβˆ’1𝑓𝐴superscript𝐴1f(A)=A^{-1}P​(A)←R~Γ—P​(AΓ—R~)←𝑃𝐴~𝑅𝑃𝐴~𝑅P(A)\leftarrow\tilde{R}\times P(A\times\tilde{R})where A𝐴Aand R~~𝑅\tilde{R}are invertible n𝑛n-by-n𝑛nmatrices.
Matrix Determinant f​(A)=detA𝑓𝐴𝐴f(A)=\det AP​(A)←P​(R~)/P​(AΓ—R~)←𝑃𝐴𝑃~𝑅𝑃𝐴~𝑅P(A)\leftarrow P(\tilde{R})/P(A\times\tilde{R})where R~~𝑅\tilde{R}is invertible.

Random Split Function.

A random splitter is used to divide the input xπ‘₯xinto c𝑐cshares a1,…,acsubscriptπ‘Ž1…subscriptπ‘Žπ‘a_{1},\ldots,a_{c}such that x=a1+β‹―+acπ‘₯subscriptπ‘Ž1β‹―subscriptπ‘Žπ‘x=a_{1}+\cdots+a_{c}. Algorithm 5 provides a possible implementation of the random splitter, accommodating an additional input that specifies the total number of shares.

1
Input: modulus: mπ‘šm, input value: xπ‘₯x, # of shares: c𝑐c
Output: an array of shares a1,a2,…,acsubscriptπ‘Ž1subscriptπ‘Ž2…subscriptπ‘Žπ‘a_{1},a_{2},\ldots,a_{c}.
2
3Initialize an array s​[1​…​c]𝑠delimited-[]1…𝑐s[1\ldots c]and initialize s​u​m←0β†π‘ π‘’π‘š0sum\leftarrow 0
4 i←1←𝑖1i\leftarrow 1
5 for i𝑖i to cβˆ’1𝑐1c-1 do
6       s​[i]←$random integer in β€‹β„€m←absent$𝑠delimited-[]𝑖random integer in subscriptβ„€π‘šs[i]\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}\text{random integer in }\mathbb{Z}_{m}
7       s​u​m←s​u​m+s​[i]β†π‘ π‘’π‘šπ‘ π‘’π‘šπ‘ delimited-[]𝑖sum\leftarrow sum+s[i]
8s​[c]←xβˆ’s​u​m(modm)←𝑠delimited-[]𝑐annotatedπ‘₯π‘ π‘’π‘špmodπ‘šs[c]\leftarrow x-sum\pmod{m}
return s𝑠s
Algorithm 5 Random-Split(m,x,c)π‘šπ‘₯𝑐(m,x,c).

This algorithm ensures that the sum of all shares a1,a2,…,acsubscriptπ‘Ž1subscriptπ‘Ž2…subscriptπ‘Žπ‘a_{1},a_{2},\ldots,a_{c}is congruent to the original input xπ‘₯xmodulo mπ‘šm. This congruence condition is vital for maintaining the integrity of the split and ensuring that the original input can be accurately reconstructed from the shares.

Majority Vote Function.

The function majority() selects the value that occurs most frequently; in case of a tie it selects the first such value. In practice, if the majority() function does not get all the same values, then it would at least β€œlog” that some error has been detected. Here, it is unlikely that the majority of the values are wrong.

Input : A list of elements a1,a2,…,ansubscriptπ‘Ž1subscriptπ‘Ž2…subscriptπ‘Žπ‘›a_{1},a_{2},\dots,a_{n}
Output : The majority element of the list
1
2Initialize an element mπ‘šmand a counter i𝑖iwith i=0𝑖0i=0;
3for j←1←𝑗1j\leftarrow 1 to n𝑛n do
4       if i=0𝑖0i=0 then  m←ajβ†π‘šsubscriptπ‘Žπ‘—m\leftarrow a_{j}and i←1←𝑖1i\leftarrow 1
5      else if m=ajπ‘šsubscriptπ‘Žπ‘—m=a_{j} then  i←i+1←𝑖𝑖1i\leftarrow i+1
6      else  i←iβˆ’1←𝑖𝑖1i\leftarrow i-1
7      
return mπ‘šm
Algorithm 6 Majority Vote Algorithm

The software implementation of all (c,n𝑐𝑛c,n)-secure programs use Boyer-Moore majority111https://www.cs.utexas.edu/~moore/best-ideas/mjrty vote algorithm [20] as the majority function implementation. Algorithm 6 maintains in its local variables a sequence element mπ‘šmand a counter i𝑖i, with the counter initially zero. It then processes the elements of the sequence, one at a time. When processing an element ajsubscriptπ‘Žπ‘—a_{j}, if the counter is zero, the algorithm stores ajsubscriptπ‘Žπ‘—a_{j}as its remembered sequence element and sets the counter to one. Otherwise, it compares ajsubscriptπ‘Žπ‘—a_{j}to the stored element and either increments the counter (if they are equal) or decrements the counter (otherwise). At the end of this process, if the sequence has a majority, it will be the element stored by the algorithm. Algorithm 7 is a self-correcting version of the majority vote algorithm. It calls the majority function n𝑛ntimes and at each iteration, it shuffles the input list to ensure that the input list is random at each iteration against simple power side-channel attacks. The algorithm returns the majority element of the list, if it exists.

Input : Votes β„“=a1,a2,…,anβ„“subscriptπ‘Ž1subscriptπ‘Ž2…subscriptπ‘Žπ‘›\ell=a_{1},a_{2},\dots,a_{n}, function majoritymajority\operatorname{majority}, and n𝑛n
Output : The majority element of the list, if it exists
1 m←1β†π‘š1m\leftarrow 1
2 for mπ‘šm to n𝑛n do
3       β„“1←$shuffle⁑(β„“)←absent$subscriptβ„“1shuffleβ„“\ell_{1}\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}\operatorname{shuffle}(\ell)
4       answerm←majority⁑(β„“1)←subscriptanswerπ‘šmajoritysubscriptβ„“1\text{answer}_{m}\leftarrow\operatorname{majority}(\ell_{1})
5      
6if mβ‰ nπ‘šπ‘›m\neq n then output "FAIL" and halt β–·β–·\triangleright verify loop completion
return the majority in {answerm​​m=1,…,n}subscriptanswerπ‘šπ‘š1…𝑛\{\text{answer}_{m}\text{: }m=1,\ldots,n\}
Algorithm 7 Protected Majority Vote (β„“,majority,nβ„“majority𝑛\ell,\operatorname{majority},n)

The Fisher-Yates shuffle algorithm is used to shuffle the input list at each iteration [38, 34]. It iterates through a sequence from the end to the beginning (or the other way) and for each location i𝑖i, it swaps the value at i𝑖iwith the value at a random target location j𝑗jat or before i𝑖i.

Input : A list of elements a1,a2,…,ansubscriptπ‘Ž1subscriptπ‘Ž2…subscriptπ‘Žπ‘›a_{1},a_{2},\dots,a_{n}
Output : A random permutation of the elements in the input list
1
2for i←nβˆ’1←𝑖𝑛1i\leftarrow n-1 down to 111 do
3       Choose a random integer j𝑗jsuch that 0≀j≀i0𝑗𝑖0\leq j\leq i
4       Swap aisubscriptπ‘Žπ‘–a_{i}and ajsubscriptπ‘Žπ‘—a_{j}
5      
6
7return the shuffled list
Algorithm 8 Fisher-Yates Shuffle

Line 5 of Algorithm 7 verifies that the loop has completed. This is a simple check to ensure that the loop has completed and that the algorithm has not been interrupted. This is a classical countermeasure against instruction skip type of fault injection attacks [92]. In the rest of the countermeasures, we don’t need to check because the self-correctness property is already guaranteed by the majority function.

Mod Function Countermeasure

We consider computing an integer modRmoduloabsent𝑅\bmod\,Rfor a positive number R𝑅R. In this case, f​(x,R)=xmodR𝑓π‘₯𝑅moduloπ‘₯𝑅f(x,R)=x\bmod R. Algorithm 9 shows a pseudocode for the protected mod function with a security parameter of 2.

1
2x1,x2subscriptπ‘₯1subscriptπ‘₯2x_{1},x_{2}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(R​2n,x𝑅superscript2𝑛π‘₯R2^{n},x)
3 return P​(x1,R)+RP​(x2,R)subscript𝑅𝑃subscriptπ‘₯1𝑅𝑃subscriptπ‘₯2𝑅P(x_{1},R)+_{R}P(x_{2},R)
Algorithm 9 2-secure protected mod operation (P,R,x)𝑃𝑅π‘₯(P,R,x)

However, Algorithm 10 shows a version for a security parameter 3 in which we increase the security parameter by one by increasing the number of shares to three. All random self-reducible properties in Table 6 are applicable to increase shares to n𝑛n.

1 x1,x2,x3subscriptπ‘₯1subscriptπ‘₯2subscriptπ‘₯3x_{1},x_{2},x_{3}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(R​2n,x𝑅superscript2𝑛π‘₯R2^{n},x)
2 return P​(x1,R)+RP​(x2,R)+RP​(x3,R)subscript𝑅subscript𝑅𝑃subscriptπ‘₯1𝑅𝑃subscriptπ‘₯2𝑅𝑃subscriptπ‘₯3𝑅P(x_{1},R)+_{R}P(x_{2},R)+_{R}P(x_{3},R)
Algorithm 10 3-secure protected mod operation (P,R,x)𝑃𝑅π‘₯(P,R,x).

Modular Multiplication Countermeasure

We now consider multiplication of integers modRmoduloabsent𝑅\bmod\ Rfor a positive number R𝑅R. In this case, f​(x,y,R)=xβ‹…Ry𝑓π‘₯𝑦𝑅subscript⋅𝑅π‘₯𝑦f(x,y,R)=x\cdot_{R}y. Suppose that both xπ‘₯xand y𝑦yare in the range β„€R​2nsubscript℀𝑅superscript2𝑛\mathbb{Z}_{R2^{n}}for some positive integer n𝑛n. Algorithm 11 shows a possible implementation for the protected modular multiplication with a c𝑐csecurity parameter set to 2.

1 x1,x2subscriptπ‘₯1subscriptπ‘₯2x_{1},x_{2}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(RΓ—2n,x𝑅superscript2𝑛π‘₯R\times 2^{n},x)
2 y1,y2subscript𝑦1subscript𝑦2y_{1},y_{2}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(RΓ—2n,y𝑅superscript2𝑛𝑦R\times 2^{n},y)
return P​(x1,y1,R)+RP​(x2,y1,R)+RP​(x1,y2,R)+P​(x2,y2,R)subscript𝑅subscript𝑅𝑃subscriptπ‘₯1subscript𝑦1𝑅𝑃subscriptπ‘₯2subscript𝑦1𝑅𝑃subscriptπ‘₯1subscript𝑦2𝑅𝑃subscriptπ‘₯2subscript𝑦2𝑅P(x_{1},y_{1},R)+_{R}P(x_{2},y_{1},R)+_{R}P(x_{1},y_{2},R)+P(x_{2},y_{2},R)
Algorithm 11 2-secure mod. multiplication (P,R,x,y)𝑃𝑅π‘₯𝑦(P,R,x,y)

Modular Exponentiation Countermeasure

We now consider exponentiation of integers modRmoduloabsent𝑅\bmod Rfor a positive number R𝑅R. In this case, f​(a,x,R)=axmodRπ‘“π‘Žπ‘₯𝑅modulosuperscriptπ‘Žπ‘₯𝑅f(a,x,R)=a^{x}\bmod R. We restrict attention to the case when gcd⁑(a,R)=1gcdπ‘Žπ‘…1\operatorname{gcd}(a,R)=1and when we know the factorization of R𝑅R, and thus we can easily compute ϕ​(R)italic-ϕ𝑅\phi(R), where Ο•italic-Ο•\phiis Euler’s function. Suppose that xπ‘₯xis in the range ℀ϕ​(R)​2nsubscriptβ„€italic-ϕ𝑅superscript2𝑛\mathbb{Z}_{\phi(R)2^{n}}.

1 x1,x2subscriptπ‘₯1subscriptπ‘₯2x_{1},x_{2}←$←absent$\mathrel{{\leftarrow}\vbox{\hbox{\scriptsize\$}}}Random-Split(ϕ​(R)​2n,xitalic-ϕ𝑅superscript2𝑛π‘₯\phi(R)2^{n},x)
return ←P​(a,x1,R)β‹…RP​(a,x2,R)←absentsubscriptβ‹…π‘…π‘ƒπ‘Žsubscriptπ‘₯1π‘…π‘ƒπ‘Žsubscriptπ‘₯2𝑅\leftarrow P(a,x_{1},R)\cdot_{R}P(a,x_{2},R)
  β–·β–·\triangleright calls Algo. 11
Algorithm 12 2-secure mod. exponentiation (P,R,a,x)π‘ƒπ‘…π‘Žπ‘₯(P,R,a,x)

The modular exponentiation self-correcting program is very simple to code. The hardest operation to perform is the modular multiplication P(a,x1,R)β‹…RP\left(a,x_{1},R\right)\cdot_{R}P​(a,x2,R)π‘ƒπ‘Žsubscriptπ‘₯2𝑅P\left(a,x_{2},R\right). The self-correcting program can compute this multiplication directly without using random self-reducible property, however, for extra protection, 222-secure modular multiplication can be used (cf. Algorithm 11).

Polynomial Multiplication Countermeasure

We consider the multiplication of polynomials over a ring. Let Rd​[x]superscript𝑅𝑑delimited-[]π‘₯R^{d}[x]denote the set of polynomials of degree d𝑑dwith coefficients from some ring R𝑅R, and let π•ŒRd​[x]Γ—Rd​[x]subscriptπ•Œsuperscript𝑅𝑑delimited-[]π‘₯superscript𝑅𝑑delimited-[]π‘₯\mathbb{U}_{R^{d}[x]\times R^{d}[x]}be the uniform distribution on Rd​[x]Γ—Rd​[x]superscript𝑅𝑑delimited-[]π‘₯superscript𝑅𝑑delimited-[]π‘₯R^{d}[x]\times R^{d}[x]. In this case, f​(p​(x),q​(x))=𝑓𝑝π‘₯π‘žπ‘₯absentf(p(x),q(x))=p​(x)β‹…q​(x)⋅𝑝π‘₯π‘žπ‘₯p(x)\cdot q(x), where p,q∈Rd​[x]π‘π‘žsuperscript𝑅𝑑delimited-[]π‘₯p,q\in R^{d}[x].

Choose p1βˆˆπ•ŒRd​[x]subscriptπ•Œsubscript𝑝1superscript𝑅𝑑delimited-[]π‘₯p_{1}\in_{\mathbb{U}}R^{d}[x]
  β–·β–·\triangleright random polynomial
Choose q1βˆˆπ•ŒRd​[x]subscriptπ•Œsubscriptπ‘ž1superscript𝑅𝑑delimited-[]π‘₯q_{1}\in_{\mathbb{U}}R^{d}[x]
  β–·β–·\triangleright random polynomial
1 p2←pβˆ’p1←subscript𝑝2𝑝subscript𝑝1p_{2}\leftarrow p-p_{1}
2 q2←qβˆ’q1←subscriptπ‘ž2π‘žsubscriptπ‘ž1q_{2}\leftarrow q-q_{1}
3 return P​(p1,q1)+P​(p2,q1)+P​(p1,q2)+P​(p2,q2)𝑃subscript𝑝1subscriptπ‘ž1𝑃subscript𝑝2subscriptπ‘ž1𝑃subscript𝑝1subscriptπ‘ž2𝑃subscript𝑝2subscriptπ‘ž2P\left(p_{1},q_{1}\right)+P\left(p_{2},q_{1}\right)+P\left(p_{1},q_{2}\right)+P\left(p_{2},q_{2}\right)
Algorithm 13 2-secure polynomial multiplication (P,p,q)π‘ƒπ‘π‘ž(P,p,q)

Number Theoretic Transforms

Transforms used in signal processing such as the Fast Fourier Transform (FFT) or Number Theoritic Transform (NTT) or their inverse can be protected with our countermeasure. NTT over an n𝑛npoint sequence is performed using the well-known butterfly network, which operates over several layers/stages. The atomic operation within the NTT computation is denoted as the butterfly operation. A butterfly operation takes as inputs (a,b)βˆˆβ„€q2π‘Žπ‘superscriptsubscriptβ„€π‘ž2(a,b)\in\mathbb{Z}_{q}^{2}and a twiddle constant w𝑀w, and produces outputs (c,d)βˆˆβ„€q2𝑐𝑑superscriptsubscriptβ„€π‘ž2(c,d)\in\mathbb{Z}_{q}^{2}. An NTT/INTT of size n=2k𝑛superscript2π‘˜n=2^{k}typically consists of kπ‘˜kstages with each stage containing n/2𝑛2n/2butterfly operations. Figure 2 shows the data-flow graph of a butterfly-based NTT for an input sequence with length n=8𝑛8n=8. All operations are linear in nature, and thus, the NTT/INTT can be viewed as a linear function.

Refer to caption
Figure 2: Data flow graphs of a butterfly-based NTT for size n=8𝑛8n=8 [75].
Lemma 1.

Let G𝐺G be an abstract finite group under the operation ∘\circ, and let xπ‘₯x be an arbitrary value from the group. If r~~π‘Ÿ\tilde{r} is a uniform random value, then so is x∘r~π‘₯~π‘Ÿx\circ\tilde{r}.

Proof.

Consider the function f​(z)=x∘z𝑓𝑧π‘₯𝑧f(z)=x\circ z. Since G𝐺Gis a group this is a one-to-one onto function. Thus, if r~~π‘Ÿ\tilde{r}is selected randomly, then so is f​(r~)𝑓~π‘Ÿf(\tilde{r}). ∎

Consider such a transformation T​(x1,…,xn)𝑇subscriptπ‘₯1…subscriptπ‘₯𝑛T\left(x_{1},\ldots,x_{n}\right)where the values xisubscriptπ‘₯𝑖x_{i}are fixed point numbers, i.e., 2-complement’s arithmetic of some fixed size. This follows since the transformation is linear. Thus, T​(x1,…,xn)=T​(x1+r~1,…,xn+r~n)βˆ’T​(r~1,…,r~n)𝑇subscriptπ‘₯1…subscriptπ‘₯𝑛𝑇subscriptπ‘₯1subscript~π‘Ÿ1…subscriptπ‘₯𝑛subscript~π‘Ÿπ‘›π‘‡subscript~π‘Ÿ1…subscript~π‘Ÿπ‘›T\left(x_{1},\ldots,x_{n}\right)=T\left(x_{1}+\tilde{r}_{1},\ldots,x_{n}+\tilde{r}_{n}\right)-T\left(\tilde{r}_{1},\ldots,\tilde{r}_{n}\right).

1 Choose r~1,…,r~nβˆˆπ•Œβ„€q2subscriptπ•Œsubscript~π‘Ÿ1…subscript~π‘Ÿπ‘›superscriptsubscriptβ„€π‘ž2\tilde{r}_{1},\ldots,\tilde{r}_{n}\in_{\mathbb{U}}\mathbb{Z}_{q}^{2}
2 return NTT​(x1+r~1,…,xn+r~n)βˆ’NTT​(r~1,…,r~n)NTTsubscriptπ‘₯1subscript~π‘Ÿ1…subscriptπ‘₯𝑛subscript~π‘Ÿπ‘›NTTsubscript~π‘Ÿ1…subscript~π‘Ÿπ‘›\text{NTT}\left(x_{1}+\tilde{r}_{1},\ldots,x_{n}+\tilde{r}_{n}\right)-\text{NTT}\left(\tilde{r}_{1},\ldots,\tilde{r}_{n}\right)
Algorithm 14 2-secure NTT (P,x1,…,xnβˆˆβ„€q2𝑃subscriptπ‘₯1…subscriptπ‘₯𝑛superscriptsubscriptβ„€π‘ž2P,x_{1},\ldots,x_{n}\in\mathbb{Z}_{q}^{2}).

The key point here is that since fixed-point values are a group under addition, the value xi+r~isubscriptπ‘₯𝑖subscript~π‘Ÿπ‘–x_{i}+\tilde{r}_{i}is a uniform random value by Lemma 1. Note that the function basis consists of just addition [50]. The countermeasure for NTT is given in Algorithm 14.

7 End-to-End Implementations

In this section, we introduce implementations of the RSA-CRT signature algorithm and Kyber’s key generation algorithm, detailing existing vulnerabilities and how we can protect them against them using our methods.

Securing RSA-CRT Algorithm.

RSA is a cryptographic algorithm commonly used in digital signatures and SSL certificates. Due to the security of RSA, which relies on the difficulty of factoring the product of two large prime numbers, the calculation of RSA is relatively slow. Therefore, it is seldom used to encrypt the data directly.

For efficiency, many popular cryptographic libraries (e.g., OpenSSL) use RSA based on the Chinese remainder theorem(CRT) for encryption or signing messages. Algorithm 15 is the RSA-CRT signature generation algorithm. With the private key, we pre-calculate the values dp=dmod(pβˆ’1)subscript𝑑𝑝modulo𝑑𝑝1d_{p}=d\mod(p-1), dq=dmod(qβˆ’1)subscriptπ‘‘π‘žmoduloπ‘‘π‘ž1d_{q}=d\mod(q-1)and u=qβˆ’1modp𝑒modulosuperscriptπ‘ž1𝑝u=q^{-1}\mod p, then generate the intermediate value sp=mdpmodpsubscript𝑠𝑝modulosuperscriptπ‘šsubscript𝑑𝑝𝑝s_{p}=m^{d_{p}}\mod p, sq=mdqmodqsubscriptπ‘ π‘žmodulosuperscriptπ‘šsubscriptπ‘‘π‘žπ‘žs_{q}=m^{d_{q}}\mod q. Finally, combine two intermediate value spsubscript𝑠𝑝s_{p}, sqsubscriptπ‘ π‘žs_{q}with the Garner’s algorithm S=sq+(((spβˆ’sq)β‹…u)modp)β‹…q𝑆subscriptπ‘ π‘žβ‹…moduloβ‹…subscript𝑠𝑝subscriptπ‘ π‘žπ‘’π‘π‘žS=s_{q}+(((s_{p}-s_{q})\cdot u)\mod p)\cdot qThe RSA based on CRT is about four times faster then classical RSA.

Input: A message M𝑀Mto sign, the private key (p,q,d)π‘π‘žπ‘‘(p,q,d), with p>qπ‘π‘žp>q, pre-calculated values dp=dmod(pβˆ’1)subscript𝑑𝑝modulo𝑑𝑝1d_{p}=d\mod(p-1), dq=dmod(qβˆ’1)subscriptπ‘‘π‘žmoduloπ‘‘π‘ž1d_{q}=d\mod(q-1), and u=qβˆ’1modp𝑒modulosuperscriptπ‘ž1𝑝u=q^{-1}\mod p.
Output: A valid signature S𝑆Sfor the message M𝑀M.
1
2mβ†β†π‘šabsentm\leftarrowEncode the message M𝑀Min mβˆˆβ„€Nπ‘šsubscript℀𝑁m\in\mathbb{Z}_{N}
sp←mdpmodp←subscript𝑠𝑝modulosuperscriptπ‘šsubscript𝑑𝑝𝑝s_{p}\leftarrow m^{d_{p}}\mod p
  β–·β–·\triangleright Protection with Algorithm 12
sq←mdqmodq←subscriptπ‘ π‘žmodulosuperscriptπ‘šsubscriptπ‘‘π‘žπ‘žs_{q}\leftarrow m^{d_{q}}\mod q
  β–·β–·\triangleright Protection with Algorithm 12
3 t←spβˆ’sq←𝑑subscript𝑠𝑝subscriptπ‘ π‘žt\leftarrow s_{p}-s_{q}
4 if t<0𝑑0t<0 then
5       t←t+p←𝑑𝑑𝑝t\leftarrow t+p
6      
7S←sq+((tβ‹…u)modp)β‹…q←𝑆subscriptπ‘ π‘žβ‹…moduloβ‹…π‘‘π‘’π‘π‘žS\leftarrow s_{q}+((t\cdot u)\mod p)\cdot q
8 return S𝑆S as a signature for the message M𝑀M
9
Algorithm 15 RSA-CRT Signature Generation Algorithm

However, using CRT to improve RSA operation efficiency makes RSA vulnerable. For instance, in [6], AumΓΌller et al. provided the fault-based cryptanalysis method of RSA-CRT that the attacker can intentionally induce the fault during the computation, which changes spsubscript𝑠𝑝s_{p}to faulty sp^^subscript𝑠𝑝\hat{s_{p}}, to obtain the faulty output and factorize N𝑁Nby using the equation q=g​c​d​((s′⁣eβˆ’m)modN,N)π‘žπ‘”π‘π‘‘modulosuperscriptπ‘ β€²π‘’π‘šπ‘π‘q=gcd((s^{\prime e}-m)\mod N,N)to recover the secret key. Sung-Ming et al. provided another equation that can factorize N𝑁Nwith faulty signature in [97]. There are two scenarios that the attacker can break the RSA-CRT. If the attacker knows the value of the message and faulty output, they can factorize N𝑁Nwith the previous equation. On the other hand, if the attacker knows the value of correct and faulty signatures, they can factorize N𝑁Nwith the equation q=g​c​d​((s^βˆ’s)modN,N)π‘žπ‘”π‘π‘‘modulo^𝑠𝑠𝑁𝑁q=gcd((\hat{s}-s)\mod N,N).

We protect Line 2 and Line 3 of Algorithm 15 using the proposed countermeasure against the attack introduced in [6].

Securing Kyber Key Generation Algorithm.

The NIST standardization process for post-quantum cryptography [63] has finished its third round, and provided a list of new public key schemes for new standardization [3]. While implementation performance and theoretical security guarantees served as the main criteria in the initial rounds, resistance against side-channel attacks (SCA) and fault injection attacks (FIA) emerged as an important criterion in the final round, as also clearly stated by NIST at several instances [74].

1
2s​e​e​dA←←𝑠𝑒𝑒subscript𝑑𝐴absentseed_{A}\leftarrowSample()U{}_{U}()
3 s​e​e​dB←←𝑠𝑒𝑒subscript𝑑𝐡absentseed_{B}\leftarrowSample()U{}_{U}()
4 A^←←^𝐴absent\hat{A}\leftarrowNTT(A𝐴A)
5 s←←𝑠absents\leftarrowSample(seedB,coinss)B{}_{B}(seed_{B},coins_{s})
6 e←←𝑒absente\leftarrowSample(seedB,coinse)B{}_{B}(seed_{B},coins_{e})
s^←←^𝑠absent\hat{s}\leftarrowNTT(s𝑠s)
  β–·β–·\triangleright Protection with Algorithm 14
7 e^←←^𝑒absent\hat{e}\leftarrowNTT(e𝑒e)
8t^←A^βŠ™s^+e^←^𝑑direct-product^𝐴^𝑠^𝑒\hat{t}\leftarrow\hat{A}\odot\hat{s}+\hat{e}
9 return p​k=(s​e​e​dA,t^),s​k=(s^)formulae-sequenceπ‘π‘˜π‘ π‘’π‘’subscript𝑑𝐴^π‘‘π‘ π‘˜^𝑠pk=(seed_{A},\hat{t}),sk=(\hat{s})
10
Algorithm 16 CPA Secure Kyber PKE (CPA.KeyGen)

They typically operate over polynomials in polynomial rings, and notably, polynomial multiplication is one of the most computationally intensive operations in practical implementations of these schemes. Among the several known techniques for polynomial multiplication such as the schoolbook multiplier, Toom-Cook [93] and Karatsuba [46], the Number Theoretic Transform (NTT) based polynomial multiplication [30] is one of the most widely adopted techniques, owing to its superior run-time complexity. Over the years, there has been a sustained effort by the cryptographic community to improve the performance of NTT for lattice-based schemes on a wide-range of hardware and software platforms [78, 67, 19, 2, 25]. As a result, the use of NTT for polynomial multiplication yields the fastest implementation for several lattice-based schemes. In particular, the NTT serves as a critical computational kernel used in Kyber [8] and Dilithium [54], which were selected as the first candidates for PQC standardization [75].

Refer to caption
Figure 3: In Kyber Key Generation algorithm polynomial multiplication is done using Number Theoretic Transform (NTT). The NTT is protected using the proposed countermeasure against the attack introduced in [75].

Figure 3 illustrates a recent fault injection attack [75] that exposes a significant vulnerability in NTT-based polynomial multiplication, allowing the zeroization of all twiddle constants through a single targeted fault. This vulnerability enables practical key/message recovery attacks on Kyber KEM and forgery attacks on Dilithium. Moreover, the proposed attacks are also shown to bypass most known fault countermeasures for lattice-based KEMs and signature schemes.

To safeguard polynomial multiplication, we can employ Algorithm 13 or protect individual NTT operations using Algorithm 14. In this paper, we focus on securing the NTT operation targeted by Ravi et al.[75] using Algorithm14. Consequently, we reinforce Line 6 of Algorithm 16 with our proposed countermeasure against the attack delineated in [75].

8 Evaluation

We conducted three experimental sets to assess our countermeasure’s effectiveness against fault injection and power side-channel attacks. Initially, we evaluated protected operations individually, including modular multiplication, modular exponentiation, and NTT. Subsequently, we assessed our countermeasure’s robustness within RSA-CRT and Kyber key generation algorithms. Finally, we examined the latency overhead introduced by our countermeasure.

To capture power traces, for our experiments we use an ATSAM4S-based target board. SAM4S is a microcontroller based around the 32-bit ARM cortex-m4 processor core, which is commonly used in embedded systems such as IoT devices. The specific target board comes with the ChipWhisperer Husky [64], which is the equipment that we used for power trace collection.

The voltage fault injection test bed is created using Riscure’s VC Glitcher product222https://www.riscure.com/products/vc-glitcher/ that generates an arbitrary voltage signal with a pulse resolution of 2 nanoseconds. We use a General Purpose Input Output (GPIO) signal to time the attack which allows us to inject a glitch at the moment the target is executing the targeted code. The target’s reset signal is used to reset the target prior to each experiment to avoid data cross-contamination. All fault injection experiments are performed targeting an off-the-shelf development platform built around an STM32F407 MCU, which includes an ARM Cortex-M4 core running at 168 MHz. This Cortex-M4 based MCU has an instruction cache, a data cache and a prefetch buffer.

Power Side-Channel Attack Evaluation.

In power side-channel evaluation, we use the Hamming Weight leakage model and the Test Vector Leakage Assessment (TVLA) [40] to evaluate the effectiveness of our countermeasure. The instantaneous power consumption measurement corresponding to a single execution of the target algorithm is referred to as power trace. Each power trace is therefore a vector of power samples, and the t-test has to be applied sample-wise. The obtained vector is referred to as t-trace.

To detect Points-of-Interest, we employ the Sum of Squared pairwise T-differences (SOST)[24] method, setting the threshold at 20% of the maximum. The t-test window size is uniformly set to Β±8plus-or-minus8\pm 8for all operations. We define the power side-channel security parameter as c=2𝑐2c=2in the c𝑐c-secure countermeasure in Algorithm 2 applicable to all operations. In the mod operation and modular multiplication, the entire operation is targeted, while in modular exponentiation and NTT, attacks are focused on the constant-time Montgomery ladder [59, 51] modular exponentiation function. For TVLA analysis, two sets of test vectors were created: one with random numbers of Hamming weight 12 and another with a Hamming weight of 4, using 1000 random numbers for each. These vectors were used for evaluating both protected and unprotected cryptographic operations.

In our study, we also evaluated the distinguishability of total power consumption in modular operations and modular multiplication. For modular multiplication, we maintained one operand’s value constant while varying the other operand among numbers with different Hamming Weights. This approach enables a comparative analysis of power consumption patterns in modular operations, particularly between unprotected and protected versions, offering insights into how variations in Hamming Weight influence power consumption in these protected cryptographic operations.

Our evaluation indicates that the RSR countermeasure significantly reduced t-test results, bringing them into acceptable regions. For example, in the mod operation, the maximum t-test result decreased from 415.7 to 4.12, and for NTT, it dropped from 417.7 to 7.69. These results, which are detailed in Table 4, demonstrate an average reduction of two orders of magnitude, highlighting the effectiveness of the RSR countermeasure in enhancing the security of cryptographic operations against side-channel attacks.

Refer to caption
(a) Unprotected Mod Operation
Refer to caption
(b) Protected Mod Operation
Refer to caption
(c) Unprotected Mod. Mult.
Refer to caption
(d) Protected Mod. Mult.
Refer to caption
(e) Unprotected Mod. Exp.
Refer to caption
(f) Protected Mod. Exp.
Refer to caption
(g) Unprotected NTT
Refer to caption
(h) Protected NTT
Figure 4: Power Side-Channel Attack Evaluation t-tests

Fault Injection Attack Evaluation.

In the fault injection attack evaluation, we use the model of injecting faults to cause changes to the desired output, comparing the desired output to the one of the fault. We set the fault injection security parameter as n=10𝑛10n=10for n𝑛n-secure countermeasure 3 for all operations.

Refer to caption
(a) Unprotected Mod. Mult.
Refer to caption
(b) Protected Mod. Mult.
Refer to caption
(c) Unprotected Mod. Exp.
Refer to caption
(d) Protected Mod. Exp.
Refer to caption
(e) Unprotected Poly. Mult.
Refer to caption
(f) Protected Poly. Mult.
Refer to caption
(g) Unprotected NTT
Refer to caption
(h) Protected NTT
Refer to caption
(i) Unprotected RSA-CRT
Refer to caption
(j) Protected RSA-CRT
Refer to caption
(k) Unprotected Kyber Key Gen.
Refer to caption
(l) Protected Kyber Key Gen.
Figure 5: Fault Injection Attack Evaluation Heatmaps
TABLE III: Reduction in Faults for Different Operations
{NiceTabular}

|Ol|Oc|Oc|Oc| Operation Unprotected Protected Reduction
Mod. exponentiation 165 9 94.5594.5594.55%
Mod. multiplication 168 1 99.499.499.4%
NTT 63 5 92.0692.0692.06%
Poly. multiplication 196 14 92.8692.8692.86%
RSA-CRT 168 7 95.8395.8395.83%
Kyber Key. Gen. 172 4 97.6797.6797.67%

Figure 5 presents the results of our fault attack experiment. We employed voltage glitches for the fault injection attacks. The Glitch Offset is the time between when the trigger is observed and when the glitch is injected. The Glitch Length is the time for which the Glitch Voltage is set. Glitch Offset and Glitch Length are the two parameters that we varied to inject faults, they correspond to the start time and duration of the glitch, respectively. For each combination of start time and duration, we executed each target function five times, resulting in a total of 1280 test data points for each function. We used heatmaps to illustrate the ratio of faulty to correct outputs. This experiment yielded three types of outputs: faulty, correct, and board reset. In the heatmaps, colors closer to red indicate a higher likelihood of voltage glitches causing faulty outputs (red = 100%), whereas blue signifies a lower likelihood (blue = 0%). Green indicates instances where all test outputs resulted in the board being reset. We treated any output that is not the correct output as a fault, this is very conservative, as some of the outputs may not be effective faults. From these heatmaps, the unprotected functions exhibit a significantly higher number of red dots, indicating more faults. Furthermore, Table 8 demonstrates the reduction of faults in target functions, with our protection method reducing approximately 95.4% of faulty outputs in average, up to 99.4% in modular multiplication. Collectively, these results affirm the effectiveness of our protection method in safeguarding the functions.

We observed fault injection sometimes breaks memory allocations (malloc) without causing the target to crash. This is due to the fact that the target is not designed to handle such faults. We believe that this is a potential avenue for future work, as it may lead to new types of attacks. We simply reset the target in such cases, as we are not interested in the results of these attacks. However, in some cases, the fault progresses to the next operation silently without causing a crash and the target continues to operate. We registered these cases as successful attacks in the heatmaps.

We additionally protected the fault-injection countermeasure method (Algorithm 3) using classical techniques. After exiting the loop, the code verifies loop completed successfully. If not, the code resets the target. This is a simple and effective way to protect the countermeasure from fault injection attacks. This led to a reduction in faults 4.56% in average.

9 Limitations

Our study presents a novel software-based countermeasure against physical attacks such as power side-channel and fault-injection attacks, utilizing the concept of random self-reducibility and instance hiding for number theoretic operations. While our approach offers significant advantages over traditional methods, there are several inherent limitations. Firstly, the countermeasure’s effectiveness is intrinsically linked to the random self-reducibility of the function being protected. This dependency means that our approach may not be universally applicable to all cryptographic operations. Secondly, redundancy and randomness inevitably introduce computational overhead. Nevertheless, each call to original function P𝑃Pcan be easily parallelized in hardware or vectorized software implementations. This parallelization can potentially increase the noise, and we identify this as an avenue for future work. Finally, our approach is not tailored to defend against attacks targeting the random number generator itself. Nevertheless, there are also simple duplication based techniques to protect random number generators from physical attacks. For instance, one such technique involves comparing two successive random numbers to determine if they are identical or not, as discussed in the work of Ravi et al. [72].

10 Related Work

To the best of our knowledge, this work is the first to apply random self-reducibility to protect against physical attacks. However, random self-reducibility has been used in several other areas of computer science, including cryptography protocols [41, 17], average-case complexity [37], instance hiding schemes [1, 11, 12], result checkers [16, 53] and interactive proof systems [15, 42, 53, 83].

Fault Injection and Side-Channel Analysis attacks are a risk for microcontrollers operating in a hostile environment where attackers have physical access to the target. These attacks can break cryptographic algorithms and recover secrets either by e.g changing the control flow of the program (FI) or by monitoring the device’s power consumption with little or no evidence [27].

Multiple countermeasures such as random delays [26], masking [68], infection [44], data redundancy checks [55, 57] and instruction redundancy [10] have been proposed to tackle these threats, yet their impact, effectiveness and potential interactions remain open for investigation. Therefore, our work aims to provide a new countermeasure to mitigate these attacks by combining a power side-channel countermeasure with a fault injection countermeasure [27].

In the introduction section, we have mentioned that a fault injection attack is a kind of attack in which the attacker intentionally induced the fault to obtain the faulty output and analyze it with the original output to recover the secret, which means that a fault injection attack is based on the fault output. Therefore, if we can reduce the possibility of the attacker obtaining the faulty output, we can mitigate fault injection attacks. There are two intuitive ways to reduce the probability of the attacker obtaining the faulty output: adding the check operation, which is at the software level, or protection device, which is at the hardware level. For instance, the most famous method is Shamir’s countermeasure [84]. It adds a check operation before outputting the signature to prevent the fault injection attack on RSA-CRT. Even though it has been proven that the attacker can bypass the check operation in Shamir’s countermeasure and obtain fault outputs, it still provides a good concept for mitigating the fault injection attack. Some devices, such as EM pulse, voltage glitch, or laser can prevent environmental noise at the hardware level. For example, we can reduce the impact of EM pulse by adding a surge protector. We can prevent laser attacks by employing beam stops. Notice that at the hardware level, that physical device can only mitigate the particular physical fault injection method and cannot fully protect the device from all types of fault injection attacks.

11 Conclusion

In this work, we show that if a cryptographic operation has a random self-reducible property, then it is possible to protect it against physical attacks such as power side-channel and fault-injection attacks with a configurable security. We have demonstrated the effectiveness of our method through empirical evaluation across critical cryptographic operations including modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms (NTT). Moreover, we have successfully showcased end-to-end implementations of our method within two public key cryptosystems: the RSA-CRT signature algorithm and the Kyber Key Generation, to show the practicality and effectiveness of our approach. The countermeasure reduced the power side-channel leakage by two orders of magnitude, to an acceptably secure level in TVLA analysis. For fault injection, the countermeasure reduces the number of faults to 95.4% in average. Although the countermeasures were introduced as software-based, they can be more efficiently implemented in hardware, particularly on FPGAs. Each call to P𝑃Pcan be parallelized in hardware, potentially increasing the noise. We identify this as an avenue for future work.

References

  • [1] Martin Abadi, Joan Feigenbaum, and Joe Kilian. On hiding information from an oracle. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pages 195–203, 1987.
  • [2] Amin Abdulrahman, Jiun-Peng Chen, Yu-Jia Chen, Vincent Hwang, Matthias J Kannwischer, and Bo-Yin Yang. Multi-moduli ntts for saber on cortex-m3 and cortex-m4. Cryptology ePrint Archive, 2021.
  • [3] Gorjan Alagic, Daniel Apon, David Cooper, Quynh Dang, Thinh Dang, John Kelsey, Jacob Lichtinger, Carl Miller, Dustin Moody, Rene Peralta, et al. Status report on the third round of the nist post-quantum cryptography standardization process. US Department of Commerce, NIST, 2022.
  • [4] Dorian Amiet, Andreas Curiger, Lukas Leuenberger, and Paul Zbinden. Defeating newhope with a single trace. Cryptology ePrint Archive, Report 2020/368, 2020. https://ia.cr/2020/368.
  • [5] Amund Askeland and Sondre RΓΈnjom. A side-channel assisted attack on ntru. Cryptology ePrint Archive, Report 2021/790, 2021. https://ia.cr/2021/790.
  • [6] Christian AumΓΌller, Peter Bier, Wieland Fischer, Peter Hofreiter, and J-P Seifert. Fault attacks on rsa with crt: Concrete results and practical countermeasures. In Cryptographic Hardware and Embedded Systems (CHES), pages 260–275. Springer, 2003.
  • [7] Roberto Avanzi, Joppe Bos, LΓ©o Ducas, Eike Kiltz, TancrΓ¨de Lepoint, Vadim Lyubashevsky, John M Schanck, Peter Schwabe, Gregor Seiler, and Damien StehlΓ©. Crystals-kyber algorithm specifications and supporting documentation. Technical report, NIST PQC Round, 2017.
  • [8] Roberto Avanzi, Joppe W. Bos, Leo Ducas, Eike Kiltz, Tancrede Lepoint, Vadim Lyubashevsky, John Schanck, Peter Schwabe, Gregor Seiler, and Damien StehlΓ©. CRYSTALS-Kyber (version 3.0): Algorithm specifications and supporting documentation. Technical report, Submission to the NIST post-quantum project, 10 2020. October 1, 2020.
  • [9] Alessandro Barenghi, Luca Breveglieri, Israel Koren, and David Naccache. Fault injection attacks on cryptographic devices: Theory, practice, and countermeasures. Proceedings of the IEEE, 100(11):3056–3076, 2012.
  • [10] Alessandro Barenghi, Luca Breveglieri, Israel Koren, Gerardo Pelosi, and Francesco Regazzoni. Countermeasures against fault attacks on software implemented aes: effectiveness and cost. In Proceedings of the 5th Workshop on Embedded Systems Security, pages 1–10, 2010.
  • [11] Donald Beaver and Joan Feigenbaum. Hiding instances in multioracle queries. In Annual Symposium on Theoretical Aspects of Computer Science, pages 37–48. Springer, 1990.
  • [12] Donald Beaver, Joan Feigenbaum, Joe Kilian, and Phillip Rogaway. Security with low communication overhead. In Advances in Cryptology-CRYPTO’90: Proceedings 10, pages 62–76. Springer, 1991.
  • [13] Daniel J Bernstein and Bo-Yin Yang. Fast constant-time gcd computation and modular inversion. IACR Transactions on Cryptographic Hardware and Embedded Systems, pages 340–398, 2019.
  • [14] Shivam Bhasin, Jan-Pieter D’Anvers, Daniel Heinz, Thomas PΓΆppelmann, and Michiel Van Beirendonck. Attacking and defending masked polynomial comparison for lattice-based cryptography. Cryptology ePrint Archive, Report 2021/104, 2021. https://ia.cr/2021/104.
  • [15] Manuel Blum and Sampath Kannan. Designing programs that check their work. Journal of the ACM (JACM), 42(1):269–291, 1995.
  • [16] Manuel Blum, Michael Luby, and Ronitt Rubinfeld. Self-testing/correcting with applications to numerical problems. In Proceedings of the twenty-second annual ACM symposium on Theory of computing, pages 73–83, 1990.
  • [17] Manuel Blum and Silvio Micali. How to generate cryptographically strong sequences of pseudo random bits. In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali, pages 227–240. 2019.
  • [18] Dan Boneh, Richard A DeMillo, and Richard J Lipton. On the importance of checking cryptographic protocols for faults. In International conference on the theory and applications of cryptographic techniques, pages 37–51. Springer, 1997.
  • [19] Leon Botros, Matthias J Kannwischer, and Peter Schwabe. Memory-efficient high-speed implementation of kyber on cortex-m4. In International Conference on Cryptology in Africa, pages 209–228. Springer, 2019.
  • [20] Robert S Boyer and J Strother Moore. Mjrty: A fast majority vote algorithm. Automated reasoning: essays in honor of Woody Bledsoe, 1:105–117, 1991.
  • [21] Claudio Bozzato, Riccardo Focardi, and Francesco Palmarini. Shaping the glitch: optimizing voltage fault injection attacks. IACR transactions on cryptographic hardware and embedded systems, pages 199–224, 2019.
  • [22] Eric Brier, Christophe Clavier, and Francis Olivier. Correlation power analysis with a leakage model. In International workshop on cryptographic hardware and embedded systems, pages 16–29. Springer, 2004.
  • [23] Anantha P Chandrakasan and Robert W Brodersen. Minimizing power consumption in digital cmos circuits. Proceedings of the IEEE, 83(4):498–523, 1995.
  • [24] Suresh Chari, Josyula R Rao, and Pankaj Rohatgi. Template attacks. In International Workshop on Cryptographic Hardware and Embedded Systems, pages 13–28. Springer, 2002.
  • [25] Chi-Ming Marvin Chung, Vincent Hwang, Matthias J Kannwischer, Gregor Seiler, Cheng-Jhih Shih, and Bo-Yin Yang. Ntt multiplication for ntt-unfriendly rings: New speed records for saber and ntru on cortex-m4 and avx2. IACR Transactions on Cryptographic Hardware and Embedded Systems, pages 159–188, 2021.
  • [26] Christophe Clavier, Jean-SΓ©bastien Coron, and Nora Dabbous. Differential power analysis in the presence of hardware countermeasures. In Cryptographic Hardware and Embedded Systemsβ€”CHES 2000: Second International Workshop Worcester, MA, USA, August 17–18, 2000 Proceedings 2, pages 252–263. Springer, 2000.
  • [27] Lucian Cojocar, Kostas Papagiannopoulos, and Niek Timmers. Instruction duplication: Leaky and not too fault-tolerant! Cryptology ePrint Archive, Paper 2017/1082, 2017. https://eprint.iacr.org/2017/1082.
  • [28] Lucian Cojocar, Kostas Papagiannopoulos, and Niek Timmers. Instruction duplication: Leaky and not too fault-tolerant! In Smart Card Research and Advanced Applications: 16th International Conference, CARDIS 2017, Lugano, Switzerland, November 13–15, 2017, Revised Selected Papers, pages 160–179. Springer, 2018.
  • [29] Wikipedia contributors. Chernoff bound β€” wikipedia, the free encyclopedia, 2023. [Online; accessed 7-September-2023].
  • [30] James W Cooley and John W Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of computation, 19(90):297–301, 1965.
  • [31] Ivan DamgΓ₯rd, Valerio Pastro, Nigel Smart, and Sarah Zakarias. Multiparty computation from somewhat homomorphic encryption. In Annual Cryptology Conference, pages 643–662. Springer, 2012.
  • [32] Sanjay Deshpande, Santos Merino Del Pozo, Victor Mateu, Marc Manzano, Najwa Aaraj, and Jakub Szefer. Modular inverse for integers using fast constant time gcd algorithm and its applications. In Field-Programmable Logic and Applications (FPL), pages 122–129, 2021.
  • [33] LΓ©o Ducas, Eike Kiltz, Tancrede Lepoint, Vadim Lyubashevsky, Peter Schwabe, Gregor Seiler, and Damien StehlΓ©. Crystals-dilithium: A lattice-based digital signature scheme. IACR Transactions on Cryptographic Hardware and Embedded Systems, pages 238–268, 2018.
  • [34] Richard Durstenfeld. Algorithm 235: random permutation. Communications of the ACM, 7(7):420, 1964.
  • [35] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE transactions on information theory, 31(4):469–472, 1985.
  • [36] Sho Endo, Takeshi Sugawara, Naofumi Homma, Takafumi Aoki, and Akashi Satoh. An on-chip glitchy-clock generator for testing fault injection attacks. Journal of Cryptographic Engineering, 1:265–270, 2011.
  • [37] Joan Feigenbaum and Lance Fortnow. Random-self-reducibility of complete sets. SIAM Journal on Computing, 22(5):994–1005, 1993.
  • [38] Ronald Aylmer Fisher, Frank Yates, et al. Statistical tables for biological, agricultural and medical research, edited by ra fisher and f. yates. Edinburgh: Oliver and Boyd, 1963.
  • [39] Aymeric GenΓͺt, Natacha Linard de Guertechin, and Novak KaluΔ‘eroviΔ‡. Full key recovery side-channel attack against ephemeral sike on the cortex-m4. Cryptology ePrint Archive, Report 2021/858, 2021. https://ia.cr/2021/858.
  • [40] Benjamin Jun Gilbert Goodwill, Josh Jaffe, Pankaj Rohatgi, et al. A testing methodology for side-channel resistance validation. In NIST non-invasive attack testing workshop, volume 7, pages 115–136, 2011.
  • [41] Shafi Goldwasser and Silvio Micali. Probabilistic encryption & how to play mental poker keeping secret all partial information. In Providing sound foundations for cryptography: on the work of Shafi Goldwasser and Silvio Micali, pages 173–201. 2019.
  • [42] Shafi Goldwasser, Silvio Micali, and Chales Rackoff. The knowledge complexity of interactive proof-systems. In Providing sound foundations for cryptography: On the work of shafi goldwasser and silvio micali, pages 203–225. 2019.
  • [43] Daniel Heinz and Thomas PΓΆppelmann. Combined fault and dpa protection for lattice-based cryptography. Cryptology ePrint Archive, Report 2021/101, 2021. https://ia.cr/2021/101.
  • [44] Marc Joye, Pascal Manet, and Jean-Baptiste Rigaud. Strengthening hardware aes implementations against fault attacks. IET Inf. Secur., 1(3):106–110, 2007.
  • [45] Emre Karabulut, Erdem Alkim, and Aydin Aysu. Single-trace side-channel attacks on Ο‰πœ”\omega-small polynomial sampling: With applications to ntru, ntru prime, and crystals-dilithium. In 2021 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), pages 35–45. IEEE, 2021. https://ia.cr/2022/494.
  • [46] Anatolii Karatsuba. Multiplication of multidigit numbers on automata. In Soviet physics doklady, volume 7, pages 595–596, 1963.
  • [47] Paul Kocher, Joshua Jaffe, Benjamin Jun, and Pankaj Rohatgi. Introduction to differential power analysis. Journal of Cryptographic Engineering, 1:5–27, 2011.
  • [48] Thomas Korak and Michael Hoefler. On the effects of clock and power supply tampering on two microcontroller platforms. In 2014 Workshop on Fault Diagnosis and Tolerance in Cryptography, pages 8–17. IEEE, 2014.
  • [49] Moritz Lipp, Andreas Kogler, David Oswald, Michael Schwarz, Catherine Easdon, Claudio Canella, and Daniel Gruss. Platypus: Software-based power side-channel attacks on x86. In IEEE Symposium on Security and Privacy (SP), 2021.
  • [50] Richard Lipton. New directions in testing. Distributed computing and cryptography, 2:191–202, 1991.
  • [51] Zhe Liu, Johann GroßschΓ€dl, and Ilya Kizhvatov. Efficient and side-channel resistant rsa implementation for 8-bit avr microcontrollers. In Workshop on the Security of the Internet of Things-SOCIOT, volume 10, 2010.
  • [52] Victor LomnΓ©, Thomas Roche, and Adrian Thillard. On the need of randomness in fault attack countermeasures-application to aes. In 2012 Workshop on Fault Diagnosis and Tolerance in Cryptography, pages 85–94. IEEE, 2012.
  • [53] Carsten Lund, Lance Fortnow, Howard Karloff, and Noam Nisan. Algebraic methods for interactive proof systems. Journal of the ACM (JACM), 39(4):859–868, 1992.
  • [54] Vadim Lyubashevsky, LΓ©o Ducas, Eike Kiltz, TancrΓ¨de Lepoint, Peter Schwabe, Gregor Seiler, Damien StehlΓ©, and Shi Bai. Crystals-dilithium. Submission to the NIST Post-Quantum Cryptography Standardization, 2017.
  • [55] Paolo Maistri and RΓ©gis Leveugle. Double-data-rate computation as a countermeasure against fault analysis. IEEE Transactions on Computers, 57(11):1528–1539, 2008.
  • [56] Luke Mather, Elisabeth Oswald, Joe Bandenburg, and Marcin Wojcik. Does my device leak information? an a priori statistical power analysis of leakage detection tests. Cryptology ePrint Archive, Report 2013/298, 2013. https://ia.cr/2013/298.
  • [57] Marcel Medwed and JΓΆrn-Marc Schmidt. A generic fault countermeasure providing data and program flow integrity. In 2008 5th Workshop on Fault Diagnosis and Tolerance in Cryptography, pages 68–73. IEEE, 2008.
  • [58] Payman Mohassel and Peter Rindal. Aby3: A mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pages 35–52, 2018.
  • [59] Peter L Montgomery. Modular multiplication without trial division. Mathematics of computation, 44(170):519–521, 1985.
  • [60] Nicolas Moro, Karine Heydemann, Amine Dehbaoui, Bruno Robisson, and Emmanuelle Encrenaz. Experimental evaluation of two software countermeasures against fault attacks. In Hardware-Oriented Security and Trust (HOST), pages 112–117, 2014.
  • [61] Nicolas Moro, Karine Heydemann, Emmanuelle Encrenaz, and Bruno Robisson. Formal verification of a software countermeasure against instruction skip attacks. Journal of Cryptographic Engineering, 4:145–156, 2014.
  • [62] Koksal Mus, YarkΔ±n DorΓΆz, M Caner Tol, Kristi Rahman, and Berk Sunar. Jolt: Recovering tls signing keys via rowhammer faults. In 2023 IEEE Symposium on Security and Privacy (SP), pages 1719–1736. IEEE, 2023.
  • [63] NIST. Submission requirements and evaluation criteria for the post-quantum cryptography standardization process, 2016.
  • [64] Colin O’Flynn and Zhizhang (David) Chen. Chipwhisperer: An open-source platform for hardware embedded security research. Cryptology ePrint Archive, Report 2014/204, 2014. https://ia.cr/2014/204.
  • [65] Conor Patrick, Bilgiday Yuce, Nahid Farhady Ghalaty, and Patrick Schaumont. Lightweight fault attack resistance in software using intra-instruction redundancy. In Selected Areas in Cryptography–SAC 2016: 23rd International Conference, pages 231–244. Springer, 2017.
  • [66] Gilles Piret and Jean-Jacques Quisquater. A differential fault attack technique against spn structures, with application to the aes and khazad. In Cryptographic Hardware and Embedded Systems-CHES 2003: 5th International Workshop, Cologne, Germany, September 8–10, 2003. Proceedings 5, pages 77–88. Springer, 2003.
  • [67] Thomas PΓΆppelmann, Tobias Oder, and Tim GΓΌneysu. High-performance ideal lattice-based cryptography on 8-bit atxmega microcontrollers. In International conference on cryptology and information security in Latin America, pages 346–365. Springer, 2015.
  • [68] Emmanuel Prouff and Matthieu Rivain. Masking against side-channel attacks: A formal security proof. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 142–159. Springer, 2013.
  • [69] Mark Randolph and William Diehl. Power side-channel attack analysis: A review of 20 years of study for the layman. Cryptography, 4(2), 2020.
  • [70] Prasanna Ravi, Shivam Bhasin, Sujoy Sinha Roy, and Anupam Chattopadhyay. Drop by drop you break the rock - exploiting generic vulnerabilities in lattice-based pke/kems using em-based physical attacks. Cryptology ePrint Archive, Report 2020/549, 2020. https://ia.cr/2020/549.
  • [71] Prasanna Ravi, Shivam Bhasin, Sujoy Sinha Roy, and Anupam Chattopadhyay. On exploiting message leakage in (few) nist pqc candidates for practical message recovery and key recovery attacks. Cryptology ePrint Archive, Report 2020/1559, 2020. https://ia.cr/2020/1559.
  • [72] Prasanna Ravi, Anupam Chattopadhyay, Jan Pieter D’Anvers, and Anubhab Baksi. Side-channel and fault-injection attacks over lattice-based post-quantum schemes (kyber, dilithium): Survey and new results. ACM Transactions on Embedded Computing Systems, 2022.
  • [73] Prasanna Ravi, Mahabir Prasad Jhanwar, James Howe, Anupam Chattopadhyay, and Shivam Bhasin. Exploiting determinism in lattice-based signatures: practical fault attacks on pqm4 implementations of nist candidates. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, pages 427–440, 2019.
  • [74] Prasanna Ravi and Sujoy Sinha Roy. Side-channel analysis of lattice-based pqc candidates. Round 3 Seminars, NIST Post Quantum Cryptography, 2021.
  • [75] Prasanna Ravi, Bolin Yang, Shivam Bhasin, Fan Zhang, and Anupam Chattopadhyay. Fiddling the twiddle constants-fault injection analysis of the number theoretic transform. IACR Transactions on Cryptographic Hardware and Embedded Systems, pages 447–481, 2023.
  • [76] Oscar Reparaz, Benedikt Gierlichs, and Ingrid Verbauwhede. Fast leakage assessment. Cryptology ePrint Archive, Report 2017/624, 2017. https://ia.cr/2017/624.
  • [77] Ronald L Rivest, Adi Shamir, and Leonard Adleman. A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM, 21(2):120–126, 1978.
  • [78] Sujoy Sinha Roy, Frederik Vercauteren, Nele Mentens, Donald Donglong Chen, and Ingrid Verbauwhede. Compact ring-lwe cryptoprocessor. In Cryptographic Hardware and Embedded Systems–CHES 2014: 16th International Workshop, Busan, South Korea, September 23-26, 2014. Proceedings 16, pages 371–391. Springer, 2014.
  • [79] Ronitt Rubinfeld. Robust functional equations with applications to self-testing/correcting. Technical report, Cornell University, 1994.
  • [80] JΓΆrn-Marc Schmidt and Michael Hutter. Optical and em fault-attacks on crt-based rsa: Concrete results. na, 2007.
  • [81] Tobias Schneider and Amir Moradi. Leakage assessment methodology - a clear roadmap for side-channel evaluations. Cryptology ePrint Archive, Report 2015/207, 2015. https://ia.cr/2015/207.
  • [82] Bodo Selmke, Johann Heyszl, and Georg Sigl. Attack on a dfa protected aes by simultaneous laser fault injections. In 2016 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), pages 36–46. IEEE, 2016.
  • [83] Adi Shamir. Ip= pspace. Journal of the ACM (JACM), 39(4):869–877, 1992.
  • [84] Adi Shamir. Method and apparatus for protecting public key schemes from timing and fault attacks, November 23 1999. US Patent 5,991,415.
  • [85] Bo-Yeon Sim, Jihoon Kwon, Joohee Lee, Il-Ju Kim, Taeho Lee, Jaeseung Han, Hyojin Yoon, Jihoon Cho, and Dong-Guk Han. Single-trace attacks on the message encoding of lattice-based kems. Cryptology ePrint Archive, Report 2020/992, 2020. https://ia.cr/2020/992.
  • [86] Bo-Yeon Sim, Aesun Park, and Dong-Guk Han. Chosen-ciphertext clustering attack on crystals-kyber using the side-channel leakage of barrett reduction. Cryptology ePrint Archive, Report 2021/874, 2021. https://ia.cr/2021/874.
  • [87] Alistair Sinclair. Class notes for the course "randomness and computation". http://www.cs.berkeley.edu/~sinclair/cs271/n13.pdf, 2011.
  • [88] Petr Socha, VojtΔ›ch MiΕ‘kovskα»³, and Martin Novotnα»³. A comprehensive survey on the non-invasive passive side-channel analysis. Sensors, 22(21):8096, 2022.
  • [89] Raphael Spreitzer, Veelasha Moonsamy, Thomas Korak, and Stefan Mangard. Systematic classification of side-channel attacks: A case study for mobile devices. IEEE communications surveys & tutorials, 20(1):465–488, 2017.
  • [90] FranΓ§ois-Xavier Standaert. How (not) to use welch’s t-test in side-channel security evaluations. Cryptology ePrint Archive, Report 2017/138, 2017. https://ia.cr/2017/138.
  • [91] Hauke Malte Steffen, Lucie Johanna Kogelheide, and Timo Bartkewitz. In-depth analysis of side-channel countermeasures for crystals-kyber message encoding on arm cortex-m4. Cryptology ePrint Archive, Report 2021/1307, 2021. https://ia.cr/2021/1307.
  • [92] Niek Timmers, Albert Spruyt, and Marc Witteman. Controlling pc on arm using fault injection. In 2016 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), pages 25–35. IEEE, 2016.
  • [93] Andrei L Toom. The complexity of a scheme of functional elements simulating the multiplication of integers. In Doklady Akademii Nauk, volume 150, pages 496–498. Russian Academy of Sciences, 1963.
  • [94] Yingchen Wang, Riccardo Paccagnella, Elizabeth Tang He, Hovav Shacham, Christopher W Fletcher, and David Kohlbrenner. Hertzbleed: Turning power {{\{Side-Channel}}\}attacks into remote timing attacks on x86. In 31st USENIX Security Symposium (USENIX Security 22), pages 679–697, 2022.
  • [95] AndrΓ© Weil. Basic number theory., volume 144. Springer Science & Business Media, 2013.
  • [96] Wenjie Xiong, Liu Ke, Dimitrije Jankov, Michael Kounavis, Xiaochen Wang, Eric Northup, Jie Amy Yang, Bilge Acun, Carole-Jean Wu, Ping Tak Peter Tang, et al. Secndp: Secure near-data processing with untrusted memory. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), pages 244–258. IEEE, 2022.
  • [97] Sung-Ming Yen, Sangjae Moon, and Jae-Cheol Ha. Hardware fault attack on rsa with crt revisited. In Information Security and Cryptologyβ€”ICISC 2002: 5th International Conference Seoul, Korea, November 28–29, 2002 Revised Papers 5, pages 374–388. Springer, 2003.
  • [98] Bilgiday Yuce, Nahid Farhady Ghalaty, and Patrick Schaumont. Improving fault attacks on embedded software using risc pipeline characterization. In 2015 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), pages 97–108. IEEE, 2015.
  • [99] Loic Zussa, Jean-Max Dutertre, Jessy Clediere, and Assia Tria. Power supply glitch induced faults on fpga: An in-depth analysis of the injection mechanism. In 2013 IEEE 19th International On-Line Testing Symposium (IOLTS), pages 110–115. IEEE, 2013.

Appendix A 2PC Protocol

1 Function fast_gcd(f,g𝑓𝑔f,g):
2       d←max⁑(f.nbits​(),g.nbits​())←𝑑f.nbitsg.nbitsd\leftarrow\max(\text{f.nbits}(),\text{g.nbits}())
3       if d<46𝑑46d<46 then  mβ†βŒŠ49​d+8017βŒ‹β†π‘š49𝑑8017m\leftarrow\left\lfloor\frac{{49d+80}}{{17}}\right\rfloor
4       else  mβ†βŒŠ49​d+5717βŒ‹β†π‘š49𝑑5717m\leftarrow\left\lfloor\frac{{49d+57}}{{17}}\right\rfloor
5       p​r​e​c​o​m​p←Integers​(f)​(f+1/2)mβˆ’1β†π‘π‘Ÿπ‘’π‘π‘œπ‘šπ‘Integers𝑓superscript𝑓12π‘š1precomp\leftarrow\text{Integers}(f)\left({{f+1}}/{{2}}\right)^{m-1}
6       v,r,δ←0,1,1formulae-sequenceβ†π‘£π‘Ÿπ›Ώ011v,r,\delta\leftarrow 0,1,1
7       for n←0←𝑛0n\leftarrow 0 to mπ‘šm do
8             if Ξ΄>0𝛿0\delta>0 and g​&⁑1=1𝑔11g\operatorname{\&}1=1 then
9                   Ξ΄,f,g,v,rβ†βˆ’Ξ΄,g,βˆ’f,r,βˆ’vformulae-sequenceβ†π›Ώπ‘“π‘”π‘£π‘Ÿπ›Ώπ‘”π‘“π‘Ÿπ‘£\delta,f,g,v,r\leftarrow-\delta,g,-f,r,-v
10             end if
11            g0←g​&⁑1←subscript𝑔0𝑔1g_{0}\leftarrow g\operatorname{\&}1
12             Ξ΄,g,r←1+Ξ΄,g+g0β‹…f2,r+g0β‹…v2formulae-sequenceβ†π›Ώπ‘”π‘Ÿ1𝛿𝑔⋅subscript𝑔0𝑓2π‘Ÿβ‹…subscript𝑔0𝑣2\delta,g,r\leftarrow 1+\delta,\frac{{g+g_{0}\cdot f}}{2},\frac{{r+g_{0}\cdot v}}{2}
13             g←ZZ​(g)←𝑔ZZ𝑔g\leftarrow\text{ZZ}(g)
14            
15       end for
16      i​n​v​e​r​s​e←ZZ​(sign​(f)β‹…ZZ​(vβ‹…2mβˆ’1)β‹…precomp)β†π‘–π‘›π‘£π‘’π‘Ÿπ‘ π‘’ZZβ‹…β‹…sign𝑓ZZ⋅𝑣superscript2π‘š1precompinverse\leftarrow\text{ZZ}(\text{sign}(f)\cdot\text{ZZ}(v\cdot 2^{m-1})\cdot\text{precomp})
17       return i​n​v​e​r​s​eπ‘–π‘›π‘£π‘’π‘Ÿπ‘ π‘’inverse
18
Algorithm 17 Fast GCD (f,g𝑓𝑔f,g)
1 Function modular_exponentiation(x,y,pπ‘₯𝑦𝑝x,y,p):
2       r​e​s←1β†π‘Ÿπ‘’π‘ 1res\leftarrow 1
3       x←xmodp←π‘₯moduloπ‘₯𝑝x\leftarrow x\mod p
4       if x=0π‘₯0x=0 then
5             return 00
6            
7       end if
8      while y>0𝑦0y>0 do
9             if (y​&⁑1)=1𝑦11(y\operatorname{\&}1)=1 then
10                   r​e​s←(r​e​sβ‹…x)modpβ†π‘Ÿπ‘’π‘ moduloβ‹…π‘Ÿπ‘’π‘ π‘₯𝑝res\leftarrow(res\cdot x)\mod p
11                  
12             end if
13            y←y≫1←𝑦𝑦much-greater-than1y\leftarrow y\gg 1
14             x←(xβ‹…x)modp←π‘₯moduloβ‹…π‘₯π‘₯𝑝x\leftarrow(x\cdot x)\mod p
15            
16       end while
17      return r​e​sπ‘Ÿπ‘’π‘ res
18      
19
Algorithm 18 Modular Exponentiation (x,y,pπ‘₯𝑦𝑝x,y,p)