Systematic Use of Random Self-Reducibility against Physical Attacks
Abstract
This work presents a novel, black-box software-based countermeasure against physical attacks including power side-channel and fault-injection attacks. The approach uses the concept of random self-reducibility and self-correctness to add randomness and redundancy in the execution for protection. Our approach is at the operation level, is not algorithm-specific, and thus, can be applied for protecting a wide range of algorithms. The countermeasure is empirically evaluated against attacks over operations like modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms. An end-to-end implementation of this countermeasure is demonstrated for RSA-CRT signature algorithm and Kyber Key Generation public key cryptosystems. The countermeasure reduced the power side-channel leakage by two orders of magnitude, to an acceptably secure level in TVLA analysis. For fault injection, the countermeasure reduces the number of faults to 95.4% in average.
Index Terms:
Random Self-Reducibility, Fault Injection Attacks, Power Side-Channel Attacks, Countermeasure, NTT, PQC, RSA-CRT, Randomly Testable Functions1 Introduction
Smart devices and IoT devices with sensors, processing capability, and actuators are becoming ubiquitous today in consumer electronics, healthcare, manufacturing, etc. These devices often collect sensitive or security-critical information and need to be protected. However, when deployed in the field, such devices are vulnerable to physical attackers who can have direct access to the devices.
Physical attacks can be categorized as passive attacks or active attacks. In passive attacks, such as Side-Channel Attacks (SCA), the attackers do not tamper with the execution, but can collect power traces, electromagnetic (EM) field traces, or traces of acoustic signals, and analyze the signals to learn information that is processed on the device. In active attacks, such as Fault Injection (FI) attacks, the attackers can inject faults through a voltage glitch, clock glitch, EM field, or laser to cause a malfunction in the processing unit or memory to tamper with the execution to obtain desired results. It has been shown that both types of physical attacks have been able to break cryptography implementations to leak secret keys, for example [21, 36, 80, 82].
Even though the assumptions on the attackerβs capability are similar for SCA and FI, the existing mitigation techniques treat the two types of attacks separately. For side-channel attacks, the mitigation techniques usually use randomness or noise to decouple the signal observable by the attacker from the data value [47, 89]. For fault injection attacks, there are typically two solutions: one is attack detection and one is to have redundancy in the execution for error correction. The detection will detect when the execution has abnormal behavior, and then handle it as an exception. The error correction uses redundancy in the execution and uses the redundancy to correct execution error if there is [65]. However, when we consider both SCA and FI attacks in the same system, separate mitigation for the two does not protect both attacks efficiently. For example, existing work [28] showed that instruction duplication as a fault tolerance mechanism amplifies the information leakage through side channels. Detection methods such as full, partial, encrypt-decrypt duplication & comparison of a cipher [52] produce repetitions of intermediate values that are exploitable by the side-channel adversary.
In this work, we propose a joint solution for both SCA and FI attacks. With a combination of random obfuscation using the Random Self-Reducibililty (RSR) property and redundancy for error correction, our proposed countermeasure is particularly effective against FI outperforming traditional redundancy-based methods. The randomness disrupts the attackerβs observation of the statistics in fault attacks, thereby nullifying the effectiveness of statistical analysis as a tool for security compromise. This aspect is crucial in the face of increasingly sophisticated FI analysis techniques. In addition to its effectiveness against FI, the countermeasure also resists SCA, by rendering power consumption variations less useful to attackers. The countermeasure significantly enhances system security, particularly in environments where physical attacks are prevalent.
Another drawback of current mitigation techniques is that most existing work focuses on a certain implementation of a cryptographic algorithm, and to adopt the protection from one implementation to another needs redoing the security analysis process and redoing the implementation.
The proposed countermeasure offers significant benefits as a black box operation-level solution to both SCA and FI attacks, and it is independent of the target algorithm being protected. This means there is no need for detailed knowledge of the implementation. The basis for the solution is to implement protection at low-level of operations such as modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms. Also, we assume a generic fault model, and thus, there is no special fault profiling of a targeted device necessary. Therefore, the proposed protection techniques can be applied directly in software without extensive system-specific adjustments. In our evaluation, we showcase how the proposal protection techniques can be adopted to protect two different cryptosystems.
Our protection requires a small number of steps to implement. It can be implemented at C or high-level and is independent of the compiler or underlying architecture; assuming the compiler. First, target software is identified. Second, we locate low-level operations such as modular exponentiation, modular multiplication, polynomial multiplication, or number theoretic transforms. These operations can be protected with the idea of Random Self-Reducibility (RSR). Each instance of the low-level operation is replaced with an equivalent RSR operation. Each RSR operation requires querying a randomness source and then executing the low-level operations multiple times with original input values modified with the random values. Typically, multiple RSR operations are instantiated and majority voting is performed on the output of RSR operations. Because the protection works at the low-level operations such as modular exponentiation, modular multiplication, polynomial multiplication, or number theoretic transforms, it is independent of the higher-level algorithm or application. Since it does not rely on any hardware tricks, it is independent of the architecture and agnostic to the underlying compiler.
Our protection can be applied to any program or algorithm that uses modular exponentiation, modular multiplication, polynomial multiplication, and number theoretic transforms to process secret or sensitive information. This encompasses major cryptogrpahic algorithms from ElGamal [35] and RSA [77] to post-quantum cryptography such as Kyber [7] and Dilithium [33]. In our evaluation, we show how our protection can be applied to RSA-CRT and Kyberβs Key Generation algorithms. Our contributions are summarized as follows:
- β’
- β’
We formalize the security of the countermeasure in relation to an attackerβs fault injection capability, parameterize it, and quantify its effectiveness against fault-injection attacks, as detailed in Section 5.3.
- β’
End-to-end implementation of the countermeasure for RSA-CRT and Kyberβs Key Generation public key cryptosystems (Section 7).
- β’
Emprical evaluation of the countermeasure against power side-channel and fault-injection attacks over modular exponentiation, modular multiplication operations, polynomial multiplication, number theoretic transform operations, RSA-CRT, and Kyberβs Key Generation (Section 8).
2 Background
In this section, we provide a brief overview on side-channels and fault injections as physical attacks.
2.1 Power Side Channels
It is a well-known fact that the power consumption during certain stages of a cryptographic algorithm exhibits a strong correlation with the Hamming weight of its underlying variables, i.e., Hamming weight leakage model [49, 22, 69]. This phenomenon has been widely exploited in the cryptographic literature in various attacks targeting a broad range of schemes, particularly post-quantum cryptographic implementations [45, 91, 71, 85, 4, 70, 5, 86, 39, 14, 43]. Therefore, we use the Hamming weight leakage model in the evaluation of the robustness of the countermeasure.
The Hamming weight leakage model assumes that the Hamming weight of the operands is strongly correlated with the power consumption. Each bit flip requires one or more voltage transitions from 0 to high (or vice versa). Different data values typically entail differing numbers of bit flips and therefore produce distinct power traces [23]. Therefore, any circuit not explicitly designed to be resistant to power attacks has data-dependent power consumption. However, in a complex circuit, the differences can be so slight that they are difficult to distinguish from a single trace, particularly if an attackerβs sampling rate is limited [49, 94]. Therefore, it is necessary to use statistical techniques across multiple power traces [49].
Test Vector Leakage Assessment (TVLA). [40] identifies if two sets of side channel measurements are distinguishable by computing the Welchβs t-test for the two sets of measurements. It is being used in the literature to confirm the presence or absence of side-channel leakages for power traces, and has become the de facto standard in the evaluation of side-channel measurements [69, 81, 56, 90, 76]. In side-channel analysis, the recommended thresholds for t-values are specifically tailored to detect potential information leakage in cryptographic systems. A t-value threshold of or is often considered in side-channel analysis. This threshold corresponds to a very high confidence level, rejecting the null hypothesis with a confidence greater than 99.999% for a significantly large number of measurements. The null hypothesis typically being that all samples are drawn from the same distribution, a t-value outside this range indicates distinguishable distributions of the two sets and thus the existence of side-channel leakage [88]. The choice of these thresholds is influenced by the need to balance the risk of false positives (incorrectly identifying information leakage when there is none) against the risk of false negatives (failing to detect actual information leakage).
The Sum of Squared pairwise T-differences (SOST) [24] is a technique for identifying Points of Interest (PoIs) in side-channel analysis. It is particularly useful in scenarios where there are many data points (like traces in a cryptographic system), and you want to identify specific points in these traces that show significant variation based on different conditions or inputs.
2.2 Fault Injection Attacks
In the real world, there is a possibility that the devices will malfunction or be damaged, resulting in generating the error output, and we may ignore it. However, if the attacker intentionally induced the fault during the device operation, e.g., cryptographic calculation, he or she can recover the secret by analyzing the original and fault outputs. Most of the classical cryptographic algorithms can be attacked by fault injection attacks. For instance, the first fault attack research [18] was on the RSA implementation using the Chinese Remainder Theorem (CRT), which is the most common implementation of RAS used to secure communication. In this case, the attacker can recover the secret with only one faulty RSA-CRT signature. Moreover, in [62], Mus et al. provide the fault attack method, which can attack El-Gamal or elliptic-curve (ECC) based signature, such as Schnorr signature and ECDSA, via Rowhammer (a software technique used to induce the fault in memory). Not only the public key cryptosystem but also the symmetric key cryptosystem are vulnerable to fault injection attacks. In [66], Piret et al. develop the fault attack method against substitution-permutation network (SPN) structures cryptographic algorithm, such as AES or KHAZAD. Even the post-quantum cryptographic algorithms [73], which can protect against quantum computing, can be vulnerable to fault attacks. Therefore, it is necessary to have efficient FI attack protections that can be easily deployed.
The injected fault can, in principle, have an impact on any stage of the fetch-decode-execute cycle performed for each instruction [48, 99]. Additionally, any optimizations implemented by the CPU, such as pipelining [98], add to the complexity of executing a single instruction. Therefore, it is typically unknown what exactly goes wrong within the CPU when its behavior is changed due to fault injection, whereas the modified behavior itself is easier to measure. We consider a generic fault model, likely applicable to a wide range of targets, where a variable amount of bits in the instruction are flipped as a result of fault injection. Two types of behavior are possible using this fault model: 1. Instruction corruption: the original instruction is modified into an instruction that has an impact on the behavior of the device. In practice, it may modify the instruction to any other instruction supported by the architecture. 2. Instruction skipping: effectively a subset of instruction corruption. The original instruction is corrupted into an instruction that does not have an impact on the behavior of the device. The resulting instruction does not change the execution flow or any state that is used later on.
Invocation of specific behavior is not a trivial task, as the low level control required to do this is often limited. However, it is possible to identify the more probable results while assuming that bit flips affecting single or all bits are more likely than complex patterns of bit flips [92].
On embedded processors, a fault model in which an attacker can skip an assembly instruction or equivalently replace it by a nop has been observed on several architectures and for several fault injection means [61]. Moro et al. in [60] assume that the effect of the injected fault on a 32-bit microcontroller leads to an instruction skip. Moro et al. [61] and Barenghi et al. [9] have proposed implementations of the Instruction Redundancy technique as a countermeasure against this fault model. Instruction skips correspond to specific cases of instruction replacements: replacing an instruction with another one that does not affect any useful register has the same effect as a nop replacement and so is equivalent to an instruction skip.
3 Threat Model
In our threat model, we consider an attacker with physical access to a device, capable of injecting faults such as voltage glitches during the computation of a critical function like the number theoretic transform (see Section 6). These faults can corrupt or skip instructions (see Section 2.2) and happen anywhere multiple times but does not crash the device. Furthermore, the model permits the attacker to perform basic power side-channel analysis, collecting power trace samples. By correlating data-dependent power consumption with the Hamming weight leakage model, the attacker can expose vulnerabilities in cryptographic computations. This underscores the crucial need for robust defenses against both fault injection and side-channel attacks.
4 Preliminaries
We use the notion of random self-reducibility [16, 79] to develop a new software-based countermeasure against fault-injection attacks and simple power side-channel attacks. Therefore, in this section, we provide the necessary background on random self-reducibility. Since we apply our countermeasure to number-theoretic operations, we also provide the necessary background on number theoretic transforms.
4.1 Notation
The notation is used to represent a specific realization of a random variable (i.e., a specific value that the random variable takes on). Let denote the probability of the event in the enclosed expression when is uniformly chosen from . We assume the domain and range of the function are the same set, usually named as , but the formalization can be expanded to accommodate multivariate functions and heterogeneous domains and ranges.
Let be a prime number, and the field of integers modulo be denoted as . Schemes such as Kyber and Dilithium operate over polynomials in polynomial rings. The polynomial ring is denoted as where is a cyclotomic polynomial with being a power of 2. Multiplication of polynomials is denoted as . Pointwise/Coefficientwise multiplication of two polynomials is denoted as , which means that each of the coefficients of polynomial multiplies the coefficients of with the same index. The NTT representation of a polynomial is denoted as .
4.2 Random Self-Reducibility
Informally, a function is random-self-reducible if the evaluation of at any given instance can be reduced in polynomial time to the evaluation of at one or more random instances.
Definition 1 (Random Self-Reducibililty (RSR) [16, 79]).
Let and be an integer. We say that is -random self-reducible if can be computed at any particular input via:
(1) |
where can be computed asymptotically faster than and the βs are uniformly distributed, although not necessarily independent; e.g., given the value of it is not necessary that be randomly distributed in . This notion of random self-reducibility is somewhat different than other definitions given by [1, 37, 17], where the requirement on is that it be computable in polynomial time.
Another similar definition was made by Lipton [50]. Suppose that we wish to compute the trivial identity function , and let be a program that computes . We can construct from another program with the property that it can compute correctly at an arbitrary point provided that one can compute it at a number of random points. Consider the following program . can compute with inputs and .
It is shown by Blum et al. [16] that self-correctors exist for any function that is random self-reducible. A self-corrector for takes a program that is correct on most inputs and turns it into a program that is correct on every input with high probability.
4.3 Arithmetic Secret Sharing
Privacy-preserving computing allows multiple parties to evaluate a function while keeping the inputs private and revealing only the output of the function and nothing else. One popular approach to outsourcing sensitive workloads to untrusted workers is to use arithmetic secret sharing [31, 58]. It splits a secret into multiple shares, distributing them across various workers. Each worker processes their respective share locally. Assuming the workers will not collude, it is information-theoretically impossible for each worker to recover the secret from its share [96].
In standard arithmetic secret sharing, the client aims to compute , with the property that , where and are randomly chosen. Let denote the integer ring of size . The shares are constructed such that the sum of all shares is equal to the original secret value . The client then delegates the computation to workers (untrusted entities). These workers independently calculate and , then relay their results back to the client. The client derives using these partial results from the untrusted workers. The randomness in the shares is crucial for our power side-channel countermeasure. While arithmetic secret sharing is based on the linearity of addition and multiplication over integers, our approach utilizes Random Self-Reducible properties, some of which may not necessarily be linear. Moreover, to counteract fault injection attacks, our algorithm must produce accurate results despite faults. We achieve this in our countermeasure by repeating the computation times and choosing the majority of the responses.
5 Overview of Our Countermeasure
The foundational works of Blum et al. [16] and Lipton [50] on testing have significantly influenced our approach to developing countermeasures. We have incorporated the concept of self-correctness to safeguard against fault-injection attacks, and the principles of random self-reducibility and randomly-testable functions to defend against power side-channel attacks. These notions are investigated and applied as a countermeasure against physical attacks in the literature.
At the heart of this method is the generic, randomized Algorithm 2, which is founded on the principle described in Definition 1. Additionally, Algorithm 3 boosts the effectiveness of the randomized Algorithm 2 through majority voting and probability amplification [87].
We observed that instance hiding can be also used against physical attacks, such as power side-channel and fault-injection attacks by randomizing the intermediate values of the computation. In this way, attackers wonβt be able to correlate the side-channel leakage with the intermediate values of the computation (see 1). For example, secrets in ElGamal Decryption [35] (see Algorithm 1) can be protected end-to-end using instance hiding, but instead of using arithmetic secret sharing, we use random self-reducible properties.
In this algorithm, and are the ciphertexts, is the secret key, and is the decrypted message. The operation represents raising to the power of , and represents the modular multiplicative inverse of (i.e., , where is the prime modulus used in the ElGamal encryption scheme). The result of the decryption, , is obtained by multiplying with the modular inverse of , denoted as in this algorithm.
For instance, we can protect the modular exponentiation function, , in ElGamal decryption using , and modular multiplication, , using . In these equalities shares should be selected to make and .
|l|l|l| Method-Method-
Modular exponentiation Modular exponentiation
Fermatβs method [95] Fast GCD algorithm [13]
Modular multiplication Modular multiplication
It is also hard to implement countermeasures for different implementations of the same mathematical function. Table 5 presents two distinct methodologies for implementing the ElGamal decryption algorithm. Method employs Fermatβs method (Algorithm 18 in Appendix) for the modular inverse calculation [95], while utilizes a sophisticated, constant-time modular inverse implementation recently developed by Bernstein and Yang [13] (Algorithm 17 in Appendix). Importantly, our countermeasure technique, applicable for both modular exponentiation (refer to Section 6) and modular multiplication (refer to Section 6), is compatible with either method regardless of the complexity of their respective implementations.
5.1 RSR against Power Side Channels
Consider a correct program that has an associated random self-reducible property, which takes the form of a functional equation . This property is deemed satisfied if, in the equation , we can substitute for the function and the equation remains true.
Generic -secure-countermeasure PSCA defined Algorithm 2 takes a program , a sensitive input , and a security parameter . The algorithm randomly splits into shares such that , and calls on each share to obtain . Finally, the algorithm returns the result of the function on . The function basis is defined based on the random self-reducible property of the function that implements (cf. Definition 1).
To ensure minimum security, splitting the secret input into two shares would suffice. However, for enhanced security, the secret input can be divided into additional shares. Itβs important to view the security parameter as an invocation to , especially in the context of bivariate functions, rather than merely the number of shares.
Masking with Random Self-Reducibility. If a cryptographic operation has a random self-reducible property, then it is possible to protect it against power side-channel attacks by masking with arithmetic secret sharing.
5.2 Self-Correctness against Fault Injections
Fault injection attacks rely on obtaining a faulty output or correlating the faulty output with the input or secret-dependent intermediate values. By introducing redundancy and majority voting, we can obtain correct results even if some results are incorrect due to injected faults.
In Algorithm 3, we show how to apply the fault injection countermeasure approach on top of the power side-channel countermeasure. To protect a program that implements a function having a random self-reducible property, the algorithm calls βs -secure-countermeasure times and returns the majority of the answers. The function -secure-countermeasure takes a program , a sensitive input , and a security parameter .
Note that and are independent security parameters. The security parameter represents the number of calls to the unprotected program used in the PSCA countermeasure, whereas signifies the number of iterations in the FIA countermeasure. The security parameter is associated with the attackerβs capability to inject effective faults. Owing to redundancy, an increase in the security parameter results in a decreased likelihood of the attacker successfully injecting a fault.
Algorithm 4 presents an example of a combined and configurable countermeasure, effective against both PSCA and FIA, applied to the modular multiplication operation. In Line 2, the algorithm divides the input into shares , satisfying . The methodology for the random splitter algorithm is detailed in Section 6. Furthermore, the majority function, which essentially returns the most common answer, is described in Section 6.
Self-Correctness with Majority Voting. Fault injection attacks rely on faulty output. By majority voting, we can obtain correct results even if some results are incorrect.
5.3 n and attackerβs probability of success
Fault injection occurs at the hardware level and is both challenging and unpredictable to control. When a successful fault is induced, it transforms a previously correct victim program into an incorrect one. Consequently, the essence of a fault injection attack is its probabilistic nature. This concept is abstracted in terms of the attackerβs probability of success in our work.
Definition 2 (-fault tolerance).
Let be the upper bound on the attackerβs probability of injecting a fault successfully at an unprotected program that correctly implements a function . Say that the program is -fault tolerant for the function provided for at least of any input . We assume each fault injection is independent of the others:
Algorithm 2 is a randomized algorithm and Algorithm 3 is also a randomized algorithm that repeats the computation times by calling Algorithm 2 and uses majority voting to pick the correct answer. Therefore, we can use Chernoff bounds [87] to show that the probability of getting the correct answer is at least .
A simple and common use of Chernoff bounds is for "boosting" of randomized algorithms. If one has an algorithm that outputs a guess that is the desired answer with probability , then one can get a higher success rate by running the algorithm times and outputting a guess that is output by more than runs of the algorithm. Assuming that these algorithm runs are independent, the probability that more than of the guesses is correct is equal to the probability that the sum of independent Bernoulli random variables that are 1 with probability is more than . This can be shown to be at least via the multiplicative Chernoff bound () [29]:
Theorem 1 (Derived from Theorem 3.1 in [50]).
Suppose that is randomly self-reducible and that is -fault tolerant for the function . Consider a -secure countermeasure (Line 4 in Algorithm 2):
Then, for any is equal to with probability at least .
Proof.
Fix an input . Clearly, the probability that is correct is at least the probability that for each . This follows since is random self-reducible with respect to the number of calls to is done. It therefore follows that returns correct results at least of the time. β
In the next sections, we will present a number of examples of -secure countermeasures whose security parameter is mostly . Thus, for these functions, Theorem 1 says that, for equal to , the probability that returns correct results is at least 0.98. We can amplify the probability of success by repeating the computation times and using majority voting. In addition, we can select a bigger by adjusting as the confidence parameter:
Lower bound for . The attackerβs probability of success is , and for a -secure countermeasure, the lower bound for is defined as: , where is the confidence parameter.
Algorithm 2 makes calls to a program that implements a function having a random self-reducible property. However, we do not need to know the implementation of the function , we just need to know the mathematical definition of the function to configure the Algorithm 2 and 3. Therefore, one further advantage of our countermeasure is that it follows βblack-boxβ approach. The fault injection attacks are hardware attacks, and the attacker does not have access to the software implementation of the function. Therefore, the attacker can only observe the input and output of the function. By using the black-box approach, we basically make the countermeasure robust at the hardware level.
Black-box. If we replace the function with a program that computes the function , then our countermeasure access as a black-box and computes the function using the random self-reducible properties of .
6 Implementation of Countermeasures
Table 6 lists all functions of some finite field operations and their corresponding random self-reducible properties. In this section, we examplify each function that we used in end-to-end experiments and show how to apply the countermeasure approach to protect the function.