# On the Effects of Data Distribution on Small-error Approximate Adders

Yizhi Chen\*, Ardalan Najafi<sup>†</sup>, and Alberto Garcia-Ortiz<sup>†</sup>

\* Universität Bremen, 28359 Bremen, Germany Email: yizchen@uni-bremen.de

<sup>†</sup>Institute of Electrodynamics and Microelectronics, Universität Bremen

Otto-Hahn-Allee 1, 28359 Bremen, Germany, Email: {ardalan, agarcia}@item.uni-bremen.de

Abstract—Approximate computing is a technique to tradeoff accuracy and hardware cost. It increases energy efficiency that leverages application-level tolerance to few errors in many applications including image processing, multimedia, machine learning and wireless communication. Truncated adders, as the most conventional approximate architectures, compute the addition of most significant bits, and produce small errors with high probabilities. In prior art, the adders have been analyzed considering uniformly distributed input data. However, in digital signal processing, the data has a distribution which can be considered as Gaussian distribution characterized by a mean value and standard deviation. This paper studies the effects of input data distribution on small-error approximate adders. We will show that the effects of Gaussian distribution can be modeled for the approximate adder architectures.

*Index Terms*—Approximate computing, adder architecture, Gaussian distribution, error-cost trade-off

#### I. INTRODUCTION

Increasing vulnerability of the computing systems to errors in underlying circuits is a growing concern nowadays. As variability increases, achieving deterministic behavior becomes increasingly expensive in modern technologies. As a promising technique, approximate computing redesigns a logic circuit to accept a reduced level of accuracy in response to languishing benefits of the technology scaling [1].

Approximate adders, as the key components of the arithmetic circuits, have attracted researchers' attention in the field of approximate computing. A close investigation into the variety of approximate adders divides them into two philosophy groups considering their errors: 1) *small errors* or 2) *unlikely errors* [2]. Some examples of the small errors are the Lowerpart OR Adder (LOA) [3], the non-zeroing bit truncated adder proposed in [4], and the Optimized Lower-part Constant-OR Adder (OLOCA) [5]. In the second philosophy, the errors are engineered to appear infrequently, even if they are large when they appear. The examples of these philosophies are the Almost Correct Adder (ACA) [6], the Generic Accuracy Configurable Adder (GeAr) [7], the Error Tolerant Adder (ETAII) [8] and the Equal Segmentation Adder (ESA) [9].

Prior studies of approximate adders have considered uniform input distribution [10]. In fact, in digital signal processing, the data has a distribution which can be considered as Gaussian distribution characterized by a mean value and standard deviation.

In this paper, the impact of input data distribution on the characteristics of approximate adders is studied. Two stateof-the-art small-error approximate adders are selected for this study as an illustration: In [4], authors have proposed a configurable approximate architecture based on truncation strategies. This architecture which we call Non-Zeroing Bit Truncated adder (NZBT), uses control signals to switch between approximate and exact modes dynamically. In [5], an optimized small-error approximate adder for mean squared error (MSE) has been proposed by the authors.

First, the behavior of the adders in both uniform and Gaussian distribution is compared. Subsequently, in order to illustrate the fact that the behavior of the approximate architectures in Gaussian distribution can be modeled, a linear model is presented for the studied architectures considering Gaussian distributed input data.

The paper is organized as follows: Section II reviews the approximate adders which are studied in this paper. The comparison between the effects of uniform distribution and Gaussian distribution on the adders is presented in Section III. The accuracy model considering different mean values and stand deviations is proposed in Section IV. Finally, Section V concludes the paper.

## II. APPROXIMATE ADDER

Approximate n-bit addition is generally designed to break the carry chain in order to decrease the latency and power consumption. The small-error adders selected in this paper perform the addition on the basis of the truncated adders. A truncated adder [11], calculates the addition of (n-k) Most Significant Bits (MSBs), where k is the number of truncated bits. Consequently, depending on the k, the truncated adder performs faster and is more cost-effective than its exact counterpart.

The idea of the NZBT adder, proposed in [4], is to approximate the adder by forcing the truncated output Least Significant Bits (LSBs) to constant-1. To do so, the authors propose to fix the k LSBs of one input to '0's and the other input to '1's. Using k control bits as well as extra gates, the adder can be configured to different levels of accuracy. In order to use the adder for comparison, we consider the adder switched to the approximate mode. As a result, changing k, different accuracies can be achieved using NZBT adder.

In [5], an optimized small-error approximate adder for mean squared error, called OLOCA, has been proposed by the authors. OLOCA, similar to NZBT, calculates the addition of

(n-k) MSBs. The difference is that OLOCA forces (k-2) LSBs to constant-1 and the  $(k-1)^{th}$  and  $(k-2)^{th}$  are bitwise ORed. This way the architecture has a very low MSE.

In the next section, the above-mentioned architectures are compared with conventional truncated adder [11] considering different error metrics. The conventional truncated adder is called Trunc in the rest of this paper.

#### **III.** COMPARISON RESULTS OF DIFFERENT DISTRIBUTIONS

In this section the effects of different input data distributions on the accuracy of the approximate adders are studied. In order to compare the adder architectures, first we introduce the error metrics used in this paper. Afterwards, the adder architectures are compared considering uniform distributed inputs. Finally, the accuracy of the approximate adders is analyzed considering Gaussian distributed inputs.

## A. Error metric

In this paper, the error is defined as the difference between the approximate and the accurate output results of the adder:

$$\varepsilon = s - \hat{s},\tag{1}$$

where  $\hat{s}$  is the approximate (erroneous) output of the adder and s is the accurate result.

The mean squared errors (MSE) is the metric which incorporates the variance of the errors. For a n-bit adder, there are  $2^{2n}$  possible addition. However, considering uniform random numbers, the MSE can be calculated as follows:

$$MSE = E\left[\varepsilon^2\right] = \sum_j \varepsilon_j^2 Pr[\varepsilon_j] , \qquad (2)$$

where  $Pr_j$  is the probability of the  $\varepsilon_j$ . In fact, Eq.(2) is the general form, and when considering all the possible additions, it can be written as:

$$MSE = \frac{1}{2^{2n}} \sum_{j=0}^{2^{2n}-1} \varepsilon_j^{2}.$$
 (3)

Since for a truncated adder, only the k LSBs are erroneous, the errors repeat for  $2^{2h}$  additions, where h is the number of exact MSBs (i.e. h = n - k). Eq.(3) can be rewritten as below:

$$MSE = \frac{1}{2^{2k}} \sum_{j=0}^{2^{2k}-1} \varepsilon_j^2.$$
 (4)

Having MSE, Peak Signal-to-Noise Ratio (PSNR) can be defined as:

$$PSNR = 20 \times log(MAX/\sqrt{MSE}), \tag{5}$$

where MAX is the maximum possible output value while adding two unsigned n-bit operands, i.e.  $2 \times (2^n - 1)$ .

To quantify the quality of the approximate units, Mean Absolute Error (MAE) is also an important metric which has been frequently considered in the literature. For uniform distributed inputs, MAE is defined as below:

$$MAE = \frac{1}{2^{2n}} \sum_{j=0}^{2^{2n}-1} |\varepsilon_j| .$$
 (6)

 Table I

 COMPARISON RESULTS IN TERMS OF PSNR AND MAE

|    |    | PSNR/dB |        |        | MAE      |          |          |
|----|----|---------|--------|--------|----------|----------|----------|
| n  | k  | Trunc   | NZBT   | OLOCA  | Trunc    | NZBT     | OLOCA    |
| 32 | 16 | 101.68  | 110.13 | 112.17 | 65530.25 | 21835.74 | 15366.06 |
|    | 12 | 125.76  | 134.20 | 136.25 | 4096.98  | 1367.28  | 960.95   |
|    | 8  | 149.80  | 158.30 | 160.35 | 254.88   | 85.27    | 59.91    |
| 16 | 8  | 53.54   | 61.96  | 64.01  | 254.94   | 85.37    | 60.00    |
|    | 6  | 65.67   | 74.01  | 76.05  | 62.99    | 21.33    | 14.99    |
|    | 4  | 78.08   | 86.06  | 88.12  | 15.00    | 5.32     | 3.70     |
| 8  | 4  | 29.88   | 37.87  | 39.93  | 15.00    | 5.31     | 3.70     |
|    | 3  | 36.41   | 43.95  | 46.04  | 7.00     | 2.62     | 1.78     |
|    | 2  | 43.54   | 50.18  | 52.39  | 3.00     | 1.25     | 0.75     |

#### B. results in uniform distribution

In order to compare the approximate adder architectures considering uniform distributed inputs,  $10^6$  uniform random numbers have been generated for 8, 16, and 32-bit operands. The accuracy of the conventional truncated adder is compared with NZBT and OLOCA adders. The results for PSNR and MAE are tabulated in Tab.I.

It can be seen from the table that the OLOCA architecture outperforms both NZBT adder and conventional truncated adder for all the bit-widths. For example, a 32-bit OLOCA adder with 12-bit truncation has 30 % and 77 % lower MAE in comparison with NZBT and Trunc adders, respectively.

### C. results in Gaussian distributions

In this section, the effects of Gaussian distribution on the approximate adders are discussed. Here, the results from uniform distribution are compared with Gaussian distribution results to see if they are still valid in real scenarios such as digital signal processing applications.

To analyze the effects of distribution on an 8-bit NZBT adder, we select two scenarios where the mean values of the Gaussian distributions are  $\mu$ =0 and  $\mu$ =7. The results for PSNR for truncated bits from k=2 to k=6 are depicted in Fig.1.



Figure 1. different PSNR results of Gaussian and uniform distribution

As can be seen in Fig.1, for k=4, when  $\mu$ =7, the accuracy of the NZBT adder does not follow the expected slope (i.e. 6dB decrement of PSNR for increasing k by one) and an abrupt change is observed. That shows Gaussian distribution of input values brings different result from uniform distribution.

To study the error behavior of the adders for different mean values and standard deviations, n=8 and k=3 are chosen as an illustration. In this scenario,  $2^8$  different mean values ( $\mu$ ) (the expectation of the distribution) from 0 to 255 and 20 different standard deviations ( $\sigma$ ) from 0 to 20 are analyzed. The results for MSE and MAE are compared for NZBT and OLOCA adders in Fig.2.



Figure 2. The comparison of MSE and MAE for different mean value and standard deviation: (a) MSE for NZBT adder, (b) MSE for OLOCA adder, (c) MAE for NZBT adder, (d) MAE for OLOCA adder,

As depicted in Fig.2, the MSE and MAE of OLOCA adder for most of the scenarios is lower than the ones of NZBT adder. It can be seen that the error metrics are analyzed for various  $\mu$  and  $\sigma$  values, as mentioned above. The results in Gaussian distribution break the rule in uniform distribution. In other words, it can be seen that unlike uniform distribution that OLOCA adder always outperforms NZBT adder, in Gaussian distribution there are scenarios in which the NZBT adder is more accurate than OLOCA adder. Illustratively, we choose one scenario that  $\sigma$  and  $\mu$  are both 0, and the results are shown in Fig.3. Similarly, the scenario in which  $\sigma$  is 0 and  $\mu$  is 4 is depicted in Fig.4.

As can be seen in Fig.3 and Fig.4, for small values of  $\sigma$ , considerable deviations in PSNR are observed with the changes in mean values. As illustrated in the figures, with extremely small sigmas, the difference between PSNR for Gaussian and uniform distributions is maximized. This difference is about 6.7 dB for NZBT adder and 8.05 dB for OLOCA adder in Fig.4 as an instance.

On the other hand, with the Gaussian distributed inputs having large stand deviations ( $\sigma$ ), the mean value ( $\mu$ ) plays no role in accuracy. As it is shown in Fig.3 and Fig.4, while  $\sigma$  is large, influence of specific  $\mu$  is weak and the curve is flat to approaching results in uniform distributions. In addition, as discussed before, unlike in uniform distribution in which OLOCA adder always outperform NZBT, in Gaussian distribution, this is not always the case. For example, in Fig.4,



Figure 3. PSNR versus standard deviation for the case  $\mu=0$ 



Figure 4. PSNR versus standard deviation for the case  $\mu$ =4

for small values of  $\sigma$ , NZBT adder has a better PSNR than OLOCA adder.

# IV. MODEL

To analyze the error behavior of an n-bit adder in Gaussian distribution, normally different  $2^n$  different mean values should be considered. However, for an n-bit truncated adder with k-bit truncation, considering only the  $2^k$  mean values the error behavior of the adder can be modeled. In this section, in order to be consistent with the previous section, we consider 8-bit adders, and we analyze the error of the adders for standard deviation from  $\sigma = 0$  to  $\sigma = 20$ .

From the previous section, we know that increasing the standard deviation the error of the adders approaches the errors in uniform distribution. The error of the adders has been modeled in uniform distribution [4], [5]. As a result, the endpoints of the curves are known.

In order to model the start point, considering  $\sigma = 0$ , the two inputs of the adders are the same. Consequently, the MSE of the NZBT adder can be calculated as:

$$MSE_{\sigma=0} = (2^k - 1 - 2\mu)^2 .$$
(7)

From Eq.(5) and Eq.(7), the PSNR for the start point can be calculated. For example for an 8-bit NZBT adder with k=3, it can be concluded that there are only 4 unique curves for



Figure 5. repetition of PSNR with changing  $\mu$  for (a) NZBT adder, and (b) OLOCA adder.

error versus standard deviation. This repetition of PSNR for different mean values can be seen in Fig.5(a) for NZBT adder, and in Fig.5(b) for OLOCA adder. Note that the start point of the OLOCA adder can be calculated in a similar way.

It is worth mentioning that from Eq.(7), it can be concluded that the minimum MSE of the NZBT adder can be 1 for any k, which results in maximum PSNR equal to 54 dB for the 8-bit adder.

Considering that the start point and the endpoint of the errors have been modeled, we propose a linear model to understand the behavior of the errors. As a result, if we model the  $\sigma$  in which the error reaches the value in uniform distribution, we can find the slope for the linear model, and the model is complete. Based on the observation, we modeled this point with  $\sigma = 2^{k-2}$  which results in an acceptable accuracy for the model.

$$PSNR = \begin{cases} \frac{\sigma(PSNR_u - PSNR_0)}{2^{k-2}} + PSNR_0 & \text{if } \sigma < 2^{k-2} \\ PSNR_u & \text{otherwise} \end{cases}$$
(8)



Figure 6. linear NZBT model for differnt k with  $\mu$ =0 (a)model for k=3 (b)model for k=4(c)model for k=5



Figure 7. linear NZBT adder model for differnt k with different mu (a)model for k=3  $\mu$ =3 (b)model for k=4  $\mu$ =7(c)model for k=5  $\mu$ =15

where  $PSNR_0$  and  $PSNR_u$  are the PSNRs for the start point and the endpoint, respectively.

To evaluate the accuracy of the model, the simulation results are compared with the results from the model. The results are depicted for some ascending and descending cases in Fig.6 and Fig.7, respectively. As can be seen in the figures, the model accurately predicts the values of PSNR for different k,  $\sigma$ , and  $\mu$ s.

# V. CONCLUSION AND FUTURE WORK

This paper discusses the influence of different input data distributions on small-error approximate adders. In many applications such as image processing, machine learning and wireless communication the data has a distribution which can be considered as Gaussian distribution. It has been shown in this paper that considering Gaussian distribution the selection of approximate architectures might be different from uniform distribution. Consequently, it is necessary to take the effects of data distribution on the accuracy of the approximate architectures into consideration.

In addition, presenting a linear model to analyze the accuracy of a truncated adder, we showed that the error behavior of the adders can be modeled for Gaussian distributed inputs.

## REFERENCES

- S. Mittal, "A survey of techniques for approximate computing," ACM Comput. Surv., vol. 48, no. 4, pp. 62:1–62:33, Mar. 2016.
- [2] A. Najafi, M. Weißbrich, G. Payá-Vayá, and A. Garcia-Ortiz, "Coherent design of hybrid approximate adders: Unified design framework and metrics," *IEEE Journal on Emerging and Selected Topics in Circuits* and Systems, vol. 8, no. 4, pp. 736–745, Dec 2018.
- [3] H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie, and C. Lucas, "Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications," *IEEE Trans. on Circuits and Systems I: Regular Papers*, vol. 57, no. 4, pp. 850–862, April 2010.
- [4] F. Frustaci, S. Perri, P. Corsonello, and M. Alioto, "Energy-quality scalable adders based on nonzeroing bit truncation," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, pp. 1–5, 2018.
- [5] A. Dalloo, A. Najafi, and A. Garcia-Ortiz, "Systematic design of an approximate adder: The optimized lower part constant-or adder," *IEEE Trans. on Very Large Scale Integration (VLSI) Systems*, vol. 26, no. 8, pp. 1595–1599, Aug 2018.
- [6] A. B. Kahng and S. Kang, "Accuracy-configurable adder for approximate arithmetic designs," in *Proceedings of the 49th Annual Design Automation Conference*, ser. DAC '12. New York, NY, USA: ACM, 2012, pp. 820–825.
- [7] M. Shafique, W. Ahmad, R. Hafiz, and J. Henkel, "A low latency generic accuracy configurable adder," in 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), June 2015, pp. 1–6.
- [8] N. Zhu, W. L. Goh, and K. S. Yeo, "An enhanced low-power high-speed adder for error-tolerant application," in *Proceedings of the 2009 12th International Symposium on Integrated Circuits*, Dec 2009, pp. 69–72.
- [9] D. Mohapatra, V. K. Chippa, A. Raghunathan, and K. Roy, "Design of voltage-scalable meta-functions for approximate computing," in 2011 Design, Automation Test in Europe, March 2011, pp. 1–6.
- [10] C. Liu, J. Han, and F. Lombardi, "An analytical framework for evaluating the error characteristics of approximate adders," *IEEE Trans. on Computers*, vol. 64, no. 5, pp. 1268–1281, May 2015.
- [11] J. Park, J. H. Choi, and K. Roy, "Dynamic bit-width adaptation in dct: An approach to trade off image quality and computation energy," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 18, no. 5, pp. 787–793, May 2010.