mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Blogorrhea > sweety439

Reply
 
Thread Tools
Old 2022-04-30, 03:53   #1
 
sweety439's Avatar
 
"99(4^34019)99 palind"
Nov 2016
(P^81993)SZ base 36

25×5×23 Posts
Default New floating-point arithmetic

Double: 64 bits

bit 63: sign (0 = positive, 1 = negative)

bit 62 to bit 52: exponent (bit 62 = e_10, bit 61 = e_9, bit 60 = e_8, ..., bit 53 = e_1, bit 52 = e_0) (the exponent uses two's complement representation, the range is -1024 (e_10,e_9,e_8,...,e_1,e_0 = 10000000000) through +1023 (e_10,e_9,e_8,...,e_1,e_0 = 01111111111))

bit 51 to bit 0: significand precision (bit 51 = s_51, bit 50 = s_50, bit 49 = s_49, ..., bit 1 = s_1, bit 0 = s_0)

the value of the number is

(-1)^sign*(1.s_51,s_50,s_49,...,s_1,s_0)2*2^(e_10,e_9,e_8,...,e_1,e_0)2 (for the integer part in the "scientific notation", in decimal (base 10) it can be 1, 2, 3, ..., 8, 9, but in binary (base 2) it must be 1, thus this 1 need not to memory)

Special cases:

* sign = 0, e_10,e_9,e_8,...,e_1,e_0 = 01111111111 (+1023), s_51,s_50,s_49,...,s_1,s_0 are all 1 --> +∞
* sign = 1, e_10,e_9,e_8,...,e_1,e_0 = 01111111111 (+1023), s_51,s_50,s_49,...,s_1,s_0 are all 1 --> -∞
* sign = 0, e_10,e_9,e_8,...,e_1,e_0 = 10000000000 (-1024), s_51,s_50,s_49,...,s_1,s_0 are all 0 --> 0
* sign = 1, e_10,e_9,e_8,...,e_1,e_0 = 10000000000 (-1024), s_51,s_50,s_49,...,s_1,s_0 are all 0 --> NaN

(I think this double floating-point is better than the original one, since it is bijective)

Last fiddled with by sweety439 on 2022-05-03 at 03:34
sweety439 is offline   Reply With Quote
Old 2022-04-30, 03:57   #2
 
sweety439's Avatar
 
"99(4^34019)99 palind"
Nov 2016
(P^81993)SZ base 36

25·5·23 Posts
Default

Examples:

Code:
0000 0000 0000 0000   = 1
8010 0000 0000 0000   = −2
0057 8000 0000 0000   = 47
8085 5000 0000 0000   = −341
3fff ffff ffff fffe   = 21024−2972 (Max double)
4000 0000 0000 0001   = 2−1024+2−1076 (Min double)
7ff0 0000 0000 0000   = 1/2
7fe5 5555 5555 5555   ≈ 1/3
4000 0000 0000 0000   = 0
c000 0000 0000 0000   = NaN
3fff ffff ffff ffff   = ∞
bfff ffff ffff ffff   = −∞

Last fiddled with by sweety439 on 2022-05-03 at 03:22
sweety439 is offline   Reply With Quote
Old 2022-04-30, 04:08   #3
 
sweety439's Avatar
 
"99(4^34019)99 palind"
Nov 2016
(P^81993)SZ base 36

25×5×23 Posts
Default

single: 32 bits, including 1 sign bit, 8 exponent bits, 23 significand precision bits
double: 64 bits, including 1 sign bit, 11 exponent bits, 52 significand precision bits
long double: 80 bits, including 1 sign bit, 16 exponent bits, 63 significand precision bits
quadruple: 128 bits, including 1 sign bit, 15 exponent bits, 112 significand precision bits
octuple: 256 bits, including 1 sign bit, 16 exponent bits, 239 significand precision bits (I think it is better, since 239 significand precision bits means its significant bits is 239+1 = 240 bits, and 240 is a highly-composite number, i.e. 240 has many divisors, and 240 bits ≈ 72 decimal digits (and thus its significant decimal digits is 72 digits), and 72 also has many divisors (72 is known as the smallest Achilles number), the current octuple-precision is 1 sign bit, 19 exponent bits, 236 significand precision bits)

super: 65536 bits, including 1 sign bit, 256 exponent bits, 65280 significand precision bits

use my sense of two's complement exponent and only one bit combo refers to each of (+∞,-∞,0,NaN), super-precision has 65279+1 = 65280 bits (≈19652 decimal digits) significant digits, its maximum number is 2^(2^255)-2^(2^255-65280), and its minimum nonzero number is 2^(-2^255)+2^(-2^255-65280)

Last fiddled with by sweety439 on 2022-05-03 at 03:32
sweety439 is offline   Reply With Quote
Old 2022-04-30, 04:25   #4
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

61×109 Posts
Default

The lack of negative zero will be a problem for some algorithms.

And if NaN always, or never, faults then is could be cumbersome. The existing QNaN vs SNaN allows for some nice efficiencies. And the extra bits in the QNaN/SNaN encoding provides good debugging opportunities.

I don't care about denormals, so whatever.

But -0 == Nan? Why? Extra circuitry/code for what gain?
retina is online now   Reply With Quote
Old 2022-04-30, 04:33   #5
 
sweety439's Avatar
 
"99(4^34019)99 palind"
Nov 2016
(P^81993)SZ base 36

25×5×23 Posts
Default

In super-precision:

Number rounds to 65280 binary significant digits, use Gaussian rounding (round half to even), i.e. (the bold number is the 65280th binary significant digit)

* ...00... --> ...0
* ...01...1... --> ...1
* ...01000... (the digits after the only 1 in the 65281st bit are all 0) --> ...0
* ...11... --> ...(+1)0
* ...10...0... --> ...1
* ...10111... (the digits after the only 0 in the 65281st bit are all 1) --> ...(+1)0

(remember: 0.999... = 1)

These calculations return "+∞":

* the result number >= 2^(2^255) (in fact, >=2^(2^255)-2^(2^255-65281), since we must use Gaussian rounding to round to 65280 binary significant digits, thus 2^(2^255)-2^(2^255-65281) (which has 65281 consecutive 1's after the "0" in the 2^(2^255) bit) become 2^(2^255) and become +∞, since the 2^(2^255-65280) digit is 1 and the digits after it is exactly a half, thus it will be rounded up)
* (+∞) + (x) (except the cases x = -∞ and x = NaN)
* (+∞) - (x) (except the cases x = +∞ and x = NaN)
* (x) - (-∞) (except the cases x = -∞ and x = NaN)
* (+∞) * (x) when x > 0 (including x = +∞)
* (-∞) * (x) when x < 0 (including x = -∞)
* (+∞) / (x) when x >= 0 (except x = +∞)
* (-∞) / (x) when x < 0 (except x = -∞)
* (x) / (0) when x > 0 (including x = +∞)
* (+∞) ^ (x) when x > 0 (including x = +∞)
* (x) ^ (+∞) when x > 1 (including x = +∞)
* (x) ^ (-∞) when 0 <= x < 1

These calculations return "0":

* the result number between 2^(-2^255) and -2^(-2^255) inclusive (in fact, between 2^(-2^255)+2^(-2^255-65281) and -2^(-2^255)-2^(-2^255-65281) inclusive, since we must use Gaussian rounding to round to 65280 binary significant digits, thus 2^(-2^255)+2^(-2^255-65281) (which has 65280 consecutive 0's after the "1" in the 2^(-2^255) bit) become 2^(2^-255) and become 0, since the 2^(-2^255-65280) digit is 0 and the digits after it is exactly a half, thus it will be rounded down)
* (x) / (+∞)
* (x) / (-∞)
* (x) ^ (+∞) when 0 <= x < 1
* (x) ^ (-∞) when x > 1 (including x = +∞)

These calculations return "NaN":

* at least one number is NaN
* the result number is complex number, e.g. (-1)^(1/2)
* (+∞) + (-∞)
* (+∞) - (+∞)
* (+∞) * (0)
* (+∞) / (+∞)
* (0) / (0)
* (0) ^ (0)
* (+∞) ^ (0)
* 1 ^ (+∞)

Last fiddled with by sweety439 on 2022-05-03 at 03:34
sweety439 is offline   Reply With Quote
Old 2022-04-30, 04:33   #6
 
sweety439's Avatar
 
"99(4^34019)99 palind"
Nov 2016
(P^81993)SZ base 36

25·5·23 Posts
Default

Quote:
Originally Posted by retina View Post
The lack of negative zero will be a problem for some algorithms.

And if NaN always, or never, faults then is could be cumbersome. The existing QNaN vs SNaN allows for some nice efficiencies. And the extra bits in the QNaN/SNaN encoding provides good debugging opportunities.

I don't care about denormals, so whatever.

But -0 == Nan? Why? Extra circuitry/code for what gain?
The value of -0 is the same as +0, thus no need to use a bit combo for -0
sweety439 is offline   Reply With Quote
Old 2022-04-30, 04:48   #7
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

61×109 Posts
Default

Quote:
Originally Posted by sweety439 View Post
The value of -0 is the same as +0, thus no need to use a bit combo for -0
I mean the way it is handled in the machine, not the value.

You need something to manipulate the bits to do the computations. If you make the bit patterns hard to deal with then it is no fun to use. Too many special cases in the code or the circuitry.
retina is online now   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
floating point operations ATH Lounge 3 2006-01-01 20:29
Floating point options for Windows XP 64 dsouza123 Hardware 2 2005-03-12 17:45
LL tests: Integer or floating point? E_tron Math 4 2004-01-13 19:44
Floating point precision lunna Hardware 11 2003-12-29 16:46
floating point exception in Version 23.4.2 mda2376 Software 2 2003-06-12 04:45

All times are UTC. The time now is 09:55.


Tue Jan 3 09:55:26 UTC 2023 up 138 days, 7:24, 0 users, load averages: 0.65, 0.81, 0.80

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔