|
| satan.sa 3.3 12/19/90
|
| The entry point satan computes the arctangent of an
| input value. satand does the same except the input value is a
| denormalized number.
|
| Input: Double-extended value in memory location pointed to by address
| register a0.
|
| Output: Arctan(X) returned in floating-point register Fp0.
|
| Accuracy and Monotonicity: The returned result is within 2 ulps in
| 64 significant bit, i.e. within 0.5001 ulp to 53 bits if the
| result is subsequently rounded to double precision. The
| result is provably monotonic in double precision.
|
| Speed: The program satan takes approximately 160 cycles for input
| argument X such that 1/16 < |X| < 16. For the other arguments,
| the program will run no worse than 10% slower.
|
| Algorithm:
| Step 1. If |X| >= 16 or |X| < 1/16, go to Step 5.
|
| Step 2. Let X = sgn * 2**k * 1.xxxxxxxx...x. Note that k = -4, -3,..., or 3.
| Define F = sgn * 2**k * 1.xxxx1, i.e. the first 5 significant bits
| of X with a bit-1 attached at the 6-th bit position. Define u
| to be u = (X-F) / (1 + X*F).
|
| Step 3. Approximate arctan(u) by a polynomial poly.
|
| Step 4. Return arctan(F) + poly, arctan(F) is fetched from a table of values
| calculated beforehand. Exit.
|
| Step 5. If |X| >= 16, go to Step 7.
|
| Step 6. Approximate arctan(X) by an odd polynomial in X. Exit.
|
| Step 7. Define X' = -1/X. Approximate arctan(X') by an odd polynomial in X'.
| Arctan(X) = sign(X)*Pi/2 + arctan(X'). Exit.
|
| Copyright (C) Motorola, Inc. 1990
| All Rights Reserved
|
| For details on the license for this file, please see the
| file, README, in this same directory.
|satan idnt 2,1 | Motorola 040 Floating Point Software Package
|--THE MOST LIKELY CASE, |X| IN [1/16, 16). WE USE TABLE TECHNIQUE
|--THE IDEA IS ATAN(X) = ATAN(F) + ATAN( [X-F] / [1+XF] ).
|--SO IF F IS CHOSEN TO BE CLOSE TO X AND ATAN(F) IS STORED IN
|--A TABLE, ALL WE NEED IS TO APPROXIMATE ATAN(U) WHERE
|--U = (X-F)/(1+XF) IS SMALL (REMEMBER F IS CLOSE TO X). IT IS
|--TRUE THAT A DIVIDE IS NOW NEEDED, BUT THE APPROXIMATION FOR
|--ATAN(U) IS A VERY SHORT POLYNOMIAL AND THE INDEXING TO
|--FETCH F AND SAVING OF REGISTERS CAN BE ALL HIDED UNDER THE
|--DIVIDE. IN THE END THIS METHOD IS MUCH FASTER THAN A TRADITIONAL
|--ONE. NOTE ALSO THAT THE TRADITIONAL SCHEME THAT APPROXIMATE
|--ATAN(X) DIRECTLY WILL NEED TO USE A RATIONAL APPROXIMATION
|--(DIVISION NEEDED) ANYWAY BECAUSE A POLYNOMIAL APPROXIMATION
|--WILL INVOLVE A VERY LONG POLYNOMIAL.
|--NOW WE SEE X AS +-2^K * 1.BBBBBBB....B <- 1. + 63 BITS
|--WE CHOSE F TO BE +-2^K * 1.BBBB1
|--THAT IS IT MATCHES THE EXPONENT AND FIRST 5 BITS OF X, THE
|--SIXTH BITS IS SET TO BE 1. SINCE K = -4, -3, ..., 3, THERE
|--ARE ONLY 8 TIMES 16 = 2^7 = 128 |F|'S. SINCE ATAN(-|F|) IS
|-- -ATAN(|F|), WE NEED TO STORE ONLY ATAN(|F|).
ATANMAIN:
movew #0x0000,XDCARE(%a6) | ...CLEAN UP X JUST IN CASE
andil #0xF8000000,XFRAC(%a6) | ...FIRST 5 BITS
oril #0x04000000,XFRAC(%a6) | ...SET 6-TH BIT TO 1
movel #0x00000000,XFRACLO(%a6) | ...LOCATION OF X IS NOW F
fmovex %fp0,%fp1 | ...FP1 IS X
fmulx X(%a6),%fp1 | ...FP1 IS X*F, NOTE THAT X*F > 0
fsubx X(%a6),%fp0 | ...FP0 IS X-F
fadds #0x3F800000,%fp1 | ...FP1 IS 1 + X*F
fdivx %fp1,%fp0 | ...FP0 IS U = (X-F)/(1+X*F)
|--WHILE THE DIVISION IS TAKING ITS TIME, WE FETCH ATAN(|F|)
|--CREATE ATAN(F) AND STORE IT IN ATANF, AND
|--SAVE REGISTERS FP2.
movel %d2,-(%a7) | ...SAVE d2 TEMPORARILY
movel %d0,%d2 | ...THE EXPO AND 16 BITS OF X
andil #0x00007800,%d0 | ...4 VARYING BITS OF F'S FRACTION
andil #0x7FFF0000,%d2 | ...EXPONENT OF F
subil #0x3FFB0000,%d2 | ...K+4
asrl #1,%d2
addl %d2,%d0 | ...THE 7 BITS IDENTIFYING F
asrl #7,%d0 | ...INDEX INTO TBL OF ATAN(|F|)
lea ATANTBL,%a1
addal %d0,%a1 | ...ADDRESS OF ATAN(|F|)
movel (%a1)+,ATANF(%a6)
movel (%a1)+,ATANFHI(%a6)
movel (%a1)+,ATANFLO(%a6) | ...ATANF IS NOW ATAN(|F|)
movel X(%a6),%d0 | ...LOAD SIGN AND EXPO. AGAIN
andil #0x80000000,%d0 | ...SIGN(F)
orl %d0,ATANF(%a6) | ...ATANF IS NOW SIGN(F)*ATAN(|F|)
movel (%a7)+,%d2 | ...RESTORE d2
|--THAT'S ALL I HAVE TO DO FOR NOW,
|--BUT ALAS, THE DIVIDE IS STILL CRANKING!
|--U IN FP0, WE ARE NOW READY TO COMPUTE ATAN(U) AS
|--U + A1*U*V*(A2 + V*(A3 + V)), V = U*U
|--THE POLYNOMIAL MAY LOOK STRANGE, BUT IS NEVERTHELESS CORRECT.
|--THE NATURAL FORM IS U + U*V*(A1 + V*(A2 + V*A3))
|--WHAT WE HAVE HERE IS MERELY A1 = A3, A2 = A1/A3, A3 = A2/A3.
|--THE REASON FOR THIS REARRANGEMENT IS TO MAKE THE INDEPENDENT
|--PARTS A1*U*V AND (A2 + ... STUFF) MORE LOAD-BALANCED
faddx %fp1,%fp0 | ...ATAN(U), FP1 RELEASED
fmovel %d1,%FPCR |restore users exceptions
faddx ATANF(%a6),%fp0 | ...ATAN(X)
bra t_frcinx
ATANBORS:
|--|X| IS IN d0 IN COMPACT FORM. FP1, d0 SAVED.
|--FP0 IS X AND |X| <= 1/16 OR |X| >= 16.
cmpil #0x3FFF8000,%d0
bgt ATANBIG | ...I.E. |X| >= 16
ATANSM:
|--|X| <= 1/16
|--IF |X| < 2^(-40), RETURN X AS ANSWER. OTHERWISE, APPROXIMATE
|--ATAN(X) BY X + X*Y*(B1+Y*(B2+Y*(B3+Y*(B4+Y*(B5+Y*B6)))))
|--WHICH IS X + X*Y*( [B1+Z*(B3+Z*B5)] + [Y*(B2+Z*(B4+Z*B6)] )
|--WHERE Y = X*X, AND Z = Y*Y.
cmpil #0x3FD78000,%d0
blt ATANTINY
|--COMPUTE POLYNOMIAL
fmulx %fp0,%fp0 | ...FP0 IS Y = X*X
movew #0x0000,XDCARE(%a6)
fmovex %fp0,%fp1
fmulx %fp1,%fp1 | ...FP1 IS Z = Y*Y
|--APPROXIMATE ATAN(-1/X) BY
|--X'+X'*Y*(C1+Y*(C2+Y*(C3+Y*(C4+Y*C5)))), X' = -1/X, Y = X'*X'
|--THIS CAN BE RE-WRITTEN AS
|--X'+X'*Y*( [C1+Z*(C3+Z*C5)] + [Y*(C2+Z*C4)] ), Z = Y*Y.
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.