[1]
Page 212 (second page of the Read State Register description),
4th paragraph from the top is printed as:
"RDFPRS waits for all pending FPops
to complete before reading the FPRS register."
It *should* read:
"RDFPRS waits for all pending FPops
**and loads of floating-point registers** to complete
before reading the FPRS register."
[2]
Page 234 (Tagged Add):
The "op3" column is incorrect in the Opcode table;
the low-order bit should be "0" for all Tagged-Add instructions.
The table should read:
| Opcode |
op3 |
Operation |
| TADDcc |
10 0000 ... |
|
| TADDccTV |
10 0010 ... |
|
[3]
Page 80 (subsection 6.3.6.4, RESTORED description):
In the last line of the 6.3.6.4, change:
CLEANWIN < NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))
[4]
Page 216 (RESTORED):
Third paragraph, last sentence, change
CLEANWIN != NWINDOWS
to:
(CLEANWIN < (NWINDOWS-1))
[5]
Page 76, Section 6.3.4.[12] (branches):
A *taken* conditional branch (not just a conditional branch)
should have been referred to in the last sentences of
two subsections.
Change the last sentence in 6.3.4.1,
"Conditional Branches", to:
Note that the annul behavior of a
taken conditional branch is different from that of an
unconditional branch.
And change the last sentence in 6.3.4.2,
"Unconditional Branches" to:
Note that the annul behavior of a
unconditional branch is different from that of a
taken conditional branch.
[6]
Page 290, Section G, Table 43:
In the table entries for "cas", "casl", "casx", and "casxl",
the built-in constant names beginning with "ASI" should
all be proceeded by "#" (as they were correctly specified
on p.286).
[7]
Page 242, Write State Register page:
In the Exceptions section:
"WRASR with rs1=16..31"
should read:
"WRASR with rd=16..31".
[8]
Page 57, subsection 5.2.10 (Register-Window State Registers):
A clarification has been added to Section 5, to allow
an implementation with 16 or fewer register windows the
option to implement the CWP, CANSAVE, CANRESTORE, OTHERWIN,
and CLEANWIN registers with fewer than 5 bits each, if
desired. The following text was added:
IMPL. DEP. #126: Privileged registers
CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain
values in the range 0..NWINDOWS-1. The effect of writing
a value greater than NWINDOWS-1 to any of these registers
is undefined. Although the width of each of these five
registers is nominally 5 bits, the width is implementation-dependent
and shall be between ceil(log2(NWINDOWS)) and 5 bits,
inclusive. If fewer than 5 bits are implemented, the unimplemented
upper bits shall read as 0 and writes to them shall have
no effect. All five registers shall be the same width.
[9]
Page 268, Table 32:
As a privileged instruction, "RDPR" should be listed with
a trailing superscript "P".
[10]
Pages 58-59, subsection subsection 5.2.10 (Register-Window
State Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE,
OTHERWIN, and CLEANWIN registers that the effect of writing
a value to them greater than NWINDOWS-1 is undefined.
[11]
Page 81:
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".
[12]
Page 81:
In section 6.3.9, a sentence was added stating that FSR.cexc
and FSR.ftt are cleared by FMOVcc and FMOVr whether or
not the move occurs.
[13]
Page 171,
Appendix A: Sentence added specifying that LDFSR
does not affect the upper 32 bits of FSR.
[14]
Page 220(r142)/A.49(r142), third paragraph:
The words "the" and "and" were transposed in the implementation
dependency description. It now reads: "The location of
the SIR_enable control flag and the means of accessing
the SIR_enable control flag..."
[15]
Page 229, paragraph beginning "Store integer...":
"...used for the load..." changed to "...used for the
store...".
[16]
Page 231, Appendix
A: Corrected SWAP deprecation note to recommend
use of "CASA" or "CASXA" (not "CASX") in place of SWAP.
[17]
Page 258, D.3.3., rule (1): The text was clarified,
to read "(1) The execution of Y is conditional on X, and
S(Y) is true."
[18]
Page 312, Appendix
I:
Missing word "not" added
to Compatibility Note: "The coprocessor opcodes were eliminated
because they have not been used in SPARC-V7 and SPARC-V8,
..." ^^^
[19a]
Page 195, Appendix
A:
Order of instructions in Suggested Assembly Language Syntax
was rearranged to correspond to order of the instructions
in the Opcode/op3/Operation table above it.
"movre" and "movrz", as the assembly-language
mnemonic and its synonym, were exchanged to correspond
with the instruction name of MOVRZ.
"movrne" and "movrnz", as the assembly-language
mnemonic and its synonym, were exchanged to correspond
with the instruction name of MOVRNZ.
[19b]
Page 228, Appendix
A:
Order of instructions in Suggested
Assembly Language Syntax was rearranged to correspond
to order of the instructions in the Opcode/op3/Operation
table above it.
[20]
Page 241, Appendix
A:
Added footnote to Suggested Assembly Language Syntax table,
noting that the suggested syntax for WRASR with rd=16..31
may vary, citing reference to implementation dependency
#48.
(Suggested Assembly Language Syntax
is just that -- *suggested* -- so isn't part of the architecture
specification anyway, but this change makes it clearer
that if bits are interpreted differently in the instruction,
one should expect its assembly-language syntax to change,
as well)
[21]
Page 40, Table 7:
Changed leftmost column
text as follows:
"Single" to "Single f.p. or 32-bit integer"
"Double" to "Double f.p. or 64-bit integer"
"Quad" to "Quad f.p."
| Corrections 22-57 were incorporated
into R1.4.5, Dec 1999, | | which was to be used for
the 2nd printing of the book. | | R1.4.5 (revision
1.4.5) can be identified by the text | | "SAV09R1429309"
inside the front cover of the book. | | These corrections
also appear in all subsequent revisions. |
[22]
p.13, subsection 2.57, definition of "reserved":
Wording:
"...intended to run on future version of"
was corrected to read:
"...intended to run on future versions of".
The sentence beginning "Reserved register
fields" was amend to read: "Reserved register fields should
always be written by software with values of those fields
previously read from that register, or with zeroes; they
should read as zero in hardware."
[23]
p.21(r142), Editor's Notes:
Added Les Kohn's name to the Acknowledgements.
[24]
p.28(r142), Tables 3,4,5:
Made use of hyphens & dashes made consistent, and easier
to read.
[25]
p.30(r142), paragraph just above subsection 5.1:
Changed end of sentence to read:
"...should be written with the values
of those bits previously read from that register, or with
zeroes."
[26]
p.40(r142), Table 7:
Added lines for 32-bit and 64-bit signed integers in f.p.
registers, for clarity.
[27]
p.51, Figure 17:
Added bits 11..10 to the figure, so it looks like:
| PID1 |
PID0 |
CLE |
TLE |
MM |
RED |
PEF |
AM |
PRIV |
IE |
AG |
| 11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
|
| \________/ (changed here) |
(see also Errata #28, #29, and #53)
[28]
p.52(r142), inserted new subsection 5.2.1.1 before old
one:
"IMPL. DEP. #127: The presence and semantics of PSTATE.PID1
and PSTATE.PID0 are implementation-dependent. Software
intended to run on multiple implementations should only
write these bits to values previously read from PSTATE,
or to zeroes. See also TSTATE bits 19..18."
(see also Errata #27, #29, and #53)
[29]
p.55(r142), Figure 22, (TSTATE register):
Extended the "saved PSTATE" field up through bit 19 of
TSTATE; changed the diagram to look like:
| ... |
ASI from TL=x |
--- |
PSTATE from TL=X |
... |
|
|
|
|
|
(see also Errata #27, #28, and #53)
[30]
p.56(r142):
Added a new paragraph to the end of subsection 5.2.6:
"TSTATE bits 19 and 18 are implementation-dependent.
ImplDep#126: If PSTATE bit 11 (10) is implemented, TSTATE
bit 19 (18) shall be implemented and contain the state
of PSTATE bit 11 (10) from the previous trap level.
If PSTATE bit 11 (10) is not implemented, TSTATE bit
19 (18) shall read as zero. Software intended to run
on multiple implementations should only write these
bits to values previously read from PSTATE, or to zeroes."
[31]
p.57(r142), subsection 5.2.10 (Register-Window State Registers):
Added implementation dependency #126:
IMPL. DEP. #126: Privileged registers
CWP, CANSAVE, CANRESTORE, OTHERWIN, and CLEANWIN contain
values in the range 0..NWINDOWS-1. The effect of writing
a value greater than NWINDOWS-1 to any of these registers
is undefined. Although the width of each of these five
registers is nominally 5 bits, the width is implementation-dependent
and shall be between ceil(log2(NWINDOWS)) and 5 bits,
inclusive. If fewer than 5 bits are implemented, the unimplemented
upper bits shall read as 0, and writes to them shall have
no effect. All five registers should have the same width.
(see also Errata #54)
[32]
pp.58-9(r142), subsection 5.2.10 (Register-Window State
Registers):
Added note to descriptions of CWP, CANSAVE, CANRESTORE,
OTHERWIN, and CLEANWIN registers that the effect of writing
a value to them greater than NWINDOWS-1 is undefined.
[33]
p.76, Section 6:
Last sentence in 6.3.4.1, "Conditional Branches" changed
to:
Note that the annul behavior of a taken
conditional branch is different from that of an unconditional
branch.
And the last sentence in 6.3.4.2, "Unconditional
Branches" changed to:
Note that the annul behavior of a unconditional
branch is different from that of a taken conditional
branch.
[34]
p.80(r142), 6.3.6.4(r142), RESTORED:
(duplicate of Erratum #3)
[35]
p.81(r141/r142):
In section 6.3.9, "FMOVc" was corrected to read "FMOVr".
[36]
p.81(r141/r142):
In section 6.3.9, a sentence was added stating the clearing
of FSR.cexc and FSR.ftt during condition moves FMOVcc
and FMOVr:
FMOVcc and FMOVr instructions clear
these FSR fields regardless of the value of the conditional
predicate.
[37]
p.121(r141/r142):
An index entry for "non-faulting loads" was fixed in section
8.3.
[38]
p.151(r142), A.9(r142), Compare and Swap page:
Added mention of CASL and CASXL to the Programming Note:
Compare and Swap Little (CASL) and
Compare and Swap Extended Little (CASXL) synthetic instructions
are available for "little endian" memory accesses.
[39]
p.171, Appendix
A, "Load Floating-Point":
Sentence added:
The upper 32 bits of FSR are unaffected by LDFSR.
[40]
p.181(r141/r142):
Section number "A.31" was fixed so it now increments to
A.32. All following section numbers and odd page headers
in Appendix A have changed.
[41]
p.191(r141/r142):
Misspelling corrected in page heading: "Condition" -->
"Condition"
[42]
p.195(r141/r142), "Move Integer Register on Register Condition
(MOVR)":
Order of instructions in Suggested Assembly Language Syntax
was rearranged to correspond to order of the instructions
in the Opcode/op3/Operation table above it.
"movre" and "movrz", as the assembly-language
mnemonic and its synonym, were exchanged to correspond
with the instruction name of MOVRZ.
"movrne" and "movrnz", as the assembly-language
mnemonic and its synonym, were exchanged to correspond
with the instruction name of MOVRNZ.
[43]
p.212(r14[123]) A.43(r14[12])/A.44(r144):
(duplicate of Erratum #1)
[44]
p.216(r142), A.46(r142), RESTORED page: (duplicate
of Erratum #4)
[45]
(duplicate of Erratum #14)
[46]
p.228(r141/r142):
Order of instructions in Suggested Assembly Language Syntax
was rearranged to correspond to order of the instructions
in the Opcode/op3/Operation table above it.
[47]
(duplicate of erratum #15)
[48]
p.231(r142)/233(r144), AppendixA:
Corrected SWAP deprecation note to recommend use of "CASA"
or "CASXA" (not "CASX") in place of SWAP.
[49]
p.234, A.58(r14[12])/A.59(r144), Tagged Add:
op3 opcodes are wrong. Both should have "0" for low-order
bit (as is correctly specified in Appendix E).
[50]
p.241(r142), A.62(r142), Write State Register page:
(duplicate of Erratum #20)
[51]
p.242(r142), A.62(r142), Write State Register page:
(duplicate of Erratum #7)
[52]
p.253(r142), Appendix
C:
Fixed 6 incorrect index entries.
[53]
p.253(4142), Appendix
C:
Added a new Implementation Dependency:
| # |
Cat |
Def/Ref |
Description |
| 127 |
f |
52, 56 |
The presence and semantics of PSTATE.PID1
and PSTATE.PID0 are implementation-dependent. The
presence of TSTATE bits 19 and 18 is implementation-dependent.
If PSTATE bit 11 (10) is implemented, TSTATE bit 19
(18) shall be implemented and contain the state of
PSTATE bit 11 (10) from the previous trap level. If
PSTATE bit 11 (10) is not implemented, TSTATE bit
19 (18) shall read as zero. Software intended to run
on multiple implementations should only write these
bits to values previously read from PSTATE, or to
zeroes. |
(see also Errata #27, #28, and #29)
[54]
p.255(r142), Appendix
C:
Added implementation dependency #126.
(see correction #31 above for the text of implementation
dependency #126)
[55]
p.258(r142), D.3.3., rule (1):
(duplicate of Erratum #17)
[56]
p.268(r142), Table 32:
(duplicate of Erratum #9)
[57]
p.290(r142), Section G, Table 43:
(duplicate of Erratum #6)
[58]
In Figure 3 in Chapter 6 (p.62), the 4th format
description from the bottom of the page (op,rd,op3,rs1,i=0,--,rs2)
contains an error; "i=0" should read "i=1".
[59]
In section 6.3.1, "Memory Access Instructions",
on p.67,
"and CAS accesses words or doublewords. " should be amended
to read: "CASA accesses words, and CASXA acesses doublewords."
[60]
In section 7.7, p. 111, the async_data_error exception
description should be updated to read as follows:
async_data_error [tt = 0x040] (Precise,
Deferred, or Disrupting) -- An implementation-dependent
exception (impl. dep. #31) that indicates that one or
more unrecoverable or uncorrectable but recoverable errors
have been detected in the processor. This may include
errors detected in the architectural registers (general-purpose
registers, floating-point registers, ASRs, or ASI registers)
and other core processor hardware. A single async_data_error
exception may indicate multiple errors and may occur asynchronously
to instruction execution. An async_data_error exception
may cause a precise, deferred, or disrupting trap. When
async_data_error causes a disrupting trap, the TPC and
TNPC stacked by the trap do not necessarily indicate the
instruction or data access that caused the error.
[61]
The following text should be added to the second paragraph
of section A.27 (p.176), to clarify the behavior of a
little-endian doubleword load (LDD):
With respect to little endian memory,
an LDD instruction behaves as if it is composed of two
32-bit loads, each of which is byte swapped independently
before being written into each destination register.
(see also Errata #62, #71, and #72)
[62]
The following text should be added to the second paragraph
of section A.28 (p.178), to clarify the behavior of a
little-endian doubleword load from alternate space (LDDA):
With respect to little endian memory,
an LDDA instruction behaves as if it is composed of two
32-bit loads, each of which is byte swapped independently
before being written into each destination register.
(see also Errata #61, #71, and #72)
[63]
In the Index, p.354, the "signal monitor instruction"
index entry should instead read "software intiated reset
(SIR) instruction".
[64]
There is an error in the definition of CLEANWIN (p.59)
and the SAVE instruction that allows the locals of the
"invalid" window to in some cases not be cleaned (zeroed)
when it is allocated by a SAVE instruction.
A software workaround (used in the Solaris
operating system and perhaps others), to keep user registers
clean of kernel data, involves the use of an extra %wstate
value. When the kernel returns to user code, it sets %wstate
to the new value. The new trap table entry for spills
with that %wstate value spills the window as usual but
also backs up a window and performs the missing "clean"
operation. The spill handler then sets %wstate back to
the default value for a user process.
[65]
In Chapter 7, "Traps", it is implied (but not
explicitly stated) that the value PSTATE.TLE is preserved
during traps that cause entry into RED_state and during
XIR, WDR, and SIR resets. However, PSTATE.TLE may be left
in an undefined states by one of those events. The correction,
which applies to sections 7.6.2.1 (p.106), 7.6.2.3 (p.108),
7.6.2.4 (p.109), and 7.6.2.5 (p.110) is to change the
little-ending mode settings from:
PSTATE.CLE <-- PSTATE.TLE (set
endian mode for traps)
to:
PSTATE.CLE <-- PSTATE.TLE (set endian mode for traps)
PSTATE.TLE <-- undefined
[66]
In Chapter 5, section 5.1.7.9 (p.48), the last
sentence of the third paragraph is inaccurate. The entire
third paragraph should be replaced with:
Floating-point operations which cause
an overflow or underflow condition may also cause an "inexact"
condition. For overflow and underflow conditions, FSR.cexc
bits are set and trapping occurs as follows:
o If an IEEE 754 overflow condition
occurs:
-- if TEM.OFM=0 and TEM.NXM=0, the
cexc.ofc and cexc.nxc bits are both set to 1, the other
three bits of cexc are set to 0, and
an IEEE_754_exception trap does *not* occur.
-- if TEM.OFM=0 and TEM.NXM=1, the cexc.nxc bit is set
to 1, the other four bits of cexc are set to 0, and
and an IEEE_754_exception trap *does* occur.
-- if TEM.OFM=1, the cexc.ofc bit is set to 1, the other
four bits of cexc are set to 0, and an IEEE_754_exception
trap *does* occur.
o If an IEEE 754 underflow condition
occurs:
-- if TEM.UFM=0 and TEM.NXM=0,
the cexc.ufc and cexc.nxc bits are both set to 1, the
other three bits of cexc are set to 0, and an IEEE_754_exception
trap does *not* occur.
-- if TEM.UFM=0 and TEM.NXM=1, the cexc.nxc bit is set
to 1, the other four bits of cexc are set to 0, and an
IEEE_754_exception trap *does* occur.
-- if TEM.UFM=1, the cexc.ufc bit is set to 1, the other
four bits of cexc are set to 0, and an IEEE_754_exception
trap *does* occur.
The above behavior is summarized in
the following table
(x = don't-care):
Conditions
-------------------------------- |
|
Results
--------------------------- |
Exception(s)
Detected
in f.p.
operation
------------ |
Trap Enable
Mask Bits
(in FSR.TEM)
-------------- |
fp_
exception_
ieee_754
Trap |
Current
Exception
Bits (in FSR.cexc)
--------------- |
|
of
---
|
uf
---
|
nx
---
|
|
OFM
---
|
UFM
---
|
NXM
---
|
|
-
|
-
|
-
|
|
x
|
x
|
x
|
|
-
|
-
|
*
|
|
x
|
x
|
0
|
|
-
|
*
|
*
|
|
x
|
0
|
0
|
|
*
|
-
|
*
|
|
0
|
x
|
0
|
| |
|
|
|
|
|
|
|
-
|
-
|
*
|
|
x
|
x
|
1
|
|
-
|
*
|
*
|
|
x
|
0
|
1
|
|
-
|
*
|
-
|
|
x
|
1
|
x
|
|
-
|
*
|
*
|
|
x
|
1
|
x
|
|
*
|
-
|
*
|
|
1
|
x
|
x
|
|
*
|
-
|
*
|
|
0
|
x
|
1
|
|
|
Occurs?
-------
|
ofc
---
|
ufc
---
|
nxc
---
|
Notes
-----
|
| no |
0
|
0
|
0
|
|
| no |
0
|
0
|
1
|
|
| no |
0
|
1
|
1
|
(1)
|
| no |
1
|
0
|
1
|
(2)
|
| |
|
|
|
|
| yes |
0
|
0
|
1
|
|
| yes |
0
|
0
|
1
|
|
| yes |
0
|
1
|
0
|
|
| yes |
0
|
1
|
0
|
|
| yes |
1
|
0
|
0
|
(2)
|
| yes |
0
|
0
|
1
|
(2)
|
|
(1) When the underflow trap is disabled
(UFM=0), underflow is always accompanied by inexact.
(2) Overflow is always accompanied
by inexact.
(see also Errata #67, #68, and #69)
[67]
In Appendix B, section B.3 (p.245), the first
paragraph:
"Underflow occurs if the
exact unrounded result has magnitude
between zero and the smallest normalized number in the
destination format."
should be replaced by the following
two paragraphs:
"On an implementation
that detects tininess before rounding, trapped underflow
occurs when the exact unrounded result has magnitude
between zero and the smallest normalized number in the
destination format.
On an implementation that detects
tininess after rounding, trapped underflow occurs when
the result, if it was rounded to a hypothetical format
having the same precision as the destination but of
unbounded range, would have magnitude between zero and
the smallest normalized number in the actual destination
format."
(see also Errata #66, #68, and #69)
[68]
In Appendix B, section B.4 (p.245), the first
two paragraphs:
The first paragraph:
"Underflow occurs
if the exact unrounded result has magnitudebetween zero
and the smallest normalized number in thedestination format,
*and* the correctly rounded result in the destination
format is inexact."
should be replaced by the following
paragraph:
On an implementation that
detects tininess before rounding, untrapped underflow
occurs when the exact unrounded result has magnitude between
zero and the smallest normalized number in the destination
format, *and* the correctly-rounded result in the destination
format is inexact."
And the beginning of the second paragraph:
"Table 28 summarizes
what happens when an exact ..."
should be modified to read:
"Table 28 summarizes what happens on an implementation
that detects tininess before rounding, when an exact ..."
(see also Errata #66, #67, and #69)
[69]
In Appendix B, Table 28, "Untrapped Floating-Point
Underflow" (p.245): Table 28 (and its
footnote) should be replaced by the following revised
table and text:
Table 28: Untrapped Floating-Point
Underflow (Tininess Detected Before Rounding)
| |
Underflow trap mask:
|
UFM=1 |
UFM=0 |
UFM=0 |
| |
Inexact trap mask:
|
NXM=x |
NXM=x |
NXM=0 |
| |
|
|
|
|
|
| u = r |
|
r is minimum normal |
none |
none |
none |
| |
|
r is subnormal |
UF |
none |
none |
| |
|
r is zero |
none |
none |
none |
| |
|
|
|
|
|
| u ! = r |
|
r is minimum normal |
UF |
NX |
uf nx |
| |
|
r is subnormal |
UF |
NX |
uf nx |
| |
|
r is zero |
UF |
NX |
uf nx |
| |
|
|
|
|
|
|
UF = IEEE_754_exception trap with
cexc.ufc=1
|
|
NX = IEEE_754_exception trap with
cexc.nxc=1
|
| |
|
|
|
uf = cexc.ufc=1, aexc.ufa=1, no
IEEE_754_exception trap
|
|
nx = cexc.nxc=1, aexc.nxa=1, no
IEEE_754_exception trap
|
In an implementation that detects tininess
after rounding, Table 28 applies to a narrower range of
values of the exact unrounded result u. The precise bounds
depend on the rounding direction specified in FSR.RD,
as follows:
o Let m denote the smallest normalized
number and e the absolute difference between 1 and the
next larger representable number in the destination
format. Then the bounds on u for which Table 28 applies
are:
|
|
Rounding |
|
|
FSR.RD
|
Toward |
Range of Values of
u |
|
-------------
|
------------ |
--------------------- |
|
0
|
nearest |
|u| < m(1 - e/4) |
|
1
|
0 |
|u| < m |
|
2
|
+infinity |
-m < u <= m(1
- 2/2) |
|
3
|
-infinity |
-m(1 - e/2) <= u
< m |
o When u lies outside these ranges,
underflow does not occur,
although an inexact exception still occurs when u != r,
the rounded value.
(see also Errata #66, #67, and #68)
[70]
In Appendix A, section A.40, "No Operation"
(p.204):
For clarity, in the instruction
format diagram the eterm "op" should be replaced
by five zeroes.
[71]
In Appendix A, section A.53, "Store Integer"
(p.227):
The following paragraph should be added near
the end of the Description subsection, prior to the Programming
Note, to clarify the behavior of a little-endian doubleword
store (STD):
"With respect to little-endian
memory, a STD instruction behaves as if it is composed
of two 32-bit stores, each of which is byte-swapped
independently before being written into its respective
destination memory word."
(see also Errata #61, #62,
and #72)
[72]
In Appendix A, section A.54, "Store Integer Into
Alternate Space" (p.229):
The following paragraph should be added near
the end of the Description subsection, prior to the Programming
Note, to clarify the behavior of a little-endian doubleword
store to alternate space (STDA):
"With respect to little-endian
memory, a STDA instruction behaves as if it is composed
of two 32-bit stores, each of which is byte-swapped
independently before being written into its respective
destination memory word."
(see also Errata #61, #62, and #71)
[73]
In Chapter 7, pp.101-102: reference is made in two places
to a range of trap priorities, with 0 as the highest priority
and 31 as the lowest.
Architecturally, there are no absolute trap priorities
(only relative trap priorities) and there is no specific
limit to trap priority numbers. Trap priorities are only
used by a processor to choose which exception will cause
a trap at any given time; a trap priority is an ordinal
number which need not be stored anywhere. Therefore, the
following changes should be noted:
Caption above
Table 15, p.101:
Change:
0
= Highest; 31 = Lowest"
to:
0
= Highest"
Text of first
paragraph of section 7.5.3 on p.102:
Change:
"Priority 0 is highest, priority 31 is lowest; that
is, if......."
to:
"A trap priority is an ordinal number, with 0 indicating
the highest priority and greater priority numbers
indicating decreasing priority; that is, if......"
[74]
In Chapter 7, page 88, Figure 37 "Processor State
Diagram", the following corrections should be made
in the figure:
| -- |
all references to "Trap"
should be changed to "nrt" |
| |
|
| -- |
add to the caption
the words: |
| |
|
("nrt" =
"non-reset trap") |
| |
|
|
| -- |
"or SIR" should be added to the label
on the center topmost arc in the diagram, so that
it reads "nrt or SIR @ TL = MAXTL"
|
| |
|
|
| -- |
The references to "RED = 1" and "RED
= 0" should be changed to
"PSTATE.red <-- 1" and "PSTATE.red
<-- 0", respectively, for clarity.
|
| |
|
|
| -- |
Under the arc from "execute_state" to
"RED_state", the label currently reads:
|
| |
|
"Trap or SIR @
TL < MAXTL, RED=1". |
| |
The words "Trap or" should be removed,
so that it reads:
|
| |
|
"SIR @ TL
< MAXTL, RED <-- 1" |
| |
|
|
| A related change should
be made on the first page of Chapter 7 (p.87) to the
definition of "trap" in the paragraph beginning
"Thus, an exception is...". The words: |
| |
"...in response
to the presence of an exception, interrupt, reset,
or Tcc instruction" |
| should be changed to: |
| |
"...in response
to the presence of an exception, interrupt, reset,
or Tcc instruction" |
| The same change should
be made to the definition of "trap" in section
2.66 on p.13. |
[75]
In Chapter 6, p. 76, Table 13, the four rows with "B"
in the leftmost cell should be more clearly labelled with
Branch Always (BA) and Branch Never (BN) abbreviations,
as follows:
| |
BA |
|
| |
BN |
|
| |
BA |
|
| |
BN |
|
| Correspondingly, at
the top of p.75: |
| |
always or never taken,
represented in the table by "B" |
| should be replaced
by: |
| |
always or never taken, represented
in the table by "BA" and "BN",
respectively
|
[76]
In Chapter 6, p.63, Figure 34, the Format(4) diagram for
Tcc
| (the one including
"sw_trap_#", the third one from the bottom)
is incorrect. That diagram should be deleted and replaced
by a copy of the two Format-4 diagrams from the Tcc
instruction page (p. 237). |
[77]
The contents of Chapter 6, section 6.3.11, p.82, should
be replaced by the following:
If a conforming SPARC V9 implementation
attempts to execute an instruction that is not specificallydefined
in this specification, it behaves as follows:
| o |
If the instruction
encodes an implementation-specific extension to the
instruction set, that extension is executed. |
| |
|
|
| o |
If the instruction
does not encode an extension to the instruction set,
but would decode as a valid instruction if nonzero
bits in reserved instruction field(s) were ignored
(read as 0): |
| |
|
--
the recommended behavior is to generate an illegal_instruction
exception (or, in the FPop opcode space, an fp_exception_other
exception with FSR.ftt = 3(unimplemented_FPop) |
| |
|
--
altenatively, the implementation can ignore the nonzero
reserved field bits and execute the instruction as
if those bits had been zero. |
| |
|
|
| o |
If the instruction
does not encode an extension to the instruction set
and would still not decode as a valid instruction
if nonzero bits in reserved instruction field(s) were
ignored, then the instruction is invalid and causes
an exception. Specifically, attempting to execute
an invalid instruction in the FPop opcode space causes
an fp_exception_other trap (with FSR.ftt = unimplemented_FPop);
attempting to execute any other invalid instruction
causes an illegal_instruction trap. |
| |
|
|
| See Appendix
E, "Opcode Maps", for an enumeration of
reserved opcodes. |
|
|
|
Implementation Note:
|
| |
As described above,
implementations are strongly encouraged, but not strictly
required, to trap on nonzero values in reserved instruction
fields. |
| |
|
|
|
Programming Note:
|
| |
For software portability,
software (such as assemblers, static compilers, and
dynamic compilers) that generates SPARC instructions
must always generate zeroes in instruction fields
marked "reserved" ("--"). |
[78]
In Appendix A, p.131, numbered bullet point (2), third
sentence:
| Currently reads: |
| |
If a conforming SPARC-V9
implementation encounters nonzero values in these
fields, its behavior is undefined. |
| Should be corrected
to read: |
| |
If a conforming SPARC
V9 implementation encounters nonzero values in these
fields, its behavior is as defined in section 6.3.11
on page 82. |
[79]
In Appendix E, p.267:
| The second paragraph
currently reads: |
| |
...an attempt to execute
a reserved opcode shall cause a trap, unless it is
an implementation-specific extension to the instruction
set. |
| Should be corrected
to read: |
| |
...an attempt to execute
a reserved opcode behaves as defined in section 6.3.11
on page 82. |
[80]
In Chapter 2, section 2.57 (definition of "reserved"),
three corrections:
| a) |
Where it currently
reads: |
| |
|
...reserved instruction
fields is undefined. |
| |
Should be corrected
to read: |
| |
|
...reserved instruction
fields is as defined in in section 6.3.11 on page
82. |
| |
|
|
| b) |
Where it currently
reads: |
| |
|
Reserved register fields
should ... |
| |
Should be corrected
to read: |
| |
|
...assume that these
fields will read... |
| |
|
|
| c) |
Where it currently
reads: |
| |
|
...assume that these
field will read... |
| |
Should be corrected
to read: |
| |
|
...assume that these
fields will read... |
[81]
In Appendix E, pp.267-8, two corrections:
| a) |
p.267, Table 31, "BPr"
column, currently reads: |
| |
|
BPr |
| |
|
See Table 37 |
| |
To reinforce that
bit 28=0 for BPr, this should be corrected to read:
|
| |
|
BPr (bit 28=0) |
| |
|
|
See Table 37 |
| |
|
-- (bit 28=1) |
| |
Plus, a footnote must
be added: |
| |
|
Although SPARC V9 implementations
should cause an illegal_instruction exception when
bit 28=1, many early implementations ignored the value
of this bit and executed the opcode as a BPr instruction
even if bit 28=1. |
| |
This footnote should
be referenced in both Appendix E (p.268) and on the
BPr instruction page (p.136). |
| |
|
|
|
| b) |
p.268, Table 32, table
cell for Tcc (op3=0x3A), currently reads: |
| |
|
Tcc (bit 29=0) |
| |
|
|
See Table 36 |
| |
|
-- |
(bit 29=1) |
[82]
The behavior of an attempt to reference a restricted ASI
by a
| PREFETCHA while in
nonprivileged mode is not clear; the second paragraph
of 6.3.1.3 (p.71) suggests that a privileged_action
exception should occur and the first sentence in A.41
(p.2.3) suggests that an implementation should treat
it as a NOP. Although such a reference is clearly
inappropriate in nonprivileged software, a case can
be made for either response by an implementation and
both may have been implemented. |
| |
|
|
|
| Therefore, it is implementation-dependent
whether this condition causes a privileged_action
exception or executes as a NOP. |
| |
|
|
|
| In section A.41, p.204,
new implementation dependency #103(6) should be added:
|
| |
IMPL.DEP. #103(6):
Whether an attempt to reference a restricted ASI (<
0x80) by a PREFETCHA instruction while in nonprivileged
mode (PSTATE.PRIV=0) causes a privileged_action exception
or executes as a NOP is implementation-dependent.
|
| In 6.3.1.3, second
paragraph, append to the end of the second sentence:
|
| |
"(see impl.dep.#103(6))" |
| |
|
|
|
| At the end of section
A.41, page 207: |
| |
The following entry
should be added to the end of the "Exceptions"
list: |
| |
|
privileged_action |
(PREFETCHA with PSTATE.PRIV=0
and |
| |
|
|
ASI<0x80 (impl.dep.#103(6)) |
|
In
Appendix C, p.252:
|
| |
A reference to new
implementation dependency #103(6) should be added
to the entry for implementation dependency #103. |
[83]
Chapter 5, section 5.2.1.1, "PSTATE_current_little_endian
(CLE)", on p.52 reads:
When PSTATE.CLE = 1, all data
reads and writes using an implicit ASI are performed in
little-endian byte order with an ASI of ASI_PRIMARY_LITTLE.
When PSTATE.CLE = 0, all data reads and writes using an
implicit ASI are performed in big-endian byte order with
an ASI of ASI_PRIMARY. Instruction accesses are always
big-endian.
This description assumes the
processor is executing with TL = 0; to make it accurate
for all conditions, it should be modifed to read:
When PSTATE.CLE = 1, all data
accesses using an implicit ASI are performed in little-endian
byte order. When PSTATE.CLE = 0, all data accesses using
an implicit ASI are performed in big-endian byte order.
Instruction accesses are always performed using big-endian
byte order. Specific ASIs used are shown in Table __ on
page 71.
[84]
The first paragraph of Chapter 6, section 6.3.1.3, "Address
Space Identifiers (ASIs)", p.71, should be replaced by:
Alternate-space Load, store, and load-store
instructions specify an explicit ASI to use for their
data access; when i = 0, the explicit ASI is provided
in the instruction's imm_asi field and when i = 1, it
is provided in the in ASI register. Non-alternate-space
load, store, and load-store instructions use an implicit
ASI value which depends on the current trap level (TL)
and the value of PSTATE.CLE. Instruction fetches use
an implicit ASI which depends only on the current trap
level. The cases are enumerated in Table __.
Table __: ASIs used for
Data Access and Instruction Fetches
Access Type
---------- |
TL
--- |
PSTAE.CLE
--------- |
ASI Used
------------ |
| Instruction |
= 0 |
any |
ASI_PRIMARY |
| Fetch |
> 0 |
any |
ASI_NUCLEUS* |
| |
|
|
|
| ---------- |
|
|
|
| |
|
|
|
| Non-alternate- |
= 0 |
0 |
ASI_PRIMARY |
| space Load, |
|
1 |
ASI_PRIMARY_LITTLE |
| Store, or |
> 0 |
0 |
ASI_NUCLEUS* |
| Load-Store |
|
1 |
ASI_NUCLEUS_LITTLE** |
| |
|
|
|
| ---------- |
|
|
|
| |
|
|
|
| Alternate- |
any |
any |
ASI explicitly |
| space |
|
|
specified in the |
| Load, Store |
|
|
instruction (subject |
| or Load-Store |
|
|
to privilege level |
| |
|
|
restrictions) |
| * |
on
some early SPARC V9 implementations, ASI_PRIMARY
may have been used for this case |
| ** |
on
some early SPARC V9 implementations, ASI_PRIMARY_LITTLE
may have been used for this case |
Also see section 8.3, Addressing and
Alternate Address Spaces, on page 119.
[85]
In the assembly-language sample at end of section 7.2.1.3,
p.91, an instruction is missing that would shift the "TT"
value left by 5 bits to line up with the correct field
in TBA.
| |
Specifically, the current
text: |
| |
|
rdpr |
%tt, %g1 |
| |
|
rdpr |
%tba, %g2 |
| |
|
add |
%g1, %g2, %g2 |
| |
|
|
|
| |
Should be replaced
by: |
| |
|
rdpr |
%tt, %g1 |
| |
|
rdpr |
%tba, %g2 |
| |
|
sllx |
%g1, 5, %g1 |
| |
|
add |
%g1, %g2, %g2 |
[86]
On page 45, section 5.1.7.6, the first paragraph,
replace the end of the last sentence:
"... the ftt field
encodes the type of the floating-point exception until
an STFSR or an FPop executes."
"... the
ftt field encodes the type of the floating-point
exception until an STFSR, STXFSR, or FPop executes."
[87] On page
78, at the end of section 6.3.5.1, the line of example
code:
movg
%xcc, %g0,1, %i3
should
read:
movg %xcc,
1, %i3
[88] On page
11, replace section 2.41 (a single paragraph) with this
revised definiton:
| |
non-faulting
load: |
| |
A load
operation that behaves identically to a normal load
operation, except when supplied an invalid
effective address by software.
In that case, a regular load triggers an exception
while a non-faulting load appears (possibly
with the assistance of system software)
to ignore the exception and loads its
destination register with a value of zero. |
[89]
On page 40, section 5.1.4.2, the first Programming Note
contains an error of omission, since poorly-aligned double-
(or quad-) precision f.p. data _can_ be loaded directly
into the upper half of the f.p. register file using LDDF(A)/LDQF(A)
instructions.
The following should
replace the erroneous Programming Note:
| |
Programming Note: |
| |
|
The upper 16 double-precision
(upper 8 quad-precision) floating-point registers
cannot be directly loaded by 32-bit load instructions.
Therefore, double- or quad-precision data that is
only word-aligned in memory cannot be directly loaded
into the upper registers using LDF(A)
instructions. The following guidelines are recommended:
|
| |
|
|
|
| |
(1) |
Whenever possible,
align floating-point data in memory on proper address
boundaries. If access to a datum is required to be
atomic, the datum _must_ be properly aligned. |
| |
|
|
|
| |
(2) |
When a double- or quad-precision
datum is not properly aligned in memory, is still
aligned on a 4-byte boundary, and access to the datum
in memory is not required to be atomic, software should
attempt to allocate a register for it in the lower
half of the floating-point register file so that the
datum can be loaded using multiple LDF(A) instructions.
|
| |
|
|
|
| |
(3) |
If the only available
registers for such a datum are located in the upper
half of the floating-point register file and access
to the datum in memory is not required to be atomic,
the word-aligned datum can be loaded into them by
one of two methods: |
| |
|
(a) |
load the datum into
an upper register by using multiple LDF(A) instructions
to first load it into a double[quad]- precision register
in the lower half of the floating-point register file,
then copy that register to the desired destination
register in the upper half, or |
| |
|
(b) |
use a LDDF(A)[LDQF(A)]
instruction to perform the load directly into the
upper floating-point register, understanding that
use of these instructions on poorly-aligned data can
cause a trap (LDDF[LDQF]_mem_not_aligned) on some
implementations which may significantly slow down
program execution. |
[90]
On page 76, section 6.3.4.3, the second paragraph says:
"The JMPL instruction ... then causes a PC-relative delayed transfer of control..."
"The JMPL instruction ... then causes a register-indirect delayed transfer of control..."
[91] On page 76, Table 13 at the top of the page:
The 8th row in the table, the one for Branch (B) that is Taken with Annul bit=1, has an erroneous entry in the "Delayed" column. A Taken unconditional branch with Annul bit=1 is non-delayed, therefore show say "No" in the Delayed column.
[92] On page 184, Table 26, last row (Lookaside barrier row):
The text under the "Description " column in the Lookaside row should be replaced by the following text:
(Deprecated) A store appearing prior to the MEMBAR must complete
before any load following the MEMBAR referencing the same address can
be initiated.
MEMBAR #Lookaside is deprecated and is supported only for legacy code;
it should not be used in new software. A slightly more restrictive
MEMBAR operation (such as MEMBAR #StoreLoad) should be used, instead.
Implementation Note: Since #Lookaside is deprecated, implementations
are not expected to perform address matching. Instead, they should
provide #Lookaside functionality using a more restrictive MEMBAR
operation, such as #StoreLoad. (in fact, no SPARC V9 processor has
ever implemented address matching; all have implemented #Lookaside
using a more restrictive operation such as #StoreLoad or #Sync)
[93] On page 285, last line on the page (#Lookaside line):
The following note should be added to #Lookaside:
Use of #Lookaside is deprecated and only supported for legacy
software. New software should use a slightly more restritive MEMBAR
operation (such as #StoreLoad) instead.
|