Lesson 02: ARM Cortex-M Instruction Set

Instruction Set for Cortex-M4

You can download the following reference materials:

Assembly Instruction Set: Cortex-M3/M4F Instruction Set
Technical Reference Manual Cortex M4, Assembly Instruction Set: CortexM4_TRM_r0p1.pdf
ARM and Thumb-2 Instruction Set Quick Reference Card: QuickReferenceCard.pdf

Memory Access Instructions

General Data Processing Instructions

Multiply and Divide Instructions

Saturating Instructions

Packing and Unpacking Instructions

Bitfield Instructions

Branch and Control Instructions

Miscellaneous Instructions

Floating-point Instructions (Cortex-M4)

The Cortex-M4 processor comes with an FPU co-processor. It provides floating-point computation functionality compliant with the ANSI/IEEE std 754-2008, IEEE Standard for Binary Floating-Point Arithmetic, referred to as the IEEE 754 standard.

The FPU supports all single-precision data-processing instructions: add, subtract, multiply, divide, multiply and accumulate, and square root operations. It also provides conversions between fixed-point and floating-point data formats and floating-point constant instructions.

FPU Register Bank

The FPU provides an extension register file containing 32 single-precision registers.

Sixteen 64-bit double-word registers, D0 ~ D15
Thirty-two 32-bit single-word registers, S0 ~ S31

The FPU register bank is shown in the following diagram:

You can access the Least-Significant half of the value in D6 by accessing S12 and the Most-Significant half of the elements by accessing S13.

Floating-Point Instructions

Enabling the FPU

The FPU is disabled from reset. You must enable it before you can use any floating-point instructions. Example 4.1 shows an example code sequence for enabling the FPU in both privileged and user modes. The processor must be in privileged mode to read from and write to the CPACR.

; CPACR is located at address 0xE000ED88
        LDR.W   R0, =0xE000ED88
; Read CPACR
        LDR     R1, [R0]
; Set bits 20-23 to enable CP10 and CP11 coprocessors
        ORR     R1, R1, #(0xF << 20)
; Write back the modified value to the CPACR
        STR     R1, [R0]; wait for store to complete
        DSB
;reset pipeline now the FPU is enabled
        ISB

ARM Cortex-M Instruction Groups

ARM Cortex-M instruction groups
Group	Instr bits	Instructions	Cortex M0	Cortex M0+	Cortex M1	Cortex M3	Cortex M4
Thumb-1	16	ADC, ADD, ADR, AND, ASR, B, BIC, BKPT, BLX, BX, CMN, CMP, CPS, EOR, LDM, LDR, LDRB, LDRH, LDRSB, LDRSH, LSL, LSR, MOV, MUL, MVN, NOP, ORR, POP, PUSH, REV, REV16, REVSH, ROR, RSB, SBC, SEV, STM, STMIA, STR, STRB, STRH, SUB, SVC, SXTB, SXTH, TST, UXTB, UXTH, WFE, WFI, YIELD	Yes	Yes	Yes	Yes	Yes
Thumb-1	16	CBNZ, CBZ	No	No	No	Yes	Yes
Thumb-1	16	IT	No	No	No	Yes	Yes
Thumb-2	32	BL, DMB, DSB, ISB, MRS, MSR	Yes	Yes	Yes	Yes	Yes
Thumb-2	32	ADC, ADD, ADR, AND, ASR, B, BFC, BFI, BIC, CDP, CLREX, CLZ, CMN, CMP, DBG, EOR, LDC, LDMA, LDMDB, LDR, LDRB, LDRBT, LDRD, LDREX, LDREXB, LDREXH, LDRH, LDRHT, LDRSB, LDRSBT, LDRSHT, LDRSH, LDRT, MCR, LSL, LSR, MLS, MCRR, MLA, MOV, MOVT, MRC, MRRC, MUL, MVN, NOP, ORN, ORR, PLD, PLDW, PLI, POP, PUSH, RBIT, REV, REV16, REVSH, ROR, RRX, RSB, SBC, SBFX, SEV, SMLAL, SMULL, SSAT, STC, STMDB, STR, STRB, STRBT, STRD, STREX, STREXB, STREXH, STRH, STRHT, STRT, SUB, SXTB, SXTH, TBB, TBH, TEQ, TST, UBFX, UMLAL, UMULL, USAT, UXTB, UXTH, WFE, WFI, YIELD	No	No	No	Some	Yes
Thumb-2	32	SDIV, UDIV	No	No	No	Yes	Yes
DSP	32	PKH, QADD, QADD16, QADD8, QASX, QDADD, QDSUB, QSAX, QSUB, QSUB16, QSUB8, SADD16, SADD8, SASX, SEL, SHADD16, SHADD8, SHASX, SHSAX, SHSUB16, SHSUB8, SMLABB, SMLABT, SMLATB, SMLATT, SMLAD, SMLALBB, SMLALBT, SMLALTB, SMLALTT, SMLALD, SMLAWB, SMLAWT, SMLSD, SMLSLD, SMMLA, SMMLS, SMMUL, SMUAD, SMULBB, SMULBT, SMULTT, SMULTB, SMULWT, SMULWB, SMUSD, SSAT16, SSAX, SSUB16, SSUB8, SXTAB, SXTAB16, SXTAH, SXTB16, UADD16, UADD8, UASX, UHADD16, UHADD8, UHASX, UHSAX, UHSUB16, UHSUB8, UMAAL, UQADD16, UQADD8, UQASX, UQSAX, UQSUB16, UQSUB8, USAD8, USADA8, USAT16, USAX, USUB16, USUB8, UXTAB, UXTAB16, UXTAH, UXTB16	No	No	No	No	Yes
SP Float	32	VABS, VADD, VCMP, VCMPE, VCVT, VCVTR, VDIV, VLDM, VLDR, VMLA, VMLS, VMOV, VMRS, VMSR, VMUL, VNEG, VNMLA, VNMLS, VNMUL, VPOP, VPUSH, VSQRT, VSTM, VSTR, VSUB	No	No	No	No	Optional SP FPU
DP Float	32	VCVTA, VCVTM, VCVTN, VCVTP, VMAXNM, VMINNM, VRINTA, VRINTM, VRINTN, VRINTP, VRINTR, VRINTX, VRINTZ, VSEL	No	No	No	No	No
TrustZone	16	BLXNS, BXNS	No	No	No	No	No
TrustZone	32	SG, TT, TTT, TTA, TTAT	No	No	No	No	No

Mnemonic	Syntax	Description
VABS		Floating-point Absolute
VADD		Floating-point Add
VCMP		Compare two floating-point registers, or one floating-point register and zero.
VCMPE		Compare two floating-point registers, or one floating-point register and zero with Invalid Operation check.
VCVT	VCVT{R} {cond}.Tm.F32 Sd, Sm	Convert between floating-point and integer
VCVT		Convert between floating-point and fixed-point
VCVTR		Convert between floating-point and integer with rounding
VCVTB		Converts half-precision value to single-precision
VCVTT		Converts single-precision register to half-precision
VDIV		Floating-point Divide
VFMA		Floating-point Fused Multiply Accumulate
VFNMA		Floating-point Fused Negate Multiply Accumulate
VFNMS		Floating-point Fused Negate Multiply Subtract
VLDM		Load Multiple extension registers
VLDR		Loads an extension register from memory
VLMA		Floating-point Multiply Accumulate
VLMS		Floating-point Multiply Subtract
VMOV		Floating-point Move Immediate
VMOV		Floating-point Move Register
VMOV		Copy ARM core register to single precision
VMOV		Copy 2 ARM core registers to 2 single precision.
VMOV		Copies between ARM core register to scalar
VMOV		Copies between Scalar to ARM core register
VMRS		Move to ARM core register from floating-point System Register.
VMSR		Move to floating-point System Register from ARM Core register.
VMUL		Multiply floating-point
VNEG		Floating-point negate
VNMLA		Floating-point multiply and add
VNMLS		Floating-point multiply and subtract
VNMUL		Floating-point multiply
VPOP		Pop extension registers
VPUSH		Push extension registers
VSQRT		Floating-point square root
VSTM		Store Multiple extension registers
VSTR		Stores an extension register to memory
VSUB		Floating-point Subtract