http://www.ic.unicamp.br/~celio/mc404-2012/armslides.html
ARM Architecture Overview
Ref:
http://en.wikipedia.org/wiki/ARM_architecture
History
- 1984: Acorn Computers Ltd (ARM: Acorn Risc Machine)
- Designers: Steve Furber and Sophie Wilson
|
- April 1985:VLSI Technology produced first chip, ARM1
- 1986: ARM2-production system (30000 transistors)
- 1992: ARM6 produced for Apple (35000 transistors)
- 2006: ARMv7 core (especificação da CPU)
- 2011: ARMv8: 64 bit data & addressing - no production yet
- 2011: 15 billion ARM processors sold (market share: 90% mobile & smartphones)
- Licensees: dozens from IBM, Samsung, Microsoft to Nintendo, Nvidia, etc.
- 2006 royalties revenues: US$ 164 million, licensees shipped 2.45 billion units
(US$: 7 cents/unit!)
RISC features
- Load/store architecture.
- Uniform 16 x 32-bit register file.
- Fixed instruction width of 32 bits
- Mostly single clock-cycle execution.
- Conditional execution of most instructions.
- Arithmetic instructions alter condition codes only when desired.
- Powerful indexed addressing modes.
- Link register for fast leaf function calls.
- Simple, but fast, 2-priority-level interrupt subsystem with switched register banks.
- Add, subtract, and multiply instructions. Integer divide instructions on ARMv7.
- 3 register operands in arithmetic/logical instructions, with optional shift constant:
ADD R0, R2, R3, LSL #2 @ R0:= R2 + R3<<2
Cpu modes
- User mode
- System mode
- Supervisor mode: SWI instruction (system call)
- Interrupt
- Fast Interrupt
- hyp mode (ARMv7a -virtualization support)
- abort mode
Registers
R0-R12 - General purpose
R13 - SP: Stack Pointer
R14 - LR: Link Register
R15 - PC: Program counter
R13-R14: banked on all modes, R8-R12 also in fast interrupt mode
Conditional execution
4-bit condition code selector on every instruction
Assembler: suffix to instruction mnemonics (see example below).
Example C code:
while(i != j) {
if (i > j)
i -= j;
else
j -= i;
}
Assembler code:
loop CMP Ri, Rj ; set condition "NE" if (i != j),
; "GT" if (i > j),
; or "LT" if (i < j)
SUBGT Ri, Ri, Rj ; if "GT" (greater than), i = i-j;
SUBLT Rj, Rj, Ri ; if "LT" (less than), j = j-i;
BNE loop ; if "NE" (not equal), then loop
Other features
- "Flat subroutine calls" via branch and link instruction (bl):
Exampĺe:
main:
bl mysub @ stores address of next instrucion in link register
... @ will return to instruction here
... @ other instructions
mysub: @ subroutine
... @ do something
mov pc,lr @ returns to caller
Obs: nested subroutine calls should save lr (return address)
and possibly other registers in stack via push {reg list} instruction
- Folds shifts and rotates in a single instruction:
a += (j << 2);
could be compiled into:
ADD Ra, Ra, Rj, LSL #2
Thumb instruction set
- compact 16 bit instruction set, subset of 32 bit normal instructions
- provides code density and performance similar to 32 bit instructions
- at execution expands in full 32 bit operands (registers and addresses)
- some instruction operands are implicit
- some opcodes restricted to half of general purpose registers
- only branches can be conditional
- supported by ARM7TDMI and later families
- 1992: Thumb2 - additional 32 bit instructions (started in ARM1156 core,
supported by all ARMv7 chips)
adds new instructions to the ARM and Thumb instruction sets
Coprocessors
- Up to 16, numbered 0-15
- addressed via instructions MRC, MRRC, MCRR
- attached to the processor by mapping their physical registers into
ARM memory space.
- FPU: provides low-cost single-precision and double-precision floating-point
- NEON (Media Processing Engine):
for media and signal processing applications: audio, video, graphics and gaming
- 64 and 128-bit SIMD instruction set
- 8, 16, 32, 64-bit integer and 32-bit floating-point support
- 128-bit vector processing, up to 16 operations at the same time
Jazelle DBX (Direct Bytecode eXecution)
- Java bytecode executed by hardware as a third execution state and instruction set,
- entered by BXJ (Branch to Java) instruction,
- extra stage between fetch and decode in the processor instruction pipeline,
- recognised bytecodes are converted into a string of one or more native ARM instructions
- ThumbEE (Jazelle RCT): 4th processor mode, supports JIT compilation, also useful for Python,C#, Perl
Hardware Debugging
- JTAG support:
- device, board and system testing
- EmbeddedICE (In Circuit Emulation) over JTAG:
- allows debug of software of embedded system at the machine instruction level
- ARM SWD protocol
- ARMv7: breakpoints, watchpoints, Debug Mode instruction execution
Microsoft Operating Systems supporting ARM
Windows CE, Windows 8, Windows RT (this only for ARM)
Hello World assembler example:
.data
.align 2
Hello_message: .string "Hello World!"
.text
.align 2
.global main
main:
push {lr} @ lr contains return address to OS
ldr r0, =Hello_message
bl puts
pop {pc} @ return to OS