SlideShare a Scribd company logo
1 of 30
Download to read offline
C & CPU
fea<feathertw@gmail.com>
2016.02.16
WHAT?
● A complete system
○ HW: CPU, BUS, Memory, Peripheral
○ SW: sssimple os
● Language
○ HW: verilog
○ SW: c & assembly
WHY?
● Background
○ know a little about C code
○ learning embedded system-stm32f4
● Motivation
○ know about how a computer system run
○ know about CPU and OS relationship
Target
● Define software need -> Implement necessary hardware
● Software
○ run basic c program
○ basic function of os kernel
● Hardware
○ CPU
○ BUS
○ Memory
○ Uart(only register)
Communicate
CPU
xUART
0
1
RX
TX
DATA FLAG ENABLE
iVerilog
Demo
Demo program
A program that can run a basic function of os, include
○ boot for c code environment
○ control kernel mode and user mode
○ set up user tasks
○ context switch
○ comunicate between tasks
Demo Enviroment
● Ubuntu 12.04LTE 32bit
● Icarus Verilog 0.9.5
○ sudo apt-get install iverilog
● nds32le-linux-glibc-V1 toolchain 2009 version
● git v1.7.9.5
○ sudo apt-get install git
● Python v2.7.3
● GTKWave v3.3.34
○ sudo apt-get install gtkwave
Setting & Run
● #Download toolchain from https://goo.gl/HPeM0H
● unzip nds32le-linux-glibc-V1-zip
● export PATH=$PATH:$(pwd)/nds32le-linux-glibc-V1/bin
●
● git clone https://github.com/feathertw/TiniSOC.git
● git clone https://github.com/feathertw/TiniOS.git
● git clone https://github.com/feathertw/TiniRUN.git
● cd TiniRUN/
● . run.sh
Hardware
CPU
I cache D cache I/O port
Master wapper
AMBA-AHB
IM DM Peripheral
Slave wapper Slave wapper Slave wapper
Architecture
Master wapper Master wapper
Refer:
1. AndesStar Instruction Set Architecture
2. Stm32(arm cortex-m)
3. MIPS 5 stages pipeline
Characteristics
1. 5 stages pipeline
2. write-through cache
3. jump cache & branch prediction
4. kernel mode & system mode
5. interrupt handle
6. systick timer
CPU
R-type
ADD, SUB, AND, OR, XOR,
SEB,SEH,SLT, SLTS, SRL, SLL,
SRA,SRLI, SLLI,SRAI, ROTRI
CMOVN,CMOVZ
I-type
ADDI, SUBRI,
ANDI, ORI, XORI,
SLTI, SLTSI
type3 MOVI, SETHI
Memory
LW, SW,
LWI, SWI,
LWI.BI, SWI.BI
BRANCH
BEQ, BNE,
BEQZ, BGEZ, BGTZ, BLEZ, BLTZ, BNEZ
JUMP J, JAL, JR, JRAL
MISC MTSR,MFSR,SYSCALL, IRET
“lmw,smw” change to ”lwi,swi” in compile time
Opcode
Jump cache
21 3 4 5
jump cache
0000 A1 $r0, $r1, $r2
0004 A2 $r0, $r1, $r2
0008 j Hello(0x10)
000c A3 $r0, $r1, $r2
Hello:
0010 A4 $r0, $r1, $r2
0014 A5 $r0, $r1, $r2
1 2 3 4 5
step1 j A2 A1
step2 A3 j A2 A1
step3 A4 A3 j A2 A1
1 2 3 4 5
step1 j A2 A1
step2 A4 j A2 A1
step3 A5 A4 j A2 A1
decodefetch
miss
hit
Jump cache test
_start:
j Reset_Handler
Reset_Handler:
la $r0, #1000
la $r15, #1
my_repeat:
beqz $r0, my_exit
addi $r0, $r0, #-1
j my_repeat
my_exit:
la $r15, #2
my_loop:
j my_loop
no jcache with jcache improve rate
4005cycles 3036cycles 24%
with jcache
no jcache
Branch prediction
S2
Predit Not
Taken
S0
Predit Taken
S1
Predit Taken
S3
Predit Not
Taken
Not taken
Not taken
Not taken
Taken
Taken
Taken
Not taken
Taken
0000 A1 $r0, $r1, $r2
0004 A2 $r0, $r1, $r2
0008 beq $r0, $r1 Hello(0x10)
000c A3 $r0, $r1, $r2
Hello:
0010 A4 $r0, $r1, $r2
0014 A5 $r0, $r1, $r2
1 2 3 4 5
step1 beq A2 A1
step2 A3 beq A2 A1
step3 A4 A3 beq A2 A1
1 2 3 4 5
step1 beq A2 A1
step2 A4 beq A2 A1
step3 A5 A4 beq A2 A1
miss
taken
hit-S1
taken
1 2 3 4 5
step1 beq A2 A1
step2 A4 beq A2 A1
step3 A3 A4 beq A2 A1
hit-S0
not taken
Branch prediction test
_start:
j Reset_Handler
Reset_Handler:
la $r0, #1000
la $r15, #1
my_repeat:
addi $r0, $r0, #-1
bnez $r0, my_repeat
nop
my_exit:
la $r15, #2
my_loop:
j my_loop
no bcache with bcache improve rate
3001cycles 2004cycles 33%
with branch prediction
no branch prediction
CPU
BUS-AHB
I cache
D cache
I/O port
IM
DM
Peripheral
2 cycle delay
Signal
for icache width
16-beat incrementing burst
2 cycle delay
Software
Toolchain
● nds32le-linux-glibc-V1
○ nds32le-linux-gcc
○ nds32le-linux-as
○ nds32le-linux-ld
○ nds32le-linux-objcopy
○ nds32le-linux-objdump
Memory
IM
DM
Peripheral
System
0x0000_0000
0x0100_0000
0x0200_0000
0x0300_0000
0x0E00_0000
0x0F00_0000
Memory map
MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000,
LENGTH = 0x00040000
RAM (rwx) : ORIGIN = 0x01000000,
LENGTH = 0x00040000
}
SECTIONS
{
.text :
{
_stext = . ;
*(vectors)
*(.text)
_etext = . ;
} > FLASH
.data : AT(_etext)
{
_sdata = . ;
*(.rodata*)
*(.data)
_edata = . ;
} > RAM
data_size = _edata - _sdata ;
.bss :
{
_sbss = . ;
*(.bss)
_ebss = . ;
} > RAM
bss_size = _ebss - _sbss ;
}
Linker script
DM
IM
Boot .section "vectors"
_start:
j Reset_Handler
j Syscall_Handler
j Systick_Handler
Reset_Handler:
init_data:
la $r0, _etext
la $r1, _sdata
la $r2, data_size
la $r3, #0x000000FF
copy_data:
beqz $r2, init_bss
lwi.bi $r4, [$r0], #4
…
swi.bi $r4, [$r1], #4
addi $r2, $r2, #-4
j copy_data
init_bss:
la $r0, _sbss
la $r1, bss_size
zero_bss:
beqz $r1, init_sp
movi $r2, #0
swi.bi $r2, [$r0], #4
addi $r1, $r1, #-4
j zero_bss
init_sp:
la $sp, STACK_ADDR
j main
0x0000_0000
0x0000_0004
0x0000_0008
0x0000_000C
Reset_Handler
Syscall_Handler
Systick_Handler
Vector table
vector table
copy data section
from flash to ram
init 0 for bss section
set stack point
go C main function
Context switch
to_user_mode:
smw.adm $r1, [$sp], $r27, 14
movi $r1, #1
mtsr $r1, $PSW
mtsr $r1, $IPSW
addi $sp, $r0, #8
lmw.bim $r4, [$sp], $r27, 0
iret
to_kernel_mode:
mfsr $r0, $P_P1
smw.adm $r3, [$r0], $r27, 0
addi $r0, $r0, #-4
lmw.bim $r1, [$sp], $r27, 14
jr $lp
$PSW SYSTEM_MODE (RW) 0:kernelmode 1:
usermode
$IPSW IRET_MODE (RW) 0:kernelmode 1:
usermode
$P_P0 KSP (R) kernel stack point
$P_P1 USP (R) user stack point
Need to know
mul_u10:
slli $r1, $r0, #2
add $r0, $r1, $r0
slli $r0, $r0, #1
mod_u10:
addi $r1, $r0, #0
jal div_u10
jal mul_u10
sub $r0, $r1, $r0
unsigned div_u10(unsigned n){
q=(n>>1)+(n>>2);
q=q+(q>>4);
q=q+(q>>8);
q=q+(q>>16);
q=q>>3;
r=n-(((q<<2)+q)<<1);
return q+((r+6)>>4);
}
● No implement multiply and divison in CPU
● Compile with -O2 for using array
● Only use “int” type for all variable
● Volatile the variable for peripheral
● ISR need to put in boot.s
Compile Flow
.c
.s
.s _.s .o .elf .bin .prog
gcc as ld objcopy
iverilog loading
binary_analyze.pyopcode_translate.py
Conclusion
Next?
● Bugs
● Run on FPGA
● DMA, Write-back cache, ......
Final
● Need big screen for staring waveform
● Bugs everywhere
● Be care of using -O3
● Know more about CPU can help c programming
● Definition of HW & SW is different from EE & CS
Thank you.

More Related Content

What's hot

Дмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортДмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортSergey Platonov
 
HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3Linaro
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile TimeemBO_Conference
 
Zn task - defcon russia 20
Zn task  - defcon russia 20Zn task  - defcon russia 20
Zn task - defcon russia 20DefconRussia
 
Instruction Combine in LLVM
Instruction Combine in LLVMInstruction Combine in LLVM
Instruction Combine in LLVMWang Hsiangkai
 
Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23DefconRussia
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealTzung-Bi Shih
 
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Mr. Vengineer
 
How to make a large C++-code base manageable
How to make a large C++-code base manageableHow to make a large C++-code base manageable
How to make a large C++-code base manageablecorehard_by
 
soscon2018 - Tracing for fun and profit
soscon2018 - Tracing for fun and profitsoscon2018 - Tracing for fun and profit
soscon2018 - Tracing for fun and profithanbeom Park
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014Jian-Hong Pan
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for RSeth Falcon
 
Работа с реляционными базами данных в C++
Работа с реляционными базами данных в C++Работа с реляционными базами данных в C++
Работа с реляционными базами данных в C++corehard_by
 

What's hot (20)

Дмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репортДмитрий Демчук. Кроссплатформенный краш-репорт
Дмитрий Демчук. Кроссплатформенный краш-репорт
 
HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3HKG15-207: Advanced Toolchain Usage Part 3
HKG15-207: Advanced Toolchain Usage Part 3
 
TVM VTA (TSIM)
TVM VTA (TSIM) TVM VTA (TSIM)
TVM VTA (TSIM)
 
Interpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratchInterpreter, Compiler, JIT from scratch
Interpreter, Compiler, JIT from scratch
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Programming at Compile Time
Programming at Compile TimeProgramming at Compile Time
Programming at Compile Time
 
Qt Rest Server
Qt Rest ServerQt Rest Server
Qt Rest Server
 
Zn task - defcon russia 20
Zn task  - defcon russia 20Zn task  - defcon russia 20
Zn task - defcon russia 20
 
Instruction Combine in LLVM
Instruction Combine in LLVMInstruction Combine in LLVM
Instruction Combine in LLVM
 
Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23Advanced cfg bypass on adobe flash player 18 defcon russia 23
Advanced cfg bypass on adobe flash player 18 defcon russia 23
 
GCC RTL and Machine Description
GCC RTL and Machine DescriptionGCC RTL and Machine Description
GCC RTL and Machine Description
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
 
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
 
How to make a large C++-code base manageable
How to make a large C++-code base manageableHow to make a large C++-code base manageable
How to make a large C++-code base manageable
 
soscon2018 - Tracing for fun and profit
soscon2018 - Tracing for fun and profitsoscon2018 - Tracing for fun and profit
soscon2018 - Tracing for fun and profit
 
C++17 now
C++17 nowC++17 now
C++17 now
 
The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014The Simple Scheduler in Embedded System @ OSDC.TW 2014
The Simple Scheduler in Embedded System @ OSDC.TW 2014
 
verilog code
verilog codeverilog code
verilog code
 
Native interfaces for R
Native interfaces for RNative interfaces for R
Native interfaces for R
 
Работа с реляционными базами данных в C++
Работа с реляционными базами данных в C++Работа с реляционными базами данных в C++
Работа с реляционными базами данных в C++
 

Similar to C&cpu

Kernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesKernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesAnne Nicolas
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)yang firo
 
Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)yang firo
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to DebuggersSaumil Shah
 
Poker, packets, pipes and Python
Poker, packets, pipes and PythonPoker, packets, pipes and Python
Poker, packets, pipes and PythonRoger Barnes
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64FFRI, Inc.
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]RootedCON
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashFabio Pignatti
 
More Than po: Debugging in LLDB @ CocoaConf SJ 2015
More Than po: Debugging in LLDB @ CocoaConf SJ 2015More Than po: Debugging in LLDB @ CocoaConf SJ 2015
More Than po: Debugging in LLDB @ CocoaConf SJ 2015Michele Titolo
 
r2con 2017 r2cLEMENCy
r2con 2017 r2cLEMENCyr2con 2017 r2cLEMENCy
r2con 2017 r2cLEMENCyRay Song
 
Symbolic Debugging with DWARF
Symbolic Debugging with DWARFSymbolic Debugging with DWARF
Symbolic Debugging with DWARFSamy Bahra
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel CrashdumpMarian Marinov
 
Chapter Eight(3)
Chapter Eight(3)Chapter Eight(3)
Chapter Eight(3)bolovv
 
Hadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and PigHadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and PigTomasz Bednarz
 
Happy To Use SIMD
Happy To Use SIMDHappy To Use SIMD
Happy To Use SIMDWei-Ta Wang
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly LanguageMotaz Saad
 
NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016Mikhail Sosonkin
 

Similar to C&cpu (20)

Kernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesKernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering Oopsies
 
Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)Linux kernel debugging(PDF format)
Linux kernel debugging(PDF format)
 
Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)Linux kernel debugging(ODP format)
Linux kernel debugging(ODP format)
 
Introduction to Debuggers
Introduction to DebuggersIntroduction to Debuggers
Introduction to Debuggers
 
Poker, packets, pipes and Python
Poker, packets, pipes and PythonPoker, packets, pipes and Python
Poker, packets, pipes and Python
 
Exploring the x64
Exploring the x64Exploring the x64
Exploring the x64
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
 
Nsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crashNsd, il tuo compagno di viaggio quando Domino va in crash
Nsd, il tuo compagno di viaggio quando Domino va in crash
 
Linux networking
Linux networkingLinux networking
Linux networking
 
More Than po: Debugging in LLDB @ CocoaConf SJ 2015
More Than po: Debugging in LLDB @ CocoaConf SJ 2015More Than po: Debugging in LLDB @ CocoaConf SJ 2015
More Than po: Debugging in LLDB @ CocoaConf SJ 2015
 
r2con 2017 r2cLEMENCy
r2con 2017 r2cLEMENCyr2con 2017 r2cLEMENCy
r2con 2017 r2cLEMENCy
 
Symbolic Debugging with DWARF
Symbolic Debugging with DWARFSymbolic Debugging with DWARF
Symbolic Debugging with DWARF
 
Linux Kernel Crashdump
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel Crashdump
 
Chapter Eight(3)
Chapter Eight(3)Chapter Eight(3)
Chapter Eight(3)
 
Mach-O par Stéphane Sudre
Mach-O par Stéphane SudreMach-O par Stéphane Sudre
Mach-O par Stéphane Sudre
 
Hadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and PigHadoop, HDFS, MapReduce and Pig
Hadoop, HDFS, MapReduce and Pig
 
Happy To Use SIMD
Happy To Use SIMDHappy To Use SIMD
Happy To Use SIMD
 
Introduction to Assembly Language
Introduction to Assembly LanguageIntroduction to Assembly Language
Introduction to Assembly Language
 
NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016NYU hacknight, april 6, 2016
NYU hacknight, april 6, 2016
 

Recently uploaded

Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdfKamal Acharya
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 

Recently uploaded (20)

Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 

C&cpu

  • 2. WHAT? ● A complete system ○ HW: CPU, BUS, Memory, Peripheral ○ SW: sssimple os ● Language ○ HW: verilog ○ SW: c & assembly
  • 3. WHY? ● Background ○ know a little about C code ○ learning embedded system-stm32f4 ● Motivation ○ know about how a computer system run ○ know about CPU and OS relationship
  • 4. Target ● Define software need -> Implement necessary hardware ● Software ○ run basic c program ○ basic function of os kernel ● Hardware ○ CPU ○ BUS ○ Memory ○ Uart(only register)
  • 7. Demo program A program that can run a basic function of os, include ○ boot for c code environment ○ control kernel mode and user mode ○ set up user tasks ○ context switch ○ comunicate between tasks
  • 8. Demo Enviroment ● Ubuntu 12.04LTE 32bit ● Icarus Verilog 0.9.5 ○ sudo apt-get install iverilog ● nds32le-linux-glibc-V1 toolchain 2009 version ● git v1.7.9.5 ○ sudo apt-get install git ● Python v2.7.3 ● GTKWave v3.3.34 ○ sudo apt-get install gtkwave
  • 9. Setting & Run ● #Download toolchain from https://goo.gl/HPeM0H ● unzip nds32le-linux-glibc-V1-zip ● export PATH=$PATH:$(pwd)/nds32le-linux-glibc-V1/bin ● ● git clone https://github.com/feathertw/TiniSOC.git ● git clone https://github.com/feathertw/TiniOS.git ● git clone https://github.com/feathertw/TiniRUN.git ● cd TiniRUN/ ● . run.sh
  • 11. CPU I cache D cache I/O port Master wapper AMBA-AHB IM DM Peripheral Slave wapper Slave wapper Slave wapper Architecture Master wapper Master wapper
  • 12. Refer: 1. AndesStar Instruction Set Architecture 2. Stm32(arm cortex-m) 3. MIPS 5 stages pipeline Characteristics 1. 5 stages pipeline 2. write-through cache 3. jump cache & branch prediction 4. kernel mode & system mode 5. interrupt handle 6. systick timer CPU
  • 13. R-type ADD, SUB, AND, OR, XOR, SEB,SEH,SLT, SLTS, SRL, SLL, SRA,SRLI, SLLI,SRAI, ROTRI CMOVN,CMOVZ I-type ADDI, SUBRI, ANDI, ORI, XORI, SLTI, SLTSI type3 MOVI, SETHI Memory LW, SW, LWI, SWI, LWI.BI, SWI.BI BRANCH BEQ, BNE, BEQZ, BGEZ, BGTZ, BLEZ, BLTZ, BNEZ JUMP J, JAL, JR, JRAL MISC MTSR,MFSR,SYSCALL, IRET “lmw,smw” change to ”lwi,swi” in compile time Opcode
  • 14. Jump cache 21 3 4 5 jump cache 0000 A1 $r0, $r1, $r2 0004 A2 $r0, $r1, $r2 0008 j Hello(0x10) 000c A3 $r0, $r1, $r2 Hello: 0010 A4 $r0, $r1, $r2 0014 A5 $r0, $r1, $r2 1 2 3 4 5 step1 j A2 A1 step2 A3 j A2 A1 step3 A4 A3 j A2 A1 1 2 3 4 5 step1 j A2 A1 step2 A4 j A2 A1 step3 A5 A4 j A2 A1 decodefetch miss hit
  • 15. Jump cache test _start: j Reset_Handler Reset_Handler: la $r0, #1000 la $r15, #1 my_repeat: beqz $r0, my_exit addi $r0, $r0, #-1 j my_repeat my_exit: la $r15, #2 my_loop: j my_loop no jcache with jcache improve rate 4005cycles 3036cycles 24% with jcache no jcache
  • 16. Branch prediction S2 Predit Not Taken S0 Predit Taken S1 Predit Taken S3 Predit Not Taken Not taken Not taken Not taken Taken Taken Taken Not taken Taken 0000 A1 $r0, $r1, $r2 0004 A2 $r0, $r1, $r2 0008 beq $r0, $r1 Hello(0x10) 000c A3 $r0, $r1, $r2 Hello: 0010 A4 $r0, $r1, $r2 0014 A5 $r0, $r1, $r2 1 2 3 4 5 step1 beq A2 A1 step2 A3 beq A2 A1 step3 A4 A3 beq A2 A1 1 2 3 4 5 step1 beq A2 A1 step2 A4 beq A2 A1 step3 A5 A4 beq A2 A1 miss taken hit-S1 taken 1 2 3 4 5 step1 beq A2 A1 step2 A4 beq A2 A1 step3 A3 A4 beq A2 A1 hit-S0 not taken
  • 17. Branch prediction test _start: j Reset_Handler Reset_Handler: la $r0, #1000 la $r15, #1 my_repeat: addi $r0, $r0, #-1 bnez $r0, my_repeat nop my_exit: la $r15, #2 my_loop: j my_loop no bcache with bcache improve rate 3001cycles 2004cycles 33% with branch prediction no branch prediction
  • 18. CPU BUS-AHB I cache D cache I/O port IM DM Peripheral 2 cycle delay
  • 19. Signal for icache width 16-beat incrementing burst 2 cycle delay
  • 21. Toolchain ● nds32le-linux-glibc-V1 ○ nds32le-linux-gcc ○ nds32le-linux-as ○ nds32le-linux-ld ○ nds32le-linux-objcopy ○ nds32le-linux-objdump
  • 22. Memory IM DM Peripheral System 0x0000_0000 0x0100_0000 0x0200_0000 0x0300_0000 0x0E00_0000 0x0F00_0000 Memory map MEMORY { FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 0x00040000 RAM (rwx) : ORIGIN = 0x01000000, LENGTH = 0x00040000 } SECTIONS { .text : { _stext = . ; *(vectors) *(.text) _etext = . ; } > FLASH .data : AT(_etext) { _sdata = . ; *(.rodata*) *(.data) _edata = . ; } > RAM data_size = _edata - _sdata ; .bss : { _sbss = . ; *(.bss) _ebss = . ; } > RAM bss_size = _ebss - _sbss ; } Linker script DM IM
  • 23. Boot .section "vectors" _start: j Reset_Handler j Syscall_Handler j Systick_Handler Reset_Handler: init_data: la $r0, _etext la $r1, _sdata la $r2, data_size la $r3, #0x000000FF copy_data: beqz $r2, init_bss lwi.bi $r4, [$r0], #4 … swi.bi $r4, [$r1], #4 addi $r2, $r2, #-4 j copy_data init_bss: la $r0, _sbss la $r1, bss_size zero_bss: beqz $r1, init_sp movi $r2, #0 swi.bi $r2, [$r0], #4 addi $r1, $r1, #-4 j zero_bss init_sp: la $sp, STACK_ADDR j main 0x0000_0000 0x0000_0004 0x0000_0008 0x0000_000C Reset_Handler Syscall_Handler Systick_Handler Vector table vector table copy data section from flash to ram init 0 for bss section set stack point go C main function
  • 24. Context switch to_user_mode: smw.adm $r1, [$sp], $r27, 14 movi $r1, #1 mtsr $r1, $PSW mtsr $r1, $IPSW addi $sp, $r0, #8 lmw.bim $r4, [$sp], $r27, 0 iret to_kernel_mode: mfsr $r0, $P_P1 smw.adm $r3, [$r0], $r27, 0 addi $r0, $r0, #-4 lmw.bim $r1, [$sp], $r27, 14 jr $lp $PSW SYSTEM_MODE (RW) 0:kernelmode 1: usermode $IPSW IRET_MODE (RW) 0:kernelmode 1: usermode $P_P0 KSP (R) kernel stack point $P_P1 USP (R) user stack point
  • 25. Need to know mul_u10: slli $r1, $r0, #2 add $r0, $r1, $r0 slli $r0, $r0, #1 mod_u10: addi $r1, $r0, #0 jal div_u10 jal mul_u10 sub $r0, $r1, $r0 unsigned div_u10(unsigned n){ q=(n>>1)+(n>>2); q=q+(q>>4); q=q+(q>>8); q=q+(q>>16); q=q>>3; r=n-(((q<<2)+q)<<1); return q+((r+6)>>4); } ● No implement multiply and divison in CPU ● Compile with -O2 for using array ● Only use “int” type for all variable ● Volatile the variable for peripheral ● ISR need to put in boot.s
  • 26. Compile Flow .c .s .s _.s .o .elf .bin .prog gcc as ld objcopy iverilog loading binary_analyze.pyopcode_translate.py
  • 28. Next? ● Bugs ● Run on FPGA ● DMA, Write-back cache, ......
  • 29. Final ● Need big screen for staring waveform ● Bugs everywhere ● Be care of using -O3 ● Know more about CPU can help c programming ● Definition of HW & SW is different from EE & CS