SlideShare a Scribd company logo
1 of 22
了解CPU

核心系统数据库组 余锋

 http://yufeng.info

         @淘宝褚霸

        2012-03-17


                      1
提纲

• 概览

• 测量

• 利用




            2
芯片组




      3
CPU微观图




         4
5
Cache层次结构




            6
Cache-续



指令Cache
          数据Cache




                    7
Xeon 5600系列CPU




                 8
CPU内部各部件访问速度




               9
False sharing问题




                  10
Cache lines




              11
Intel Sandy Bridge来了




                       12
Upgraded features from Nehalem include

•   32 kB data + 32 kB instruction L1 cache (3 clocks) and 256 kB L2 cache (8 clocks) per core

•   Shared L3 cache includes the processor graphics (LGA 1155)

•   64-byte cache line size

•   Two load/store operations per CPU cycle for each memory channel

•   Decoded micro-operation cache and enlarged, optimized branch predictor

•   Improved performance for transcendental mathematics, AES encryption (AES instruction
    set), and SHA-1 hashing

•   256-bit/cycle ring bus interconnect between cores, graphics, cache and System Agent
    Domain

•   Advanced Vector Extensions (AVX) 256-bit instruction set with wider vectors, new
    extensible syntax and rich functionality

•   Intel Quick Sync Video, hardware support for video encoding and decoding

•   Up to 8 physical cores or 16 logical cores through Hyper-threading
                                                                                                 13
lscpu

Architecture:         x86_64               CPU MHz:           2400.461
CPU op-mode(s):           32-bit, 64-bit   BogoMIPS:          4799.93
Byte Order:       Little Endian            Virtualization:    VT-x
CPU(s):          24                        L1d cache:        32K
On-line CPU(s) list: 0-23                  L1i cache:        32K
Thread(s) per core: 2                      L2 cache:         256K
Core(s) per socket: 6                      L3 cache:         12288K
CPU socket(s):        2                    NUMA node0 CPU(s):
NUMA node(s):             2                    0,2,4,6,8,10,12,14,16,18,20,22

Vendor ID:        GenuineIntel             NUMA node1 CPU(s):

CPU family:       6                            1,3,5,7,9,11,13,15,17,19,21,23

Model:           44
Stepping:         2                                                               14
CPU拓扑结构图


# ./cpu_topology64.out




                                    15
Hwconfig

Processors:     2 x Xeon E5645 2.40GHz
5860MHz FSB (HT enabled, 12 cores, 24 threads)

cpus bits="64"         sockets="2"

cores="12"             sockets_populated="2"

cores_active="12"      threads="24"

ht_bios_enable="1"     threads_active="24"

ht_enable="1"

ht_support="1"                                   16
hwconfig -x
apic_id="0"                                 multi_threading="32"
bits="64"                                   name="cpu1"
core_id="0"                                 package_id="0"
cores="6"                                   physical_address_bits="40"
cpuid="0x000206c2"                          speed="2400461000"
cpuid_level="11"                            stepping_id="2"
family_id="6"                               threads="12"
fsb="5860MHz“                               turbo_frequencies="2800000000 2800000000
l1_cache_size="32768"                          2666666666 2666666666"

l2_cache_size="262144“                      vendor="Intel"

l3_cache_size="12582912“                    vendor_id="GenuineIntel"

model="Intel® Xeon(R) CPU E5645 @ 2.40GHz" virtual_address_bits="48"
model_id="44"


                                                                                       17
必知性能数字

L1 cache referenc    0          .    5          n               s
Branch mispredict        5                 n                    s
L2 cache reference                                          7 ns
Mutex lock/unlock                                          25 ns
Main memory reference                                     100 ns
Compress 1K bytes with Zippy                            3,000 ns
Send 2K bytes over 1 Gbps network                      20,000 ns
Read 1 MB sequentially from memory                    250,000 ns
Round trip within same datacenter                     500,000 ns
Disk seek                                           10,000,000 ns
Read 1 MB sequentially from disk                 20,000,000 ns
Send packet CA->Netherlands->CA                150,000,000 ns



                                                               18
lmbench微观测量

Basic double operations - times in nanoseconds - smaller is better
------------------------------------------------------------------
Host OS double doubledoubledouble add mul div bogo
------------------------------------------------------------------
Dr4000 Linux 2.6.32- 1.1400 1.9000 8.9500 7.7100


Memory latencies in nanoseconds - smaller is better
---------------------------------------------------------------
---------------
Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses
---------------------------------------------------------------
---
Dr4000 Linux 2.6.32- 2631 1.1590 5.7170 78.0 110.4
                                                                              19
Cache相关硬件事件

perf list




                          20
参考材料

• lscpu – CPU architecture information查看器
  http://blog.yufeng.info/archives/1886
• CPU拓扑结构的调查: http://blog.yufeng.info/archives/666
• hwconfig查看硬件信息:
  http://blog.yufeng.info/archives/2086
• LMbench实用的微观性能分析工具:
  http://blog.yufeng.info/archives/tag/lmbench

                                                 21
提问时间




谢谢大家!


           22

More Related Content

What's hot

LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Szymon Haly
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
DataStax Academy
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Community
 

What's hot (17)

PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksPerformance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
 
Comparison of-foss-distributed-storage
Comparison of-foss-distributed-storageComparison of-foss-distributed-storage
Comparison of-foss-distributed-storage
 
JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!JetStor X Storage Products 2017! New HOT products!
JetStor X Storage Products 2017! New HOT products!
 
Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce Ceph Day San Jose - Ceph at Salesforce
Ceph Day San Jose - Ceph at Salesforce
 
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-webLizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
 
Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data
 
Comparison of foss distributed storage
Comparison of foss distributed storageComparison of foss distributed storage
Comparison of foss distributed storage
 
JetStor NAS series 2016
JetStor NAS series 2016JetStor NAS series 2016
JetStor NAS series 2016
 
Scaling Cassandra for Big Data
Scaling Cassandra for Big DataScaling Cassandra for Big Data
Scaling Cassandra for Big Data
 
Webinar: What’s Your Path to NVMe?
Webinar: What’s Your Path to NVMe?Webinar: What’s Your Path to NVMe?
Webinar: What’s Your Path to NVMe?
 
M.2 SSDs: Aligned for Speed – Infographic
M.2 SSDs: Aligned for Speed – InfographicM.2 SSDs: Aligned for Speed – Infographic
M.2 SSDs: Aligned for Speed – Infographic
 
JetStor ZFS DUAL NAS introduction
JetStor ZFS DUAL NAS introductionJetStor ZFS DUAL NAS introduction
JetStor ZFS DUAL NAS introduction
 
Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective Ceph Day New York 2014: Ceph, a physical perspective
Ceph Day New York 2014: Ceph, a physical perspective
 
The latest developments from OVHcloud’s bare metal ranges
The latest developments from OVHcloud’s bare metal rangesThe latest developments from OVHcloud’s bare metal ranges
The latest developments from OVHcloud’s bare metal ranges
 
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph clusterCeph Day KL - Delivering cost-effective, high performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
 

Viewers also liked

MySQL和IO(下)
MySQL和IO(下)MySQL和IO(下)
MySQL和IO(下)
Feng Yu
 
了解内存
了解内存了解内存
了解内存
Feng Yu
 
Erlang分布式系统的的领域语言
Erlang分布式系统的的领域语言Erlang分布式系统的的领域语言
Erlang分布式系统的的领域语言
Feng Yu
 
阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化
Lixun Peng
 
Upstream design and_implementation_in_nginx
Upstream design and_implementation_in_nginxUpstream design and_implementation_in_nginx
Upstream design and_implementation_in_nginx
monad bobo
 
MySQL和IO(上)
MySQL和IO(上)MySQL和IO(上)
MySQL和IO(上)
Feng Yu
 
低成本和高性能MySQL云架构探索
低成本和高性能MySQL云架构探索低成本和高性能MySQL云架构探索
低成本和高性能MySQL云架构探索
Feng Yu
 
Systemtap
SystemtapSystemtap
Systemtap
Feng Yu
 
Linux architecture
Linux architectureLinux architecture
Linux architecture
mcganesh
 
Alibaba patches in MariaDB
Alibaba patches in MariaDBAlibaba patches in MariaDB
Alibaba patches in MariaDB
Lixun Peng
 

Viewers also liked (20)

MySQL和IO(下)
MySQL和IO(下)MySQL和IO(下)
MySQL和IO(下)
 
了解内存
了解内存了解内存
了解内存
 
Erlang分布式系统的的领域语言
Erlang分布式系统的的领域语言Erlang分布式系统的的领域语言
Erlang分布式系统的的领域语言
 
Flash存储设备在淘宝的应用实践
Flash存储设备在淘宝的应用实践Flash存储设备在淘宝的应用实践
Flash存储设备在淘宝的应用实践
 
SSD在淘宝的应用实践
SSD在淘宝的应用实践SSD在淘宝的应用实践
SSD在淘宝的应用实践
 
Cap 理论与实践
Cap 理论与实践Cap 理论与实践
Cap 理论与实践
 
阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化阿里云RDS for MySQL的若干优化
阿里云RDS for MySQL的若干优化
 
Upstream design and_implementation_in_nginx
Upstream design and_implementation_in_nginxUpstream design and_implementation_in_nginx
Upstream design and_implementation_in_nginx
 
我为什么要选择RabbitMQ
我为什么要选择RabbitMQ我为什么要选择RabbitMQ
我为什么要选择RabbitMQ
 
了解集群
了解集群了解集群
了解集群
 
利用新硬件提升数据库性能
利用新硬件提升数据库性能利用新硬件提升数据库性能
利用新硬件提升数据库性能
 
MySQL和IO(上)
MySQL和IO(上)MySQL和IO(上)
MySQL和IO(上)
 
Erlang开发实践
Erlang开发实践Erlang开发实践
Erlang开发实践
 
了解应用服务器
了解应用服务器了解应用服务器
了解应用服务器
 
mnesia脑裂问题综述
mnesia脑裂问题综述mnesia脑裂问题综述
mnesia脑裂问题综述
 
低成本和高性能MySQL云架构探索
低成本和高性能MySQL云架构探索低成本和高性能MySQL云架构探索
低成本和高性能MySQL云架构探索
 
Systemtap
SystemtapSystemtap
Systemtap
 
Death by PowerPoint
Death by PowerPointDeath by PowerPoint
Death by PowerPoint
 
Linux architecture
Linux architectureLinux architecture
Linux architecture
 
Alibaba patches in MariaDB
Alibaba patches in MariaDBAlibaba patches in MariaDB
Alibaba patches in MariaDB
 

Similar to 了解Cpu

Highload осень 2012 лекция 5
Highload осень 2012 лекция 5Highload осень 2012 лекция 5
Highload осень 2012 лекция 5
Technopark
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Tommy Lee
 

Similar to 了解Cpu (20)

Highload осень 2012 лекция 5
Highload осень 2012 лекция 5Highload осень 2012 лекция 5
Highload осень 2012 лекция 5
 
The Spectre of Meltdowns
The Spectre of MeltdownsThe Spectre of Meltdowns
The Spectre of Meltdowns
 
CONFidence 2017: Hacking embedded with OpenWrt (Vladimir Mitiouchev)
CONFidence 2017: Hacking embedded with OpenWrt (Vladimir Mitiouchev)CONFidence 2017: Hacking embedded with OpenWrt (Vladimir Mitiouchev)
CONFidence 2017: Hacking embedded with OpenWrt (Vladimir Mitiouchev)
 
Core 2 processors
Core 2 processorsCore 2 processors
Core 2 processors
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
 
AMD Ryzen Threadripper 3960X によるPCIe 4.0 x16x16x16環境の詳解
AMD Ryzen Threadripper 3960X によるPCIe 4.0 x16x16x16環境の詳解AMD Ryzen Threadripper 3960X によるPCIe 4.0 x16x16x16環境の詳解
AMD Ryzen Threadripper 3960X によるPCIe 4.0 x16x16x16環境の詳解
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
Morello Technology Demonstrator Hardware Overview - Mark Inskip, Arm
Morello Technology Demonstrator Hardware Overview - Mark Inskip, ArmMorello Technology Demonstrator Hardware Overview - Mark Inskip, Arm
Morello Technology Demonstrator Hardware Overview - Mark Inskip, Arm
 
Fastsocket Linxiaofeng
Fastsocket LinxiaofengFastsocket Linxiaofeng
Fastsocket Linxiaofeng
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
Hardware Discovery Commands
Hardware Discovery CommandsHardware Discovery Commands
Hardware Discovery Commands
 
Intel® RDT Hands-on Lab
Intel® RDT Hands-on LabIntel® RDT Hands-on Lab
Intel® RDT Hands-on Lab
 
How to Tune the Windows DNS Server 2012 R2 for Best Performance
How to Tune the Windows DNS Server 2012 R2 for Best PerformanceHow to Tune the Windows DNS Server 2012 R2 for Best Performance
How to Tune the Windows DNS Server 2012 R2 for Best Performance
 
Cpu z mariam
Cpu z mariamCpu z mariam
Cpu z mariam
 
[IDF'15 SF] RPCS001 — Overclocking 6th Generation Intel® Core™ Processors!
[IDF'15 SF] RPCS001 — Overclocking 6th Generation Intel® Core™ Processors![IDF'15 SF] RPCS001 — Overclocking 6th Generation Intel® Core™ Processors!
[IDF'15 SF] RPCS001 — Overclocking 6th Generation Intel® Core™ Processors!
 
My First AMD EPYC 7251 memo
My First AMD EPYC 7251 memoMy First AMD EPYC 7251 memo
My First AMD EPYC 7251 memo
 
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
 

More from Feng Yu (12)

Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告Rethink db&tokudb调研测试报告
Rethink db&tokudb调研测试报告
 
高性能集群服务器(Erlang解决方案)
高性能集群服务器(Erlang解决方案)高性能集群服务器(Erlang解决方案)
高性能集群服务器(Erlang解决方案)
 
淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践
 
开源混合存储方案(Flashcache)
开源混合存储方案(Flashcache)开源混合存储方案(Flashcache)
开源混合存储方案(Flashcache)
 
Erlang low cost_clound_computing
Erlang low cost_clound_computingErlang low cost_clound_computing
Erlang low cost_clound_computing
 
Oprofile linux
Oprofile linuxOprofile linux
Oprofile linux
 
Go
GoGo
Go
 
C1000K高性能服务器构建技术
C1000K高性能服务器构建技术C1000K高性能服务器构建技术
C1000K高性能服务器构建技术
 
Erlang全接触
Erlang全接触Erlang全接触
Erlang全接触
 
Tsung 压力测试工具
Tsung 压力测试工具Tsung 压力测试工具
Tsung 压力测试工具
 
Inside Erlang Vm II
Inside Erlang Vm IIInside Erlang Vm II
Inside Erlang Vm II
 
Go Lang
Go LangGo Lang
Go Lang
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

了解Cpu

  • 5. 5
  • 7. Cache-续 指令Cache 数据Cache 7
  • 13. Upgraded features from Nehalem include • 32 kB data + 32 kB instruction L1 cache (3 clocks) and 256 kB L2 cache (8 clocks) per core • Shared L3 cache includes the processor graphics (LGA 1155) • 64-byte cache line size • Two load/store operations per CPU cycle for each memory channel • Decoded micro-operation cache and enlarged, optimized branch predictor • Improved performance for transcendental mathematics, AES encryption (AES instruction set), and SHA-1 hashing • 256-bit/cycle ring bus interconnect between cores, graphics, cache and System Agent Domain • Advanced Vector Extensions (AVX) 256-bit instruction set with wider vectors, new extensible syntax and rich functionality • Intel Quick Sync Video, hardware support for video encoding and decoding • Up to 8 physical cores or 16 logical cores through Hyper-threading 13
  • 14. lscpu Architecture: x86_64 CPU MHz: 2400.461 CPU op-mode(s): 32-bit, 64-bit BogoMIPS: 4799.93 Byte Order: Little Endian Virtualization: VT-x CPU(s): 24 L1d cache: 32K On-line CPU(s) list: 0-23 L1i cache: 32K Thread(s) per core: 2 L2 cache: 256K Core(s) per socket: 6 L3 cache: 12288K CPU socket(s): 2 NUMA node0 CPU(s): NUMA node(s): 2 0,2,4,6,8,10,12,14,16,18,20,22 Vendor ID: GenuineIntel NUMA node1 CPU(s): CPU family: 6 1,3,5,7,9,11,13,15,17,19,21,23 Model: 44 Stepping: 2 14
  • 16. Hwconfig Processors: 2 x Xeon E5645 2.40GHz 5860MHz FSB (HT enabled, 12 cores, 24 threads) cpus bits="64" sockets="2" cores="12" sockets_populated="2" cores_active="12" threads="24" ht_bios_enable="1" threads_active="24" ht_enable="1" ht_support="1" 16
  • 17. hwconfig -x apic_id="0" multi_threading="32" bits="64" name="cpu1" core_id="0" package_id="0" cores="6" physical_address_bits="40" cpuid="0x000206c2" speed="2400461000" cpuid_level="11" stepping_id="2" family_id="6" threads="12" fsb="5860MHz“ turbo_frequencies="2800000000 2800000000 l1_cache_size="32768" 2666666666 2666666666" l2_cache_size="262144“ vendor="Intel" l3_cache_size="12582912“ vendor_id="GenuineIntel" model="Intel® Xeon(R) CPU E5645 @ 2.40GHz" virtual_address_bits="48" model_id="44" 17
  • 18. 必知性能数字 L1 cache referenc 0 . 5 n s Branch mispredict 5 n s L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Compress 1K bytes with Zippy 3,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from disk 20,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns 18
  • 19. lmbench微观测量 Basic double operations - times in nanoseconds - smaller is better ------------------------------------------------------------------ Host OS double doubledoubledouble add mul div bogo ------------------------------------------------------------------ Dr4000 Linux 2.6.32- 1.1400 1.9000 8.9500 7.7100 Memory latencies in nanoseconds - smaller is better --------------------------------------------------------------- --------------- Host OS Mhz L1 $ L2 $ Main mem Rand mem Guesses --------------------------------------------------------------- --- Dr4000 Linux 2.6.32- 2631 1.1590 5.7170 78.0 110.4 19
  • 21. 参考材料 • lscpu – CPU architecture information查看器 http://blog.yufeng.info/archives/1886 • CPU拓扑结构的调查: http://blog.yufeng.info/archives/666 • hwconfig查看硬件信息: http://blog.yufeng.info/archives/2086 • LMbench实用的微观性能分析工具: http://blog.yufeng.info/archives/tag/lmbench 21