SlideShare a Scribd company logo
1 of 50
Copyright © 2016 NTT DATA Corporation
April 26, 2016
NTT DATA Corporation
@ OpenStack Summit 2016 Austin
How to integrate OpenStack Swift to your "legacy"
system
2Copyright © 2016 NTT DATA Corporation
Disclaimer
• Any product name, service name, software name and other marks are
trade mark or registered mark of corresponding companies.
• This presentation is in a purpose of providing the knowledge gained
from our system integration projects using swift.
• A presenter and NTT DATA Corporation provide information in as-is
basis and have no responsiveness for results that you got according to
information in this presentation material.
3Copyright © 2016 NTT DATA Corporation
 Who are we?
• Takashi Kajinami, Masaaki Nakagawa, Masahiro Ikeda
• We are all platform engineers
 OSS professional sector in NTT DATA
 Our main theme
• Cloud platform using OpenStack
• Private cloud by OpenStack
• Cloud storage by Swift or Sheepdog
About us
Cloud Technologies Data Processing Technologies
4Copyright © 2016 NTT DATA Corporation 4Copyright © 2016 NTT DATA Corporation
Agenda
• What is swift and why we use swift?
• What is the problem?
• How to solve the problem?
• Case 1: Use swift as a backup storage
• Case 2: Provide conventional file server on Swift
• Conclusion
5Copyright © 2016 NTT DATA Corporation 5Copyright © 2016 NTT DATA Corporation https://www.flickr.com/photos/tristanf/7357655824/
6Copyright © 2016 NTT DATA Corporation
OpenStack Swift ~ Distributed object storage ~
• Storage project in OpenStack project
• Distributed object storage which has a similar feature as Amazon/S3
• REST API(GET/PUT/…) on HTTP Protocol
① Block Storage (Cinder)
② Object Storage (Swift)
The image is quoted from http://www.openstack.org/software/
7Copyright © 2016 NTT DATA Corporation
3 Key features of Swift
1
2
3
Durability
Scalability
Openness
8Copyright © 2016 NTT DATA Corporation
Durability
• Protect data from various defeats
Datacenter
Rack
Node
Disk
9Copyright © 2016 NTT DATA Corporation
Scalability
• Flexibly adopt to the growth of data
1y 2y 5y
10TB
100TB
3PB?
150TB?
10Copyright © 2016 NTT DATA Corporation
Openness
• Free from limited model and maintenance period of hardware
A B C
Vendor A B C
11Copyright © 2016 NTT DATA Corporation
3 key features of Swift
Durability Scalability Openness
Datacenter
Software
Hardware
Disk
1y 2y 5y
10TB
100TB
3PB?
150TB?
A B C
Vendor A B C
Protect data from
various defeats
Flexibly adopt
to the grow of data
No vendor lock-in
12Copyright © 2016 NTT DATA Corporation
Swift is great!
Swift is the best solution to realize durable, scalable
and open storage platform
But, in many projects, we faced a difficulty when
trying to integrate swift to the system
13Copyright © 2016 NTT DATA Corporation
What happens with us
One day, Sales guy comes, asking me.
Hey, our customer is now planning the replacement of their storage,
and looking for the way to save their data from disasters.
Do you have any good solutions?
OK. Swift is the best solution for that! It has excellent “global cluster”
feature, and realizes geographically replicated storage!
Great! How can we use that storage? Does it provide NFS or iSCSI?
No. Swift provides REST API over HTTP protocol.
Oh... They want to replace only their storage and keep their existing
applications, which require conventional file system interface
OK. Give me a time. I’ll find a solution for that.
14Copyright © 2016 NTT DATA Corporation
What is the issue?
Swift is a very good storage solution
to realize disaster recovery (Geographically replicated)
massive scalability (more than PB)
low operational cost (self-healing for disk failures)
so on.
However,
Many customers don’t change their applications at the same time
and
These applications don’t support REST API in many cases.
Copyright © 2016 NTT DATA Corporation 15Copyright © 2016 NTT DATA Corporation 15
How to solve the problem?
16Copyright © 2016 NTT DATA Corporation
Overview
We’d like to introduce two use case in this presentation
Case1: Using swift as backup storage of legacy backup server
Case2: Use swift as storage for file server
17Copyright © 2016 NTT DATA Corporation
○Overview
• Upgrading backup system for public agency
• This system backups various size data
ex) user data, user virtual machine image, VM host image
• Supporting contract of backup storage of this system is being expired
• This project’s goal is changing backup storage device
○Requirements
• Backed-up data size is from hundreds TB to 1 PB
• Backed-up data is stored from multi DCs
• These DCs are located far away each other
• Backup data should be stored redundantly
• Backup storage should be placed in multi datacenter
○Notes
• Backup system uses proprietary backup software
• Strongly required that backup software is kept to use. We should not
change anything without storage
Case1: Using swift as backup storage of legacy backup server
18Copyright © 2016 NTT DATA Corporation
Points for design
• Backed-up data size is from hundreds TB to 1 PB
→ Scalable
• Data is come from multi DCs / DCs are located far away each other
→ Multi region
• Backup data should be stored redundantly / Backup storage should be
placed in multi datacenter
→ Global cluster
Swift is good for this customer.
19Copyright © 2016 NTT DATA Corporation
System design image from points
Region A
Proxy
Backup server
Proxy
Backup server
Read/Write affinity
User system User system
Storage Storage Storage Storage Storage Storage
Replication
Read/Write affinity
This figure shows system design satisfied points
Site A Site B
Region B
20Copyright © 2016 NTT DATA Corporation
That design was difficult to realize
Tape device
swift API not
supported
Backup server
Block storage
Backup server
Block storage
/ server-
specific
virtual tape
device
required
User data / VM / Virturization host
Because of low compatibility of backup software with swift, we have
some issue to clear for using swift.
We want to
change to this
part to swift
21Copyright © 2016 NTT DATA Corporation
Experimentally approach – mount swift to file system
Mounted swift as block storage by using cloudfuse(※)
※https://github.com/redbo/cloudfuse
Backup server
Backup server
proxy
Storage
Mount point (ex: /mnt/swift)
Storage Storage
Tape device
Tape
device
Tape
device
Tape
device
cloudfuse
22Copyright © 2016 NTT DATA Corporation
Issues of mounting swift for legacy backup system
Issue Reason Achieve plan
Fail initializing virtual
tape device
• Tape device is renamed
during initializing
• Cloudfuse does not
support rename
operation
• Improve cloudfuse to
support rename
• Select other component
like cloudfuse
• Workaround : create
device at other location
Swift doesn’t support
append operation
• Backup server append
backup data to virtual
tape device
• Using DLO plugin
There are two issues to proceed backup process.
23Copyright © 2016 NTT DATA Corporation
Summary of case 1
【Requirement】
• We’d like to use swift as backend storage of Legacy backup system.
【Issues】
• Backup software doesn’t support swift API
• Backup software need to locate Virtual tape device on block device
required.
【Attempt】
• Mounting swift to file system by using cloudfuse
【Result】
We have gotten succeeded pass of backup.
• Workaround to avoid failure of initializing virtual tape device
• Using DLO plugin to support append operation
24Copyright © 2016 NTT DATA Corporation
Case2:Use swift as storage for file server
Overview
• Renew file server which has access from users in
geographically separated areas
Requirements
1. Users store data in local storage without any overhead for replication
2. All users share same files
3. Support file-based protocols to write various size of files
25Copyright © 2016 NTT DATA Corporation
Req1. Users store data in local storage
• To keep the latency of access low
• Since users are distributed, storages need to locate in geographically
separated area too
26Copyright © 2016 NTT DATA Corporation
Req2. All users share same files
• Bundle local storages in one virtual storage to share the same files.
Virtual One Storage
27Copyright © 2016 NTT DATA Corporation
Req2. All users share same files
• After a file is written in a certain storage, it is replicated to all other
storages
File
File
File File
File
File
File
②replication①write ③read
File
28Copyright © 2016 NTT DATA Corporation
Req3. Support file-based protocols to write various size of files
• To reuse the existing many legacy applications
• To avoid high costs and risks while developing new applications
• To use as file server, accessing small files in low latency is required
Local
Storage
CIFS
NFS
File
FileFileFile
Many small files(~MB)
Some big files(MB~)
29Copyright © 2016 NTT DATA Corporation
Options of storage software
30Copyright © 2016 NTT DATA Corporation
Swift has two issues to realize requirements
Issues
31Copyright © 2016 NTT DATA Corporation
Issue1. file system Interface
• Users need to access by file-based protocols such as CIFS/NFS
• But, Swift doesn’t support any protocols excepting REST API
Swift
CIFS
NFS
NG
(Only REST API)
32Copyright © 2016 NTT DATA Corporation
Issue2. small file optimization
• To use as file server, accessing small files in low latency is required
• But, Swift is not good at processing many small files because swift is
optimized for bigger file to store much data
swift
REST
API
Slow
FileFileFile
Many small files(~MB)
33Copyright © 2016 NTT DATA Corporation
Solutions for the two issues
1 file system interface
2 small file optimization
• Cloud Storage Gateway
• Data management with
small block size
• Storage cache
Issues Solutions
34Copyright © 2016 NTT DATA Corporation
Solution1. file system interface
• Cloud Storage Gateway
– Gateway software which translates cloud storage APIs such as REST API
and standard file-based interface such as CIFS/NFS seamlessly
Swift
Cloud
Storage
Gateway
CIFS
NFS
REST
API
35Copyright © 2016 NTT DATA Corporation
Solution2. small file optimization
• Data management with small block size
– Optimize the latency to access many small files randomly
• Storage cache
– CSG’s local storage to store data which users accessed
– If CSG has the file a user accesses in storage cache already, it returns the
file to the user directly
– Since there is no access to swift, the latency is very low
Swift
Cloud
Storage
Gateway
cache
①read request
②response
File A
File A
Don’t
access
36Copyright © 2016 NTT DATA Corporation
Swift + CSG suit the requirements?
Solved
• Although CSG solves the two issues, it is important to confirm if CSG
has the feature of clustering
?
(depends CSG)
37Copyright © 2016 NTT DATA Corporation
We used Fobas CSC
• Fobas CSC(Cloud Storage Cache)
• One of CSG (※1)
• Proprietary software
• Features
※1 http://www.fobas.jp/
1 Data management with small block size
2 Storage cache
3 Loosely Cluster which enable to share same files in multiple locations
4 ACL which enable to control access such as LDAP and Active/Directory
5 Support many protocols of CIFS, NFS, WebDAV, FTP and iSCSIs
6 High Security encrypting data with AES256
and more …
38Copyright © 2016 NTT DATA Corporation
Loosely Cluster
• Bundles file systems of Fobas CSC with virtual one file system
• Enables to share same files in multiple locations in CSG layer
Swift
Fobas
CSC
[ Europe ]
Swift
[ North America ]
Fobas
CSC
Replicate
meta data
Loosely Cluster
Replicate data
Global Cluster
39Copyright © 2016 NTT DATA Corporation
Swift + Fobas CSC can realize every requirements
?
(depends CSG)
40Copyright © 2016 NTT DATA Corporation
Architecture
CSG CSG
CSG
CSG
• Each location has Swift + Fobas CSC
• Since they are clustered and replicate data, users can share same files.
Swift
Swift
Swift
Swift
File
File
File
File
File
SwiftFile
Swift
CSG
File
File
CSG
41Copyright © 2016 NTT DATA Corporation
Performance
• Overview of the performance test
- data is only on storage cache which is SSD
- The benchmark is fileserver workload in filebench
(https://sourceforge.net/projects/filebenchs)
• The performance is good for file server when the protocol is nfs
- Compare with the performance of ordinary fileserver
- ※ ◎: better, ○: almost same, △: worse
Average
Throughput[opt/sec]
Average
Latency[msec]
Ordinary Fieserver 2000~5000 5~20
Swift + Fobas CSC(NFS) ◎ ○
Swift + Fobas CSC(CIFS) △ △
42Copyright © 2016 NTT DATA Corporation
Three limitations
1. The performance of CIFS protocols isn’t so good
• I think that the performance is not so tuned so far, since CSG developers
tend to focus on flexibility, scalability and availability
• As CSG market becomes matured, this problem would be solved
2. The Performance becomes much worse when data is not on cache
• The latency increases by that of swift because CSG gets data from swift
• Design size of storage cache carefully. To compare performance and cost
efficiency while considering the tendency of data usage
3. There is a good and bad point of integration
• The good point is that we can get software’s specific features such as
Global Cluster
• The bad point is that the availability becomes lower because possible
failure points increase while the number of components are increasing
43Copyright © 2016 NTT DATA Corporation
Swift has much possibilities to expand the number of users
• The market size of CSG is increasing rapidly
– $11M(2010) → $74M(2012) → $860M?(2016) (※2)
– The reasons are
• Want to store increasing data in inexpensive storage
• Want to continue to use legacy system
• The demand of the combination of Swift and CSG probably increase
– Although proprietary software have many useful features, it is not optimized
for swift
– How about developing CSG optimized for swift?
(for example, to develop a plugin of NFS-Ganesha or SMB!?)
※2 http://www.technavio.com/report/global-data-center-cloud-storage-gateway-market
44Copyright © 2016 NTT DATA Corporation
Summary of second use case
• Integrating swift as back-end storage of file server which has access
from users in geographically separated areas
• Because swift doesn’t support file-based protocols and is not good at
processing small files in low latency, we join together swift and CSG to
realize requirements. The performance is enough good.
• Since swift is superior to others in global cluster feature, to realize file
server which has the feature, Swift and CSG is very good combination.
Copyright © 2016 NTT DATA Corporation 45Copyright © 2016 NTT DATA Corporation 45
What we learned from real use cases
46Copyright © 2016 NTT DATA Corporation
Cloud gateway sometimes help us
• Case 1: Storage for existing backup solution
cloudfuse: REST API -> FUSE (Linux filesystem) -> Backup solution
• Case 2: File sharing service between some geo locations
FOBAS CSC: REST API -> FOBASE (CIFS/NFS) -> Clients(Windows/Mac)
Cloud gateway is very good starting point to integrate swift into your
existing system
But, it often put a limitation on the benefit given by swift
• The more layers we put, the more the number possible failure points appear
• Cloud native applications get all benefits of swift
47Copyright © 2016 NTT DATA Corporation
Future vision
• Define and share some successful migration stories
• Starting point:
• Gateway solutions to integrate swift to the existing applications
• Ideal goal:
• Cloud-native solutions
-> At last, you can bring all the benefits of cloud technologies into you system.
• Improve cloud gateway
• Improve stability and usability
• Scalable and Highly available NFS/CIFS storage using nfs-ganesha and
CIFS clustering
48Copyright © 2016 NTT DATA CorporationCopyright © 2016 NTT DATA Corporation
Any product name, service name, software name and other marks are trade mark or registered mark of
corresponding companies.
記載されている会社名、商品名、又はサービス名は、各社の登録商標又は商標です。
49Copyright © 2016 NTT DATA Corporation
Why “Multi Endpoints” is important?
• If not supported
• User must access specific endpoint
• The latency of many people access is very high
NG
Only endpoint
50Copyright © 2016 NTT DATA Corporation
Why “asynchronous replication” is important?
• If not supported (synchronous replication only)
• User have to wait the replication
• The latency is much higher than that of asynchronous
②Synchronous
replication
① write
③ ack
Slow

More Related Content

What's hot

NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 LinkedinNMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
Javier Guillermo, MBA, MSc, PMP
 
IBM System Storage® : la famiglia si allarga…ultimi annunci
IBM System Storage® : la famiglia si allarga…ultimi annunciIBM System Storage® : la famiglia si allarga…ultimi annunci
IBM System Storage® : la famiglia si allarga…ultimi annunci
S.info Srl
 
Presentazione IBM System Storage - Evento Torino 19 novembre 2013
Presentazione IBM System Storage - Evento Torino 19 novembre 2013Presentazione IBM System Storage - Evento Torino 19 novembre 2013
Presentazione IBM System Storage - Evento Torino 19 novembre 2013
PRAGMA PROGETTI
 

What's hot (20)

Enovance nfv solution - Openstack in Action 5, Paris, May 2014
Enovance nfv solution - Openstack in Action 5, Paris, May 2014Enovance nfv solution - Openstack in Action 5, Paris, May 2014
Enovance nfv solution - Openstack in Action 5, Paris, May 2014
 
Security and Virtualization in the Data Center
Security and Virtualization in the Data CenterSecurity and Virtualization in the Data Center
Security and Virtualization in the Data Center
 
Garance 100% dostupnosti dat! Kdo z vás to má?
Garance 100% dostupnosti dat! Kdo z vás to má?Garance 100% dostupnosti dat! Kdo z vás to má?
Garance 100% dostupnosti dat! Kdo z vás to má?
 
Infinidat InfiniGuard
Infinidat InfiniGuardInfinidat InfiniGuard
Infinidat InfiniGuard
 
The Enhanced Cisco Container Platform
The Enhanced Cisco Container PlatformThe Enhanced Cisco Container Platform
The Enhanced Cisco Container Platform
 
eNovance - Seamless build and delivery of OpenStack based
eNovance - Seamless build and delivery of OpenStack basedeNovance - Seamless build and delivery of OpenStack based
eNovance - Seamless build and delivery of OpenStack based
 
TechWiseTV Workshop: Cisco Catalyst 9100 Access Points for Wi-Fi 6
TechWiseTV Workshop: Cisco Catalyst 9100 Access Points for Wi-Fi 6TechWiseTV Workshop: Cisco Catalyst 9100 Access Points for Wi-Fi 6
TechWiseTV Workshop: Cisco Catalyst 9100 Access Points for Wi-Fi 6
 
Apache Pulsar @Splunk
Apache Pulsar @SplunkApache Pulsar @Splunk
Apache Pulsar @Splunk
 
UCS Update: Efficiently Managing your server environment for traditional ente...
UCS Update: Efficiently Managing your server environment for traditional ente...UCS Update: Efficiently Managing your server environment for traditional ente...
UCS Update: Efficiently Managing your server environment for traditional ente...
 
NFV orchestration for cloud and virtual branch services
NFV orchestration for cloud and virtual branch servicesNFV orchestration for cloud and virtual branch services
NFV orchestration for cloud and virtual branch services
 
SDN Scale-out Testing at OpenStack Innovation Center (OSIC)
SDN Scale-out Testing at OpenStack Innovation Center (OSIC)SDN Scale-out Testing at OpenStack Innovation Center (OSIC)
SDN Scale-out Testing at OpenStack Innovation Center (OSIC)
 
04 (IDNOG02) Cloud Infrastructure by Dondy Bappedyanto
04 (IDNOG02) Cloud Infrastructure by Dondy Bappedyanto04 (IDNOG02) Cloud Infrastructure by Dondy Bappedyanto
04 (IDNOG02) Cloud Infrastructure by Dondy Bappedyanto
 
NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 LinkedinNMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
NMS Projects and POCs completed and ongoing for OSS NAM v 1.5 Linkedin
 
Cisco Connect Montreal 2017 - Mise à Jour UCS et Hyperflex
Cisco Connect Montreal 2017 - Mise à Jour UCS et HyperflexCisco Connect Montreal 2017 - Mise à Jour UCS et Hyperflex
Cisco Connect Montreal 2017 - Mise à Jour UCS et Hyperflex
 
IBM System Storage® : la famiglia si allarga…ultimi annunci
IBM System Storage® : la famiglia si allarga…ultimi annunciIBM System Storage® : la famiglia si allarga…ultimi annunci
IBM System Storage® : la famiglia si allarga…ultimi annunci
 
IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2IBTA Releases Updated Specification for RoCEv2
IBTA Releases Updated Specification for RoCEv2
 
Network Function Virtualization (NFV) using IOS-XR
Network Function Virtualization (NFV) using IOS-XRNetwork Function Virtualization (NFV) using IOS-XR
Network Function Virtualization (NFV) using IOS-XR
 
Expanding your impact with programmability in the data center
Expanding your impact with programmability in the data centerExpanding your impact with programmability in the data center
Expanding your impact with programmability in the data center
 
Presentazione IBM System Storage - Evento Torino 19 novembre 2013
Presentazione IBM System Storage - Evento Torino 19 novembre 2013Presentazione IBM System Storage - Evento Torino 19 novembre 2013
Presentazione IBM System Storage - Evento Torino 19 novembre 2013
 
Starting the DevOps Train
Starting the DevOps TrainStarting the DevOps Train
Starting the DevOps Train
 

Similar to How to integrate OpenStack Swift to your "legacy" system

Similar to How to integrate OpenStack Swift to your "legacy" system (20)

OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
OpenStack Summit Tokyo - Know-how of Challlenging Deploy/Operation NTT DOCOMO...
 
Distributed application usecase on docker
Distributed application usecase on dockerDistributed application usecase on docker
Distributed application usecase on docker
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-final
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Using Storlets/Docker For Large Scale Image Processing
Using Storlets/Docker For Large Scale Image ProcessingUsing Storlets/Docker For Large Scale Image Processing
Using Storlets/Docker For Large Scale Image Processing
 
Webinar: Does Object Storage Make Sense for Backups?
Webinar: Does Object Storage Make Sense for Backups?Webinar: Does Object Storage Make Sense for Backups?
Webinar: Does Object Storage Make Sense for Backups?
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!
 
Catching the Software Defined Storage Wave
Catching the Software Defined Storage WaveCatching the Software Defined Storage Wave
Catching the Software Defined Storage Wave
 
Webinar: The Four Requirements of a Cloud-Era File System
Webinar: The Four Requirements of a Cloud-Era File SystemWebinar: The Four Requirements of a Cloud-Era File System
Webinar: The Four Requirements of a Cloud-Era File System
 
Storage As A Service (StAAS)
Storage As A Service (StAAS)Storage As A Service (StAAS)
Storage As A Service (StAAS)
 
Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?Webinar: Is Your Storage Ready for Disaster?
Webinar: Is Your Storage Ready for Disaster?
 
Zero Downtime with Tyrone Unified Storage Solution
Zero Downtime with Tyrone Unified Storage SolutionZero Downtime with Tyrone Unified Storage Solution
Zero Downtime with Tyrone Unified Storage Solution
 
Updates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&DUpdates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&D
 
Tape and cloud strategies for VM backups
Tape and cloud strategies for VM backupsTape and cloud strategies for VM backups
Tape and cloud strategies for VM backups
 
Building an Apache Hadoop data application
Building an Apache Hadoop data applicationBuilding an Apache Hadoop data application
Building an Apache Hadoop data application
 
Ceph, Open Source, and the Path to Ubiquity in Storage - AACS Meetup 2014
Ceph, Open Source, and the Path to Ubiquity in Storage - AACS Meetup 2014Ceph, Open Source, and the Path to Ubiquity in Storage - AACS Meetup 2014
Ceph, Open Source, and the Path to Ubiquity in Storage - AACS Meetup 2014
 
WekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound AgainWekaIO: Making Machine Learning Compute Bound Again
WekaIO: Making Machine Learning Compute Bound Again
 
Software Defined Storage In Action
Software Defined Storage In ActionSoftware Defined Storage In Action
Software Defined Storage In Action
 
Software-defined Storage in Action
Software-defined Storage in ActionSoftware-defined Storage in Action
Software-defined Storage in Action
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

How to integrate OpenStack Swift to your "legacy" system

  • 1. Copyright © 2016 NTT DATA Corporation April 26, 2016 NTT DATA Corporation @ OpenStack Summit 2016 Austin How to integrate OpenStack Swift to your "legacy" system
  • 2. 2Copyright © 2016 NTT DATA Corporation Disclaimer • Any product name, service name, software name and other marks are trade mark or registered mark of corresponding companies. • This presentation is in a purpose of providing the knowledge gained from our system integration projects using swift. • A presenter and NTT DATA Corporation provide information in as-is basis and have no responsiveness for results that you got according to information in this presentation material.
  • 3. 3Copyright © 2016 NTT DATA Corporation  Who are we? • Takashi Kajinami, Masaaki Nakagawa, Masahiro Ikeda • We are all platform engineers  OSS professional sector in NTT DATA  Our main theme • Cloud platform using OpenStack • Private cloud by OpenStack • Cloud storage by Swift or Sheepdog About us Cloud Technologies Data Processing Technologies
  • 4. 4Copyright © 2016 NTT DATA Corporation 4Copyright © 2016 NTT DATA Corporation Agenda • What is swift and why we use swift? • What is the problem? • How to solve the problem? • Case 1: Use swift as a backup storage • Case 2: Provide conventional file server on Swift • Conclusion
  • 5. 5Copyright © 2016 NTT DATA Corporation 5Copyright © 2016 NTT DATA Corporation https://www.flickr.com/photos/tristanf/7357655824/
  • 6. 6Copyright © 2016 NTT DATA Corporation OpenStack Swift ~ Distributed object storage ~ • Storage project in OpenStack project • Distributed object storage which has a similar feature as Amazon/S3 • REST API(GET/PUT/…) on HTTP Protocol ① Block Storage (Cinder) ② Object Storage (Swift) The image is quoted from http://www.openstack.org/software/
  • 7. 7Copyright © 2016 NTT DATA Corporation 3 Key features of Swift 1 2 3 Durability Scalability Openness
  • 8. 8Copyright © 2016 NTT DATA Corporation Durability • Protect data from various defeats Datacenter Rack Node Disk
  • 9. 9Copyright © 2016 NTT DATA Corporation Scalability • Flexibly adopt to the growth of data 1y 2y 5y 10TB 100TB 3PB? 150TB?
  • 10. 10Copyright © 2016 NTT DATA Corporation Openness • Free from limited model and maintenance period of hardware A B C Vendor A B C
  • 11. 11Copyright © 2016 NTT DATA Corporation 3 key features of Swift Durability Scalability Openness Datacenter Software Hardware Disk 1y 2y 5y 10TB 100TB 3PB? 150TB? A B C Vendor A B C Protect data from various defeats Flexibly adopt to the grow of data No vendor lock-in
  • 12. 12Copyright © 2016 NTT DATA Corporation Swift is great! Swift is the best solution to realize durable, scalable and open storage platform But, in many projects, we faced a difficulty when trying to integrate swift to the system
  • 13. 13Copyright © 2016 NTT DATA Corporation What happens with us One day, Sales guy comes, asking me. Hey, our customer is now planning the replacement of their storage, and looking for the way to save their data from disasters. Do you have any good solutions? OK. Swift is the best solution for that! It has excellent “global cluster” feature, and realizes geographically replicated storage! Great! How can we use that storage? Does it provide NFS or iSCSI? No. Swift provides REST API over HTTP protocol. Oh... They want to replace only their storage and keep their existing applications, which require conventional file system interface OK. Give me a time. I’ll find a solution for that.
  • 14. 14Copyright © 2016 NTT DATA Corporation What is the issue? Swift is a very good storage solution to realize disaster recovery (Geographically replicated) massive scalability (more than PB) low operational cost (self-healing for disk failures) so on. However, Many customers don’t change their applications at the same time and These applications don’t support REST API in many cases.
  • 15. Copyright © 2016 NTT DATA Corporation 15Copyright © 2016 NTT DATA Corporation 15 How to solve the problem?
  • 16. 16Copyright © 2016 NTT DATA Corporation Overview We’d like to introduce two use case in this presentation Case1: Using swift as backup storage of legacy backup server Case2: Use swift as storage for file server
  • 17. 17Copyright © 2016 NTT DATA Corporation ○Overview • Upgrading backup system for public agency • This system backups various size data ex) user data, user virtual machine image, VM host image • Supporting contract of backup storage of this system is being expired • This project’s goal is changing backup storage device ○Requirements • Backed-up data size is from hundreds TB to 1 PB • Backed-up data is stored from multi DCs • These DCs are located far away each other • Backup data should be stored redundantly • Backup storage should be placed in multi datacenter ○Notes • Backup system uses proprietary backup software • Strongly required that backup software is kept to use. We should not change anything without storage Case1: Using swift as backup storage of legacy backup server
  • 18. 18Copyright © 2016 NTT DATA Corporation Points for design • Backed-up data size is from hundreds TB to 1 PB → Scalable • Data is come from multi DCs / DCs are located far away each other → Multi region • Backup data should be stored redundantly / Backup storage should be placed in multi datacenter → Global cluster Swift is good for this customer.
  • 19. 19Copyright © 2016 NTT DATA Corporation System design image from points Region A Proxy Backup server Proxy Backup server Read/Write affinity User system User system Storage Storage Storage Storage Storage Storage Replication Read/Write affinity This figure shows system design satisfied points Site A Site B Region B
  • 20. 20Copyright © 2016 NTT DATA Corporation That design was difficult to realize Tape device swift API not supported Backup server Block storage Backup server Block storage / server- specific virtual tape device required User data / VM / Virturization host Because of low compatibility of backup software with swift, we have some issue to clear for using swift. We want to change to this part to swift
  • 21. 21Copyright © 2016 NTT DATA Corporation Experimentally approach – mount swift to file system Mounted swift as block storage by using cloudfuse(※) ※https://github.com/redbo/cloudfuse Backup server Backup server proxy Storage Mount point (ex: /mnt/swift) Storage Storage Tape device Tape device Tape device Tape device cloudfuse
  • 22. 22Copyright © 2016 NTT DATA Corporation Issues of mounting swift for legacy backup system Issue Reason Achieve plan Fail initializing virtual tape device • Tape device is renamed during initializing • Cloudfuse does not support rename operation • Improve cloudfuse to support rename • Select other component like cloudfuse • Workaround : create device at other location Swift doesn’t support append operation • Backup server append backup data to virtual tape device • Using DLO plugin There are two issues to proceed backup process.
  • 23. 23Copyright © 2016 NTT DATA Corporation Summary of case 1 【Requirement】 • We’d like to use swift as backend storage of Legacy backup system. 【Issues】 • Backup software doesn’t support swift API • Backup software need to locate Virtual tape device on block device required. 【Attempt】 • Mounting swift to file system by using cloudfuse 【Result】 We have gotten succeeded pass of backup. • Workaround to avoid failure of initializing virtual tape device • Using DLO plugin to support append operation
  • 24. 24Copyright © 2016 NTT DATA Corporation Case2:Use swift as storage for file server Overview • Renew file server which has access from users in geographically separated areas Requirements 1. Users store data in local storage without any overhead for replication 2. All users share same files 3. Support file-based protocols to write various size of files
  • 25. 25Copyright © 2016 NTT DATA Corporation Req1. Users store data in local storage • To keep the latency of access low • Since users are distributed, storages need to locate in geographically separated area too
  • 26. 26Copyright © 2016 NTT DATA Corporation Req2. All users share same files • Bundle local storages in one virtual storage to share the same files. Virtual One Storage
  • 27. 27Copyright © 2016 NTT DATA Corporation Req2. All users share same files • After a file is written in a certain storage, it is replicated to all other storages File File File File File File File ②replication①write ③read File
  • 28. 28Copyright © 2016 NTT DATA Corporation Req3. Support file-based protocols to write various size of files • To reuse the existing many legacy applications • To avoid high costs and risks while developing new applications • To use as file server, accessing small files in low latency is required Local Storage CIFS NFS File FileFileFile Many small files(~MB) Some big files(MB~)
  • 29. 29Copyright © 2016 NTT DATA Corporation Options of storage software
  • 30. 30Copyright © 2016 NTT DATA Corporation Swift has two issues to realize requirements Issues
  • 31. 31Copyright © 2016 NTT DATA Corporation Issue1. file system Interface • Users need to access by file-based protocols such as CIFS/NFS • But, Swift doesn’t support any protocols excepting REST API Swift CIFS NFS NG (Only REST API)
  • 32. 32Copyright © 2016 NTT DATA Corporation Issue2. small file optimization • To use as file server, accessing small files in low latency is required • But, Swift is not good at processing many small files because swift is optimized for bigger file to store much data swift REST API Slow FileFileFile Many small files(~MB)
  • 33. 33Copyright © 2016 NTT DATA Corporation Solutions for the two issues 1 file system interface 2 small file optimization • Cloud Storage Gateway • Data management with small block size • Storage cache Issues Solutions
  • 34. 34Copyright © 2016 NTT DATA Corporation Solution1. file system interface • Cloud Storage Gateway – Gateway software which translates cloud storage APIs such as REST API and standard file-based interface such as CIFS/NFS seamlessly Swift Cloud Storage Gateway CIFS NFS REST API
  • 35. 35Copyright © 2016 NTT DATA Corporation Solution2. small file optimization • Data management with small block size – Optimize the latency to access many small files randomly • Storage cache – CSG’s local storage to store data which users accessed – If CSG has the file a user accesses in storage cache already, it returns the file to the user directly – Since there is no access to swift, the latency is very low Swift Cloud Storage Gateway cache ①read request ②response File A File A Don’t access
  • 36. 36Copyright © 2016 NTT DATA Corporation Swift + CSG suit the requirements? Solved • Although CSG solves the two issues, it is important to confirm if CSG has the feature of clustering ? (depends CSG)
  • 37. 37Copyright © 2016 NTT DATA Corporation We used Fobas CSC • Fobas CSC(Cloud Storage Cache) • One of CSG (※1) • Proprietary software • Features ※1 http://www.fobas.jp/ 1 Data management with small block size 2 Storage cache 3 Loosely Cluster which enable to share same files in multiple locations 4 ACL which enable to control access such as LDAP and Active/Directory 5 Support many protocols of CIFS, NFS, WebDAV, FTP and iSCSIs 6 High Security encrypting data with AES256 and more …
  • 38. 38Copyright © 2016 NTT DATA Corporation Loosely Cluster • Bundles file systems of Fobas CSC with virtual one file system • Enables to share same files in multiple locations in CSG layer Swift Fobas CSC [ Europe ] Swift [ North America ] Fobas CSC Replicate meta data Loosely Cluster Replicate data Global Cluster
  • 39. 39Copyright © 2016 NTT DATA Corporation Swift + Fobas CSC can realize every requirements ? (depends CSG)
  • 40. 40Copyright © 2016 NTT DATA Corporation Architecture CSG CSG CSG CSG • Each location has Swift + Fobas CSC • Since they are clustered and replicate data, users can share same files. Swift Swift Swift Swift File File File File File SwiftFile Swift CSG File File CSG
  • 41. 41Copyright © 2016 NTT DATA Corporation Performance • Overview of the performance test - data is only on storage cache which is SSD - The benchmark is fileserver workload in filebench (https://sourceforge.net/projects/filebenchs) • The performance is good for file server when the protocol is nfs - Compare with the performance of ordinary fileserver - ※ ◎: better, ○: almost same, △: worse Average Throughput[opt/sec] Average Latency[msec] Ordinary Fieserver 2000~5000 5~20 Swift + Fobas CSC(NFS) ◎ ○ Swift + Fobas CSC(CIFS) △ △
  • 42. 42Copyright © 2016 NTT DATA Corporation Three limitations 1. The performance of CIFS protocols isn’t so good • I think that the performance is not so tuned so far, since CSG developers tend to focus on flexibility, scalability and availability • As CSG market becomes matured, this problem would be solved 2. The Performance becomes much worse when data is not on cache • The latency increases by that of swift because CSG gets data from swift • Design size of storage cache carefully. To compare performance and cost efficiency while considering the tendency of data usage 3. There is a good and bad point of integration • The good point is that we can get software’s specific features such as Global Cluster • The bad point is that the availability becomes lower because possible failure points increase while the number of components are increasing
  • 43. 43Copyright © 2016 NTT DATA Corporation Swift has much possibilities to expand the number of users • The market size of CSG is increasing rapidly – $11M(2010) → $74M(2012) → $860M?(2016) (※2) – The reasons are • Want to store increasing data in inexpensive storage • Want to continue to use legacy system • The demand of the combination of Swift and CSG probably increase – Although proprietary software have many useful features, it is not optimized for swift – How about developing CSG optimized for swift? (for example, to develop a plugin of NFS-Ganesha or SMB!?) ※2 http://www.technavio.com/report/global-data-center-cloud-storage-gateway-market
  • 44. 44Copyright © 2016 NTT DATA Corporation Summary of second use case • Integrating swift as back-end storage of file server which has access from users in geographically separated areas • Because swift doesn’t support file-based protocols and is not good at processing small files in low latency, we join together swift and CSG to realize requirements. The performance is enough good. • Since swift is superior to others in global cluster feature, to realize file server which has the feature, Swift and CSG is very good combination.
  • 45. Copyright © 2016 NTT DATA Corporation 45Copyright © 2016 NTT DATA Corporation 45 What we learned from real use cases
  • 46. 46Copyright © 2016 NTT DATA Corporation Cloud gateway sometimes help us • Case 1: Storage for existing backup solution cloudfuse: REST API -> FUSE (Linux filesystem) -> Backup solution • Case 2: File sharing service between some geo locations FOBAS CSC: REST API -> FOBASE (CIFS/NFS) -> Clients(Windows/Mac) Cloud gateway is very good starting point to integrate swift into your existing system But, it often put a limitation on the benefit given by swift • The more layers we put, the more the number possible failure points appear • Cloud native applications get all benefits of swift
  • 47. 47Copyright © 2016 NTT DATA Corporation Future vision • Define and share some successful migration stories • Starting point: • Gateway solutions to integrate swift to the existing applications • Ideal goal: • Cloud-native solutions -> At last, you can bring all the benefits of cloud technologies into you system. • Improve cloud gateway • Improve stability and usability • Scalable and Highly available NFS/CIFS storage using nfs-ganesha and CIFS clustering
  • 48. 48Copyright © 2016 NTT DATA CorporationCopyright © 2016 NTT DATA Corporation Any product name, service name, software name and other marks are trade mark or registered mark of corresponding companies. 記載されている会社名、商品名、又はサービス名は、各社の登録商標又は商標です。
  • 49. 49Copyright © 2016 NTT DATA Corporation Why “Multi Endpoints” is important? • If not supported • User must access specific endpoint • The latency of many people access is very high NG Only endpoint
  • 50. 50Copyright © 2016 NTT DATA Corporation Why “asynchronous replication” is important? • If not supported (synchronous replication only) • User have to wait the replication • The latency is much higher than that of asynchronous ②Synchronous replication ① write ③ ack Slow

Editor's Notes

  1. Good afternoon, everyone. Thank you for coming to this session. We are from NTT DATA, system integrater in Japan. Last autumn, we had OpenStack summit in Japan, my home country and I spend very exciting time there. And I’m happy to join again to the OpenStack summit here, Austin. As mentioned in some keynote sessions, like Jonathan’s one, now, OpenStack is coming to the next stage to support diversity of IT systems which running on OpenStack. Today, I’ll share you our experience in the Swift project, putting focus on the problem we face. I’ll talk about our real cases, and how we tried to integrate Swift to some customer’s project in them, and I’m happy if it can be a help to enhance support of swift for diversity of IT systems.
  2. Let me start with the disclaimer of this presentation
  3. Before talking detailed contents, let me shortly introduce ourselves. We three are all platform engineers in NTT DATA. I’m Takashi Kajinami. I’ve worked in the Swift and OpenStack project in NTT DATA. … NTT DATA is a system integrator, so we provide IT systems for our customers, and also develop some features which is required in our customers’ usecase. We belongs to OSS professional sector in NTT DATA, working about cloud technologies like OpenStack, Swift, Sheepdog, Docker and so on, and data processing technologies like PostgreSQL, Hadoop, Spark and bla bla bla. We are especially responsible for cloud platforms using OpenStack technologies, and working to provide private cloud by OpenStack, and cloud storage using OpenStack Swift. That is the reason why we’ll talk about Swift.
  4. Here I show you the agenda for our presentation. First, I will shortly explain what is swift, and the reason why we use swift for our system integration. Then, Masaaki and Masahiro explain what kind of problems we faced in our swift projects, and introduce our approach in these project, with some detailed information about two real use cases. Finally, I’ll summarize and share what we learned from these cases.
  5. OK. So Swift. How many people here know Swift? How many people are using Swift, or have provided system using Swift?
  6. OpenStack Swift is a part of OpenStack project, and Storage project. Swift realizes distributed object storage like Amazon S3. Object storage is a new style storage with different interface from conventional block storage or fliesystems. Swift provides RESTful API and clients can upload data into storage as PUT request, and download data into storage as GET request. These REST API works on HTTP Protocol, so swift is often used as archiving storage for web contents like photo or video or backup data.
  7. Swift has many good features, but I don’t have so much time to explain today, so I’ll talk about its three key features, durability, scalability and no vendor lock-in.
  8. The first key feature is durability. Swift make some copies of data in storage cluster, for example 3 copies, and distribute copies over devices, nodes, racks. So, even if some parts of swift cluster fails, you can protect data from failures and continue to access all data with remaining copies. In addition, swift also automatically detects disk defeats, and heal missing data copies in other devices working properly. From Grizzly/Havana release, I think very long time has passed since theses get released, swift gets “Global Cluster” features, which enables geographically distributed storage cluster over multiple datacenters to realize disaster recovery. Disaster recovery is one of topics most interesting to especially Japanese customers, after our experience of big disasters like earthquake, tsunami and so on.
  9. The second feature is scalability. Swift distributes data over multiple devices, and when you add new devices, it rebalances data to new nodes. So, we can enlarge capacity and improve performance of swift cluster by adding new swift nodes. We can extend the storage from small capacity like 10TB, to huge capacity like dozens of PB In addition, there are no limitations on the number of devices you can add, so we can extend swift clusters flexibly. You can add capacity as much as you need, when you need, and you can adopt you storage to unpredictable market situation with effective cost.
  10. The last one is “no vendor lock-in.” Swift is open source software, which works on python framework, and you can drive it on commodity I/A servers and linux. You don’t need any special devices for swift, and you can select cost-effective hardware to construct huge storage. In addition, you can flexibly mix some types of servers in a single cluster, so you can add latest hardware to existing cluster consist of old servers. You can add latest servers when you extend, and, on the other hand, can remove old servers when they get broken after their maintenance period. So, you can keep your swift cluster for a long time, regardless of the maintenance period of servers, with replacing old servers to new ones.
  11. So now, I’ve explained about three key features of swift. Swift realizes very durable and scalable storage, and makes us free from any vendor lock-ins.
  12. As I talked in the beginning of this session, we are working as System Integrator, and unfortunately this mismatch between existing systems and swift happens very often for us. What can we do for that?
  13. OK, thank you takashi, I’m Masaaki Nakagawa software platform engineer at NTTDATA. Today, We share you two case of use case of swift. Now, I start to share case 1, use swift as backup storage of legacy backup server.
  14. One day, a customer came to us to talk about new project. From these requirements, we define some points for design.
  15. These are points. 「Backed-up data size is from hundreds TB to 1 PB」 leads scalable We thought that these points are satisfied by swift. Next, we discussed swift integrated system design.
  16. This figure shows that system design overview which is satisfying design points. For convenience, this figure set two site User system, backup server, swift proxy, and swift storage is deployed each site. Backup server gets backed-up data from user system, Backup server PUT or GET backed-up data to/from each proxy which deployed same site. Swift proxy is set affinity configuration to operation PUT or GET data to same site swift storage. Swift storage is set region configuration each site individually. We thought that this design is enough for requirement.
  17. But unfortunately, this design was difficult to realize, because of low compatibility of backup server and swift. Backup server doesn’t support swift API. And this server requires to set server-specific virtual tape device on block storage. To use swift as backup storage, we have to mount swift to local file system as block storage. Like this case, we often receive RFP form customer that using swift as backup storage of backup software which does not ready for swift. Use case of backup storage is suite for swift but we have to give up to use swift because by legacy backup software. To avoid such a unbearable give up, we should prepare something workaround. As workaround, we have experimentally tried to mount swift to filesystem by using cloudfuse. I would like to introduce it.
  18. Cloudfuse is OSS which enable swift to mount as block storage to filesystem. If you create/delete/update files, cloudfuse changes these operation to swift API request. If you want to know about detail of cloudfuse, please refer this web site. In this case, we set cloudfuse like this figure. By this architecture, we succeeded to mount swift and make backup server read mount point. So we tried to backup process to get issue point of this architecture.
  19. By trying, we get two issues to proceed backup process. The first is fail initializing virtual tape device. Virtual tape device initializing process creates temporary file and rename it. But cloudfuse doesn’t support rename operation. So to achieve this issue, we need to improve cloudfuse, or choose other same component like cloudfuse, or do workaround. In this time, we choose workaround, create virtual tape device at other location and move it. The second is swift doesn’t support append object data. During backup process, backup server appends backup data to virtual tape device. But swift doesn’t support append operation. To achieve this, we used DLO plugin. We can apply DLO plugin to appending operation. By achieving these issues, we have gotten one pass to backup. But if using this architecture for commerce, we need to more detail analyzing
  20.  Second case is to use swift as storage for file server.  We are asked to renew file server which has access from users in geographically separated areas. There are three requirements. First, Users store data in local storage without any overhead with replication Second, All users can share same files Third, Support file-based protocols to write various size of files Let me show the details of requirements
  21. First requirements is that users store data in local storage without any overhead of replication to keep the latency of access low. Users are located geographically separated areas and Users store in local storage. So, storages need to be located in geographically separated areas like this picture
  22.  But, it is required that all users can share the same files. So, it is needed to bundle local storages in one virtual storage.
  23.  If there is one virtual storage, users can share same files. After a user in EU writes a file in local storage, it is replicated to all other storages and user in North America can read the file.
  24.  Third requirement is to support file-based protocols such as CIFS or NFS to write various size of files. The reason why customer requested is to reuse the existing many legacy applications and to avoid high costs and risks while developing new applications. To use storage as back-end of file server, accessing small files in low latency is required
  25. From these requirements, we compared three storage software. First row expresses requirements. Second row expresses features to realize requirements. Third to Fifth rows are the name of comparison of software. We chosen Ceph, GlusterFS and Swift which have global cluster feature. This table shows swift fulfills most requirements. Additionally, since we have knowledge of swift, we planed to use swift.
  26. But, swift has two issues for requirements.
  27.  First issue is about file system interface. Although users need to access by file-based protocols such as CIFS and NFS, Swift supports only Rest API.
  28. Second issue is about small file optimization. To use as file-server, accessing small files in low latency is required. But, swift is not good at processing many small files because swift is optimized for big file to store much data.
  29. So, how to solve the issues? This table shows issues and solutions. Second row of the table shows our solutions. We solved first issue to integrate Cloud Storage Gateway. We solved second issue to add two features, data management with small block size and storage cache
  30. First solution about file system interface is to integrate Cloud Storage Gateway. This is a gateway software which translates cloud storage APIs such as REST API and standard file-based interface such as CIFS and NFS seamlessly. Then users access to the Cloud Storage Gateway by CIFS or NFS, The “Cloud Storage Gateway” transforms user’s request to REST API request. After swift responds the request, the “Cloud Storage Gateway” transforms response of REST API to CIFS or NFS response. Users can access the data in swift by file-based protocols. This is the solution for file system interface issue
  31. Second solutions about small file optimization are data management with small block size and storage cache. Most “Cloud Storage Gateway” has these two features. First is the feature which manages data with small block size. So, the latency to access many small files randomly is optimized Second is storage cache which stores data which users accessed. If “Cloud Storage Gateway” has the file user accesses in storage cache already, it returns the file to the user directly. Since there is no access to swift, the latency is very low
  32. So, does combination of swift and “Cloud Storage Gateway” suit the requirements? Although “Cloud Storage Gateway” solves the two issues, it is important to confirm if “Cloud Storage Gateway” has the feature of clustering because this feature depends “Cloud Storage Gateway” If a “Cloud Storage Gateway” doesn’t support the clustering feature, it distinguishes the benefit of global cluster which all users share same files
  33. We used “Fobas Cloud Storage Cache” which is one of Cloud Storage Gateway. It’s proprietary software. This has many features. Of course, it has the features, data management with small block size and storage cache. Another notable feature is Loosely Cluster.
  34. Loosely Cluster is the feature to bundle file systems of “Fobas Cloud Storage Cache” with virtual one file system, which enables to share same files in multiple locations in “Cloud Storage Gateway” layer. “Fobas Cloud Storage Cache” in multiple locations replicate meta data respectively. So, a user in North America can read the file which a user in Europe creates.
  35. Since “Fobas Cloud Storage Cache” has clustering feature, the combination of Swift and “Fobas Cloud Storage Cache” can realize every requirements. So, we used it
  36. This is a simple architecture. Each location has Swift and “Fobas Cloud Storage Cache”. Since they are clustered and replicate data, all users can share same files.
  37. We did performance test. The table shows the result of comparing with the performance of ordinary fileserver. I’m sorry that I can’t share specific performances. We got the result that the performance is enough good for file server when the protocol is nfs. When the protocol is CIFS, the performance is not so good. But, we can use it as file server which the performance is not high priority.
  38. I want to share three limitations to leverage the solution First is that the performance of CIFS protocol isn’t so good. I think the reason is that the performance is not so tuned so far, since "Cloud Storage Gateway" developers tend to focus on flexibility, scalability and availability. As "Cloud Storage Gateway" software market becomes matured, this problem would be solved. Second is that the performance becomes much worse when data is not on cache. Of course, the reason is that the latency increases by that of swift because "Cloud Storage Gateway" gets data from swift. It is important to design size of storage cache carefully. It is the key that to compare performance and cost efficiency while thinking the tendency of data usage Third is that there is a good and bad point of integration. The good point is that we can get software’s specific features such as Global Cluster. The bad point is that the availability becomes lower because possible failure points increase while the number of components are increasing
  39. At last, let me suggest an idea. The market size of “Cloud Storage Gateway” is increasing rapidly. The market size would increase almost 8000% by this year if compared to the level of 6 years ago. So, I think the demand of the combination of Swift and “Cloud Storage Gateway” is increasing. Although proprietary software have many useful features, it is not optimized for swift. How about developing “Cloud Storage Gateway” optimized for swift? In my opinion, there is a certain demand.
  40. Summary of second use case. This case is integrating swift as back-end storage of file server which has access from users in geographically separated areas. Because swift doesn’t file-based protocols and is not good at processing small files in low latency, we join together swift and “Cloud Storage Gateway” to realize requirements. The performance is enough good Since swift is superior to other software in global cluster feature, to realize file server which has the feature, Swift and “Cloud Storage Gateway” is very good combination