Good afternoon, everyone. Thank you for coming to this session.
We are from NTT DATA, system integrater in Japan.
Last autumn, we had OpenStack summit in Japan, my home country and I spend very exciting time there.
And I’m happy to join again to the OpenStack summit here, Austin.
As mentioned in some keynote sessions, like Jonathan’s one, now, OpenStack is coming to the next stage to support diversity of IT systems which running on OpenStack.
Today, I’ll share you our experience in the Swift project, putting focus on the problem we face. I’ll talk about our real cases, and how we tried to integrate Swift to some customer’s project in them, and I’m happy if it can be a help to enhance support of swift for diversity of IT systems.
Let me start with the disclaimer of this presentation
Before talking detailed contents, let me shortly introduce ourselves.
We three are all platform engineers in NTT DATA. I’m Takashi Kajinami. I’ve worked in the Swift and OpenStack project in NTT DATA.
…
NTT DATA is a system integrator, so we provide IT systems for our customers, and also develop some features which is required in our customers’ usecase.
We belongs to OSS professional sector in NTT DATA, working about cloud technologies like OpenStack, Swift, Sheepdog, Docker and so on, and data processing technologies like PostgreSQL, Hadoop, Spark and bla bla bla.
We are especially responsible for cloud platforms using OpenStack technologies, and working to provide private cloud by OpenStack, and cloud storage using OpenStack Swift.
That is the reason why we’ll talk about Swift.
Here I show you the agenda for our presentation.
First, I will shortly explain what is swift, and the reason why we use swift for our system integration.
Then, Masaaki and Masahiro explain what kind of problems we faced in our swift projects, and introduce our approach in these project, with some detailed information about two real use cases.
Finally, I’ll summarize and share what we learned from these cases.
OK. So Swift.
How many people here know Swift?
How many people are using Swift, or have provided system using Swift?
OpenStack Swift is a part of OpenStack project, and Storage project.
Swift realizes distributed object storage like Amazon S3.
Object storage is a new style storage with different interface from conventional block storage or fliesystems.
Swift provides RESTful API and clients can upload data into storage as PUT request, and download data into storage as GET request.
These REST API works on HTTP Protocol, so swift is often used as archiving storage for web contents like photo or video or backup data.
Swift has many good features, but I don’t have so much time to explain today, so I’ll talk about its three key features, durability, scalability and no vendor lock-in.
The first key feature is durability.
Swift make some copies of data in storage cluster, for example 3 copies, and distribute copies over devices, nodes, racks.
So, even if some parts of swift cluster fails, you can protect data from failures and continue to access all data with remaining copies.
In addition, swift also automatically detects disk defeats, and heal missing data copies in other devices working properly.
From Grizzly/Havana release, I think very long time has passed since theses get released, swift gets “Global Cluster” features, which enables geographically distributed storage cluster over multiple datacenters to realize disaster recovery.
Disaster recovery is one of topics most interesting to especially Japanese customers, after our experience of big disasters like earthquake, tsunami and so on.
The second feature is scalability.
Swift distributes data over multiple devices, and when you add new devices, it rebalances data to new nodes.
So, we can enlarge capacity and improve performance of swift cluster by adding new swift nodes.
We can extend the storage from small capacity like 10TB, to huge capacity like dozens of PB
In addition, there are no limitations on the number of devices you can add, so we can extend swift clusters flexibly.
You can add capacity as much as you need, when you need, and you can adopt you storage to unpredictable market situation with effective cost.
The last one is “no vendor lock-in.”
Swift is open source software, which works on python framework, and you can drive it on commodity I/A servers and linux.
You don’t need any special devices for swift, and you can select cost-effective hardware to construct huge storage.
In addition, you can flexibly mix some types of servers in a single cluster, so you can add latest hardware to existing cluster consist of old servers.
You can add latest servers when you extend, and, on the other hand, can remove old servers when they get broken after their maintenance period.
So, you can keep your swift cluster for a long time, regardless of the maintenance period of servers, with replacing old servers to new ones.
So now, I’ve explained about three key features of swift.
Swift realizes very durable and scalable storage, and makes us free from any vendor lock-ins.
As I talked in the beginning of this session, we are working as System Integrator, and unfortunately this mismatch between existing systems and swift happens very often for us.
What can we do for that?
OK, thank you takashi, I’m Masaaki Nakagawa software platform engineer at NTTDATA.
Today, We share you two case of use case of swift.
Now, I start to share case 1, use swift as backup storage of legacy backup server.
One day, a customer came to us to talk about new project.
From these requirements, we define some points for design.
These are points.
「Backed-up data size is from hundreds TB to 1 PB」 leads scalable
We thought that these points are satisfied by swift.
Next, we discussed swift integrated system design.
This figure shows that system design overview which is satisfying design points.
For convenience, this figure set two site
User system, backup server, swift proxy, and swift storage is deployed each site.
Backup server gets backed-up data from user system,
Backup server PUT or GET backed-up data to/from each proxy which deployed same site.
Swift proxy is set affinity configuration to operation PUT or GET data to same site swift storage.
Swift storage is set region configuration each site individually.
We thought that this design is enough for requirement.
But unfortunately, this design was difficult to realize,
because of low compatibility of backup server and swift.
Backup server doesn’t support swift API.
And this server requires to set server-specific virtual tape device on block storage.
To use swift as backup storage, we have to mount swift to local file system as block storage.
Like this case, we often receive RFP form customer that
using swift as backup storage of backup software which does not ready for swift.
Use case of backup storage is suite for swift but we have to give up to use swift because by legacy backup software.
To avoid such a unbearable give up, we should prepare something workaround.
As workaround, we have experimentally tried to mount swift to filesystem by using cloudfuse.
I would like to introduce it.
Cloudfuse is OSS which enable swift to mount as block storage to filesystem.
If you create/delete/update files,
cloudfuse changes these operation to swift API request.
If you want to know about detail of cloudfuse, please refer this web site.
In this case, we set cloudfuse like this figure.
By this architecture, we succeeded to mount swift and make backup server read mount point.
So we tried to backup process to get issue point of this architecture.
By trying, we get two issues to proceed backup process.
The first is fail initializing virtual tape device.
Virtual tape device initializing process creates temporary file and rename it.
But cloudfuse doesn’t support rename operation.
So to achieve this issue, we need to improve cloudfuse, or choose other same component like cloudfuse, or do workaround.
In this time, we choose workaround, create virtual tape device at other location and move it.
The second is swift doesn’t support append object data.
During backup process, backup server appends backup data to virtual tape device.
But swift doesn’t support append operation.
To achieve this, we used DLO plugin.
We can apply DLO plugin to appending operation.
By achieving these issues, we have gotten one pass to backup.
But if using this architecture for commerce,
we need to more detail analyzing
Second case is to use swift as storage for file server.
We are asked to renew file server which has access from users in geographically separated areas.
There are three requirements.
First, Users store data in local storage without any overhead with replication
Second, All users can share same files
Third, Support file-based protocols to write various size of files
Let me show the details of requirements
First requirements is that users store data in local storage without any overhead of replication to keep the latency of access low.
Users are located geographically separated areas and Users store in local storage.
So, storages need to be located in geographically separated areas like this picture
But, it is required that all users can share the same files. So, it is needed to bundle local storages in one virtual storage.
If there is one virtual storage, users can share same files.
After a user in EU writes a file in local storage, it is replicated to all other storages and user in North America can read the file.
Third requirement is to support file-based protocols such as CIFS or NFS to write various size of files. The reason why customer requested is to reuse the existing many legacy applications and to avoid high costs and risks while developing new applications. To use storage as back-end of file server, accessing small files in low latency is required
From these requirements, we compared three storage software. First row expresses requirements. Second row expresses features to realize requirements. Third to Fifth rows are the name of comparison of software. We chosen Ceph, GlusterFS and Swift which have global cluster feature. This table shows swift fulfills most requirements. Additionally, since we have knowledge of swift, we planed to use swift.
But, swift has two issues for requirements.
First issue is about file system interface. Although users need to access by file-based protocols such as CIFS and NFS, Swift supports only Rest API.
Second issue is about small file optimization. To use as file-server, accessing small files in low latency is required. But, swift is not good at processing many small files because swift is optimized for big file to store much data.
So, how to solve the issues? This table shows issues and solutions. Second row of the table shows our solutions.
We solved first issue to integrate Cloud Storage Gateway.
We solved second issue to add two features, data management with small block size and storage cache
First solution about file system interface is to integrate Cloud Storage Gateway. This is a gateway software which translates cloud storage APIs such as REST API and standard file-based interface such as CIFS and NFS seamlessly.
Then users access to the Cloud Storage Gateway by CIFS or NFS, The “Cloud Storage Gateway” transforms user’s request to REST API request. After swift responds the request, the “Cloud Storage Gateway” transforms response of REST API to CIFS or NFS response.
Users can access the data in swift by file-based protocols.
This is the solution for file system interface issue
Second solutions about small file optimization are data management with small block size and storage cache. Most “Cloud Storage Gateway” has these two features. First is the feature which manages data with small block size. So, the latency to access many small files randomly is optimized
Second is storage cache which stores data which users accessed.
If “Cloud Storage Gateway” has the file user accesses in storage cache already, it returns the file to the user directly. Since there is no access to swift, the latency is very low
So, does combination of swift and “Cloud Storage Gateway” suit the requirements?
Although “Cloud Storage Gateway” solves the two issues, it is important to confirm if “Cloud Storage Gateway” has the feature of clustering because this feature depends “Cloud Storage Gateway”
If a “Cloud Storage Gateway” doesn’t support the clustering feature, it distinguishes the benefit of global cluster which all users share same files
We used “Fobas Cloud Storage Cache” which is one of Cloud Storage Gateway. It’s proprietary software.
This has many features. Of course, it has the features, data management with small block size and storage cache. Another notable feature is Loosely Cluster.
Loosely Cluster is the feature to bundle file systems of “Fobas Cloud Storage Cache” with virtual one file system, which enables to share same files in multiple locations in “Cloud Storage Gateway” layer. “Fobas Cloud Storage Cache” in multiple locations replicate meta data respectively. So, a user in North America can read the file which a user in Europe creates.
Since “Fobas Cloud Storage Cache” has clustering feature, the combination of Swift and “Fobas Cloud Storage Cache” can realize every requirements. So, we used it
This is a simple architecture. Each location has Swift and “Fobas Cloud Storage Cache”.
Since they are clustered and replicate data, all users can share same files.
We did performance test.
The table shows the result of comparing with the performance of ordinary fileserver. I’m sorry that I can’t share specific performances.
We got the result that the performance is enough good for file server when the protocol is nfs. When the protocol is CIFS, the performance is not so good. But, we can use it as file server which the performance is not high priority.
I want to share three limitations to leverage the solution
First is that the performance of CIFS protocol isn’t so good. I think the reason is that the performance is not so tuned so far, since "Cloud Storage Gateway" developers tend to focus on flexibility, scalability and availability. As "Cloud Storage Gateway" software market becomes matured, this problem would be solved.
Second is that the performance becomes much worse when data is not on cache. Of course, the reason is that the latency increases by that of swift because "Cloud Storage Gateway" gets data from swift. It is important to design size of storage cache carefully. It is the key that to compare performance and cost efficiency while thinking the tendency of data usage
Third is that there is a good and bad point of integration. The good point is that we can get software’s specific features such as Global Cluster. The bad point is that the availability becomes lower because possible failure points increase while the number of components are increasing
At last, let me suggest an idea. The market size of “Cloud Storage Gateway” is increasing rapidly. The market size would increase almost 8000% by this year if compared to the level of 6 years ago.
So, I think the demand of the combination of Swift and “Cloud Storage Gateway” is increasing. Although proprietary software have many useful features, it is not optimized for swift. How about developing “Cloud Storage Gateway” optimized for swift? In my opinion, there is a certain demand.
Summary of second use case.
This case is integrating swift as back-end storage of file server which has access from users in geographically separated areas.
Because swift doesn’t file-based protocols and is not good at processing small files in low latency, we join together swift and “Cloud Storage Gateway” to realize requirements. The performance is enough good
Since swift is superior to other software in global cluster feature, to realize file server which has the feature, Swift and “Cloud Storage Gateway” is very good combination