Two days ago I was working for SequenceIQ, as the CTO.
----- Meeting Notes (10/04/15 20:35) -----
SequenceIQ been acquired. Started February, quickly gain trackion around June.
----- Meeting Notes (10/04/15 20:38) -----
We were doing this over and over again. Scripted, Ansible, tried everything and all existing tools.
----- Meeting Notes (10/04/15 20:38) -----
Architecturally most important components
----- Meeting Notes (10/04/15 20:56) -----
Under the hood is built on:
1. cgroup and namespacing capabilities of the Linux kernel
2. Docker image specification - filesystem composed of layers, presented as one cohesive filesystem
Recommended 3.8, works from 2.6.2
3. Libcontainer specification - namespacing, filesystem, resources (cgroups)
----- Meeting Notes (10/04/15 20:56) -----
Docker simplifies things - on one host.
We span up containers remotely on many hosts- how?
Swarm pulls together many Docker engines - presents as one virtual Docker Engine.
----- Meeting Notes (10/04/15 20:56) -----
Steps:
Can span us Docker containers remotely on hosts considering:
1. Resource management - aware of the cluster resources (e.g. can schedule it with bin packing - anywhere where 1GB memory is available) or randomly
2. Constraints using labels (label one node and stsrt the container based on labels)
3. Affinity - containers can be co-scheduled (link, vollumes-from, net=container on the same host)
----- Meeting Notes (10/04/15 21:05) -----
We have a dynamic scaling cluster where nodes are coming/leaving but also failing.
Register services in consul, like Ambari services
Zookeeper, doozerd, etcd – same as Consul, requires a quorom, offer strong consistency, but not datacenter aware
Zookeeper: no service discovery, offers primitive K/V, no DNS, does not go through DC
Zookeeper provides ephemeral nodes – but stil clients need to habe keep-alive connections
Agent – long running daemon, serves DNS and HTTP interface, every node
Client – an agent that forwards all RPC to server. Takes part in LAN gossip
Server - participates in RAFT quorum, responds to RPC, WAN gossip
Datacenter – low latency, high bandwith private network
Gossip – TCP and UDP UNICAST. Usually Broadcast/Multicast does not work in cloud
Strong consistency:
Service catalog stores all the nodes, service instances, health check data, ACLs, and Key/Value information. It is strongly consistent, and replicated using the consensus protocol.
Gossip – eventual consistency, updates to catalog comes through gossip, thus state can lag behind until is reconciled.
Most likely you’ve seen an Ambari session
Its extensible :
Stacks – set of services, multiple versions (e.g. HDP 2.1, HDP 2.2, Bigtop)
Services – e.g HDFS, Kafka, Zeppelin
Views – capability to add visualization, management and monitoring capabilities of a new “application”
Pre-install the server and agents.
Combining all these – welcome Cloudbreak.
Zero configuration way to provision HDP cluters – anywhere by the push of a button, CLI or API. One consistent infrastructure agnostic API.
----- Meeting Notes (10/04/15 21:47) -----
Expand on points
No configuration, need to have a running infrastructure.
Any size - 200 nodes in 8 min.
OAuth2, gateway (Knox will come), TLS
Since YARN - Different services - different instance types: e.g. Spark - high memory, Kafka - high disk thorughput but memory as well to buffer active read/writes
Scale based on load
View from 10000 meter high
Only thing we need is a Docker daemon. All cloud providers are going towards Docker
Kerberos – we take the pain (Dockerized a Kerberos server)
Recipes – built on Consul events, read results from the K/V store
Anybody can push his own plugin: we use plugn – instal lyour plugin, and use it from Cloudbreak
We did different projects, fixed quite a few interesting problems.
Zero config, does not require pre-installation
Can set alarms – based on alarms SLA policies.
----- Meeting Notes (10/04/15 22:04) -----
New features in hadoop 2.6
Our contribution, plus lots of others (move applications between queues), admission control - reserve capacity over time
Most likely Vinod explained all these.