Agenda
- What's up with Datacenters these days?
- Apache Mesos vs. Apache Hadoop/YARN?
- Why would you want/need both?
- Resource Sharing with Apache Myriad
What's running on your datacenter?
- Tier 1 services
- Tier 2 services
- High Priority Batch
- Best Effort, backfill
Requirements
- Programming models based on resources,
not machines
- Custom resource types
- Custom scheduling algorithms:
Fast vs. careful/slow
- Lightweight executors, fast task launch time
- Multi-tenancy, utilization, strong isolation
- Preemption/oversubscription, fault-tolerance
Hadoop and More
- Support Hadoop/BigData ecosystem
- Support arbitrary (legacy) processes/containers
- Connect Big Data to non-Hadoop apps,
share data, resources
Mesos from 10,000 feet
| Open Source Apache project |
| Cluster Resource Manager |
| Scalable to 10,000s of nodes |
| Fault-tolerant, no SPOF |
| Multi-tenancy, Resource Isolation |
| Improved resource utilization |
Mesos is more than
Yet Another Resource Negotiator
| Long-running services; real-time jobs |
| Native Docker; cgroups for years; Isolate cpu/mem/disk/net/other |
| Distributed systems SDK; ~200 loc for a new app |
| Core written in C++ for performance, Apps in any language |
Why two resource managers?
Static Partitioning sucks
- Hadoop teams fine with isolated clusters,
but Ops team unhappy; slow to provision
- Resource silos, no elasticity
- Want to run Hadoop on the same infrastructure,
without interrupting Tier-1 services
- Want multi-tenancy, resource sharing/isolation
Myriad Overview
- Mesos Framework for Apache YARN
- Mesos manages DC, YARN manages Hadoop
- Coarse and fine grained resource sharing
Myriad improves Mesos
| Tighter integration with Hadoop frameworks like HBase, Hive, Pig |
| Borrow resources from Hadoop when traffic spikes for tier-1 services |
| Backfill unused resource capacity with best-effort Hadoop jobs |
| No Mesos code changes necessary |
Myriad improves Hadoop
| Elastic scaling |
| Fault-tolerant: Maintain NM capacity |
| Share resources with other workloads, improve resource utilization |
| High SLA hadoop jobs unaffected |
| No YARN/Hadoop code changes |
Other Features
- RM discovery using Marathon/Mesos-DNS
- Distribution of hadoop binaries
- User Interface
- Ability to launch Job History Server
(in progress)
- Myriad scheduler HA, task reconciliation
(in progress)
- Your favorite feature here!