Differentiate between physical and logical cluster components

Cassandra

  • Distributed metadata store, based on heavily modified Apache Cassandra
  • Stores/manages all metadata in ring-like manner
  • Metadata describes where and how data is stored on a filesystem
  • Let’s system know which node/disk and in what form the data resides
  • Runs on every node in the cluster
  • Accessed via Medusa
  • RF used to maintain availability/redundancy
  • Upon write/update, row is written to a node in the ring and then replicated to n number of peers (n depends on cluster size)
  • Majority must agree before commit
  • Each node is responsible for a subset of overall platform metadata
  • Eliminates traditional bottlenecks (served from all nodes vs dual controllers)
  • When cluster scales, nodes are inserted for “block awareness” and reliability
Nutanix Cassandra
Image credit: https://nutanixbible.com
  • Peer replication iterates through nodes in clockwise manner
  • With block awareness, peers are distributed among blocks to ensure no two peers are in same block
  • If a block failure occurs, there are at least two copies of data available

Metadata Awareness Conditions

Fault Tolerance Level 1 (FT1):

Data is Replication Factor 2 (RF2) / Metadata is Replication Factor 3 (RF3)

Fault Tolerance Level 2 (FT2):

Data is Replication Factor 3 (RF3) / Metadata is Replication Factor 5 (RF5)

Zookeeper

  • Cluster configuration manager, based on Apache Zookeeper
  • Configuration including hosts/IPs/state/etc
  • 3 Nodes in cluster; 1 leader
  • Accessed via Zeus
  • Distributed in block-aware manner
  • Ensures availability in event of block failure
  • In the event of a Zookeeper Outage, the ZK role transferred to another node:

Stargate

  • Data I/O Manager
  • Responsible for all data management and I/O
  • Main interface to hypervisors (via iSCSI, NFS, SMB)
  • Runs on every node in the cluster

Curator

  • Map reduce cluster management/cleanup
  • Managing/distributing tasks throughout cluster
  • Disk balancing/Proactive Scrubbing
  • Runs on every node in the cluster; 1 leader

Prism

  • UI/API
  • Runs on every node in the cluster; 1 leader

Genesis

  • Cluster Component/Service Manager
  • Runs on each node, responsible for service interactions (start/stop/etc)
  • Runs independently of cluster
  • Requires Zookeeper

Chronos

  • Job/task Scheduler
  • Takes jobs/tasks from Curator scan and schedules/throttle amongst nodes
  • Runs on every node in the cluster; 1 leader

Cerebro

  • Replication/DR Manager
  • Scheduling snapshots/replication to remote sites/migrations, failovers
  • Runs on every node in the cluster; 1 leader
  • All nodes participate in replication

Pithos

  • vDisk Configuration Manager
  • vDisk (DSF) configuration data
  • Runs on every node in the cluster

Foundation

  • Bootstrap, Imaging, and deployment tool for Nutanix Clusters
  • Imaging process will install the desired version of the AOS software as well as the hypervisor of choice
  • AHV pre-installed by default
  • Foundation must be used to leverage a different hypervisor
  • NOTE: Some OEMs will ship directly from the factory with the desired hypervisor

Disk Partitioning

SSD Devices

  • Nutanix Home (CVM core)
  • Cassandra (metadata storage)
  • OpLog (persistent write buffer)
  • Unified Cache (SSD cache portion)
  • Extent Store (persistent storage)
  • Home is mirrored across first two SSD’s
  • Cassandra is on the first SSD
  • If SSD fails, the CVM will be restarted and Casandra storage will be on second SSD

HDD Devices

  • Curator Reservation (Curator Storage)
  • Extent Store (Persistent Storage)

Leave a Reply

Your email address will not be published. Required fields are marked *