Skip to content

Unified Lustre on GCP Marketplace

Deploy a Unified Lustre cluster directly from the GCP Marketplace using Google Cloud’s Infrastructure Manager (Terraform). The cluster provides a Lustre 2.17 parallel filesystem alongside NFS, SMB, S3, and SFTP access - all on the same OpenZFS pool, with active-active HA.

  • A GCP project with billing enabled
  • Compute Engine API enabled
  • Cloud Storage API enabled
  • IAM permissions: Compute Admin, Storage Admin, Network Admin, Service Account User
  • For Lustre clients: any Linux distribution and Lustre client version supported by Whamcloud for Lustre 2.17 servers (see the Whamcloud Lustre wiki for the supported matrix)

Step 1: Find Unified Lustre in GCP Marketplace

Section titled “Step 1: Find Unified Lustre in GCP Marketplace”
  1. Go to GCP Marketplace
  2. Search for High Performance Lustre on Object Storage with Active-Active HA
  3. Click Deploy

The marketplace wizard presents the following sections:

FieldDescriptionDefault
Deployment NameUnique name for the cluster (used for all resource naming)

Note: Unified Lustre deployments are always active-active HA - the MDT-on-primary topology requires it. Other deployment types are not exposed in this listing.

FieldDescriptionDefault
Primary ZoneGCE zone for the primary node (hosts MDT plus an OST plus NAS daemons)
Machine TypeVM instance typec3d-standard-90

For high-throughput HPC workloads, use c3d-standard-90 or larger with Tier_1 networking enabled.

FieldDescriptionDefault
Boot disk typeDisk type for OSpd-balanced
Boot disk sizeSize in GB20
FieldDescriptionDefault
Metadata disk typeSSD type for MDTpd-ssd
Metadata Disk Size (GB)Size of the MDT disk - holds Lustre metadata (file/dir entries, extended attributes, layout)50

The MDT lives only on the primary node’s local NVMe SSD. Sized for roughly 1 KB per inode; 50 GB suits a few tens of millions of files.

FieldDescriptionDefault
Cloud Storage SizeLogical storage size per OST (e.g., 1T, 500G, 10T)1T

Each storage node hosts one OST plus dataset capacity for NAS shares. Charges are based on actual GCS usage, not the logical size.

FieldDescriptionDefault
Allow Web UI Access (Port 2020)Opens firewall for Web UItrue
Source IP ranges for Web UIRestrict access to specific IPs0.0.0.0/0

Click Deploy and wait for Infrastructure Manager to provision all resources. This typically takes 3-5 minutes for VM, disk, and bucket provisioning. After provisioning completes, the cluster auto-configures itself in under 2 minutes:

  • MDT created on primary node’s local NVMe SSD
  • OST0 created on primary node (GCS-backed OpenZFS)
  • OST1 created on secondary node (GCS-backed OpenZFS)
  • HeartBeat active-active HA enabled with floating VIPs
  • Lustre filesystem zettafs started across both nodes
  • NAS server daemons (nfsd, smbd, S3 gateway, sshd) started on both nodes for use when you create shares

The Web UI password is auto-generated. Retrieve it from the deployment outputs or from within the instance:

Terminal window
curl -H "Metadata-Flavor: Google" \
http://metadata.google.internal/computeMetadata/v1/instance/attributes/mayanas-cloud_user_password

SSH to the primary node:

Terminal window
gcloud compute ssh mayanas@PRIMARY_NODE --zone=ZONE --project=PROJECT_ID

Run mayacli show lustre to confirm the filesystem is healthy:

Terminal window
sudo mayacli show lustre

Expected output:

Name Status MGS NID Size Used Targets
--------- ----------- ---------------------- --------- --------- -------
zettafs HEALTHY 10.100.198.21@tcp 23.3T 12.0M MDT0000,OST0000,OST0001

Note the MGS NID (e.g., 10.100.198.21@tcp) - clients use this address to mount the Lustre filesystem. It is a floating VIP that survives node failure.

Follow the upstream Whamcloud Lustre client install guide for your distribution:

For most distributions, install the matching kernel-version kmod-lustre-client and lustre-client packages from the Lustre community repository.

For Linux systems with kernel headers and build tools, the zettalane-terraform repo includes install-lustre-client.sh - a helper that builds the Lustre client kernel modules from source on the client.

Terminal window
# On the client machine (requires kernel headers and build tools)
curl -O https://raw.githubusercontent.com/zettalane-systems/zettalane-terraform/main/install-lustre-client.sh
chmod +x install-lustre-client.sh
sudo ./install-lustre-client.sh

The script handles kernel-version-specific module compilation and persists module load across reboots.

Mount Lustre on the client using the MGS NID from mayacli show lustre:

Terminal window
sudo mkdir -p /mnt/lustre
sudo mount -t lustre 10.100.198.21@tcp:/zettafs /mnt/lustre

Verify the mount:

Terminal window
df -h /mnt/lustre
lfs df /mnt/lustre

lfs df shows the breakdown across MDT and OSTs.

Optional: Use NAS Protocols Alongside Lustre

Section titled “Optional: Use NAS Protocols Alongside Lustre”

The same cluster also exposes NFS, SMB, S3, and SFTP. Create shares via the Web UI or mayacli, then access from any client:

Terminal window
# NFS — pool name and dataset come from the marketplace deployment.
# Each node has its own pool (e.g., <deployment>-pool-node1 and <deployment>-pool-node2)
# and the default dataset injected by the marketplace package is named data1.
sudo mount -t nfs <VIP>:/<POOL_NAME>/data1 /mnt/nfs
# SMB (Linux clients)
sudo mount -t cifs //<VIP>/<SHARE_NAME> /mnt/smb -o credentials=/path/to/creds
# SMB (Windows clients)
\\<VIP>\<SHARE_NAME>
# S3 (any S3-compatible client)
aws s3 --endpoint http://<VIP>:9000 ls s3://<BUCKET_NAME>/
# SFTP
sftp mayanas@<VIP>

NFS, SMB, S3, and SFTP shares live on their own ZFS datasets in the same OpenZFS pool that hosts the Lustre OST. They share the same active-active HA failover via the same floating VIPs. Run mayacli show pool and mayacli show shares on a storage node to list the deployment’s actual pool names, dataset names, and share names.

Terminal window
# Write 10 GB to Lustre
sudo dd if=/dev/zero of=/mnt/lustre/test bs=1M count=10240 conv=fsync
# Read back
sudo dd if=/mnt/lustre/test of=/dev/null bs=1M
# Cleanup
sudo rm /mnt/lustre/test
  • 2 Compute Engine instances (active-active HA pair)
  • 2 GCS buckets, one per node (used by both the OST and NAS datasets)
  • 1 SSD persistent disk for the MDT (primary node)
  • Service account with Compute / Storage / Network Admin roles
  • Firewall rules (SSH, Lustre 988/tcp, NFS 2049, SMB 445, S3 9000, optional Web UI 2020)
  • VIP alias IP range for HA failover
Terminal window
# Lustre filesystem status
sudo mayacli show lustre
# Per-target status
sudo lctl dl
# Network connectivity from client to MGS
sudo lctl ping <MGS_NID>
# Server logs
sudo tail -f /var/log/mayanas-lustre.log
# Cluster failover status
sudo mayacli show failover