Unified Lustre on GCP Marketplace
Deploy a Unified Lustre cluster directly from the GCP Marketplace using Google Cloud’s Infrastructure Manager (Terraform). The cluster provides a Lustre 2.17 parallel filesystem alongside NFS, SMB, S3, and SFTP access - all on the same OpenZFS pool, with active-active HA.
Prerequisites
Section titled “Prerequisites”- A GCP project with billing enabled
- Compute Engine API enabled
- Cloud Storage API enabled
- IAM permissions: Compute Admin, Storage Admin, Network Admin, Service Account User
- For Lustre clients: any Linux distribution and Lustre client version supported by Whamcloud for Lustre 2.17 servers (see the Whamcloud Lustre wiki for the supported matrix)
Step 1: Find Unified Lustre in GCP Marketplace
Section titled “Step 1: Find Unified Lustre in GCP Marketplace”- Go to GCP Marketplace
- Search for High Performance Lustre on Object Storage with Active-Active HA
- Click Deploy
Step 2: Configure Deployment
Section titled “Step 2: Configure Deployment”The marketplace wizard presents the following sections:
Deployment Configuration
Section titled “Deployment Configuration”| Field | Description | Default |
|---|---|---|
| Deployment Name | Unique name for the cluster (used for all resource naming) | — |
Note: Unified Lustre deployments are always active-active HA - the MDT-on-primary topology requires it. Other deployment types are not exposed in this listing.
Compute Configuration
Section titled “Compute Configuration”| Field | Description | Default |
|---|---|---|
| Primary Zone | GCE zone for the primary node (hosts MDT plus an OST plus NAS daemons) | — |
| Machine Type | VM instance type | c3d-standard-90 |
For high-throughput HPC workloads, use c3d-standard-90 or larger with Tier_1 networking enabled.
Boot Disk
Section titled “Boot Disk”| Field | Description | Default |
|---|---|---|
| Boot disk type | Disk type for OS | pd-balanced |
| Boot disk size | Size in GB | 20 |
Metadata Disk (MDT)
Section titled “Metadata Disk (MDT)”| Field | Description | Default |
|---|---|---|
| Metadata disk type | SSD type for MDT | pd-ssd |
| Metadata Disk Size (GB) | Size of the MDT disk - holds Lustre metadata (file/dir entries, extended attributes, layout) | 50 |
The MDT lives only on the primary node’s local NVMe SSD. Sized for roughly 1 KB per inode; 50 GB suits a few tens of millions of files.
Cloud Storage (OSTs)
Section titled “Cloud Storage (OSTs)”| Field | Description | Default |
|---|---|---|
| Cloud Storage Size | Logical storage size per OST (e.g., 1T, 500G, 10T) | 1T |
Each storage node hosts one OST plus dataset capacity for NAS shares. Charges are based on actual GCS usage, not the logical size.
Networking
Section titled “Networking”| Field | Description | Default |
|---|---|---|
| Allow Web UI Access (Port 2020) | Opens firewall for Web UI | true |
| Source IP ranges for Web UI | Restrict access to specific IPs | 0.0.0.0/0 |
Step 3: Deploy
Section titled “Step 3: Deploy”Click Deploy and wait for Infrastructure Manager to provision all resources. This typically takes 3-5 minutes for VM, disk, and bucket provisioning. After provisioning completes, the cluster auto-configures itself in under 2 minutes:
- MDT created on primary node’s local NVMe SSD
- OST0 created on primary node (GCS-backed OpenZFS)
- OST1 created on secondary node (GCS-backed OpenZFS)
- HeartBeat active-active HA enabled with floating VIPs
- Lustre filesystem
zettafsstarted across both nodes - NAS server daemons (nfsd, smbd, S3 gateway, sshd) started on both nodes for use when you create shares
Post-Deployment
Section titled “Post-Deployment”Retrieve Access Credentials
Section titled “Retrieve Access Credentials”The Web UI password is auto-generated. Retrieve it from the deployment outputs or from within the instance:
curl -H "Metadata-Flavor: Google" \ http://metadata.google.internal/computeMetadata/v1/instance/attributes/mayanas-cloud_user_passwordVerify Cluster State
Section titled “Verify Cluster State”SSH to the primary node:
gcloud compute ssh mayanas@PRIMARY_NODE --zone=ZONE --project=PROJECT_IDRun mayacli show lustre to confirm the filesystem is healthy:
sudo mayacli show lustreExpected output:
Name Status MGS NID Size Used Targets--------- ----------- ---------------------- --------- --------- -------zettafs HEALTHY 10.100.198.21@tcp 23.3T 12.0M MDT0000,OST0000,OST0001Note the MGS NID (e.g., 10.100.198.21@tcp) - clients use this address to mount the Lustre filesystem. It is a floating VIP that survives node failure.
Install the Lustre Client
Section titled “Install the Lustre Client”Follow the upstream Whamcloud Lustre client install guide for your distribution:
For most distributions, install the matching kernel-version kmod-lustre-client and lustre-client packages from the Lustre community repository.
Alternative: ZettaLane helper script
Section titled “Alternative: ZettaLane helper script”For Linux systems with kernel headers and build tools, the zettalane-terraform repo includes install-lustre-client.sh - a helper that builds the Lustre client kernel modules from source on the client.
# On the client machine (requires kernel headers and build tools)curl -O https://raw.githubusercontent.com/zettalane-systems/zettalane-terraform/main/install-lustre-client.shchmod +x install-lustre-client.shsudo ./install-lustre-client.shThe script handles kernel-version-specific module compilation and persists module load across reboots.
Mount the Lustre Filesystem
Section titled “Mount the Lustre Filesystem”Mount Lustre on the client using the MGS NID from mayacli show lustre:
sudo mkdir -p /mnt/lustresudo mount -t lustre 10.100.198.21@tcp:/zettafs /mnt/lustreVerify the mount:
df -h /mnt/lustrelfs df /mnt/lustrelfs df shows the breakdown across MDT and OSTs.
Optional: Use NAS Protocols Alongside Lustre
Section titled “Optional: Use NAS Protocols Alongside Lustre”The same cluster also exposes NFS, SMB, S3, and SFTP. Create shares via the Web UI or mayacli, then access from any client:
# NFS — pool name and dataset come from the marketplace deployment.# Each node has its own pool (e.g., <deployment>-pool-node1 and <deployment>-pool-node2)# and the default dataset injected by the marketplace package is named data1.sudo mount -t nfs <VIP>:/<POOL_NAME>/data1 /mnt/nfs
# SMB (Linux clients)sudo mount -t cifs //<VIP>/<SHARE_NAME> /mnt/smb -o credentials=/path/to/creds
# SMB (Windows clients)\\<VIP>\<SHARE_NAME>
# S3 (any S3-compatible client)aws s3 --endpoint http://<VIP>:9000 ls s3://<BUCKET_NAME>/
# SFTPsftp mayanas@<VIP>NFS, SMB, S3, and SFTP shares live on their own ZFS datasets in the same OpenZFS pool that hosts the Lustre OST. They share the same active-active HA failover via the same floating VIPs. Run mayacli show pool and mayacli show shares on a storage node to list the deployment’s actual pool names, dataset names, and share names.
Smoke Test
Section titled “Smoke Test”# Write 10 GB to Lustresudo dd if=/dev/zero of=/mnt/lustre/test bs=1M count=10240 conv=fsync# Read backsudo dd if=/mnt/lustre/test of=/dev/null bs=1M# Cleanupsudo rm /mnt/lustre/testWhat Gets Deployed
Section titled “What Gets Deployed”- 2 Compute Engine instances (active-active HA pair)
- 2 GCS buckets, one per node (used by both the OST and NAS datasets)
- 1 SSD persistent disk for the MDT (primary node)
- Service account with Compute / Storage / Network Admin roles
- Firewall rules (SSH, Lustre 988/tcp, NFS 2049, SMB 445, S3 9000, optional Web UI 2020)
- VIP alias IP range for HA failover
Troubleshooting
Section titled “Troubleshooting”# Lustre filesystem statussudo mayacli show lustre
# Per-target statussudo lctl dl
# Network connectivity from client to MGSsudo lctl ping <MGS_NID>
# Server logssudo tail -f /var/log/mayanas-lustre.log
# Cluster failover statussudo mayacli show failover