Role of Secondary Storage in CloudStack Fundamentals

CloudStack Secondary Storage serves as the persistent repository for all non-volatile virtual assets within a cloud ecosystem; this includes virtual machine templates, ISO images, and volume snapshots. In the architectural hierarchy, while Primary Storage provides the high-performance, low-latency disk space required for active virtual machine instances, Secondary Storage functions as the regional library that ensures data durability and portability across multiple clusters. The primary engineering problem addressed by this component is the decoupling of virtual machine definitions from specific compute nodes. By utilizing a centralized secondary tier, administrators can achieve high availability and rapid disaster recovery. Without a functional secondary storage layer, the cloud orchestration engine cannot deploy new instances or move data between different storage pools. This infrastructure component relies on the Secondary Storage Virtual Machine (SSVM) to act as a proxy, managing data transfer requests and ensuring that the internal payload of each template is correctly verified and stored.

Technical Specifications

| Requirement | Default Port/Operating Range | Protocol/Standard | Impact Level (1-10) | Recommended Resources |
| :— | :— | :— | :— | :— |
| NFS Storage Server | TCP 111, 2049 | NFSv3 / NFSv4 | 10 | 4 vCPU / 8GB RAM / 10Gbps |
| SSVM Appliance | TCP 3922 (SSH) | KVM / Xen / VMware | 9 | 1 vCPU / 2GB RAM |
| Management Link | TCP 8080, 8443 | HTTPS / REST API | 8 | Low Overhead |
| Network Bandwidth | 1Gbps to 40Gbps | IEEE 802.3ad | 7 | Category 6a or Fiber |
| Disk I/O Throughput | 150+ MB/s | SATA III / NVMe | 8 | Enterprise SSD / RAID 10 |

The Configuration Protocol

Environment Prerequisites:

Before initiating the deployment of CloudStack Secondary Storage, ensure the host environment adheres to the following standards. The physical storage server must run a stable Linux distribution such as RHEL 8 or Ubuntu 22.04 LTS. All network interfaces involved in storage traffic should be configured for Jumbo Frames (MTU 9000) to minimize header overhead and prevent packet-loss during high-volume template transfers. You must possess root-level permissions on the storage node and administrative access to the CloudStack Management Server. Ensure that the Network Time Protocol (NTP) is synchronized across all nodes to prevent cryptographic handshake failures caused by clock drift.

Section A: Implementation Logic:

The engineering design of the secondary storage system revolves around the concept of idempotent operations. When a template is requested, the SSVM retrieves the file via HTTP, HTTPS, or FTP and stores it on the NFS mount point. The system avoids direct mounting from the Management Server to the storage array; instead, it uses the SSVM as a specialized worker. This encapsulation ensures that if a storage-heavy operation encounters a bottleneck or a mechanical failure, the core management plane remains responsive. Furthermore, the design accounts for signal-attenuation and network latency by allowing administrators to place secondary storage instances geographically close to their respective zones, reducing the time required for template replication.

Step-By-Step Execution

Step 1: Initialize the NFS Export on the Storage Node

On the storage server, execute mkdir -p /export/secondary followed by chown -R root:root /export/secondary. Edit the /etc/exports file to include the line /export/secondary *(rw,async,no_root_squash,no_subtree_check). Apply these changes by running exportfs -a.
System Note: This action triggers the kernel-level NFS daemon to register the directory in the export table. The no_root_squash flag is critical as it allows the SSVM to perform operations with root privileges over the network, which is required for template manipulation.

Step 2: Configure the Firewall for Storage Traffic

Run the command firewall-cmd –permanent –add-service=nfs followed by firewall-cmd –reload. Ensure that the RPC bind and mountd services are also permitted through the security layer.
System Note: This modifies the iptables or nftables chains within the Linux kernel to allow incoming packets on the specific ports required for the NFS protocol. Failure to open these ports will result in a mount timeout and an SSVM transition to an “Error” state.

Step 3: Seed the System VM Templates

Navigate to the management server and execute the seeding script: /usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt -m /mnt/secondary -u [URL] -h kvm. Use the specific URL for the system VM template provided by the CloudStack documentation.
System Note: This script performs a checksum verification of the downloaded payload before expanding the template into the secondary storage path. It ensures that the initial SSVM has a bootable image ready for deployment.

Step 4: Add Secondary Storage to the CloudStack UI

Log into the CloudStack Management Console and navigate to Infrastructure > Secondary Storage. Select “Add Secondary Storage” and provide the NFS path in the format nfs://[IP-ADDRESS]/export/secondary.
System Note: The Management Server sends an API call to the database to register the storage entry. It then instructs the orchestration engine to spin up an SSVM. The SSVM will attempt to mount this NFS path using its internal helper scripts.

Step 5: Verify SSVM Connectivity and Status

Run ssh -i /var/cloudstack/management/.ssh/id_rsa -p 3922 root@[SSVM-IP] to access the internal appliance. Inside the VM, run /usr/local/cloud/systemvm/ssvm-check.sh.
System Note: This diagnostic script audits the internal state of the SSVM. It checks for DNS resolution, connectivity to the management server, and the health of the NFS mount point.

Section B: Dependency Fault-Lines:

The most common point of failure is a mismatch in MTU settings between the SSVM and the physical switch. If the storage network utilizes Jumbo Frames but the SSVM is restricted to a standard 1500 MTU, large packets will be dropped, leading to severe packet-loss and template corruption. Another frequent bottleneck occurs when the storage server experiences high thermal-inertia; as drive temperatures rise during intensive write-cycles, disk controllers may throttle throughput to protect hardware integrity. Ensure adequate cooling and monitor the iostat output to identify physical disk contention.

The Troubleshooting Matrix

Section C: Logs & Debugging:

The primary log for auditing storage operations on the management server is located at /var/log/cloudstack/management/management.log. Search for the string “StoragePool” or “SecondaryStorageVM” to identify lifecycle failures. On the SSVM itself, the most pertinent data is found in /var/log/cloud/cloud.log.

If the SSVM remains in a “Starting” state for more than ten minutes, inspect the following:
1. Storage Access: Can the management server ping the storage IP? If not, check for physical signal-attenuation or incorrect VLAN tagging on the switch ports.
2. DNS Failures: If the SSVM cannot resolve download.cloudstack.org, it cannot fetch the required templates. Check the /etc/resolv.conf within the SSVM.
3. Permissions: Check the NFS server logs at /var/log/messages. Portmap or mountd errors usually indicate that the SSVM IP is not allowed in the /etc/exports configuration.
4. Resource Exhaustion: Monitor the storage server’s RAM. High concurrency in template downloads can lead to an Out-of-Memory (OOM) kill of the nfsd process.

Optimization & Hardening

Performance Tuning:
To increase throughput and reduce latency, tune the NFS mount options within the CloudStack global settings. Set the nfs.ops.timeout to a duration that accounts for slower mechanical drives. Increase the rsize and wsize to 1048576 to optimize the payload size of each network transaction. This reduces the number of round-trips required for large file transfers.

Security Hardening:
Limit the NFS export access strictly to the Management Network CIDR and the Pod Network CIDR. Implement an aggressive firewall policy on the storage server that drops all non-NFS traffic. Use SSH keys for SSVM access and disable password-based authentication. Periodically audit the /var/lib/cloudstack/management/.ssh/ directory to ensure that only authorized keys are present for management-to-SSVM communication.

Scaling Logic:
As the cloud environment grows, a single NFS mount point may become a bottleneck. CloudStack supports multiple secondary storage mirrors. When the primary secondary storage reaches 85% capacity, add a new pool in a different physical rack. This provides redundancy and allows for parallel template synchronization across different availability zones.

The Admin Desk

How do I clear a “stalled” template download?
Access the template_store_ref table in the CloudStack database. Identify the row with the “DOWNLOAD_ERROR” state and set the download_state to “NOT_DOWNLOADED” to force the SSVM to restart the process.

Why is my SSVM showing a 0% disk usage after mounting?
This usually indicates the SSVM has mounted the local filesystem of its own appliance rather than the remote NFS export. Verify the mount command parameters in the SSVM logs and ensure the NFS server is reachable via the storage network.

Can I use S3 instead of NFS for Secondary Storage?
Yes; CloudStack supports S3-compliant object storage. This transition shifts the storage logic from file-level protocols to object-based APIs, which significantly improves scalability and reduces the overhead associated with maintaining a traditional NFS server.

How does network latency affect template deployment?
High latency between the SSVM and the primary storage results in slow “Copy Volume” operations. If latency exceeds 50ms, the SSVM may timeout. Ensure that secondary storage is connected via high-speed, low-latency switching fabric.

What happens if the Secondary Storage goes offline?
Existing virtual machines will continue to run without interruption. However, you will be unable to deploy new instances, take snapshots, or perform volume migrations until the storage connectivity is restored and the SSVM resumes its heartbeats.

Leave a Comment