How to use cloud-init with Scaleway Instances
Cloud-init is the industry standard for cloud Instance customization. It enables the automation of many aspects at boot time, such as software installation and configuration, disk partitioning and formatting, or even custom commands execution.
Installation
Cloud-init is packaged for most GNU/Linux distributions and part of their default repositories. This means you can simply install the cloud-init package with your system's package manager. Some distributions may require you to manually enable the services after the package installation.
Alternatively, you can grab the unpackaged source code of the latest release on their GitHub repository.
How Cloud-init operates
Cloud-init consists of a series of system services that trigger at different stages of the Instance's boot. Each one of these stages will take part into customizing the system according to the operating system distribution at hand, the cloud provider as well as the user's requirements. To that end, Cloud-init will gather information about the current Instance, combining data from three main sources:
- meta-data: These data represent the Instance's own introspection, as if it would query itself from the API.
- vendor-data: These are configuration data exposed by the cloud provider — They can eventually be overridden by user-data.
- user-data: These are optional data that can be injected by the Instance's owner before it boots.
As a Scaleway Instance user, you will need to leverage user-data to specify the customizations you want to apply to your Instance. They can be specified in a variety of formats, among which the cloud-config one (YAML-based) is the most common. In the Scaleway Instances API, the user-data actually used by Cloud-init are stored under the cloud-init key of the Instance's User Data. The Scaleway datasource will be used by Cloud-init to interact with the Metadata API.
Creating an Instance with user-data
Updating user-data of an existing Instance
Querying meta-data from within the Instance
For any given Instance, it is possible to query its meta-data from the guest system. There are several methods at your disposal:
- Running the
scw-metadatacommand will output all meta-data related to this Instance:$ scw-metadata ... NAME=my-instance-name ... - Providing the name of a meta-data field as an argument to
scw-metadatawill output only the value of this specific field:$ scw-metadata NAME my-instance-name - You can request JSON output with the
scw-metadata-jsoncommand:$ scw-metadata-json {...,"name":"my-instance-name",...} - Alternatively, you can send a raw HTTP request to
http://169.254.42.42/conf(orhttp://[fd00:42::42]/conffor IPv6 connectivity):$ curl http://169.254.42.42/conf ... NAME=my-instance-name ... $ curl http://169.254.42.42/conf?format=json {...,"name":"my-instance-name",...}
Querying user-data from within the Instance
For any given Instance, it is possible to query its user-data from the guest system. There are several methods at your disposal:
- Running the
scw-userdatacommand will output the availableuser_datakeys (USER_DATA_X=) along with their total count (USER_DATA=):# scw-userdata USER_DATA=2 USER_DATA_0=cloud-init USER_DATA_1=ssh-host-fingerprints - Passing a key name as an argument to the command will print the actual data it points to. The user-data consumed by Cloud-init are pointed by the
cloud-initkey:Keep in mind that the# scw-userdata cloud-initcloud-initkey will not be visible from an Instance where no user-data has been configured. - If a second argument is passed, it is considered the new value to set for the given key:
# scw-userdata cloud-init "$(cat my-new-userdata.yaml)" - Alternatively, you can send a raw HTTP request to
http://169.254.42.42/user_data(orhttp://[fd00:42::42]/user_datafor IPv6 connectivity). The request must originate from a port below 1024:# curl --local-port 1-1023 http://169.254.42.42/user_data USER_DATA=2 USER_DATA_0=cloud-init USER_DATA_1=ssh-host-fingerprints # curl --local-port 1-1023 http://169.254.42.42/user_data?format=json {"user_data":["cloud-init","ssh-host-fingerprints"]} # curl --local-port 1-1023 http://169.254.42.42/user_data/cloud-init
Special case: Instances booting with no public IP address
Under normal circumstances, when running on a Scaleway Instance, Cloud-init will expect to be able to query the Metadata API on http://169.254.42.42 (or http://[fd00:42::42] for IPv6) to retrieve the data it needs to apply system customizations. This requires the Instance to be provided with any available kind of public IP address, be it a Flexible IP or a Dynamic IP.
Consequently, Instances booting without any public IP address will not be able to access the Metadata API, preventing Cloud-init to take the expected actions. In order to circumvent this issue, such Instances are provided with an additional storage device containing static files for the NoCloud datasource to take over.
If the Instance has no public IP address when powered on, a CD-ROM drive is plugged to it. This drives serves a medium containing the NoCloud configuration files. The filesystem being labeled cidata, it is automatically detected and consumed by Cloud-init: no manual action is required.
If you have a local shell on your Instance, you can easily notice the presence of the CD-ROM drive:
# grep -C1 'QEMU CD-ROM' /proc/scsi/scsi
Host: scsi2 Channel: 00 Id: 00 Lun: 00
Vendor: QEMU Model: QEMU CD-ROM Rev: 2.5+
Type: CD-ROM ANSI SCSI revision: 05
# blkid -L cidata
/dev/sr0
# mount LABEL=cidata /media
mount: /media: WARNING: source write-protected, mounted read-only.
# find /media/*
/media/meta-data
/media/network-config
/media/user-data
/media/vendor-data
# umount /mediaYou can also confirm that your system was customized using the NoCloud datasource by running the following command:
# cloud-init query cloud_id
nocloud