This blog was authored by: Jose Mendes | 29 October 2024
The preview of the Terraform provider for Microsoft Fabric was announced at FabConEurope and it was one of my favourite announcements. Using Infrastructure as Code (IaC) is essential for any organisation aiming to effectively and efficiently manage their cloud resources. Organisations can leverage Terraform capabilities to, among others, enhance their governance and compliance processes, automate complex deployments or even quickly setup their infrastructure. Before proceeding to the setup, it’s important to highlight that at the time of writing, the provider is experimental and should not be used in a production environment. You can find the Hashicorp documentation here.
Pre-Requisites
- A Fabric capacity
- A service principal to be added to the fabric capacity as a capacity administrator
- Terraform >= 1.8x
- VS Code with Terraform extension
- Azure CLI for authentication with User context when the service principal is not supported
Setup
I started this exercise by using a service principal and later changed to the authentication with User context due to existing limitations. The setup to both scenarios and the steps required to configure the Fabric capacity are detailed in the documentation mentioned above. Although not required, I decided to use Terraform modules, This offers several advantages, such as reusability, maintainability, scalability and ultimately, enforces best practices by standardising the code and reducing the risk of misconfigurations. The project is structured as follows: .terraform – this folder is generated by Terraform once the init command is executed. It contains a list of the modules available in the project and the providers used. In this instance, I’m using the version 0.1.0-beta.4. modules – contains the modules used in the project. Each module represents a Fabric item. In total, the provider contains till date 22 resources. As new Fabric REST APIs are released, the number is expected to grow. Each module contains a main.tf, outputs.tf and variables.tf. Although the provider is specified in the primary main.tf file, due to a conflict with the providers, we must specify the fabric provider for each module. In some cases, the same module uses multiple resources, like the workspace one. main.tf outputs.tf variables.tf platform – this folder contains the definition of the objects to be deployed. I successfully uploaded a spark notebook to my Fabric workspace, however, when attempting to do the same with a data pipeline I wasn’t so lucky. If the pipeline contains an activity that references an existing connection (e.g. lakehouse), the deployment is not successful. One positive aspect is that it is possible to update parameters during deployment. pipeline-content.json fabric_playground.ipynb locals.tf – used to define local values, which are essentially named expressions or variables that can be reused within a configuration. main.tf – used as the primary configuration file where the core infrastructure resources, data sources, and providers are defined. secret.tfvars – this file can be used to store the secret values, such as the client secret for the service principal. The file can be invoked during the Terraform plan and apply like this: terraform plan -var-file=”secret.tfvars”. terraform.tfstate – stores information about the resources managed by Terraform. This file is essential for Terraform to track and manage resources across different runs. terraform.tfvars – used to define values for input variables declared in the configuration files (*.tf). variables.tf – used to define input variables that allow the configuration to be more flexible, reusable, and parameterised. Once all the elements are configured, all that is required is to run the three familiar commands: terraform init, terraform plan -var-file=”secret.tfvars” and terraform apply -var-file=”secret.tfvars”. In the image below it’s possible to identify the items that were deployed via the service principal and via the user account.
Final Considerations
The provider is fairly basic on some of the items, like the Warehouse resource where only a display name can be passed, and some resources need to be improved, however, it’s a first great step towards a more robust approach to manage and deploy a Fabric based data platform. As always, If you would like to understand how Telefónica Tech can help accelerate your data potential, please get in touch here.