Infraestructura como Código
Tabla de Contenidos
- Propósito
- ¿Para quién es esto?
- Descripción General de Infraestructura
- Estructura de Terraform
- Referencia de Módulos
- Configuración de Entornos
- Gestión de Estado de Terraform
- Operaciones Comunes
- Mejores Prácticas
- Diagrama de Arquitectura
Propósito
Este documento describe la implementación de Infraestructura como Código (IaC) para la plataforma Algesta usando Terraform. Toda la infraestructura de Azure es provisionada y gestionada a través de módulos Terraform, incluyendo clusters AKS, Azure Container Registry, Key Vault, Storage Accounts y bases de datos MongoDB Atlas.
Siguiendo esta guía, entenderás:
- La estructura modular de Terraform usada a través de entornos
- Cómo provisionar y actualizar infraestructura de Azure
- Gestión de estado y configuración de backend
- Configuraciones específicas por entorno y diferencias
- Operaciones comunes y solución de problemas
¿Para quién es esto?
Esta guía es para ingenieros DevOps gestionando infraestructura, ingenieros de plataforma provisionando recursos y SREs solucionando problemas de infraestructura. Asume familiaridad con Terraform, Azure, Kubernetes y conceptos de infraestructura.
Infrastructure Descripción General
The Algesta platform infrastructure consists of:
| Component | Technology | Purpose | Managed By |
|---|---|---|---|
| Kubernetes Cluster | Azure AKS | Container orchestration | Terraform |
| Container Registry | Azure ACR | Docker image storage | Terraform |
| Secrets Management | Azure Key Vault | Credentials and secrets | Terraform |
| Object Storage | Azure Storage Account | Terraform state, application data | Terraform |
| Base de datos | MongoDB Atlas | Primary Base de datos (M0 free tier) | Terraform |
| Load Balancing | AKS Ingress (nginx) | External traffic routing | Kubernetes manifests |
| TLS Certificates | cert-manager + Let’s Encrypt | HTTPS encryption | Kubernetes manifests |
| Monitoring | Grafana + Prometheus + Loki | Observability stack | Helm charts |
Regions:
- Azure Primary Region: East US (configurable)
- MongoDB Atlas Region: US-EAST-1 (AWS)
Terraform Structure
All infrastructure code is located in the ops-algesta/infrastructure/ directory:
ops-algesta/infrastructure/├── envs/│ ├── dev/│ │ └── Azure/│ │ ├── main.tf # Dev environment resources│ │ ├── variables.tf # Dev-specific variables│ │ ├── provider.tf # Azure provider config│ │ └── outputs.tf # Resource outputs│ └── production/│ └── Azure/│ ├── main.tf # Production environment resources│ ├── variables.tf # Production-specific variables│ ├── provider.tf # Azure provider config│ └── outputs.tf # Resource outputs└── modules/ ├── aks/ # Azure Kubernetes Service │ ├── main.tf │ ├── variables.tf │ └── outputs.tf ├── acr/ # Azure Container Registry │ ├── main.tf │ ├── variables.tf │ └── outputs.tf ├── secrets/ # Azure Key Vault │ ├── main.tf │ ├── variables.tf │ └── outputs.tf ├── storage/ # Azure Storage Account │ ├── main.tf │ └── variables.tf ├── database/ │ └── MongoDB/ # MongoDB Atlas │ ├── main.tf │ ├── variables.tf │ └── outputs.tf └── events/ # Azure Event Grid (optional) ├── main.tf ├── variables.tf └── outputs.tfDesign Principles:
- Environment Isolation: Each environment (dev, production) has separate Terraform state and resources
- Module Reusability: Common infrastructure patterns abstracted into reusable modules
- Configuration Management: Environment-specific Valors in
variables.tf, not hardcoded - State Separation: Each environment maintains independent state to prevent cross-environment drift
Modules Reference
1. AKS Module (modules/aks/)
Propósito: Provisions Azure Kubernetes Service cluster with configurable node pools
Key Resources:
azurerm_kubernetes_cluster: Main AKS clusterazurerm_kubernetes_cluster_node_pool: Additional node pools (user workloads)
Configuration Example (from envs/production/Azure/main.tf:20-71):
module "aks" { source = "../../../modules/aks" cluster_name = "aks-${var.project_name}-${var.environment}" location = var.location resource_group_name = var.rg_name dns_prefix = "${var.project_name}-${var.environment}" sku_tier = "Free"
tags = { environment = "${var.environment}" }
default_node_pool = { name = "default" vm_size = "Standard_B2s" node_count = 1 os_disk_size_gb = 30 os_type = "Linux" mode = "System" enable_auto_scaling = true max_count = 1 min_count = 1 }
node_pools = { stdar_pool = { name = "stdar${var.environment}" vm_size = "Standard_B2s" node_count = 1 mode = "User" os_disk_size_gb = 30 os_type = "Linux" enable_auto_scaling = true min_count = 1 max_count = 3 } }}Key Funcionalidades:
- System Node Pool: Required for AKS system services (CoreDNS, Métricas-server)
- User Node Pool: Runs application workloads, supports auto-scaling (1-3 nodes)
- Web App Routing: Enabled for ingress controller integration
- Managed Identity: System-assigned identity for Azure resource access
Outputs:
aks_id: Cluster resource IDkube_config: Kubernetes admin config (sensitive)kubelet_identity_object_id: Identity for ACR pull permissions
2. ACR Module (modules/acr/)
Propósito: Provisions Azure Container Registry for Docker images
Configuration Example (from envs/production/Azure/main.tf:73-84):
module "acr" { source = "../../../modules/acr" acr_name = "acr${var.project_name}${var.environment}" resource_group_name = var.rg_name location = var.location sku = "Basic" admin_enabled = true tags = { environment = "${var.environment}" }}SKU Comparison:
| Feature | Basic | Standard | Premium |
|---|---|---|---|
| Storage | 10 GB | 100 GB | 500 GB |
| Webhook | 2 | 10 | 500 |
| Geo-replication | No | No | Yes |
| Content Trust | No | No | Yes |
| Cost | $ | $$ | $$$ |
Current Setup: Basic tier (dev/prod), upgradeable as needed
ACR Integration with AKS:
resource "azurerm_role_assignment" "aks_to_acr" { scope = module.acr.acr_id role_definition_name = "AcrPull" principal_id = module.aks.kubelet_identity_object_id}This grants AKS nodes permission to pull images from ACR without manual credentials.
3. Key Vault Module (modules/secrets/)
Propósito: Manages secrets, keys, and certificates for the platform
Configuration Example (from envs/production/Azure/main.tf:86-94):
module "keyvault" { source = "../../../modules/secrets" name = "akv-${var.project_name}-${var.environment}" location = var.location resource_group_name = var.rg_name tenant_id = var.tenant_id sku_name = "standard"}Key Funcionalidades (from modules/secrets/main.tf):
- RBAC Authorization:
enable_rbac_authorization = true - Soft Delete: 7-day retention for deleted secrets
- Disk Encryption: Enabled for VM disk encryption keys
- Secret Management: Supports bulk secret creation via
for_each
Typical Secrets Stored:
- MongoDB connection strings (
MONGODB_URI) - JWT signing keys (
JWT_SECRET) - API keys for external services
- TLS certificates (if not using cert-manager)
Access Control:
- Use Azure RBAC roles:
Key Vault Secrets Officer,Key Vault Secrets User - Service principals for CI/CD pipelines
- Managed identities for AKS pods (via Azure Key Vault Provider for Secrets Store CSI Driver)
4. Storage Module (modules/storage/)
Propósito: Provisions Azure Storage Account for Terraform state and application data
Configuration Example (from envs/production/Azure/main.tf:1-17):
module "storage" { source = "../../../modules/storage" name_storage_account = "storage${var.project_name}${var.environment}" resource_group_name = var.rg_name location = var.location containers = [ { container_name = "tf-state" container_access_type = "private" }, { container_name = "data" container_access_type = "private" } ]}Containers:
- tf-state: Stores Terraform state files (backend)
- data: Application file uploads, logs, backups
Storage Configuration (from modules/storage/main.tf):
- Account Tier: Standard (default, configurable to Premium)
- Replication: LRS (Locally Redundant Storage) by default, upgradeable to GRS for production
- CORS: Configurable for web application access
- Access Tier: Hot (for frequently accessed data)
5. MongoDB Atlas Module (modules/Base de datos/MongoDB/)
Propósito: Provisions MongoDB Atlas cluster (M0 free tier)
Configuration Example:
module "mongodb" { source = "../../../modules/database/MongoDB" project_id = var.mongodb_project_id cluster_name_mongodbatlas = var.project_name current_environment = var.environment cluster_type = "REPLICASET"
replication_specs = [ { num_shards = 1 regions_config = [ { region_name = "US_EAST_1" electable_nodes = 3 priority = 7 read_only_nodes = 0 } ] } ]
cloud_backup = true auto_scaling_disk_gb_enabled = true mongo_db_major_version = "7.0" provider_name = "TENANT" # M0 free tier provider_instance_size_name = "M0"
admin_username = var.mongodb_admin_username admin_password = var.mongodb_admin_password cidr_block_mongodbatlas_project_ip_access_list = "0.0.0.0/0" # Restrict in production comment_mongodbatlas_project_ip_access_list = "Allow all (for dev)"}Key Funcionalidades (from modules/Base de datos/MongoDB/main.tf):
- Cluster Type: ReplicaSet (3 nodes for high availability)
- Cloud Backup: Automated snapshots (free tier includes point-in-time restore for 2 days)
- IP Whitelist:
mongodbatlas_project_ip_access_list(currently open, should be restricted) - Base de datos User: Admin user with
readWriteandatlasAdminroles
Production Recommendations:
- Upgrade to M10+ for dedicated resources and advanced Funcionalidades
- Restrict IP whitelist to AKS cluster egress IPs
- Enable VPC peering for private connectivity
- Configure backup schedules and retention policies
Environment Configuration
Development Environment (envs/dev/Azure/)
Characteristics:
- Propósito: Pruebas, development, CI/CD integration
- AKS: 1 system node, 1-3 user nodes (auto-scaling)
- ACR: Basic tier
- MongoDB: M0 free tier, 0.0.0.0/0 IP whitelist
- Cost: Minimal (~$50-100/month)
Namespace Conventions:
development: Application Desplieguesmonitoring: Grafana, Prometheus, Loki
Production Environment (envs/production/Azure/)
Characteristics:
- Propósito: Live customer traffic
- AKS: 1 system node, 1-3 user nodes (auto-scaling)
- ACR: Basic tier (consider Standard for webhooks)
- MongoDB: M0 free tier (upgrade to M10+ recommended)
- Cost: ~$100-200/month (depends on usage)
Namespace Conventions:
production: Application Desplieguesmonitoring: Observability stack
Production-Specific Settings:
- TLS certificates from Let’s Encrypt (via cert-manager)
- Ingress hosts:
algesta-api-prod.3astronautas.com - Auto-scaling enabled for resilience
Environment Variables
Each environment requires these variables (in variables.tf):
variable "project_name" { description = "Project name" type = string default = "algesta"}
variable "environment" { description = "Environment (dev, production)" type = string}
variable "location" { description = "Azure region" type = string default = "East US"}
variable "rg_name" { description = "Resource group name" type = string}
variable "tenant_id" { description = "Azure AD tenant ID" type = string}
variable "mongodb_project_id" { description = "MongoDB Atlas project ID" type = string sensitive = true}
variable "mongodb_admin_username" { description = "MongoDB admin username" type = string sensitive = true}
variable "mongodb_admin_password" { description = "MongoDB admin password" type = string sensitive = true}Setting Variables:
-
Create
terraform.tfvars(git-ignored):environment = "production"rg_name = "rg-algesta-production"tenant_id = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"mongodb_project_id = "xxxxxxxxxxxxxxxxxxxxxxxx"mongodb_admin_username = "admin"mongodb_admin_password = "SecurePassword123!" -
Or set via environment variables:
Ventana de terminal export TF_VAR_mongodb_project_id="xxx"export TF_VAR_mongodb_admin_password="xxx"
Terraform State Management
Backend Configuration
Terraform state is stored in Azure Storage Account (created by the storage module).
Backend Config (in provider.tf):
terraform { backend "azurerm" { resource_group_name = "rg-algesta-production" storage_account_name = "storagealgesta production" container_name = "tf-state" key = "production.tfstate" }}State File Naming:
- Dev:
dev.tfstate - Production:
production.tfstate
State Operaciones
View Current State:
terraform state listInspect Resource:
terraform state show module.aks.azurerm_kubernetes_cluster.aksRemove Resource from State (dangerous):
terraform state rm module.aks.azurerm_kubernetes_cluster.aksImport Existing Resource:
terraform import module.aks.azurerm_kubernetes_cluster.aks \ /subscriptions/xxx/resourceGroups/rg-algesta-production/providers/Microsoft.ContainerService/managedClusters/aks-algesta-productionState Locking
Azure Storage backend supports state locking automatically (via blob lease mechanism). If a lock is stuck:
terraform force-unlock LOCK_IDBest Practices:
- Never edit state files manually
- Use
terraform state mvfor refactoring - Enable versioning on storage account (Azure Blob versioning)
- Restrict access to state storage (RBAC + private Endpoints)
Common Operaciones
Initial Infrastructure Provisioning
1. Clone Repositorio:
git clone https://dev.azure.com/tres-astronautas/Algesta/_git/ops-algestacd ops-algesta/infrastructure/envs/production/Azure2. Authenticate to Azure:
az loginaz account set --subscription "Algesta Production"3. Authenticate to MongoDB Atlas:
export MONGODB_ATLAS_PUBLIC_KEY="xxx"export MONGODB_ATLAS_PRIVATE_KEY="xxx"4. Initialize Terraform:
terraform init5. Plan Changes:
terraform plan -out=tfplan6. Apply Infrastructure:
terraform apply tfplan7. Save Outputs:
terraform output -json > outputs.jsonUpdating Infrastructure
Scenario: Increase AKS node pool max_count from 3 to 5
1. Edit Configuration:
node_pools = { stdar_pool = { ... max_count = 5 # Changed from 3 }}2. Plan and Review:
terraform plan# Review changes, ensure only node pool updated3. Apply Changes:
terraform apply4. Verify in AKS:
az aks show --resource-group rg-algesta-production --name aks-algesta-production \ --query agentPoolProfiles[].maxCountAdding New Secrets to Key Vault
1. Update Module Call:
module "keyvault" { source = "../../../modules/secrets" ... secrets = { "mongodb-uri" = var.mongodb_uri "jwt-secret" = var.jwt_secret "new-api-key" = var.new_api_key # New secret }}2. Add Variable:
variable "new_api_key" { description = "API key for external service" type = string sensitive = true}3. Set Valor in terraform.tfvars:
new_api_key = "sk_live_xxxxxxxxxxxx"4. Apply:
terraform applyDestroying Infrastructure (USE WITH CAUTION)
Development Environment Only:
cd envs/dev/Azureterraform plan -destroyterraform destroy # Requires confirmationProduction:
Never run terraform destroy on production without explicit approval and backup verification.
Best Practices
Security
-
Never Commit Secrets:
- Use
.gitignoreforterraform.tfvars,*.tfstate - Store sensitive variables in Azure Key Vault or environment variables
- Use
-
Use Managed Identities:
- Avoid service principal credentials in code
- Use AKS managed identity for Azure resource access
-
Restrict Network Access:
- Configure MongoDB IP whitelist to AKS egress IPs only
- Use Azure Private Link for Key Vault, ACR, Storage
-
Enable RBAC:
- Use Azure RBAC for resource access control
- Kubernetes RBAC for pod-level permissions
Operaciones
-
Always Plan Before Apply:
Ventana de terminal terraform plan -out=tfplan# Review changes thoroughlyterraform apply tfplan -
Use Workspaces for Isolation:
Ventana de terminal terraform workspace new stagingterraform workspace select production -
Version Pin Providers:
terraform {required_providers {azurerm = {source = "hashicorp/azurerm"version = "~> 3.0"}mongodbatlas = {source = "mongodb/mongodbatlas"version = "~> 1.0"}}} -
Documento Changes:
- Add comments in
main.tffor complex logic - Update this wiki when infrastructure evolves
- Add comments in
CI/CD Integration
- Terraform apply runs in Azure Pipeline
DeployToK8Sstage (future Implementación) - Use pipeline secret variables for credentials
- Implement approval gates for production Terraform changes
Arquitectura Diagram
graph TB
subgraph "Azure Cloud"
subgraph "Resource Group: rg-algesta-production"
AKS[AKS Cluster<br/>aks-algesta-production<br/>Free Tier]
ACR[Azure Container Registry<br/>acralgesta production<br/>Basic SKU]
KV[Key Vault<br/>akv-algesta-production<br/>Secrets + Certificates]
SA[Storage Account<br/>storagealgesta production<br/>LRS]
subgraph "AKS Node Pools"
SYS[System Pool<br/>default<br/>1 node<br/>Standard_B2s]
USER[User Pool<br/>stdarproduction<br/>1-3 nodes<br/>Auto-scaling]
end
subgraph "Storage Containers"
TFS[tf-state<br/>Terraform State]
DATA[data<br/>Application Data]
end
end
end
subgraph "MongoDB Atlas"
MONGO[MongoDB Cluster<br/>M0 Free Tier<br/>US-EAST-1<br/>3-node ReplicaSet]
end
subgraph "Kubernetes Workloads"
NS_PROD[Namespace: production<br/>Microservices]
NS_MON[Namespace: monitoring<br/>Grafana + Prometheus + Loki]
INGRESS[Ingress<br/>nginx + cert-manager<br/>TLS via Let's Encrypt]
end
AKS --> SYS
AKS --> USER
USER --> NS_PROD
USER --> NS_MON
USER --> INGRESS
ACR -->|Image Pull| USER
KV -->|Secrets| USER
SA --> TFS
SA --> DATA
NS_PROD -->|Database Queries| MONGO
INGRESS -->|HTTPS Traffic| NS_PROD
style AKS fill:#0078d4,stroke:#004578,color:#fff
style ACR fill:#0078d4,stroke:#004578,color:#fff
style KV fill:#ffb900,stroke:#c29400,color:#000
style SA fill:#0078d4,stroke:#004578,color:#fff
style MONGO fill:#13aa52,stroke:#0e7a3a,color:#fff
Key Resource Relationships:
- AKS → ACR: Kubelet identity granted
AcrPullrole for image pulls - AKS → Key Vault: Pods access secrets via CSI driver (future Implementación)
- Terraform State → Storage Account: Backend stores state in
tf-statecontainer - Microservicios → MongoDB: Connection string stored in Key Vault, accessed via environment variables
Related Documentoation:
- CI/CD Pipelines: Automated infrastructure provisioning
- Kubernetes Operaciones: AKS cluster management
- Security Operaciones: Key Vault and secret management
- Backup & Disaster Recovery: MongoDB Atlas backup configuration
For Support:
- Check Terraform plan output for errors
- Review Azure Portal for resource Estado
- Verify MongoDB Atlas cluster health in Atlas console
- Contact infrastructure team for access and credentials