It Architecture Introduction

The goal of this post is to present the general idea of what a company's IT architecture looks like in order to give a little perspective on its complexity. It should be useful to someone who is starting to work in a CSIRT environment. The idea is to be complete without going too much into the details of how each part of the IT environment is interconnected with one another.

I would like in the future to explore the possibility to rebuild a temporary infrastructure automatically in the cloud using infrastructure as code (IaC) to make the ransomware attack have less impact on business operations... This will be the subject of a future post.

2 points I'd like to raise:

I haven't really tested this idea of automating the reconstruction of the IT infrastructure. I only think it's worth considering as it may save IT staff a lot of time.
Terraform is meant to manage infrastructure components (servers, storage, networking devices, etc.). To manage the OS and applications, you may want to use another tool (such as Ansible, Puppet).

The Business Organization

Below is the typical organization of a business. It is divided into 3 parts:

The core business which is actually creating the goods or providing services to the customers. It's often subdivided between different Business Units depending on the size of the business.
The sales team (including pre-sales, marketing, etc.) manages relations with the clients.
The support parts of the organization are here to help the organization running. This includes HR, Legal, IT, Finance, Facility, etc.

I also took the liberty to divide the IT team into 3 parts. This is because I think this can be a good idea to seggregate their accesses if the IT department is large enough that it makes sense.

Basic business organization

Basic Network Zones

For security and IT management purposes, it is good practice to divide the network into multiple zones:

The internet represents the "outside", it is both for visitors accessing the company's website and for internal users to access external websites.
The user zone encompasses the free wifi for visitors, the employees on-site or connected remotely using a VPN or a VDI.
The frontend is the first zone between the internet and the users or the servers. It usually contains network devices such as firewalls and load balancers.
The DMZ contains servers which can be accessed from both the internet and the users. It's where the webservers and mailservers are located.
The backend cannot be accessed directly from outside the organization's network. The backup servers, the databases or the management consoles are located there.

I also added an internal DMZ where softwares which needs to be accessed internally (e.g. HR softwares). These services are usually not exposed to the internet.

Basic network zones

Advanced Network architecture

Here is a more detailed view of the networks you could have in the different server zones (DMZ, Frontend, Backend). There are different services for both the clients (e.g. the external website) and the employees.

More advanced network architecture

First Steps After an Attack

In this case, we'll assume 2 things:

We've already extracted the necessary forensic data from devices of interest and isolated any service which was not encrypted.
All backups were destroyed/encrypted and from the first findings of the forensic investigation, the attackers were prensent in the network for months and there are backdoors everywhere.

Since we have lost almost everything, we come to the conclusion that we need to rebuild the entire IT infrastructure from scratch.

The first few steps we need to take should create a new work environment for the IT team:

Create a new network zone for IT admins so that they can access both the internet and administration consoles without being part of the domain (some sort of guest wifi for admins).
Create new blank workstations which are not in the Active Directory. They will be used by the admins.
List the number of appliances you have (e.g. firewalls, router, switches, workstations) as well as the services you need to put back on (e.g. the ERP, mail servers).
Draw a new network architecture for all these appliances and services.

Once this has been done, we can really start working again. However, you first need to define with the management what needs to be prioritized. For example, they may want to rebuild the payroll systems first because we're at the end of the month but if we're at the beginning of the month, it may be in the company's best interest to start by rebuilding services supporting the core business.

Automate the Reconstruction

Automating the reconstruction of a complex IT environment may not be interesting for a single company. However, it could become interesting for a consulting firm as they could have to repeat this process multiple times.

The idea is to list the networks, users, groups, IP addresses, servers, ... which should be created automatically and then generate the infrastructure automatically to enable the business operations to continue as much as possible. This would however be temporary as the on-premise infrastructure will be used again after everything comes back to normal (in order to save money as upfront investment on the infrastructure on-premise has already been made).

Azure resource group hierarchy (©️ Azure)

Before you can create anything using Terraform, you need to create a resource group which will contain the resources you create. Here's how you can do this:


resource "azurerm_resource_group" "core" {
  name     = "core-resource-group"
  location = "West Europe"
}

Here is a list of subnets to create:

IP	Purpose
10.0.0.0/24	Active Directory Domain Controllers
10.0.1.0/24	Email Servers
10.0.2.0/24	File Servers
10.0.3.0/24	Print Servers

And here is the Terraform code to build them:


resource "azurerm_virtual_network" "core" {
  name                = "core-network"
  location            = azurerm_resource_group.core.location
  resource_group_name = azurerm_resource_group.core.name
  address_space       = ["10.0.0.0/16"]

  subnet {
    name           = "prod-ad-subnet"
    address_prefix = "10.0.0.0/24"
    description    = "Active Directory Domain Controllers"
  }
  subnet {
    name           = "prod-exchange-subnet"
    address_prefix = "10.0.1.0/24"
    description    = "Email Servers"
  }
  subnet {
    name           = "prod-share-subnet"
    address_prefix = "10.0.2.0/24"
    description    = "File Servers"
  }
  subnet {
    name           = "prod-print-subnet"
    address_prefix = "10.0.3.0/24"
    description    = "Print Servers"
  }

  tags = {
    environment = "Production"
  }
}

Here is a list of servers to create:

IP	Name	Purpose
10.0.0.2	WIN-DC-01	Active Directory Domain Controller
10.0.0.3	WIN-DC-01	Active Directory Domain Controller
10.0.1.1	WIN-MAIL-01	Email Server
10.0.2.1	WIN-FILE-01	File Server
10.0.2.2	WIN-FILE-02	File Server

And below is the Terraform code to build those servers. However, it will not configure them.


resource "azurerm_network_interface" "example" {
  name                = "example-nic"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.example.id
    private_ip_address_allocation = "Static"
    private_ip_address            = "10.0.2.5"
  }
}

resource "azurerm_windows_virtual_machine" "example" {
  name                = "example-machine"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  size                = "Standard_F2"
  admin_username      = "adminuser"
  admin_password      = "P@$$w0rd1234!"
  network_interface_ids = [
    azurerm_network_interface.example.id,
  ]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2016-Datacenter"
    version   = "latest"
  }
}

Note: this is a little introduction on the possibilities of IaC for disaster recovery. hopefully, we'll explore this in another post in the future.

Application Infrastructure Complexity

We've seen how we can quickly deploy and configure servers and network appliances. However, you need to be careful and not try to automate everything immediately after an attack as it may take too much time and make the company lose money by not getting back to business quickly enough.

To better showcase the complexity an application can have, here is a diagram showing how a website could be implemented (infrastructure wise) in a company:

Complexity in the IT architecture for a website

Conclusion

While we can somewhat quickly redeploy some services (and especially the infrastructure supporting them), we cannot easily be able to have all our services running as they did before the attack. This endeavor will take time a lot of time which means the decision makers at your company have to prioritize the most essential services to rebuild first.