A quick primer on DNS and how AWS Route53 resolver & Hybrid DNS works with sample architecture

Sudhir Kumar
8 min readMar 26, 2023

This post will cover details about DNS Architecture, Route53 DNS, DNS Security and how to resolve on-premises DNS zones/records from AWS and Route53 private hosted zones from on-premises.

Route53 Hybrid DNS is a powerful tool that enables you to efficiently manage your DNS infrastructure by combining both on-premises and cloud DNS services. It is designed to work seamlessly with Amazon Web Services (AWS) and on-premises DNS servers, providing you with a highly reliable and scalable DNS service. Hybrid DNS helps to streamline your DNS management process, providing you with a unified view of your DNS infrastructure.

Will discuss below topics in detail :

  1. About DNS and types of DNS Servers
  2. DNS Records types
  3. How DNS resolution works (high level)
  4. DNS Solution (High Available DNS architecture - on-premises)
  5. Alternate method to resolve DNS queries in AWS
  6. Route53 Resolver (How it works)
  7. Outbound Resolution from AWS to on-premies
  8. Inbound Resolution from On-Premises to Route53 private hosted zones
  9. Route53 Resolver Query Logging
  10. DNS Error Types and Alerting
  11. AWS DNS/Route53 Security

What is DNS and types of DNS servers ?

DNS is decentralized hierarchical naming system that resolves domain-name to IP address and IP address to domain-name (reverse) resolution as well.

Types of DNS Servers:

  1. Authoritative DNS : These DNS servers holds actual DNS record i.e. source of truth.
  2. Recursive DNS : It acts as intermediate server that can help with DNS resolution by passing query to other authoritative servers. Recursive DNS and local DNS can also cache records.
  3. Caching DNS : We can setup dedicated Caching DNS servers i.e. to cache DNS records ; It improves DNS latency and reduce load on Authoritative DNS.
  4. Forwarding DNS : These servers forward traffic to another DNS servers. It checks local cache and if no results then just send traffic to another DNS server. Difference between Authoritative and Forwarder DNS is :- Forwarder will not try to figure out the actual results but just forward request to another DNS server.

DNS Record Types:

This list of DNS record types is an overview of resource records (RRs) permissible in zone files of the Domain Name System (DNS).

TYPE            value and meaning

A a host address

NS an authoritative name server

MD a mail destination (Obsolete - use MX)

MF a mail forwarder (Obsolete - use MX)

CNAME the canonical name for an alias

SOA marks the start of a zone of authority

MB a mailbox domain name (EXPERIMENTAL)

MG a mail group member (EXPERIMENTAL)

MR a mail rename domain name (EXPERIMENTAL)

NULL a null RR (EXPERIMENTAL)

WKS a well known service description

PTR a domain name pointer

HINFO host information

MINFO mailbox or mail list information

MX mail exchange

TXT text strings

How DNS resolution works (high level) :

Ec2 (Client) -> DNS Resolver IP (Recursive + Caching) -> Root Nameserver -> Top Level Domain Server -> Authoritative Name Server.

Created sequence diagram using mermaid.js

Client side can cache DNS records. In few cases, organizations also uses ETP (Enterprise Threat Protection) DNS Endpoints. This is for secure internet access and can analyze activities related to DNS.

ETP or external DNS resolvers can also cache DNS records. Create automation to flush DNS based on your requirement as and when needed.

DNS Solution (High Available DNS architecture for on-premises):

Explaining below diagram:

VM is sending traffic to highly available DNS endpoints i.e. Load balancer Virtual Interface (multi region) and redirecting traffic to Recursive DNS stacks (multi region).

  1. Directional DNS is forwarding traffic to these regions based on source VM location. If source VM is in Oregon then it will send request to Oregon Recursive DNS stack and if it’s Virginia then traffic will redirect to virginia.
  2. So, Load balancer and DNS needs to be smart and have that knowledge to forward traffic based on the requester’s origin (same concept as in cloudfront).
  3. Once request received by Recursive DNS ; it will forward request to another Load balancer Virtual Interface i.e. for Authoritative.
  4. DNS Authoritative will send response back to recursive ; Recursive DNS will cache record and send response back to the client.
  5. Caching at recursive is helpful and alleviate more request to Auth DNS.

You should have automation in place to flush DNS cache in all Recursive/Caching DNS servers (if needed).

Logs can be enabled at each layer and send it to SIEM tool for visualization.

Alternate method to resolve DNS in AWS :

Deploy DNS proxies in each account/region to resolve traffic from on-premises to AWS and vice versa.

You can forward queries from on-premises to cloud using any resolver like bind/unbound in order to resolve private hosted zones.

Issue :

→ Dependency on ec2 proxies uptime.

→ Patching ec2 for any security bugs/vulnerabilities and share them with multiple accounts.

→ Redeploying and make sure they come up with same IP address (by assigning them same network interface). You will face some downtime during AMI upgrade.

→ Cost of running/managing/monitoring proxies in 100+ AWS accounts can be costly and challenging.

This solution might be fine in few scenarios in which you want to take full control of DNS.

Route53 Resolver (how it works):

Inbound and Outbound DNS endpoints : These endpoints will be deployed in main central DNS account.

  • Deploying R53 Resolver endpoints create private IP addresses in each AZ (or you can deploy as per your need). Make sure you have routes via these internal IP addresses to on-premises DNS resolver and vice-versa.
Source — AWS

Route53 Resovler honors TTL (time to live) set by Authoritative but it should not be more than 5 mins. If it’s more than 5 mins then TTL will expire after 5 mins.

Outbound resolution (from AWS to on-premies):

Traffic flow:

Ec2 Instance → Default VPC DNS IP (i.e. VPC CIDR IP + 2) → Resolver Rule sharing (Main DNS account) → R53 Outbound endpoint → On-premises DNS endpoint (via IPSEC/Directconnect) -> DNS Resolution by on-premises Authoritative and return answer to ec2 via same path in reverse.

Created sequence diagram using mermaid.js

Create Route53 Resolver rules and share it with participant accounts. Best way is to share resolver rules with organisation ID rather than each account ID. Using AWS Organization ID provides ease of management and rollout to all current and future accounts under same Org.

You can also add DNS exclusion (if needed).

Attach resolver rules in participant accounts and test all DNS records with Default VPC DNS IP.

Once your change is confirmed, you can use DHCP association to start using default DNS VPC IP.

Checkout if you have any custom settings :-

  • Check PEERDNS configuration in your eth0/eth1 configuration. If PeerDNS=yes then DNS settings will not get overwrite and VM might use old/static DNS resolvers in resolv.conf.

Inbound Resolution (from on-premises to Route53 private hosted zones):

Inbound resolution is for resolving Route53 private hosted zones from on-premises.

Traffic Flow : VM (on-premises) → on-premises DNS Recursive → R53 Inbound DNS endpoint (over IPSEC/Directconnect) → R53 resolver at VPC CIDR +2 → R53 resolver resolves the query and return answer to client via same path in reverse.

Created sequence diagram using mermaid.js

Make sure all private hosted zones DNS records are resolvable from centralized DNS account.

Inbound endpoints allow DNS queries to your VPC from another VPC or on-premises.

Route53 Resolver Query Logging

Enable Route53 resolver query logging at account level. You can also share logging configuration with other accounts via Resource Access Manager.

You can log the following DNS queries:

  • Queries that originate in Amazon Virtual Private Cloud VPCs that you specify, as well as the responses to those DNS queries.
  • Queries from on-premises resources that use an inbound Resolver endpoint.
  • Queries that use an outbound Resolver endpoint for recursive DNS resolution.
  • Queries that use Route 53 Resolver DNS Firewall rules to block, allow, or monitor domain lists.

You can forward these logs to Centralized Cloudwatch or S3 bucket. Afterwards, you can run analytics tool or also send it to your SIEM for further analysis / create dashboards. You can monitor various metrics for troubleshooting such as DNS Response Code/ Response data / Source etc.

DNS Error Types and Alerting:

  1. NXDOMAIN: Domain name does not exist.
  2. SERVFAIL: DNS server is unable to resolve the query due to an error.
  3. REFUSED: DNS server refuses to process the query, often due to a lack of authorization.
  4. NOERROR: Query is successful and an answer is found.
  5. FORMERR: DNS server was unable to interpret the query due to a formatting error.
  6. NOTIMP: DNS server does not support the requested query type.
  7. YXDOMAIN: Domain name already exists.
  8. YXRRSET: Resource record set already exists.
  9. NOTAUTH: DNS server is not authoritative for the requested zone.
  10. NOTZONE: Name is not within the zone.

Enable alerting for all critical metrics related to DNS.

  1. Query Volume: The number of DNS queries received by the DNS server over a period of time.
  2. Latency: The time it takes for a DNS query to receive a response.
  3. Query Error Rate: The percentage of DNS queries that result in an error (e.g. NXDOMAIN, SERVFAIL).
  4. Cache Hit Ratio: The percentage of DNS queries that are resolved from the DNS server’s cache.
  5. Zone Transfer Activity: The frequency and volume of zone transfers between DNS servers.
  6. Server Availability: The percentage of time the DNS server is available and responsive.
  7. Resource Utilization: The amount of CPU, memory, and disk space used by the DNS server.
  8. Network Traffic: The amount of incoming and outgoing network traffic generated by the DNS server.

AWS DNS/Route53 Security:

DNS Security in underrated yet very important. Sharing few important tips:

  1. Use Enterprise threat protector (ETP) endpoints for external DNS resolution. ETP provides policy-based defense against phishing, malware, ransomware, DNS tunneling, and other threat events.
  2. Enable Logging/monitoring at each level to diagnose/debug/traceback issues.
  • Enable Route53 resolver query logging (log DNS queries)
  • Enable Cloudtrail logging (log all Route53 resolver API calls)
  • Enable AWS Config Logs (Log AWS resources timeline)
  • Route53 IAM access (examine cross account IAM roles/ Federated User Access / App running on ec2 with route53 profile attached).
  • Enable Guardduty to check any DNS related threats.

3. Route53 Resolver DNS firewall : It can be enabled at Org level via AWS Firewall Manager. DNS Firewall provides filtering for outbound DNS queries that pass through the Route 53 Resolver from applications within your VPCs. You can also configure DNS Firewall to send custom responses for queries to blocked domain names.

Thank you for taking the time to read this post. If this post was helpful, please click the clap 👏 button below a few times to show your support for the author 👇

--

--

Sudhir Kumar

Working as Cloud lead/Architect with security mindset.