Cloud Networking & Storage
VPCs, subnets, S3 buckets, and block storage — the plumbing of every cloud application. Here's how data moves and where it lives, explained without drowning in acronyms.
The startup that got a $47,000 bill
A developer at a startup spun up a test database on AWS. He chose the wrong storage class — provisioned IOPS SSD instead of general purpose. He forgot about it. Three weeks later, the AWS bill arrived: $47,000. For a test database nobody was using.
Cloud storage is cheap. Cloud storage you do not understand is very, very expensive. And cloud networking — VPCs, subnets, security groups — is the invisible plumbing that connects everything. Get it wrong, and your database is exposed to the internet. Get it right, and attackers cannot reach it even if they know the IP address.
This module is the "save yourself from a surprise bill" guide. By the end, you will be able to design a VPC with public and private subnets, choose the right storage type and class for any workload, and audit a cloud bill for waste.
You saw VPCs and security groups briefly in Module 4 (cloud security). You learned about storage and networking as core service categories in Module 1. Now we go hands-on with the plumbing — how it actually works, what costs what, and where the $47,000 mistakes hide.
Cloud networking: your private internet
When you deploy an application in the cloud, it needs a network — just like your home needs Wi-Fi. But in the cloud, you build the network yourself.
Virtual Private Cloud (VPC)
A VPC is your own private section of the cloud. Think of it as renting a floor in an office building — you share the building with other tenants, but your floor is completely private. You control who gets in, where the walls go, and how rooms connect.
| Concept | What it is | Analogy |
|---|---|---|
| VPC | Your isolated network in the cloud | Your private floor in the building |
| Subnet | A section of your VPC | A room on your floor |
| Public subnet | A subnet accessible from the internet | A room with a window facing the street |
| Private subnet | A subnet NOT accessible from the internet | An interior room with no windows |
| Route table | Rules for where traffic goes | Hallway signs pointing to rooms |
| Internet Gateway | Connects your VPC to the internet | The front door of the building |
| NAT Gateway | Lets private subnets access the internet without being accessible from it | A one-way mirror — you can see out, nobody can see in |
Step 1: Create a VPC with a CIDR block (your address range, e.g., 10.0.0.0/16 = 65,536 addresses)
Step 2: Create subnets — public ones for web servers, private ones for databases
Step 3: Attach an Internet Gateway for public internet access
Step 4: Configure route tables so traffic flows correctly
Step 5: Set up security groups (firewall rules per instance) and NACLs (firewall rules per subnet)
There Are No Dumb Questions
Why would I put anything in a private subnet?
Databases, internal APIs, and backend services should never be directly accessible from the internet. A private subnet means attackers cannot reach them even if they know the IP address. Your web server in the public subnet talks to the database in the private subnet — but the internet cannot.
What is a CIDR block?
It is a way of defining IP address ranges. 10.0.0.0/16 means "all addresses starting with 10.0" — that gives you 65,536 addresses. 10.0.1.0/24 means "all addresses starting with 10.0.1" — that gives you 256 addresses. The smaller the number after the slash, the bigger the network.
Design a VPC
25 XPYou are deploying a web application with a frontend, an API server, and a database. Design the network: 1. How many subnets do you need? Which are public, which are private? 2. Where does the frontend go? The API? The database? 3. What connects the frontend to the internet? 4. How does the API talk to the database if they are in different subnets? 5. Can the database access the internet? Should it?
Sign in to earn XPLoad balancers and CDNs
Load balancers distribute incoming traffic across multiple servers. If one server is overloaded or crashes, the load balancer sends traffic to the others. Like a restaurant host seating guests at different tables instead of cramming everyone at table 1.
CDNs (Content Delivery Networks) cache your content at edge locations around the world. A user in Tokyo gets your website from a server in Tokyo, not from Virginia. Faster load times, lower bandwidth costs. Remember from Module 3 where we discussed edge locations as "pop-up kiosks" — CDNs are the service that uses those edge locations.
DNS services translate human-readable domain names (like theocto.net) into IP addresses that computers use. Every cloud provider offers a managed DNS service. You register your domain, point it at your cloud resources, and the DNS service handles the rest — including health checks that can automatically route traffic away from unhealthy endpoints.
| Service | AWS | Azure | GCP |
|---|---|---|---|
| Load balancer | ALB / NLB | Azure Load Balancer | Cloud Load Balancing |
| CDN | CloudFront | Azure CDN | Cloud CDN |
| DNS | Route 53 | Azure DNS | Cloud DNS |
The names differ across providers, but the concepts are identical — exactly as Module 2 explained. If you learn how VPCs and load balancers work on AWS, you can apply the same mental model to Azure VNets and GCP VPCs within days.
Cloud storage: where your data lives
Cloud storage comes in three flavors. Choosing the wrong one is how you get a $47,000 bill.
Object storage (S3 / Blob / GCS)
What it is: Store any file — images, videos, backups, logs — as objects in buckets. No folder hierarchy (it is faked with prefixes). Unlimited capacity.
When to use it: Static assets, backups, data lakes, website hosting, media files.
Key feature: Storage classes for cost optimization:
| Storage class (AWS) | Use case | Cost per GB/month |
|---|---|---|
| S3 Standard | Frequently accessed data | ~$0.023 |
| S3 Infrequent Access | Data accessed less than once/month | ~$0.0125 |
| S3 Glacier | Long-term archive (retrieval takes hours) | ~$0.004 |
| S3 Glacier Deep Archive | Compliance archives (retrieval takes 12+ hours) | ~$0.00099 |
The 23x cost difference between Standard and Deep Archive is why lifecycle policies matter. If you store 10 TB of logs in Standard when they should be in Glacier, you are paying $230/month instead of $40. Over a year, that is $2,280 wasted — on a single bucket.
Block storage (EBS / Managed Disks)
What it is: Virtual hard drives that attach to virtual machines. Fast, consistent performance. Fixed size — you pay for the capacity you provision, not what you use.
When to use it: Operating system drives, databases, applications that need low-latency disk access.
The $47,000 mistake: Provisioned IOPS SSD (io1/io2) costs 10-50x more than general purpose (gp3). Only use provisioned IOPS for mission-critical databases that need guaranteed performance.
✗ Object storage (S3)
- ✗Store any file type as objects in buckets
- ✗Virtually unlimited capacity
- ✗Pay for what you store
- ✗Best for: media files, backups, data lakes
- ✗Access via HTTP API
✓ Block storage (EBS)
- ✓Virtual hard drives attached to VMs
- ✓Fixed size — pay for provisioned capacity
- ✓Fast, consistent low-latency performance
- ✓Best for: databases, OS drives
- ✓Access via file system (mount as a drive)
File storage (EFS / Azure Files / Filestore)
What it is: Shared file systems that multiple servers can access simultaneously. Like a shared network drive.
When to use it: Applications where multiple servers need to read/write the same files (content management, shared configurations). If you have ever used a shared network drive at work, file storage is the cloud equivalent.
Cost note: File storage is the most expensive of the three types per GB. Use it only when you genuinely need shared access. If each server can work with its own copy, block storage is cheaper. If the files are static assets, object storage is cheapest.
Pick the right storage
25 XPClassify each scenario into the best storage type. **Categories:** Object storage (S3/Blob), Block storage (EBS), File storage (EFS), S3 Glacier 1. 10 million product images for an e-commerce site → ___ 2. A PostgreSQL database for your main application → ___ 3. 5 years of financial records you must keep for compliance but rarely access → ___ 4. Log files that three different servers need to write to simultaneously → ___ 5. A machine learning training dataset of 500GB that you access weekly → ___ *Hint: Images and backups are objects. Databases need low-latency block storage. Compliance archives go to Glacier. Shared files across servers need file storage.*
Sign in to earn XPCost optimization: do not be the $47,000 guy
Cloud cost optimization is not a one-time exercise — it is a continuous practice. The pricing strategies you learned in Module 1 (on-demand, reserved, spot) combine with the storage classes above and the resource management habits below to form a complete cost picture. The best cloud engineers review costs weekly.
Right-size instances: Most cloud VMs run at 10-20% CPU utilization. Downsize them. Use monitoring tools to check actual usage.
Use reserved instances: If you know you will need a server for 1-3 years, reserved pricing saves 30-72% over on-demand.
Lifecycle policies: Automatically move old S3 objects to cheaper storage classes. After 30 days to Infrequent Access, after 90 days to Glacier.
Set billing alerts: ALWAYS set a budget alert. "Email me when spending exceeds $100/month." This prevents surprises.
Delete unused resources: Unattached EBS volumes, idle load balancers, orphaned snapshots — they all cost money silently.
There Are No Dumb Questions
"Should I always choose the cheapest storage class?"
No. Glacier is cheap to store data but expensive to retrieve it — retrieval from Deep Archive can take 12+ hours and costs $0.02/GB. If you need to access your data regularly, the "cheaper" class can end up costing more. Match the storage class to your access pattern, not just the price per GB. This is exactly the kind of trade-off tested on both the AWS Cloud Practitioner (Module 5) and AZ-900 (Module 6) exams.
"Do I really need a NAT Gateway? They cost $32/month even when idle."
Only if your private subnet instances need to reach the internet (for software updates, API calls, etc.). If your private instances only talk to other services within your VPC, you can skip the NAT Gateway and save $32/month. For many small deployments, VPC endpoints are a cheaper alternative for reaching AWS services like S3 and DynamoDB without internet access.
Optimize this cloud bill
50 XPYour company's monthly cloud bill is $8,500. Here is the breakdown: - EC2 instances: $4,200 (10 instances, all on-demand, avg 15% CPU utilization) - EBS storage: $1,800 (includes 500GB of provisioned IOPS SSD for a test database) - S3 storage: $600 (2TB of logs in Standard, never accessed after 7 days) - Data transfer: $900 (serving images from us-east-1 to users worldwide) - Other: $1,000 (3 idle load balancers, 5 unattached EBS volumes) Write a cost optimization plan. For each line item, suggest a specific change and estimate the savings.
Sign in to earn XPNext up: You have the technical foundation — cloud concepts, architecture, security, certification prep, networking, and storage. In the final module, you will turn all of this into a career plan — which roles exist, what they pay, and exactly how to land your first cloud job.
Key takeaways
- A VPC is your private network in the cloud — public subnets face the internet, private subnets do not
- Security groups and NACLs are your cloud firewalls — control what traffic goes where
- Object storage (S3) is for files, block storage (EBS) is for disks, file storage (EFS) is for sharing
- S3 storage classes save money: Standard for frequent access, Glacier for archives
- Load balancers distribute traffic, CDNs speed up delivery worldwide
- Set billing alerts, right-size instances, delete unused resources — 30% of cloud spend is waste
- Both the AWS Cloud Practitioner and AZ-900 exams test storage and networking concepts heavily — this module is direct exam prep
- The $47,000 bill happened because nobody understood storage types. Now you do.
Knowledge Check
1.What is the purpose of putting a database in a private subnet?
2.You have 2TB of log files in S3 Standard that are never accessed after 7 days. What should you do?
3.What is a NAT Gateway used for?
4.Why is 30% of cloud spending considered waste?
Want to go deeper?
💻 Software Engineering Master Class
The complete software engineering program — from your first line of code to landing your first job.
View the full program