Working with Files & Data
Reading files, writing CSVs, parsing JSON, and calling APIs — how to get real-world data in and out of your Python programs.
The intern who automated 4 hours of work in 12 lines
In 2019, a marketing intern at a mid-size e-commerce company was given a daily task: download a CSV report from the analytics dashboard, open it in Excel, filter out rows where revenue was below $10, calculate the daily average, and email a summary to the team. It took about 4 hours every day — downloading, clicking, copying, pasting, formatting.
After learning basic Python, the intern wrote 12 lines of code that did the entire job in 3 seconds. Download the file, filter it, calculate the average, format the summary. The manager was so impressed that the intern was promoted within three months.
That intern did not use machine learning. Did not use AI. Just read a file, processed the data, and wrote the output. The most practically valuable Python skill is not fancy algorithms — it is moving data in and out of files and APIs.
In Module 5, you learned to organize data in memory using lists and dictionaries. But when your program ends, that data vanishes. This module teaches you to make data permanent — reading it from files and writing it back.
Reading and writing text files
The simplest way to work with data is plain text files. Python's built-in open() function handles this.
# Writing to a file
with open("notes.txt", "w") as file:
file.write("Line 1: Hello from Python\n")
file.write("Line 2: This is a text file\n")
file.write("Line 3: Easy, right?\n")
# Reading a file
with open("notes.txt", "r") as file:
content = file.read()
print(content)
# Reading line by line
with open("notes.txt", "r") as file:
for line in file:
print(line.strip()) # strip() removes trailing newline| Mode | What it does | Creates file? |
|---|---|---|
"r" | Read only | No — crashes if file missing |
"w" | Write (overwrites everything) | Yes |
"a" | Append (adds to end) | Yes |
"r+" | Read and write | No |
There Are No Dumb Questions
"What does 'w' mode do if the file already exists?"
It erases everything and starts fresh. This is the most dangerous file mode for beginners — you can accidentally delete hours of data with one
open("important.txt", "w"). If you want to add to a file without erasing, use"a"(append) mode. Always double-check your mode before running."What is
\n?"It is a "newline character" — it tells the computer "go to the next line." When you press Enter in a text editor, it inserts a
\nbehind the scenes. When reading files,strip()removes it from the end of each line.
<classifychallenge xp="25" title="Which File Mode?" items={["Read a configuration file at program startup","Save a brand new report to disk","Append today's log entry to an existing log file","Overwrite a settings file with updated values","Add a new row to the bottom of an existing CSV","Read a list of usernames from a text file"]} options={[""r" (read)",""w" (write/overwrite)",""a" (append)"]} hint="Read mode (r) is for loading existing data without changing it. Write mode (w) creates or overwrites — use it for new files or complete rewrites. Append mode (a) adds to the end without erasing — use it for logs, growing CSVs, or cumulative data.">
CSV files — the spreadsheet of programming
CSV (Comma-Separated Values) is the most common data format in the world. Every spreadsheet app can export CSV. Every database can import it. It is just text with commas between values:
name,age,city,salary
Alice,30,London,75000
Bob,25,New York,68000
Charlie,35,Tokyo,82000
Python has a built-in csv module:
import csv
# Reading a CSV file
with open("employees.csv", "r") as file:
reader = csv.DictReader(file)
for row in reader:
print(f"{row['name']} earns ${row['salary']}")
# Writing a CSV file
data = [
{"name": "Alice", "score": 95},
{"name": "Bob", "score": 87},
{"name": "Charlie", "score": 92},
]
with open("scores.csv", "w", newline="") as file:
writer = csv.DictWriter(file, fieldnames=["name", "score"])
writer.writeheader()
writer.writerows(data)Analyze a CSV
25 XPCreate a file called `sales.csv` with this content: ``` product,units_sold,price Widget A,150,9.99 Widget B,89,24.99 Widget C,210,4.99 Widget D,45,49.99 Widget E,178,14.99 ``` Then write a Python script that: 1. Reads the CSV 2. Calculates the revenue for each product (units_sold * price) 3. Finds the product with the highest revenue 4. Prints a summary _Hint: Use `csv.DictReader`. Convert `units_sold` to `int` and `price` to `float`. Track the best product as you loop._
Sign in to earn XPJSON — the language of APIs
JSON (JavaScript Object Notation) is how data moves across the internet. When an app fetches weather data, user profiles, or stock prices, it arrives as JSON. And JSON looks almost identical to Python dictionaries:
{
"name": "Alice",
"age": 30,
"skills": ["Python", "SQL", "Excel"],
"address": {
"city": "London",
"country": "UK"
}
}Python's json module converts between JSON strings and Python data:
import json
# Python dict → JSON string
data = {"name": "Alice", "age": 30, "skills": ["Python", "SQL"]}
json_string = json.dumps(data, indent=2)
print(json_string)
# JSON string → Python dict
json_text = '{"name": "Bob", "age": 25}'
parsed = json.loads(json_text)
print(parsed["name"]) # "Bob"
# Read JSON from a file
with open("data.json", "r") as file:
data = json.load(file)
# Write JSON to a file
with open("output.json", "w") as file:
json.dump(data, file, indent=2)| Function | Direction | Source |
|---|---|---|
json.dumps() | Python → JSON string | Dictionary in memory |
json.loads() | JSON string → Python | String variable |
json.dump() | Python → JSON file | Writes to file |
json.load() | JSON file → Python | Reads from file |
Working with APIs — getting live data
An API (Application Programming Interface) is a URL that returns data instead of a web page. You send a request, and the server sends back JSON.
import urllib.request
import json
# Fetch data from a public API
url = "https://api.open-meteo.com/v1/forecast?latitude=40.71&longitude=-74.01¤t_weather=true"
with urllib.request.urlopen(url) as response:
data = json.loads(response.read())
weather = data["current_weather"]
print(f"Temperature: {weather['temperature']}C")
print(f"Wind speed: {weather['windspeed']} km/h")This fetches live weather data for New York City using a free, no-signup API. No API key required.
Step 1 — Construct the URL with any required parameters (latitude, longitude, etc.)
Step 2 — Send the request with urllib.request.urlopen()
Step 3 — Read the response and parse the JSON with json.loads()
Step 4 — Access the data like a Python dictionary — because it IS one now
There Are No Dumb Questions
"Do I need the
requestslibrary? I see it in every tutorial."
requestsis a third-party library that makes API calls easier and more readable.urllibis built into Python — no installation needed. For learning,urllibis fine. For real projects, installrequests(we will cover this in Module 7). The concepts are identical."What if the API is down or the request fails?"
Your program will crash with a
URLError. In production code, you wrap API calls in atry/exceptblock to handle failures gracefully. For now, just know that network requests can fail and error handling is important.
Fetch Live Data
25 XPUse the Open-Meteo weather API to fetch the current weather for your city. Find your city's latitude and longitude (Google it), then modify this code: ```python import urllib.request import json lat = ___ # Your city's latitude lon = ___ # Your city's longitude url = f"https://api.open-meteo.com/v1/forecast?latitude={lat}&longitude={lon}¤t_weather=true" with urllib.request.urlopen(url) as response: data = json.loads(response.read()) weather = data["current_weather"] print(f"Temperature: {weather['temperature']}C") print(f"Wind speed: {weather['windspeed']} km/h") ``` Bonus: convert the temperature to Fahrenheit using your function from Module 4. _Hint: London is roughly 51.51, -0.13. Tokyo is 35.68, 139.69. Paris is 48.86, 2.35._
Sign in to earn XPError handling — when things go wrong
Files can be missing. APIs can be down. Users can enter garbage. Error handling lets your program deal with problems gracefully instead of crashing.
# Without error handling — crashes on bad input
age = int(input("Enter your age: ")) # Crashes if user types "abc"
# With error handling — recovers gracefully
try:
age = int(input("Enter your age: "))
print(f"You are {age} years old")
except ValueError:
print("That is not a valid number!")
# Multiple except blocks
try:
with open("data.csv", "r") as file:
data = file.read()
value = int(data.split(",")[0])
except FileNotFoundError:
print("File not found — check the filename")
except ValueError:
print("File contains non-numeric data")
except Exception as e:
print(f"Unexpected error: {e}")The pattern: try the risky thing. If it fails, except catches the specific error and runs alternative code.
Build a Data Pipeline
50 XPWrite a complete script that: 1. Reads `employees.csv` (create it with 5 employees: name, department, salary) 2. Filters only employees in the "Engineering" department 3. Calculates the average salary of engineers 4. Writes the results to `engineering_report.json` Include error handling for the case where the CSV file does not exist. Expected JSON output: ```json { "department": "Engineering", "employee_count": 2, "average_salary": 85000.0, "employees": ["Alice", "Charlie"] } ``` _Hint: Read with `csv.DictReader`. Filter with a list comprehension. Calculate average with `sum()/len()`. Write with `json.dump()`. Wrap file reading in try/except._
Sign in to earn XPBack to the intern
That marketing intern's 12 lines of Python? You could write them now. Read a CSV with csv.DictReader, filter rows with a list comprehension, calculate an average with sum()/len(), and format the output with an f-string. Twelve lines, three seconds, four hours of manual work eliminated.
The difference between a junior developer and someone who just finished a tutorial is the ability to move data between the real world and your code. You just learned that skill — CSV in, JSON out, APIs on demand, errors handled gracefully.
Next up: You have been using Python's built-in tools — csv, json, urllib. They work, but the Python ecosystem has hundreds of thousands of third-party libraries that make everything easier. In the next module, you will learn to install packages with pip, manage projects with virtual environments, and use pandas (data analysis), requests (cleaner APIs), and matplotlib (charts and visualizations).
Key takeaways
with open()is the safe way to read and write files — it automatically closes the file, even on errors"w"mode erases everything — use"a"(append) if you want to add to an existing file- CSV values are always strings — convert to
int()orfloat()before doing math json.dumps()/json.loads()work with strings;json.dump()/json.load()work with files — the "s" = "string"- APIs return JSON — fetch with
urllib, parse withjson.loads(), access like a dictionary try/excepthandles errors gracefully — always wrap file and network operations- This is the most practical Python skill — most real-world automation is reading, processing, and writing data
Knowledge Check
1.What is the danger of opening a file with `open('data.txt', 'w')`?
2.When reading a CSV file with csv.DictReader, what data type are all values?
3.What is the difference between `json.dump()` and `json.dumps()`?
4.What Python construct should you use to handle a FileNotFoundError gracefully?
Want to go deeper?
💻 Software Engineering Master Class
The complete software engineering program — from your first line of code to landing your first job.
View the full program