Beyond Hardware Limits: Unraveling Disk Physical Structure with Microbenchmarking
Recently, an interesting 2019 article was brought back into the spotlight via Hacker News: “Discovering hard disk physical geometry through microbenchmarking.” In an era where high-performance SSDs are commonplace, why is it important to understand the physical structure of rotational media (HDDs)?
In fact, the core of this article goes beyond the simple structure of a hard disk, focusing on “a methodology for inferring hardware’s internal operations through Observable Performance.” This principle is applicable not only to analyzing the performance characteristics of modern NVMe SSDs with ZNS (Zoned Namespace) storage but also to low-power network devices like the recently discussed BYOMesh based on LoRa.
In this post, we will practice the microbenchmarking technique of uncovering the hardware’s “Physical Geometry” by writing a simple code ourselves.
Why Microbenchmarking?
Software developers can work without knowing complex hardware details thanks to the abstraction layers between the OS and hardware. However, this changes when developing systems that require high performance, such as e-commerce platforms handling high transaction volumes or analytical systems processing large amounts of data.
It is difficult to accurately know the actual sector layout, cache memory size, or rotational latency using only OS commands like fstat or lsblk. At this point, microbenchmarking, which involves performing read/write operations and measuring the time taken, becomes the most powerful tool.
Fundamental Principles of Benchmarking
The data access speed of a hard disk drive (HDD) is determined by the following three factors:
- Seek Time: The time it takes for the head to move to the relevant track (physical movement).
- Rotational Latency: The time until the sector containing the data rotates under the head.
- Transfer Time: The time to actually read the data.
We will focus on ‘Seek Time’. The further the head has to move, the longer it takes. By measuring the time difference between reading adjacent sectors and sectors far apart, we can infer the disk’s physical layout (track and cylinder structure).
Hands-on: Exploring Disk Structure with Python
Now, let’s use Python to measure the performance difference between random and sequential access. This code is a simple example to measure the cost of moving between the ‘Outer Zone’ and ‘Inner Zone’ of a disk.
Caution: This script accesses actual disk devices (e.g.,
/dev/sdX). Be sure to use a test disk with no data on it or run it in a VM environment. Accessing the wrong device can lead to data corruption.
import os
import time
import sys
# Disk path to test (needs to be changed to a VM or separate test disk)
# Example: '/dev/sdb' for Linux, '/dev/rdisk2' for macOS
DISK_PATH = '/dev/sdb'
# Read block size (4KB)
BLOCK_SIZE = 4096
# Number of measurements
ITERATIONS = 1000
def benchmark_random_access(fd, size):
"""Measures performance when accessing random locations"""
total_bytes = os.path.getsize(DISK_PATH) if os.path.exists(DISK_PATH) else size
start_time = time.time()
for _ in range(ITERATIONS):
# Calculate random offset (maintain block alignment)
offset = os.urandom(8)
offset_int = int.from_bytes(offset, 'big') % (total_bytes - BLOCK_SIZE)
aligned_offset = (offset_int // BLOCK_SIZE) * BLOCK_SIZE
os.lseek(fd, aligned_offset, os.SEEK_SET)
os.read(fd, BLOCK_SIZE)
end_time = time.time()
return (end_time - start_time) * 1000 # Convert to ms
def benchmark_sequential_access(fd):
"""Measures performance when accessing sequential locations"""
start_time = time.time()
for _ in range(ITERATIONS):
os.read(fd, BLOCK_SIZE)
end_time = time.time()
return (end_time - start_time) * 1000 # Convert to ms
if __name__ == "__main__":
if not os.path.exists(DISK_PATH):
print(f"Error: {DISK_PATH} not found. Please update DISK_PATH.")
sys.exit(1)
print(f"Benchmarking {DISK_PATH}...")
try:
# It is recommended to use the O_DIRECT flag to minimize buffering when opening the file (Linux).
# Here, we proceed with the default mode for compatibility, but O_DIRECT is necessary for actual hardware access.
fd = os.open(DISK_PATH, os.O_RDONLY | os.O_SYNC)
print("1. Measuring Random Access (Simulating Head Seek)..." )
# Random access is slow due to continuous head movement
random_time = benchmark_random_access(fd, 1024*1024*1024) # Assume 1GB
print(f" Random Access Time: {random_time:.2f} ms")
print("2. Measuring Sequential Access (Minimal Head Movement)...")
# Reset file pointer to the beginning
os.lseek(fd, 0, os.SEEK_SET)
sequential_time = benchmark_sequential_access(fd)
print(f" Sequential Access Time: {sequential_time:.2f} ms")
print("\n--- Analysis ---")
print(f"Performance Gap (Seek Cost): {random_time - sequential_time:.2f} ms")
print("The gap represents the time spent moving the disk head physically.")
os.close(fd)
except PermissionError:
print("Error: Permission denied. Try running with 'sudo'.")
except Exception as e:
print(f"Error: {e}")
Interpreting and Utilizing Results
Running the code above, you will observe that random access is significantly slower than sequential access. This ‘Gap’ is precisely the time spent on physical seeking and rotation.
If you were to perform this measurement separately at the beginning of the disk (outer tracks) and at the end (inner tracks), you might discover that the outer tracks have a faster transfer rate than the inner tracks due to the disk’s Zone Bit Recording (ZBR) structure. In the past, this was utilized to tune data placement to the front of the disk.
Modern Relevance: Lessons from the SSD and Cloud Era
Although spinning disk technology is becoming a thing of the past, the principle of “understanding a system’s internals through performance measurement” remains unchanged.
- SSD Internal Parallelism: SSDs internally operate multiple channels and planes in parallel. If performance dramatically increases when we induce sequential reads using multithreading, this can be a signal to infer the internal controller’s parallel processing capabilities.
- Cloud Storage I/O: By capturing phenomena like the ‘Burst’ followed by a ‘Baseline’ drop in disk I/O performance on AWS or Azure through microbenchmarking, you can design cost-effective architectures.
Conclusion
The ‘Discovering hard disk physical geometry’ article, which regained attention on Hacker News, goes beyond mere curiosity to remind us of the most fundamental stance in diagnosing system performance bottlenecks.
Instead of vaguely concluding “the disk is slow,” proving with data “where and why it is slow” by running simple scripts yourself. This is the first step towards true performance tuning.
We encourage you to run the benchmarking code written in today’s post in your development environment. Discovering unexpected hardware characteristics and directly observing their impact on system performance will be a very interesting experience.
References