Notes on Modern Operating Systems, Chapter 2, Part 3: Scheduling

This is the last part of Chapter 2 of Modern Operating Systems at long last. I never noticed when I was in college how incredibly long the chapters in these textbooks are; most of the chapters in Modern Operating Systems are around a hundred pages. It was less of an issue with Computer Networking because the chapter was like 75% fluff and examples that one could skim over for understanding, but Modern Operating Systems is very text-dense, so breaking the chapter down into parts so I could sustain a sense of momentum and avoid feeling like I was on an endless slog through the same chapter was of great psychological benefit.

2.5 Scheduling

  • Need to choose which process gets the CPU next when it’s free.

2.5.1 Intro to Scheduling

  • Scheduling is much less important on PCs because most of the time is spent waiting for user input, not on CPU.
  • It’s much more important on batch systems, mainframes, and servers.
  • Process switching is expensive so scheduling must also mind the efficiency of CPU use:
    • Must go into kernel mode
    • Then must save all state of the current process
    • And save the memory map (the memory reference bits in the page table)
    • Then select a new process
    • Then reload the MMU with the memory map of the new process
    • Start the new process
    • And also the entire cache will be invalidated, so you have to reload it.
  • Processes alternate bursts of computing with waits on IO (disk, network, user response). Processes with long bursts of CPU usage are called CPU-bound. Those with short bursts of CPU usage are called I/O-bound.
  • These terms are defined in terms of length of CPU burst, not length of I/O wait, because I/O waits don’t really depend on the process; they’re determined by external factors like the speed of the disk or how long it takes a user to respond.
  • As CPUs get faster, processes become more I/O bound.
  • When scheduling happens:
    • When a process starts
    • When a process exits and a new one needs to start
    • When a process blocks
    • When an I/O interrupt happens—can choose to let the formerly blocked process run now that its I/O is available.
  • Clock interrupts provide an opportunity for scheduling to occur. In preemptive systems, the running process will be suspended and another scheduled after some number of clock interrupts. In non-preemptive systems, clock interrupts are not allowed to interrupt the running process.
  • Batch systems will be non-preemptive; there’s no reason to not finish running a job all at once.
  • Interactive systems will be preemptive. E.g. they need to simulate running a word processor and a web browser at the same time, neither of which have a set exit point and both of which might need to accept user input.
  • Real time systems often don’t use preemption; processes are simply written to exit very quickly.
  • Goals of scheduling algorithms:
    • Fairness. Make sure similar processes are treated similarly.
    • Keep CPU and I/O devices busy.
    • In batch systems:
    • Throughput: number of jobs per unit time.
    • Turnaround time: average time to complete jobs
    • In interactive systems:
    • Response time: make sure user gets a response quickly.
    • Proportionality: conform to users’ perception of how long something should take.
    • In real time systems:
    • Hitting deadlines
    • Predictability

2.5.2 Scheduling in Batch Systems

  • I skipped this because I don’t care about batch systems.

2.5.3 Scheduling in Interactive Systems

  • Generally each process gets a quantum, i.e. a unit of time which it is allowed to run for before the clock interrupts it. This must be chosen carefully. Too short means lots of inefficient process switching. Too long means we waste time running processes that are waiting on I/O.
  • Round Robin: just cycle through all the processes.
  • Priority Scheduling: Each process gets a priority. Choose the highest priority process to run. Degrade priorities as processes run to avoid starvation of lower priority processes.
  • Shortest Process Next: Estimate how long each process will take using a weighted average of the time previous runs took. Choose the one predicted to be shortest.
  • Lottery Scheduling: Give each process lottery tickets. At scheduling time, choose a ticket at random and run the process holding that ticket. Good for allocating fixed percentages of the CPU time; to give a process 25% of the CPU, give it 25% of the available tickets.

2.5.4 Scheduling in Real-Time Systems

  • Real-time systems often have physical devices feeding them data which must be processed on a deadline. E.g. a CD player receives bits from the laser that reads the disc and must translate them into music quickly to produce the correct sound. An autopilot system must respond to the readings of gauges and meters and adjust the plane’s movement quickly enough to account for current conditions.
  • Hard real-time: Missing a deadline is a disaster.
  • Soft real-time: Occasionally missing a deadline is tolerable.
  • The scheduler must decide how to schedule processes to not miss deadlines. However, processes in real-time systems are usually written to exit very quickly.

2.5.5 Policy vs. Mechanism

  • Sometimes the user processes have information that can be used to make better scheduling decisions. The scheduling mechanism (the actual algorithm) can be passed this information to set the scheduling policy.

2.5.6 Thread Scheduling

  • Varies a lot between user and kernel threads.
  • User-level:
    • Scheduler picks a process, which can run whatever threads it wants whenever it wants for whatever reason it wants.
    • Can end up wasting time running blocked threads
    • Can use a custom thread scheduler and less worry about separating process from mechanism.
  • Kernel-level:
    • The kernel just picks a thread and runs it.
    • Requires a full context switch to swap threads
    • Can choose the highest priority thread regardless of what process it belongs to. So if a single process has two threads that are both higher priority than anything else currently running, they can be run back to back instead of waiting for that process to get control of the CPU again.

N.B. Question 50 is #problematic.

Notes on Modern Operating Systems, Chapter 2, Part 2: Interprocess Communication

Notes on the second part of Chapter 2 of Modern Operating Systems, on interprocess communication. These sections cover the basic ideas of mutual exclusion and coordination of concurrent processes. There wasn’t a ton in this section I hadn’t seen before, but it was a good refresher and a good chance to nail down some of the terminology.

2.3 Interprocess Communication

Three issues to consider in interprocess communication:

  1. Passing information between processes
  2. Preventing concurrent processes from interfering with each other
  3. Sequencing of dependency acquisition

2) and 3) also apply to threads. 1) less so because threads can share memory.

2.3.1 Race Conditions

  • A race condition occurs when two processes are reading or writing shared data and the result depends on the order of execution.
  • The shared data can be in memory, a file, a directory, whatever.

2.3.2 Critical Regions

  • Need mutual exclusion to lock other processes out of shared data when it’s being used.
  • The part of the program that uses the shared data, requiring mutual exclusion, is called the critical region.
  • If you ensure that two processes are never executing the critical region at the same time, you can avoid race conditions.

Four conditions must hold for a good solution to this:

  1. No two processes are simultaneously in the critical region.
  2. No assumptions may be made about speed of processes or the number of CPUs.
  3. No process outside its critical region should be able to block other processes.
  4. No process should have to wait forever to enter its critical region.

2.3.3 Mutual Exclusion with Busy Waiting

In practice busy waiting is rarely useful, so I’m not going to write down any notes about it, but the book lists several bad solutions for mutual exclusion using busy waiting and goes over how they introduce race conditions and other badness.

A loop that runs doing nothing to implement busy waiting is called a spin lock.

The one useful thing the section covers is the TSL (test-and-set-lock) instruction, which is a CPU instruction that atomically sets and checks a lock variable. This instruction is useful in other, less terrible kinds of mutual exclusion.

2.3.4 Sleep and Wakeup

Want to get rid of busy waiting.

Producer-Consumer Problem

  • We’ll use this problem as an example for the mutual exclusion techniques we’re about to discuss.
  • Two processes share a fixed-size buffer
  • One process produces things and puts them on the buffer.
  • The other consumes things from the buffer and does something with them.
  • Access to the shared buffer must be carefully managed.
  • Conceptually, we want the producer to sleep when the buffer is full and wake up and add things to it when there is space. Conversely, we want the consumer to sleep when the buffer is empty and wake up and take things from it when it is not empty.

2.3.5 Sempahores

  • One way to achieve sleep and wakeup is with semaphores.
  • A semaphore is an integer variable that supports two operations: up and down.
  • down decrements the semaphore. If the semaphore is zero, the process that called down blocks until the semaphore is no longer zero.
  • up increments the semaphore.
  • Semaphores can represent “tickets” or “passes” to do a specific operation.
  • You can use them like this to synchronize two processes, imposing a certain ordering on their actions.
  • For example, in the producer-consumer problem, you can use empty and full semaphores to track how many spaces are empty and full in the buffer.
  • Then you can force the consumer to do a down on full so that it blocks when there’s on the buffer to take, and force the producer to do a down on empty so that it blocks when there’s no space for it to add to the buffer.
  • This assures that consumption only happens after production and production only proceeds to a certain point before some consumption happens.

2.3.6 Mutexes

  • A mutex is another way of using a semaphore. It only has the values 0 and 1.
  • If the value is 1, processes can enter the critical region. If 0, they block on trying to enter the critical region.
  • If your processor has a TSL instruction, you can implement mutex locking and unlocking in user space.
  • Importantly, this means you can implement mutexes so that when a thread blocks on a mutex, the thread can yield the CPU before blocking.
  • Most modern OSes allow limited sharing of memory between processes, which allows mutex sharing.

2.3.7 Monitors

  • A collection of variables and functions that guarantee only a single thread at once can access them.
  • Monitors are a programming language-specific feature. The compiler handles mutual exclusion, often using a mutex or semaphore under the hood.
  • Monitors still need a way to block processes that can’t proceed anymore so that another process can start using the monitor. They do this with…

Condition Variables

  • Two operations: wait and signal.
  • If a process can’t continue, it waits on a condition variable and blocks.
  • Then another process can enter the monitor and wake up the sleeping process by doing a signal on the condition variable.
  • You can either immediately suspend the signaling process, resuming it later, and let the awakened process run; or you can require a signal to be the last thing in a function.
  • Without the monitor’s guarantee of only one process running the code at once, wait and signal are subject to race conditions where signals cross and wakeup signals disappear into the void just as a process is about to go to sleep, causing them to never wake up.

Java objects can also be used as monitors with the synchronized keyword and the wait and notifyAll methods. This combined with threads is the basis for all Java concurrency.

2.3.8 Message Passing

  • Semaphores and monitors require shared memory, so they can’t be used in distributed systems.
  • Message passing, on the other hand, is more general and works even across machines.
  • Uses two operations: send and receive. These are implemented as system calls.
  • send gets a process ID and a message.
  • receive also gets a process ID and in C implementations it gets passed an empty message to hydrate.
  • A receiver can block until a message arrives, or return immediately with an error code.

Message passing raises several design challenges.

  • Messages can be lost by the network. You can have receivers send back an acknowledgment signal, and have senders retry the message if they don’t get it after a certain timeout.
  • Local message passing is slower than semaphores or monitors.
  • You can implement message passing with a mailbox that buffers received messages, or you can have threads block and wait on each other. The latter is called a rendezvous.

2.3.9 Barriers

  • Block a group of processes all at the same point in the program to coordinate their start on the next phase.

2.4 Classical IPC Problems

These are all well-known and there’s a ton of information out there on them, so I didn’t take detailed notes about them.

2.4.1 Dining Philosophers

Models a situation where there is possible contention for a set of shared resources, which can lead to possible deadlocks or starvation.

2.4.2 Readers and Writers

A simplistic model of a database, where multiple concurrent processes can read but only one can write at a time.

2.4.3 Sleeping Barber

Models a situation where a bounded size queue of tasks is being serviced by a single process but filled by multiple processes.

Notes on Modern Operating Systems, Chapter 2, Part 1: Processes and Threads

These are my notes on Chapter 2, Sections 2.1 and 2.2 of Modern Operating Systems, Second Edition, by Andrew Tanenbaum. The chapter is quite long, and dense with text, so I’m splitting my notes into three parts. Part 1, this part, is the basics of processes and threads. Part 2 will be Sections 2.3 and 2.4 on interprocess communication and concurrency control. Part 3 will be Section 2.5 on scheduling.

As with Computer Networking: A Top-Down Approach, I bought an old edition of this book for cheap about six years ago and happened to have it laying around. The second edition is from the year 2000, so it’s now twenty years old and somewhat out of date. But operating systems haven’t changed as much in twenty years as the Internet has, so although some of the chapters talk as if single-core processors are still the status quo, and the Case Studies section in the back talks about Windows 2000, I’ve yet to stumble on anything that makes me suspicious.

The writing style is clear, but the sections aren’t always the best organized and sometimes make references to material that hasn’t been covered yet. Partly that’s just inherent to the way operating system concepts all connect into a whole, so although it does bother me a little, it was a reasonable choice. If you can find a newer edition for under fifty dollars it might be worth picking up. There are a couple sections towards the end of the book where Tanenbaum talks about his work on MINIX and his opinions on Linux and Linus Torvalds that read like aggrieved Usenet posts cleaned up for publication, and he also constantly dunks on other people’s research, writing in Chapter 3 that research on distributed deadlock detection is not “even remotely practical in real systems” and that its “main function seems to be keeping otherwise unemployed graph theorists off the streets”. I find it highly amusing, not least because Tanenbaum must have known that the undergraduate students who would mostly be reading this book didn’t care.

Section 2.1 Processes

  • Processes pretend to run sequentially, but behind they scenes they run discontinuously as the CPU swaps between them.
  • Each process maintains its own logical program counter which is loaded into the real program counter when the CPU picks it up to run.
  • Run order and run time of processes might be non-deterministic, so you cannot make any assumptions about when processes will run or how long they will run for.

2.1.1 Process Creation

Four events cause process creation:

  1. System initialization
  2. System call by a running process (fork on Unix).
  3. User request to create a process (e.g. running a terminal command, clicking an icon).
  4. Batch job initialization (on mainframes, submitted by users).
  • fork on Unix creates new processes.
    • The child process starts with the same memory, environment strings, and open files as its parent.
    • Then it runs execve or some similar system call to change the memory image and run a new program.
    • The child’s memory image is a copy of the parent’s into a new address space. Children do not share address spaces with their parents, so the parent and child cannot access each other’s memory.

2.1.3 Process Destruction

Processes can terminate for the following reasons:

  1. Normal exit (voluntary), triggered on Unix by the exit system call
  2. Error exit (voluntary), e.g. program cannot finish, such as compiler called on nonexistent file
  3. Fatal error (involuntary), e.g. divide by zero error, attempt to access protected memory (null pointer dereference in C)
  4. Murder by another process (involuntary), e.g. someone calls kill on it.

2.1.4 Process Hierarchies

  • Do not exist on Windows.
  • On Unix a process and its descendants form a group. Process groups share some signals and other things.
  • Since all processes on Unix are children of the init process, there is also a global process group.

2.1.5 Process States

Three basic states:

  1. Running (using the CPU)
  2. Ready (stopped, but could be picked up and run at any time)
  3. Blocked (unable to run until some event happens)
  • Running ⬌ blocked can happen due to a system call made by the running process, or automatically if the process reads from a pipe or a special file like a terminal or socket when there’s no input available.
  • Running ⬌ Ready is managed by the process scheduler, which decides what process to run next.

2.1.6 Implementation of Processes

  • The operating system has a process table that stores the following data for each process:
    • Process state (running, ready, blocked)
    • Program counter (i.e. address of the next instruction to execute)
    • Stack pointer (i.e. address of the top of the runtime stack)
    • Memory allocation
    • Status of open files
    • Accounting and scheduling information
    • Anything else that needs to be saved between process switches

The process table enables interrupt execution. (i.e. hardware interrupts, which “interrupt” the currently running process to deliver something from a hardware device)
1. Hardware pushes the program counter, registers, state, etc. to the stack
2. Hardware loads a new program counter from the interrupt vector, a special location in memory assigned to each class of IO device (e.g. hard disk, timer, terminal devices in mainframes).
3. Assembly language procedure saves the current values of the registers in the process table.
4. Assembly language procedure sets up a new stack.
5. Interrupt service written in C runs. It reads input from the IO device and buffers it into memory.
6. Now that the input from the hardware has been pulled in, the scheduler runs and decides which process to run next.
7. The C procedure returns control to assembly
8. Assembly starts the new current process; it loads up the registers, resets the program counter and stack pointer, etc.

2.2 Threads

2.2.1 The Thread Model

  • A process can have multiple threads of execution
  • Processes group resources together; threads are the thing that actually runs on the CPU.
  • Each thread has its own:
    • Program counter
    • Registers
    • Runtime stack with execution history
  • All the threads in a process share:
    • An address space (so they can access each other’s memory)
    • Global variables
    • Open files
    • Child processes
    • Pending alarms
    • Signals and signal handlers
  • All threads in a process share memory so they can read and write each other’s variables and even stacks. This is part of the benefit of threads; it allows more efficient cooperation between concurrent entities (compared to processes, which can’t read each other’s memory and thus must use other means of communicating).
  • Threads have four states:
    1. Running
    2. Blocked
    3. Ready
    4. Terminated
  • Processes start with one thread and create more using library calls.
  • Threads can exit by calling another library procedure, e.g. thread_exit.
  • In some systems threads can wait for other threads to exit by calling thread_wait.
  • thread_yield gives up the CPU for another thread. The CPU won’t actually stop a thread until a process switch occurs, so threads in the same process have to yield to each other.
  • Threads introduce many complications, including synchronizing their shared memory and coordinating the yields so you don’t waste time running blocked threads.

2.2.2 Thread Usage

  • Threads let a single application do multiple things, modeled as sequential jobs running side by side instead of as a mess of interrupts and signals.
  • On multiprocessor systems or multi-core CPUs, multiple threads can actually run simultaneously. On single-core systems threads will switch on and off the one core.
  • Threads can start up faster than processes since they require fewer resources to get going.
  • Threads can block and cooperatively switch off since they can signal each other using their shared memory.

2.2.3 Threads in User Space

  • Processes manage their own threads in user space.
  • The kernel knows nothing of threads.
  • Each process has a thread table with the threads’ program counters, stack pointers, registers, etc.
  • The threads run on a run-time system, a library of thread-managing procedures


  • Thread switching is very fast since applications don’t need to trap to the kernel to switch.
  • Applications can implement custom thread scheduling.
  • User-space threads scale better since they don’t require space in the kernel.


  • If a user-space thread makes a blocking system call, it will block without giving you a chance to switch threads before handing off control to the kernel, because the kernel knows nothing of threads and cannot give you that chance. This means the whole process is blocked.
  • If a thread causes a page fault, e.g. because it needs to run some code that isn’t loaded into memory yet, it gets blocked, again without allowing any chance to switch threads.

Possible solutions to the drawbacks (that all suck):

  • The select system call can detect when a system call will block. You can use it before making a possibly blocking system call, and if it detects that the system call will block, you can switch to another thread and make the system call later. E.g. if the current thread needs to read some data, but the read call will block on a disk read, you can switch threads until the interrupt executes later and the data is buffered in memory, and then switch back to that thread.
  • Using select sucks because you have to wrap all your blocking system calls to do the select check, so you have to modify the kernel, and it’s also annoying and tedious logic to implement.
  • You can also use a clock interrupt to stop the currently running thread every so often and check if it’s been blocked and should be swapped for another thread.
  • Using a clock interrupt sucks because it’s an ugly hack (you might still let the blocked thread have the CPU for a while before the clock interrupt stops it), and apparently each process only gets one clock so if you use it to manage thread swapping you can’t use it for anything else.
  • System calls can all be rewritten to be non-blocking. E.g. rather than waiting for input to be available, a read call would just return 0 (for 0 bytes read) and you could try again later.
  • Rewriting system calls sucks because you have to rewrite a bunch of system calls around threads and that’s not cool.

2.2.4 Threads in the Kernel

  • You can instead put threads in the kernel. The kernel then manages the thread table and thread swapping instead of the user program.
  • All blocking calls are system calls and when a thread blocks, the kernel can choose another thread to run. If there is no suitable thread in the same process, the kernel can choose to let a different process run for a while.
  • Creating, destroying, waiting, etc. are all more expensive since they are system calls.

2.2.5 Hybrid Implementations

  • The idea is to combine the lightweight nature of user-space threads with the ease of swapping out blocked threads that comes with kernel threads.

2.2.6 Scheduler Activations

  • The kernel assigns each process some virtual CPUs (or they can request and release them). [It’s not actually clear to me from the text how the virtual CPUs play into this process.]
  • The kernel can detect when a thread blocks and notify the process through a mechanism called upcalls.
  • Upcalls work by activating the process’s runtime system at a known starting address, similar to a Unix signal. Once activated, the runtime system can choose to run another thread instead of the blocked one.
  • When a hardware interrupt runs, the process will be resumed and the runtime system can decide which thread to run. It can run the thread that was interrupted, or, if the hardware interrupt completed something that a blocked thread cares about (like reading in some data and buffering it in memory), it can choose to run the blocked thread, or it can just choose to run a completely different thread.
  • This system isn’t great because you now have two-way calls between the kernel and user space, whereas a normal operating system has a nice layered model where user space calls the kernel but not the reverse.

To be continued next time with interprocess communication. And then…in the epic final part of Chapter 2…SCHEDULING!!!

Notes on Computer Networking: A Top-Down Approach Chapter 2

These are my notes on Computer Networking: A Top-Down Approach (4th Edition) Chapter 2, Application Layer.

I don’t particularly recommend this book. It came out in 2008, and a lot’s happened to the internet since then. It’s not that well organized; you’ll see if you read the notes that they take some bizarre leaps from topic to topic. And it tends to be extremely wordy; these notes condense almost 100 pages of information. Because of that, you’ll notice that I didn’t take notes on every section. But I happened to have this book laying around, so I read it. I’m putting up the notes for my own future reference and in case someone else someday finds them useful.

2.1.1 Application Architectures

  • Client-server: one process (client) initiates communication, another (server) responds.
  • P2P: No central server, any process can act as either a client or a server.
    (Note: “process” refers to an operating system process, i.e. a running program.)
  • Sockets are the interface between the transport layer (of the OSI layer model) and the application layer. A process listens to a socket for messages and can put messages into the socket to send them across the network to the other end.

2.1.3 Transport Services Available to Applications

  • Messages pushed into a socket are sent across the network using a transport layer protocol.
  • There are four services transport layer protocols could theoretically provide.
    • Reliable data transfer
    • Throughput guarantees (e.g. guaranteed delivery of bits at 50 kbps)
    • Timing (e.g. guaranteed receipt of messages within 100ms)
    • Security

2.1.4 Transport Services on the Internet

  • TCP
    • Uses handshaking to establish a connection between client and server
    • Guarantees lossless data transfer
    • Implements throttling to reduce congestion over the network; this means it provides no timing or throughput guarantees.
  • UDP
    • Does not use handshaking or establish connections; packets sent over UDP might just disappear and never reach their destination.
    • Does not throttle.
    • Theoretically, UDP would be good for loss-tolerant time-bound applications, since the lack of throttling often means more throughput and faster delivery. In practice, UDP is usually blocked by firewalls, so no one uses it.
  • Neither protocol provides throughput guarantees, timing guarantees, or security, and UDP does not provide reliable data transfer either, which makes it puzzling that we took the time to go over those four services above. Applications built on these protocols must cope with the lack of these services.
  • SSL is an application-layer add-on to TCP that provides encryption; other application layer protocols send their messages through SSL, which encrypts them and sends them into the socket, which uses TCP to send them to the other end, where the encrypted message is read from the socket by SSL on the other end and decrypted before being passed to the application layer protocol on that end.
  • The transport layer uses IP addresses and port numbers to find the correct destination machine and socket on that machine.

2.2 The Web and HTTP

  • HTTP is defined by RFCs 1945 and 2616.
  • An HTTP server stores no information about the clients, so HTTP is a stateless protocol.
  • HTTP uses persistent connections by default, i.e. after opening a TCP connection, it will keep that connection open and reuse it for a period of time for subsequent interactions. This increases performance; establishing a connection requires an extra network round trip to go through the handshaking process, and a single web page might require several requests to fully load (to get the HTML, Javascript, CSS, images, etc.), so reusing a connection reduces that overhead.

HTTP request message format:
– \<HTTP method, e.g. GET, POST> \<URL> \<version, e.g. HTTP/1.1>
– <header name>: <header value>
– <More headers>
– Empty line
– <Entity body, which can contain any data>

HTTP response message format:
– <version> <status code, e.g. 400> <explanation of status code, e.g. “Bad Request”>
– <header name>: <header value>
– <More headers>
– Empty line
– <Entity body>

2.2.4 Cookies

  • Defined in RFC 2965
  • Response header Set-cookie tells client to store something, such as a user id, in a cookie file, keyed by the host name.
  • Request header Cookie contains the value set by the host to which the request was made.
  • Cookies allow servers to maintain information about clients even over the stateless HTTP protocol.

2.2.5 Web Caching

  • Cache make go fast
  • The cache can send requests with an If-Modified-Since header; if the object being requested has not been updated, the server will return status 304 Not Modified, telling the cache it can return the version it has.
  • The book doesn’t mention this, but ETags are another mechanism used for cache validation.

2.3 FTP

  • FTP allows navigating a remote filesystem and transferring files to and from that filesystem.
  • FTP uses two TCP connections
    • A control connection on port 21 sends user id, password, and commands
    • A data connection on port 20 sends actual files.
  • Using the separate connection for control information is called sending control information out of band.
  • FTP is not stateless; it stores username, password, and current directory for clients.
  • Control connections are persistent; they are kept open for a period of time and reused.
  • Data connections are not persistent; they are opened by the server in response to user commands, used to send or receive a single file, and closed.

2.4 Email

  • Uses two types of protocols: mail send (SMTP) and mail read (POP3, IMAP).

2.4.1 SMTP

  • Uses TCP’s reliable data transfer.
  • Has both a client and server mode. Message senders are clients, message receivers are servers.
  • All text in an SMTP message must be 7-bit ASCII because the protocol is extremely old.
  • Uses port 25.
  • Handshakes to establish a connection.
  • Uses a single TCP connection to send all outgoing messages it has.
  • Is a push protocol—clients mainly send data to servers. By contrast, HTTP is mainly a pull protocol; clients typically receive data from servers.
  • RFC 822 defines SMTP and the headers for sending ASCII text.
  • MIME (Multipurpose Internet Mail Extensions) defines headers for sending non-ASCII text.
    • Defined in RFCs 2045 and 2046.
    • Uses the Content-type and Content-Transfer-Encoding headers.
    • Content-type tells the receiver what the data actually is, e.g. image/jpeg, audio/vorbis, text/csv.
    • Content-Transfer-Encoding tells the receiver how the data was converted to 7-bit ASCII text, e.g. base64.

2.4.4 Mail-Access Protocols

  • Since SMTP is push-based, it can’t really be used to retrieve mail by non-server clients like desktop PCs.
  • POP3 and IMAP are pull-based protocols for email clients to retrieve mail.
  • POP3 is very simple; it cannot manage email directories on a remote server. It has “read and delete” mode, where it downloads mail and deletes it from the remote server, as well as “read and keep”, where it leaves the mail on the server.
  • IMAP is much more complex.
    • It can manage mail directories.
    • It can obtain parts of messages, which can be useful when bandwidth is low; you can avoid downloading large video and audio files.
  • Web mail clients like Gmail just use HTTP to retrieve messages. POP3 and IMAP are useful for desktop clients like Thunderbird and Outlook, or for terminal clients like Mutt. Gmail and other web-based email services sometimes provide POP3 or IMAP access so you can use clients like Thunderbird and Mutt to read your mail. Information to access Gmail using SMTP and IMAP from an email client.

2.5 DNS

  • DNS (Domain Name System) translates human-readable domain names to numeric IP addresses.
  • The term “DNS” actually refers to two things:
    • A distributed database implemented in a hierarchy of DNS servers.
    • An application-level protocol for querying this database.
  • DNS servers often run the Berkeley Internet Name Domain (BIND) software.
  • DNS runs on UDP and uses port 53.
  • Clients implementing other protocols (e.g. HTTP, FTP, SMTP) will include a step where they use DNS to translate user-supplied hostnames into IP addresses.
  • DNS can add a delay to requests, but the delay is minimized with caching in nearby servers.
  • DNS also provides:
    • Host aliasing: can translate a short alias into a longer canonical hostname
    • Mail server aliasing: can give a web server and mail server the same alias but different canonical hostnames
    • Load distribution: (proceed with caution; I have doubts that this is still a good idea. I asked a question on SE about this: DNS can map a hostname to a set of IP addresses. You can map a hostname to a set of replica servers all running the same application. The DNS server will (according to the book) rotate the order of the IP addresses it returns, and clients will (according to the book) always take the first one, so requests will be distributed across your servers.

2.5.2 How DNS?

DNS uses four types of servers:
– Root DNS servers. There are 13 of these in the world, controlled by various institutions and entities. They are the first stop in a DNS lookup.
– TLD DNS servers. These each service a top-level domain (.com, .edu, .gov, .jp, .fr, .tv, etc.). Different companies own the different TLD servers.
– Authoritative DNS servers. The owner of a domain controls the authoritative DNS server for the domain.
– Local DNS servers. These act as brokers / caching centers for a local area.

A basic example of a DNS interaction.
1. A client requests an internet connection to an ISP. The ISP provides the address of one of its local DNS servers using the DHCP protocol. The local DNS server will be within a few routers of the client for speed.
2. The client will send requests for hostname lookups to its local DNS server. The local DNS server will forward them to a root DNS server.
3. The root DNS server will send back the identity of a TLD server that can handle the request. E.g. if the hostname is, the root DNS server will send back a TLD server that can handle .com.
4. The local DNS server contacts the provided TLD server. The TLD server looks at the entire domain and returns an authoritative DNS server for that domain. E.g. if the hostname is, the TLD server for .com will send back an authoritative DNS server for
5. The local DNS server contacts the provided authoritative DNS server. The authoritative DNS server will provide the IP address of a machine mapped to the hostname, possibly after communicating with other authoritative DNS servers to resolve subdomains. E.g. the authoritative DNS server for might contact another authoritative DNS server to resolve to an IP address.
6. The local DNS server returns the IP address to the client. The client uses that IP address to send a request.

The local DNS server can cache a lot of the information it receives for later reuse. This allows it to skip a lot of the steps. For example if the local DNS server caches the IP of, it can immediately send that IP address back to any client who requests it. It can also cache the authoritative DNS server for, so if a client requests any subdomain of, it can go straight to that authoritative DNS server instead of going through the root and TLD servers again.

DNS cache is busted after a period of time, typically 24–48 hours.

2.5.3 DNS Messages and Resource Records

  • A resource record is a four-tuple: (Name, Value, Type, TTL)
  • The TTL is the time before the record should be evicted from cache.

Four types:
A: Name is host, Value is host’s IP address. Nowadays we also have AAAA for IPv6 addresses.
NS: Name is domain (e.g., Value is hostname of authoritative DNS server for that domain, e.g.
CNAME: Name is an alias, Value is the canonical hostname.
MX: Name is an alias, Value is the canonical hostname of a mail server. This exists so a mail server and another kind of server (such as a web server) can have the same alias but different canonical hostnames, so you can have point to both the web server and the mail server Web clients will request the CNAME record for and get back the value, while mail clients will request the MX record and get back the value

Note1: There are actually an assload of DNS record types.
Note2: On MacOS and Linux (probably also through WSL on Windows) you can use the terminal command dig to do DNS lookups.

Format of a DNS message:

16 bit ID 12 bits of flags
number of questions number of answers
number of authority records number of additional records
  • The 16-bit ID links requests and responses together.
  • The flags indicate such things as whether the message is a query or reply, if the queried server is authoritative for the queried name, or whether to allow recursive queries (a DNS server receiving a query and querying another DNS server to fulfill it).
  • The numbers indicate the number of DNS records returned in the later sections.
  • Questions are queries with host names and types (A for host address, MX for mail server canonical name, etc.)
  • Answers are only in replies and are the answers for the queries (IP addresses for A queries, canonical names for CNAME and MX queries). You might get multiple answers when a hostname has multiple IPs.
  • Authority is the records of other authoritative servers.
  • Additional depends on the query but will contain other useful information. E.g. for an MX query, the answers section will contain the canonical hostname of the mail server, and the additional section will contain the A records mapping the canonical name to an IP address so you don’t have to query again for them. Similarly, for an NS query, the answers section will have the canonical names of the authoritative DNS servers, and the additional section will have the A records for the IP addresses. You can mess around with dig to see these.

Records are inserted to DNS by paying a registrar, which are authorized by ICANN, to get your hostname mapped to your authoritative DNS server in a TLD server for your top-level domain.

There were two more sections, one about P2P protocols and BitTorrent, another about implementing custom application-layer protocols in Java by using the various socket classes. I read the P2P section but chose not to take extensive notes on it. I skimmed the implementation section.

Aphorisms from The Pragmatic Programmer(‘s first half)

I can’t remember the last time I read more than halfway through a tech book. Tech books tend to peak early. You get more than half the value of the first half of the book. A lot of times the second half is specialized or optional topics and a lot of times those topics are out of date unless you buy the book the second it’s off the presses. Of course, I’m also lazy, and about halfway through a tech book I’m usually starting to lose interest and decide I’m okay with building CRUD web apps for the rest of my life if it means I don’t have to read any more of this.

I still haven’t made it all the way through The Pragmatic Programmer either, but I’ve made it further than I usually do—about two-thirds of the way through. And I actually want to read the rest of it. I don’t always like its metaphors or cutesy terminology like “binary chop”, but it’s memorable and easy to read.

The Pragmatic Programmer is structured as a list of 100 tips. Some of them cover code structure and design (“Tip 47: Avoid global data”). Some cover development practices (“Tip 28: Always use version control”). And some relate to personal development (“Tip 11: English is just another programming language”), philosophy (“Tip 3: You have agency”), and ethics (“Tip 99: Don’t enable scumbags”). Each tip cross-references related tips. The end result feels more like reading a bunch of blog posts than a single coherent book. The tips are grouped broadly by topic, but a single tip can go from explaining a technical concept to coding advice to rats chewing through cables. As “pragmatic” suggests, the book imposes no hard lines on its topics and the commentary on the tips goes wherever it needs to go.

In the rest of this post, I’ll go through some of the tips and my idiotic hot takes on them.

Tips 1–4

I tried to read The Pragmatic Programmer for the first time about six years ago. I looked at the Kindle sample, which only had Tips 1–4. I thought, “Wow, these guys are smug as hell”, deleted the sample, and didn’t come back to the book for another six years.

Tips 1–4 are basically the preamble that some college professors put in their syllabi where they lecture you about how you’re an adult now, you need to take responsibility for your own actions, only you benefit from attending this course and finishing your assignments, so on, yada yada, etc. If you’re a bit of a beaver-cleaver, you’ll read these four tips, pat yourself on the back for being so responsible and never making lame excuses, and a deep contentment will warm you from the heart outward for the rest of the day. Otherwise you can just skip to Tip 5. Go on, be a rebel. I won’t tell.

Tip 5: Don’t Live with Broken Windows

When it comes to software rot (or “technical debt”), there is definitely some truth to this idea. If developers feel the code is badly designed and full of hacks, they will contemptuously pile more hacks on top. I’ve seen it happen firsthand. We spent seven months talking about rewriting the entire codebase because the whole thing was such a cancerous dumpster fire that it could never be improved. Then most of the team left because the code was such a cancerous dumpster fire, and about a year after that, management finally gave the go-ahead to rewrite the code.

On the other hand, The Pragmatic Programmer pushes an absolute zero tolerance policy towards broken windows, which has not been realistic in my experience. All the codebases I’ve worked on were written as quickly as possible by people who are long gone. They’re usually bad in various ways. Sometimes in ways that are easy to fix—bad formatting can be fixed by automated tools, and fancy IDEs like IntelliJ IDEA can stick a big glowing flag on certain kinds of anti-patterns and code smells. Sometimes you can go back and patch up the rot and make sure to do better going forward. But sometimes the broken windows are so fundamental or widespread that you can’t fix them without tearing down huge swathes of code and rebuilding deep-seated parts of it. And sometimes you simply aren’t allowed the time to clean up bad code, because your team is tiny and you can’t afford to put a whole developer on a cleanup job for a month unless customers are complaining about it.

However, sometimes, even if you can’t fix the broken windows, you can at least quarantine them so most of your daily work doesn’t touch that code. One codebase I worked on had an absolutely hideous subsystem where a developer wrote all the code in a single god class of 500-line methods that mutated local variables and then returned them to be passed as arguments into other methods that also mutated them. It was horrid, and so brittle that with the original developer gone there was no chance of fixing it without breaking something, but on the bright side, it was all in a single god class, so we could just never look at that class until we had to.

Tip 9: Invest Regularly in Your Knowledge Portfolio

Steve Yegge points out in several of his blog rants that the half life of technical knowledge can vary a lot. So while it might be tempting to study the hip new language or framework that everyone on Hacker News is talking about, it’s a good idea to invest in long term assets as well.

Algorithms, data structures, math, and statistics rarely, if ever, go out of date. Unix tools are practical, have been around forever, and show no sign of going away, so getting familiar with grep and sed can be a great investment. It’ll pay off when you have some weird bug that only happens on the third Thursday of every month and you can bust out a grep command to find all those logs in your log files without even looking at the man pages. find, tr, cut, awk, curl, nc, jq, and screen are also great commands to look into, as well as the shell language itself. Recently I finished a script in four hours that I thought was going to take two days, because I realized I could replace a bunch of API calls and ugly JSON structure rearrangement that I was going to write out in Python with a short bash script using curl and jq. Editors like Vim and Emacs have also been around forever, and learning some of the more powerful commands can save you a ton of time. For a while, I had to debug data pipelines, which usually meant receiving a CSV of data that someone wasn’t satisfied with, taking the UUIDs out of it, and querying a bunch of tables to find out where in the pipeline data got dropped or corrupted. cut and Emacs saved me a ton of time; I could use cut to extract the UUIDs from the CSV, then paste it in Emacs and use replace-regexp to quote them and reformat them into a comma-separated list that I could just paste into an IN clause of an SQL query. Which brings me to SQL—it’s been around since the 70’s, and unlike disco, it doesn’t seem to be going anywhere, so you could do worse than to study it. So many of the weird bugs ORMs cause only become clear once you understand the SQL it must be generating.

The Pragmatic Programmer also suggests investing time in some of the weirder languages to have a bigger stock of ideas to draw on. I think this idea bears repeating. I see too many people online who know Python, Ruby, and Javascript and want to have a language war over which of those three is better. I also see people who know Python, Ruby, and Javascript complain that Java is ancient, inscrutable gibberish that somehow is also the most widely used language in the world. I was this person. When I first started learning to program, I wrote a huge rant (that I thankfully never published) comparing Visual Basic, Java, and Python. My arguments were that Visual Basic was better than Java because you have to write End If, so you know it ends an If, whereas in Java it’s always just a brace, but Python was the best because whitespace. I was stupid. What I was doing was equivalent to reviewing three books which all cover the same material in the same order and only comparing what fonts the text was printed in. I definitely support spending time with Icon, Prolog, OCaml, Io, or Idris, and learning just how different a language can be, before discussing differences between languages.

Tip 10: Critically Analyze What You Read and Hear

I majored in computer science, but before that, I was an English major, and before that, I spent ten years wanting to be an English major. One of the most important things you learn studying humanities is exactly what this tip says—how to critically analyze what you read and hear.

When you’re evaluating a new technology, be critical about the claims. Don’t just accept flashy benchmarks; probe the methodology behind those flashy benchmarks. Think about the context behind claims: maybe a group at Amazon is building great things with an experimental JIT compiler for Lua, but does that mean your team will be able to do the same? Maybe the software actually isn’t business critical and only has five users. Maybe the creator of the experimental JIT compiler is on the team and can fix any bugs they run into. Maybe Amazon, being Amazon, can make it worth the creator’s while to solve their problems.

Being critical doesn’t mean being constantly and unreservedly cynical. You can be positive about things. You can love Rust if you want. But you should love Rust because you critically analyzed the claims made about it and decided they were reasonably accurate, and you should always be on the lookout for new information that contradicts your current point of view. When you find that new information, critically analyze it and decide if it might be true or not. Then decide if you need to care. Even if your favorite language chokes on 5 HTTP requests per second, you can still enjoy hacking together a hobby project with it. But be honest; don’t convince your company to build something in your favorite language when there’s a strong chance the system will need to handle 1,000 HTTP requests per second.

Tip 14: Good Design is Easier to Change Than Bad Design

This is true, but no design is infinitely flexible. You always have to make some choices that will lock you in. I’ve seen (and written) a ton of code that was easy to extend in a way that turned out to never be necessary. I’ve also seen (and written) a ton of code that was built to be extensible, but not quite extensible enough to capture a later use case. This is where the tips about making your design orthogonal and decoupled come in. If you can rip out a whole module and replace it with something else, then you can change it, even if it’s not as easy to change as it would be if you’d thought of something when you first designed it.

Tip 17: Eliminate Effects between Unrelated Things

This is the larger principal behind a ton of well-known best practices. Don’t use global variables, because they provide a channel for two unrelated things to affect each other. The C preprocessor is a pain because unrelated files can affect each other if they happen to be compiled together. I also find a lot of semi-common practices in hyper-dynamic languages like Ruby, Javascript, and Python to have a lot of potential to create effects between unrelated things. Javascript didn’t used to have real modules, so libraries would shove things in the global namespace. It was a complete nightmare to deal with libraries fighting for symbol real estate in the window object. (JS for some reason allows just $ as a function name, so everybody fought over $ until jQuery came in and definitively staked claim to it.) I’ve had some infuriating bugs due to Ruby’s ability to define a class across several files combined with Rails’s implicit global imports. In Ruby, if you define a class with the same name in different files and load both of those files, they will be combined into a single class. Rails will automatically load every file you put in its blessed directories. So I’ve encountered ridiculous bugs where two programmers made classes with the same name in different contexts, and Rails helpfully imported both files, causing Ruby to combine those two unrelated classes into a single class and breaking code in two places that had nothing to do with each other.

Tip 19: Forgo Following Fads

This tip is pretty useless. The idea is good, but it offers no guidance on figuring out what’s a fad and should be ignored. I’ll offer a pointer back to Tip 10, though: critically analyze what you read and hear.

Tip 20: Use Tracer Bullets to Find the Target

This tip is awesome and the book is worth the cover price just for this.

The basic idea is that when you start a new system, you build an end-to-end skeleton of the entire architecture with some example flows. Stub out anything you have to, just get the end-to-end working as quickly as possible. This will show you what the architecture you have planned will look like in practice. It’s called a “tracer bullet” because it lets you see where you’re aiming, and you can assess how close your shot is to the target.

The book suggests showing your skeleton to customers. This probably depends on your organization’s culture and relationship to customers. Nothing I’ve worked on benefited from being shown to customers. The customers would get hung up on details like the color of the banner on mocked up web pages, or they would seize on random technical details that they happened to know about (“It won’t allow escalation of privilege, right?”, “The JSON responses are small and streamlined, right? That makes it fast.”, “Did you use Angular? Angular makes rich interfaces.”), or they would look at it, grunt, and go back to whatever they were doing. But it can be a great benefit to show it to engineering leadership, or to engineers on other teams, or product managers or UX experts. And if you happen to have customers able to give useful feedback, sure, show it to them.

Tip 32: Read the Damn Error Message

Part of this tip discusses rubber ducking, which is where you explain your problem to a rubber duck because your problems are idiotic and should not be inflicted on other humans. Because your problem is so idiotic, the rubber duck’s silence will make you feel shame, causing you to quit the industry and take up competitive bass fishing.

In case you don’t want to quit the industry yet, I’ve found that a more productive strategy than talking to a duck is writing out what the buggy code is doing as if you’re manually running through something you’ve written for an interview. If you don’t know why something happens, note that and the result and move on. Start a little before the bug, and keep going to a little bit after—the moment where the exception is thrown or the erroneous result returned is usually a good place. For me this usually causes something to click eventually, and I see what I’m missing. It can also help, once you’ve isolated the buggy code and the bad result it outputs, to treat it like a brainstorming session—what are five ways this bad result could have been created? Get completely crazy; “solar wind knocking an electron out of alignment, causing the bits to swap” is on the table. How do those five ways relate to the surrounding code? What else is going on in the program at the same time this code is executing?

Tip 37: Design with Contracts

I’ve come around a little on statically typed languages. Java is still kinda painful to write, but at least it’s also kinda painful to write for that guy who was three months out of university when he was asked to write an entire analytics tool in a week and didn’t have time to write comments or real error handling—his code is probably a mess, but at least Java forced him to leave a few more breadcrumbs about what the hell he was doing than Python or Javascript would have. And newer languages like Go and Rust reduce a lot of the annoying parts of Java by adding better type inference and other goodies.

Statically typed languages force you to write basic contracts for every function. It’s nicer to write contracts that express some real invariant about your data—“x represents an age, so it must be between 0 and 200″—but at least “x is an integer” is more than nothing.

But contracts are most important for system boundaries, where different people’s code comes together, and nowadays that often happens over a network. So that’s where statically typed data transfer formats, like protocol buffers, become useful.


There’s lot of good advice in The Pragmatic Programmer, and most of the tips I didn’t bother to discuss because I just nodded when I read them. But there’s also some stuff implicit in the book’s worldview that I don’t agree with—I have a whole rant about “people are not their code” that I might go into when I finish up the book. I recommend you read it, but that you keep in mind Tip 10. Don’t be a beaver-cleaver. Be a cool rebel who studies humanities. Donne is subversive, man.

You Won’t Be Remembered For Your Code

A lot of people who are first getting into the software industry seem to think that they’re going to make some kind of impact with their code. The dumber ones think it’s going to be some kind of huge splash, like people are going to print their code on giant wall scrolls and put it up in museums so future programmers can come through and admire its structure and that amazing little trick with the lambda. The smarter ones know it will be a small impact, but they still think it will be an impact of some kind.

Allow me to rain on your parade: you will never be remembered for your code. Code in industry software is endlessly erased and rewritten. Even that picture of a penis you carved into the wall of a bathroom stall in sixth grade and then initialed like it was some great work of art is more likely to still be there, because replacing the walls of bathroom stalls is expensive and laborious. But erasing your clever little trick with the lambda and replacing it with ten lines of plodding that even the biggest dunderhead couldn’t possibly fail to understand takes about five minutes, and people who erase code are on salary, so it didn’t even cost the company five minutes of work, whereas people who replace bathroom stall walls get paid hourly. Plus, now even the biggest dunderhead couldn’t possibly fail to understand that code, so they can fire the programmer who replaced your clever little lambda trick and hire the biggest dunderhead for a lower salary.

No one is remembered for code. Steve Jobs is remembered for product design and marketing. Bill Gates will be remembered for being insanely rich. Programming languages are products, so programming language creators are also remembered as product designers, not for their wonderful code, and they’re easily forgotten once no one is using the language anymore; give it twenty years and Larry Wall will be a random piece of programming trivia alongside Jon Ousterhout, Jean Ichbiah, and Niklaus Wirth. Same goes for frameworks, which are even more disposable than programming languages. If anyone remembers David Heinemeier Hansson or Rails at all, it won’t be for some ingenious metaprogramming trick he used in Active Record.

And these are the really famous people, the people whose code actually affected the way thousands of others do their jobs. Even they will not be remembered for how wonderful their code was. So trust me when I say that you won’t be remembered for your great code.

However, if your code sucks, it will keep your memory alive for a while. Every time someone goes into a sloppily formatted file with dozens of unnecessary global variables, lots of overlong methods, and comments that say x++; // Add one to x, they’ll head on over to Git and find out who wrote this crap. If you wrote enough of the code, you might even become something of a legend for a while. Of course, the other programmers will make a point of extinguishing your imprint from the code, so your code is still headed for death, and your legend will live on only in the tales the other programmers tell of the crappy code they suffered with until they could sneak in enough refactoring to eliminate it. You’ll be remembered, but you’ll be remembered the same way Tom Green is remembered, as an irritating bore that everyone raced to expurgate as quickly as humanly possible.

So don’t get into software if you think code is a good way to be remembered. It’s not. Product design is more likely to get you remembered by someone, but that’s not foolproof either; not many people, even in programming circles, have heard of Ward Cunningham, who invented the wiki, or Dan Bricklin, who invented the spreadsheet, even though those are both products at least as important as anything Steve Jobs ever made.

Every Freaking Tech Job Ad Ever

We’re looking for a superstar COBOL ninja to join our team! Applicants must have at least 40 years experience with COBOL. COBOL.NET experience preferred. You’ll spend every day hacking away on our exciting, disruptive legacy banking product, making it more robust and scalable as well as more exciting and disruptive. We are an exciting, young, hip, radical, groovy team of hackers, rockstars, ninjas, and wizards who love tech, startups, and tech startups, as well as disruption, excitement, and disruptive excitement, particularly in tech! We’re changing the face of banking by using the latest, most exciting technologies to disrupt all the old ideas about storing money and replace them with exciting, disruptive new ones! So come join us and remake the face of banking by making hacky temporary workarounds to problems with our mainframe OS that were fixed thirty years ago by upgrading to Windows 3.1!


  • 40 years COBOL experience.
  • Experience working with decimal computers
  • Teletype experience
  • Experience with mainframes
  • Experience with banking
  • Experience with banking mainframes
  • Experience with exciting, disruptive developments in legacy banking mainframes
  • PhD. in Computer Science, Computer Engineering, or Nucular Engineering
  • 10+ years software experience
  • Must be under 24 years of age
  • Must be able to lift up to five grams
  • Great team player who loves working in teams and can’t do anything alone. Anything.
  • Be a hip, radical, groovy ninja who’s excited about disrupting banking with disruptive legacy mainframe systems

Nice to have:

  • Experience with TCL, MUMPS, or RPG
  • Experience with Visual Foxpro
  • Dataflex experience
  • Juggling three or more balls
  • Knowledge of wilderness survival
  • Willing to spend business trips living in a van down by the river
  • A pulse

From Java to Python in Pictures

I’ve been working mostly in Java for the past several months, for my job. It’s started to feel kind of like this:

Screen Shot 2016-06-26 at 9.48.46 PM

Screen Shot 2016-06-26 at 9.48.35 PM

And every time I have to whip out a design pattern or build yet another layer or deal with generics or interfaces or add 50 public setters for the benefit of some library, it kind of feels like this:

Screen Shot 2016-06-26 at 9.48.22 PM.png

A few weekends ago I worked on a small hobby project in Python. After all that Java I’ve been doing recently, it felt kind of like this:


Your mileage may vary.

The Ballad of Leftpad: Or, Batteries Not Included

By now, you’ve probably heard about the leftpad debacle. If not, let me be the first to explain it. Some guy wrote an 11-line Javascript function that appends padding characters to the left of a string (if you need them on the right, you’re outta luck), and for some reason he published it on NPM. Then a bunch of people, including the React team at Facebook apparently, used it as a dependency in their projects. Then the guy who wrote leftpad got mad, ragequit NPM, and took down his package. This broke Node and Javascript projects all across Silicon Valley, and since some of those projects were making money, and their making money was inexorably tied to their not being broken, this was not very good at all.

I read a blog post about this incident in which the author wondered if we’d all forgotten how to program and couldn’t have written this function ourselves. A bunch of commenters responded that you shouldn’t have to write this function yourself and it was right of these projects to include an 11-line dependency, and everyone argued.

I happen to feel that both sides of this argument are wrong. Of course you don’t want to include a dependency for this stupid little crap function. And of course you don’t want to write it yourself, even if you’re perfectly capable of doing so. Something like this really belongs in the language’s standard library.

Languages used to come with these things called standard libraries. Remember those? You didn’t have to include them in a package file or download them from a repository somewhere. They were just there. If you had the language, you had all these libraries. They could do simple things like left-padding a string, or joining the elements of a list, or even right-padding a string, that you could have done yourself. They could also do more complicated things, like converting between character encodings, or parsing CSV files, or making HTTP calls, or creating threads. Sometimes they even had these things called data structures, for storing data.

Presumably most of these people arguing in favor of leftpad were Javascript programmers. Javascript kind of has a standard library, but not really. It’s really spotty; compared with Python, Java, or C#, it’s missing a lot. Even when the library gets expanded, you have to wait for Safari and Edge to implement that part of it before you can use it. God forbid you need to support IE, even the reasonably standards-compliant IE10 and 11. So Javascript programmers will often use polyfills, which are NPM packages that implement whatever the browser is missing. AKA, dependencies. In the old days, the old days being approximately four years ago, there was no NPM; people would link in CDNs or just copy and paste the code into their current project. In that cultural context, you can see why Javascript programmers would argue in favor of leftpad: it kind of is the sort of thing you shouldn’t have to write yourself, even if you’re capable of doing so, but if you’re not going to write it yourself, you’ve got to get it from somewhere, and getting it from NPM sure is nicer than copying it off someone’s blog and pasting it into your codebase.

On the other hand, I have a lot of sympathy for this comment from the blog:

Prior to the emergence of jQuery, JavaScript development was [a] mess. Polyfills for different browsers and other abominations [were] all copy-pasted from project to project. It was a toy language that could be hacked into doing cool things (Google Maps and Gmail). The idea that a layer of abstraction that [sic] could hide browser complexity was a revelation. This idea took hold and grew till we got to the sitation [sic] where we are now. Simultaneously the “one language to rule them all” cargo cult and SPA trends emerged and JS ended up being in the right place, at the right time with the right toolsets.

Any edge cases and incompatible language implementations -including incorrect or missing operators(!) can be abstracted away and incorporated into the package and shared. I personally think that modules are a good way of working around the rickety mess that is JavaScript development. Perhaps the real answer it [sic] to stop fetishising a 20-year-old language that was thrown together in a week, and make a concerted effort to standardize on something better.

Javascript is not a great language. It’s definitely better than PHP, or COBOL. I’m gonna say it’s nicer to use than C++, too. It’s an okay language, and Brendan Eich making it as good as it is, given the constraints he was under, is a laudable achievement. But it’s not a great language, and there’s a ton of over-the-top love for it going around these days that doesn’t seem justified. But it’s what we’ve got, and we’re probably stuck with it; if it’s this hard and takes this long to get ES6, I can’t imagine a switch to a completely new language happening anytime this century. And lots of people have worked hard to make it better, and they’ve mostly done a good job.

However, despite these efforts, the Javascript ecosystem is definitely not first-class. I recently converted a project at work from Gulp to Webpack. The experience was not pleasant. It was a side project that I mostly pursued nights and weekends, and it took me over two months to finish, because Webpack is horribly complex. At the end of the day, there were still things I wasn’t satisfied with, things I had to hack in weird ways to make them work. And after those two and half months of work, I could create modules. Webpack can do more, but I wasn’t doing more; I was doing what Python does right out of the box, effortlessly. While I appreciate the engineering effort it must have taken to create Webpack, I can’t call something a great tool if it makes basic shit like creating modules that difficult.

I’m not saying these things to insult Javascript or Javascript programmers. I’m telling you guys not to settle for less. And definitely don’t create a philosophy around settling for less. Don’t create a philosophy that it’s okay not to have a standard library because you can just include a million little dependencies from NPM. That’s silly. It’s like Java needing to download the BigInteger class from Maven Central. Javascript programmers deserve a first-class standard library, just like Python and Java and Ruby and C# have. You haven’t got one right now; you’ve got a bunch of tiny packages on NPM, of questionable quality, that some guy can pull whenever he feels like it. If you own the deficiencies of your ecosystem instead of arguing that they aren’t problems, you’ll be further on your way to fixing them.

Harry Potter and Bilbo Baggins: Two Approaches to Magic

There’s a lot of magic in modern programming.

When I say “magic”, I really mean “sufficiently-advanced technology”. And when I say “sufficiently-advanced technology”, I really mean “library that you have no fricking idea what the kowloon putonghua it’s doing, but you use it anyway because it makes your job so much easier.”

There are lots of libraries that I have no idea what the mugu gaipan it’s doing to accomplish what it does. Just today, I confronted yet another one: Mockito, the mock object library for Java. Some of what it does, I know it does with reflection. Some, I can’t even imagine how it does it without hacking the bytecode. (It doesn’t. PowerMock, on the other hand, does hack the bytecode. It’s like adding capabilities to C by building a library that opens up a binary as if it were its diary, writes in a bunch of bits, and closes it up again. It’s bionic programs.)

Some people probably wouldn’t be bothered by this. They would use the magic without understanding it, happy that they have these magical artifacts to help them do their job. On the other hand, all this magic drives me mad. I complained in my post on web frameworks that I didn’t like Rails because, as Princess Jasmine said breathlessly, “It’s all so magical!” Then, later, I tried Rails again, and all of a sudden, it didn’t bother me. What I didn’t realize at the time was that I had spent several months between those two incidents reading the odd article on metaprogramming, usually in a Lisp/Clojure context but sometimes also pertaining to Python and Ruby. A lot of these articles would drop little hints about how RSpec and ActiveRecord were implemented. By the time I came back to Rails, it suddenly wasn’t so magical, because I could see the threads of fate behind the tapestry, weaving themselves into finders and schema definitions and migrations.

I’m not going to pass judgment on people who aren’t bothered by the magic. I’m not going to say they’re ruining programming or whatever. Frankly, most of this magic is pretty well packaged up; if you ever do have problems with it, just Google whatever cryptic passage flashes on screen and you’ll find a tome of lore with the answer you seek. In practice, this is what I have to do most of the time because otherwise I would never get anything done. So I can’t blame anyone if this doesn’t bother them.

However, this does remind me of a difference in approach between two famous fantasy heroes, Bilbo Baggins and Harry Potter.

Bilbo Baggins is a hobbit. He has a mithril vest and a magic sword, Sting, that glows in the presence of orcs, and a magic ring that turns him invisible when he wears it. He doesn’t question how or why they work; they work, so he uses them to fight with the goblins, battle a troll, and escape in a barrel from the Elf-King’s hall. (If you’re not familiar with the story, you can find out all about it from the original work. Just Google “The Ballad of Bilbo Baggins”.)

Bilbo hangs out with Gandalf, a wizard, and Elrond, an elf lord. Wizards can do magic. So can elf lords. Bilbo doesn’t know or care why or how. They just can; he accepts it.

For the most part, this strategy works for Bilbo. He survives all sorts of things that would have killed a lesser man, becomes fantastically rich, and ends up sailing off into the West to be sponge-bathed by elf maidens in his dotage. He does run into a rather nasty edge case fifty years on, when it turns out the magic ring is actually the One Ring to rule them all, created by the Dark Lord Sauron in the fires of Mordor. But by now, he’s retired, so it’s really someone else’s problem.

Like Gandalf, Harry Potter is a wizard. Unlike Gandalf, Harry Potter didn’t spring into creation fully formed and able to do magic just because some cosmic guy played a harp. Harry Potter had to slave away for six years learning magic at Hogwarts. (It would have been seven years, but destroying Lord Voldemort’s Horcruxes gave him enough extracurricular credit to skip a year, sort of like doing a year-abroad in Muggle colleges, except with more death.) Harry had to take all kinds of crap to learn magic; he had to put up with Snape bullying him, and McGonagall riding him to win at Quidditch, and he had to stay up until 1 AM writing essays full of ludicrous made-up garbage for Professor Trelawney. He had to deal with Dumbledore, who refused to ever tell him anything straight out, instead making him engage in some insane version of the Socratic method where you die if you don’t guess right.

At the end of all this, Harry Potter still isn’t that powerful. He manages to do the Imperius Curse and Sectumsempra, but that doesn’t help him much since those are both illegal. After six years of slaving away, it’s hard to see how Harry Potter is more powerful than he was when he started.

In the Harry Potter world, magic is hard. There are some pre-packaged artifacts to help you, like the invisibility cloak, but to be really effective at magic, you have to learn it. You have to study hard. Harry Potter doesn’t really study hard. Maybe I should have used a different example, like His Dark Materials, or Edward Elric. But Harry Potter works; while he does benefit from pre-packaged artifacts like the Invisibility Cloak and the Elder Wand, there’s always the implication that anyone smart and hardworking can learn how those things work, and even reproduce or improve on them.

It bugs me to trust anything that can think if I can’t see where it keeps its brain. It bothers me if I can’t understand how something is implemented. It bothers me when I can’t understand how something is even possible without bytecode manipulation, and I want to know the answer, even if the answer is “it’s not possible without bytecode manipulation”. I take a Harry Potter approach to programming.

Well, here’s something else we can all fight about! Soon we’ll be insulting people by calling them hobbit programmers.