Recently, Sijin Joseph’s Programmer Competency Matrix was trending on Hacker News. Sadly, the conversation quickly devolved to whether or not it’s truly a reliable means of measuring a programmer’s competency. While that may indeed be a valid point to debate, a discussion more helpful to the larger programming community would have been how developers starting at level 0 could master all of the concepts discussed in the matrix.

In the spirit of open source, as well as to refresh my own understanding of these topics and fill a few holes in my knowledge, I have decided to work through one competency level of one section of the matrix per month here on my blog. We’ll start with level 0 (2^n) and work our way up to level 3 (log(n)). (And I’m sure we’ll discuss what that notation means along the way.)

Since there are 5 sections to the matrix and each section has 4 levels of competency, that yields 20 months worth of posts doing one section + level per month. However, in order to be as thorough as possible, things may need to be taken in smaller chunks (especially the extra-long “Programming” section of the matrix). Thus, to avoid confusion, I will always thoroughly document the relevant section + level of the matrix covered in each post.

I invite you to buckle up and join me as we get started by tackling Computer Science: Level 0 from the Programmer Competency Matrix.

Data Structures: Arrays vs. Linked Lists

Before we dig into arrays and linked lists, it would first be good to understand the concept of a data structure. A data structure, in its simplest form, is really just a fancy word for:

A means of organizing and representing information so as to be more usable by its intended audience.

For example, say you have a collection of rare books. Each of your books has information we call metadata (things like author, publishing date, etc.). Putting this metadata into a form that allows you to quickly and easily find the information you want is the job of a data structure. For example, you could arrange your books alphabetically or in the order in which they were published, and this structure would aid someone in retrieving a desired book at a later date.

Arrays and linked lists are just different types of data structures with different uses. Let’s look at each in turn.

Arrays

You can think of an array like a row of seats at a theater. For example, suppose we have theater row A with seats 1-5:

Row A: [1] [2] [3] [4] [5]

Let’s suppose that the seats are assigned as follows:

Seat 1 - Peter
Seat 2 - Olivia
Seat 3 - Walter
Seat 4 - Astrid
Seat 5 - Charlie

Now, if you wanted to speak with Astro, er, Astrid, you would need to tap the should of the person in seat number 4. For Walter, it would be seat number 3.

Translating this metaphor back to arrays in the real world:

  • The row is our array,
  • The seat number is what we call the array index, and
  • The person in a given seat is the value for a given index number.

The basic idea behind an array is that, given an index, we can retrieve the value at that index.

In computer science, arrays can be visualized as follows:

[myArray]
 |
 | 
 |
[1]["Peter"]
[2]["Olivia"]
[3]["Walter"]
[4]["Astrid"]
[5]["Charlie"]

// Most of the time, an array variable simply
// points to a place in memory that holds the actual data

The trouble is: What an array actually is and what it can do varies drastically from programming language to programming language. What Ruby refers to as an array is more akin to a vector in C++, but C doesn’t even have vectors…but it does have arrays - confusing I know!

Thus, from a 10,000 foot point of view, all we can definitively say about an array is that:

An array is a sequentially indexed list of values.

Beyond that, refer to your language’s documentation for further details on its specific implementation of this data structure.

(P.S. Getting comfortable using arrays in your programming language of choice really pays off as they are a highly useful and very efficient data structure.)

Linked Lists

Moment of total geek honesty: You can probably get through your entire programming career without ever absolutely needing to use a linked list.

That said:

  • Understanding them will help you be a better programmer
  • Many higher-level functions and data structures use linked lists under the hood (so you probably can’t get by without using them, at least indirectly ;)

Where linked lists shine is against certain implementations of the array data structure we discussed above. In such implementations, you define the size of your array when you create it. . . stop and let that sink in for a minute and hopefully the limitations of that approach will become painfully obvious.

(If you’re struggling, maybe this quote from the visionary Bill Gates, uttered in the 1980s, will help: “640K [of RAM] ought to be enough for anyone.”)

The issue then becomes how does one add more elements to an array that is at capacity? Additionally, how do you remove elements from an array (especially the middle)? While these operations are possible, there is a data structure better equipped for handling data that changes frequently - linked lists to the rescue!

In a linked list, each item (node) in the list simply points to the next item (node) in the list:

["H"]--> ["e"]--> ["l"]--> ["l"]--> ["o"]

So:

  • Adding items to a linked list entails:
    1. Adding a pointer to a new list item
    2. To the item that was previously the last item in the list
  • Deleting an item means:
    1. Changing the node that comes before the deleted item
    2. To point to the node that comes after the deleted item (inserting an item works similarly; think about it)

The main drawback to linked lists as opposed to arrays is that there is no way to randomly access an item in a linked list. To find a particular value in a linked list, you need to start at the first item in the list and follow the pointers from item to item until you find the one you want.

Bottom line: Whereas arrays shine in accessing a value quickly given its index, linked lists shine in managing lists of data that need to update often but whose items don't need to be accessed randomly.

Algorithms: Average of Numbers in an Array

To solve this problem, think about what you need for an average of anything:

  • The sum of all the items in the data set
  • The number of items in the data set

The sum divided by the number of items yields the average.

Thus, in the context of the sum of an array, you need to:

  1. Loop through the array, adding the value at each index to a running total
  2. Increment a counter as you go through the array and track the number of items processed (or, alternately if permitted, call an arrayLength() method to get the size of the array)
  3. Divide the final total by the number of items in the array
  4. Display the results to the user

As with any programming challenge where you accept input, it would be wise to ensure that your program:

  • Checks that each value in the array is a valid integer (and responds appropriately if it's not)
  • Accounts for empty arrays

Systems Programming: Compilers, linkers, and interpreters

Compiler
Translates human-readable source code into machine-executable code. A compiler executes several stages (lexing, parsing, semantic checking, code generation) to accomplish this translation.
Linker
In the case where a program requires outside code (i.e., libraries, executables, etc.), a linker bundles (links) the program’s compiled code with any required external resources. The end result of this process is a single file incorporating all of the required files.
Interpreter
A compiler creates a program that can speak directly to the CPU; an interpreter receives code and executes the code (i.e., talks to the CPU) on the code’s behalf. Much like a human interpreter, a computer interpreter is a go-between for your code and the CPU that performs the instructions in your code. Rather than being statically compiled into executable code, code from interpreted languages is often executed by the interpreter "on the fly."

Conclusion

In writing the above, I referenced a number of blog posts and Wikipedia articles to refresh my memory of a few topics. Since I'll likely be doing that quite often throughout the course of this series, I will only cite such sources if a) I quote from them directly or b) if not doing so would mean plagiarizing content or ideas.

I hope this post was helpful – please let me know your thoughts if you get a chance.