Search Strategies For Learning

Of all the skills that a programmer needs to perform well, the most essential is the ability to learn quickly. The landscape of programming is steadily changing, and the only way to keep up is to learn new things as fast as you can. Programmers are constantly dealing with new problems, using new interfaces, learning new languages, and programming in new domains. We need to make our programs faster, bigger, slicker, cleaner, safer, more robust, more secure, and more useful.

To accomplish all of these goals, we're going to have to learn. It comes with the territory, and there's no getting around it, but being programmers, we're naturally going to try to optimize it. Much of learning to do something new consists of searching for what you need to know to take the next step towards your goal. It can take the form of searching the web for documentation or tutorials, reading books that will teach you all about the subject that you need to know, or searching manuals and specs for the exact information you need for a particular task.

So learning requires a lot of searching. Conveniently, Computer Science provides more than a few search algorithms that we might find useful. Of all of the search algorithms available, the path-finding algorithms are probably the most applicable because most of the time, when programmers are learning something, we want to find a way to get from our current insufficient knowledge to a working system by following a series of logical, easily understood steps. That pretty much sounds like a path-finding problem to me, so let's look at some of the best options we have at our disposal to solve that problem.

Graphs, Vertices, and Edges, Oh My!

A graph is basically a set of objects that are connected together in some way. In mathematical terms, the objects are called vertices and the connections between them are called edges. Here is an example of a graph:

If you imagine grabbing the center vertex, dragging it up to the top, and shaking the graph a bit, you get a different representation of the same graph that's commonly referred to as a tree:

This tree could be thought of as representing a body of knowledge about a particular subject, some of which you need to know to be able to do something you want to do, like build a web page that can find the most efficient route from point A to point B or build a robot that can make your dinner. The root node is the general subject under consideration, the next level of nodes would be main topics of that subject, the next level would be subtopics, and so on. As you descend the levels, you get more and more detailed knowledge of more specific aspects of the subject.

BFS

Breadth-First Search is one of the two fundamental graph search methods. (The other being Depth-First Search (DFS), which we'll get to in a minute.) If you were looking for a particular piece of information on building and programming a robot chef, and you did a BFS of the subject, it would look something like this:

Tree with depth-first search vertices highlighted

You would start by doing a general survey of topics relating to building a robot, possibly by reading a book (or more) on the subject, researching online for general information about it, and going through reviews and documentation for a few robot starter kits. Once you had a general idea of how to go about building a robot, you could drop down a level and continue learning about each topic in more detail. At some point, you would feel comfortable enough to start building your own robot.

Of course, you can't learn and remember everything about robots before getting started, so there will be a balance between continuing the BFS and actually building the robot. The search will end up being somewhat guided by issues that come up when you run into gaps in your knowledge, but the general idea is that you take a broad view of the subject. You're trying to learn as many different topics as possible so that you'll know what's possible and have a better idea of where to look when you get stuck.

I have a particular affinity for this approach. I'm not sure if I was always biased towards it, or if an experience drove me into adopting it as my primary strategy. But if it was the latter, I can distinctly remember what that experience was.

In fourth or fifth grade, I was in Cub Scouts, and one of the activities we did at one of our nightly meetings was a worksheet that had a large number of directions on it. I can't remember how many, exactly, maybe twenty directions in total. It was supposed to be a game, and the Scoutmaster upped the ante by making it a race. The first scout to complete the worksheet correctly won a prize.

The first instruction was to read all of the directions and then do exactly what they said. What followed was a series of instructions to complete a bunch of random tasks in the white space on the front and back of the worksheet - doing math problems and drawing pictures in specific places, folding the worksheet in various ways, circling and underlining different words on the page, etc.

Everyone immediately got to work frantically scribbling on the page. Well, almost everyone did. A couple kids just sat there and read the whole worksheet, like the directions said. The last instruction simply said to ignore all of the other directions on the worksheet, draw a square in the upper right corner of the page, and put your pencil down. One other scout and I smoked the rest of the troop in that race, but even more importantly, the lesson of that worksheet really sunk in for me.

When you race to try to finish something as fast as possible without knowing all (or most) of your options, you will likely end up wasting a lot of time doing meaningless, tedious, and superfluous work. If you take some time up front to survey the landscape, you can make better choices, accomplish much more in less time, and be more aware of available alternatives. Okay, that's why I love BFS and gravitate towards it when learning new things. What about other strategies?

DFS

The analog of BFS is DFS, and getting back to our tree, it looks like this:

If you were using a DFS strategy while building our robot chef, you would probably go out and buy a robot kit, decide on a feature to get working first, and then dig into the documentation to try to implement that feature right away. Once that feature was done, you would move onto the next one and repeat the process. Your searching for information about robots would consist almost exclusively of finding the detailed information necessary to get each feature working as fast as possible.

The advantages of this approach are that you can get something working much more quickly than if you did a BFS, and you don't waste time learning things you may never need to know. I've rarely been good at this approach because I seem to need a good overview of a subject on which to hang more detailed knowledge, and I can't integrate the necessary details for a project unless I understand how the entire system works together. Some people are really good at DFS, though, and it's impressive to see them in action.

Beyond BFS and DFS, there are a couple other path-finding algorithms that are applicable to learning.

A* Search

Nearly 60 years ago, Edgar Dijkstra invented an algorithm to find the entire set of shortest paths from one vertex to all other vertices in a graph that has cycles. It's still in widespread use today. We don't really need to find all paths, so it's overkill for our purposes. A* search is an optimization of Dijkstra's algorithm that can be used to quickly find the shortest path to a particular vertex in a graph.

A* search uses heuristics to guide the search on a graph that has edges of different weights. For the graphs of subject knowledge that we're dealing with, we can think of the edge weights as the likelihood that that path will lead to the knowledge necessary to get a particular feature working in our project. Using an appropriate heuristic to guide the search, an A* search would look like this:

It starts by examining all topics of the subject, and then picking the most likely subtopic to explore next. It ends up being a hybrid approach between BFS and DFS, combining the advantages and softening the disadvantages of both of them. With A* search you would probably find a good book to get a general overview of the topics on building robots, and then you would drill down on those topics that were the most interesting for building a robot that could cook. You may not get a working robot as fast as if you did DFS, but you'll be more well-versed in building robots in general and will probably end up with a better design.

Another feature of A* search is that it keeps a list of the next vertices to search in priority order. As each vertex is visited, the unvisited vertices connected to it are added to the list, and the next vertex in the priority queue is visited next. When studying something new, we naturally create a list like this in our heads and loosely follow it as we're looking for what we need to know. If we make that list more explicit, by keeping tabs open and ordered in our browser or taking notes on what we're reading, we can improve the heuristics we use and find what we need faster.

A* search is the gold standard of search algorithms. There are many variations and optimizations of this algorithm that work better under certain conditions, but with a good heuristic, A* search is the go-to algorithm for path-finding searches. It works quite well as a learning strategy as well. Let's look at one more search algorithm to round out our options.

Sample Search

Sample search is a polite way of saying random search. You're basically picking a topic or subtopic at random and learning about it. Sample search would look something like this:

You may be thinking, why the heck would I ever want to do that? But we do it all the time. My RSS reader is a constant stream of articles that may pique my interest and set me off exploring in one area or another. A well-curated Twitter feed would probably be much the same. (Although I haven't been able to get past all of the noise on the few attempts I've made so far.) Watching for interesting questions on Stackoverflow.com or other StackExchange.com sites would be another example of doing a sample search. Any time you come across a piece of information that interests you and you pick it up and run with it, you're unwittingly doing a sample search.

It may not be the best way to quickly learn about a specific problem that you're currently dealing with, but it's a great way to find new things that interest you that you never knew about before. And every once in a while serendipity happens, and you come across an article or a book reference during your random searching that turns out to be exactly what you need for that troublesome problem that you can't figure out.

As a primary means of learning, sample search is impractical, but the random nature of it can be the catalyst to overcome barriers that you may be running up against. That makes it quite powerful in certain situations, so it's worth keeping in mind when you get stuck.

Know Your Learning Strategies

Each of these search strategies can work well under the right circumstances. BFS works best in cases where you don't know much about the subject in question and you're going to be working with it for a long time, so you need to build a solid foundation before proceeding. DFS works best if you already have some knowledge in the area, and you need to figure out something specific that can be easily found and used. A* search works best when you need an overview of a subject before delving into one or more specific areas of interest. And Sample search works best when none of the others do, and you need something to kick-start progress again.

Different people will make better use of different strategies, so be sure to experiment and find what works best for you. I naturally use BFS because I need to know how everything works, and I can think about problems much more effectively when I have a good idea of how everything fits together. But I've used all of these approaches, and they've all worked in the right context.

The trick is to realize when a particular search strategy is not working. Don't keep plowing along with a strategy that's going nowhere. Switch it up when you get stuck and you may find that a different strategy is what you needed to find exactly what you were looking for.

Lucid Mesh

Search This Blog