Reading unfamiliar code

This is a procedure for reading unfamiliar code in C++.

1. Have a goal in mind.

Don’t approach the code without something to do. Approach the code with a plan, and only concentrate on learning the code you need to in order to accomplish that plan. You should never expect to understand the entire codebase.

2. Notes

Keep a notes file, as if you were writing documentation for someone else. More often than not, that someone else will be you, when you try to understand the code again in six months.

Describe the flow of the program for the part of it you are interested in, and note the various data structures. If you don’t know some operation, write down the question, and go answer it.

3. Find the main loop

Depending on what you want to do with the code, you need to find the main loop that executes it. If a bash script calls some binary, go find the main() for that binary. Walk through main() until you see the part of the code relevant to your interests. You should only concentrate on that part of the code.

4. Doxygen

Doxygen is absolutely essential in approaching unfamiliar codebases. Set up some reasonable defaults in the Doxyfile, by changing the setting on the following variables:

  • HAVE_DOT = YES
  • CLASS_GRAPH = YES
  • COLLABORATION_GRAPH = YES
  • GROUP_GRAPHS = YES
  • TEMPLATE_RELATIONS = YES
  • INCLUDE_GRAPH = YES
  • DISTRIBUTE_GROUP_DOC = YES
  • EXTRACT_ALL = YES
  • SOURCE_BROWSER = YES
  • INLINE_SOURCES = YES
  • GENERATE_LATEX = NO
    • (Feel free to change this if you do want to see latex inside docstrings.)

Generate the html, and open it in a browser. That will let you go through class structures while reading the code, making it easier to understand.

5. Changing the code

Attempt to change a part of the code to do what you want. Mirror the syntax of existing code. Iterate through 2-5 continually.

Leave a comment