Progress update on the documentation generator
I’ve made a lot of progress on the code documentation tool.
took a lot of inspiration stole from Eskil Steenberg’s website gamepipeline.org.
On “stealing” things
On this subject, whenever I see something cool from another developer online, especially someone with more skill and experience, I try to analyze it and then integrate it into my own code.
If one day I manage to release something, I will of course integrate the proper licensing and mention when some bits of code are literally other people’s code. I also never fail to mention who I got the ideas from (like in this article), because I think it’s important.
I try to not just copy and paste pieces of code, but rewrite the whole thing letter by letter until I understand it fully (but even doing that, we all know where the idea comes from…).
Those developers are benevolent souls, as they post their code online for people like me to use.
I am not comfortable enough to share whole chunks of my code.
Maybe someday I will?
The latest thefts are Sean Barrett’s stb for my vector class.
I tried to rethink it but couldn’t find a better way to write it, so most of it works the same way, I just added stuff like insertion, removal, and a fixed version of it that doesn’t need memory allocations.
I finally finished Eskil’s 2-hour video on how he programs C, and while I learned a lot of tips here and there (even though I’ll probably rewatch it several times because it’s really a treasure trove), I took a special interest to his memory debugging functions. I went through that code and I decided to adopt the method because it was just way better than what I had in place before.
I had a C++ heavy MemoryTracker, that would, when enabled, actually save the entire callstack for each allocation / free call.
While it was very precise, I would most of the time disable it because it was just too slow. I still had allocation tracking and leak detection, but without callstacks it wasn’t very useful, and it was a pain to switch to get callstacks on and off.
So I replicated Eskil’s memory tracking functions, and they’re really neat.
Over-allocating and put a magic number to detect overshoots, that’s just great! Especially when earlier I had a bug in the string resize code that would do exactly that.
As mentioned in earlier articles I also was inspired by sds and µnit.
Although for these, I didn’t even look at the implementation in detail, just saw the API and thought “huh that looks cool, let me try and code something like that.”
The doc generator
So back on the documentation. I like Eskil’s doc website a lot, it’s simple and clean-looking and easy to navigate. Naturally I’m “borrowing” the structure for my own code documentation.
I had a very quick look at Eskil’s DocGen code, and while at first glance it seems simple and elegant, I decided to not study it, but try to code it myself to see where I would end up (and also by now you know I don’t just copy and paste and tweak things here and there, I just start from scratch!).
My new tool goes something like this:
- read the C code and store it into a vector of C data elements
- go through these elements and transform them into documentation elements
- go through those doc elements and generate html pages
So I had to make a set of separate tools, for C parsing, HTML/CSS generating and then doc functions that would combine everything.
“Why all those steps?” you say, “Surely you’re making it slower than it needs to be”.
Well yes, but I have other goals than just documentation. Maybe I don’t want just HTML/CSS files but Latex too, or other formats. I definitely will have a need for C parsing later on when I want to generate code based on pseudo-C files.
I also added a Cmake macro parser, but that was child’s play compared to C parsing.
Speaking of that, I’m sacrificing some performance, I surely could improve the algorithm but for now it works.
The parser gets the file in memory, then that memory block is read 3 times, saving all the comments, then removing them, then saving all the preprocessor, and removing it, then finally parsing the remaining C code.
After that I have a merge step that assigns comments to preprocessor and code elements, based on line numbers. That seems overkill but that REALLY simplifies the parsing steps, not having to worry about comments.
As for the documentation itself, it has a simple structure:
- a doc project has a name and a description, and modules
- a module has a name, description, and sections
- a section has a name, a description and a list of elements
- elements have a name, content, and description, and can be C define, C struct, cmake macro, C code snippet, plain text, etc
By default a module will be like a library on a project, and in case of a C module, and each header file will create a new section.
The module description can be within a header using the @module tag in a comment, as well as sections work with @section. Sections are flexible, a section can have elements in several files, or one file can have several sections.
That way the documentation can almost be all encompassed in the code itself.
I’ve made a small executable that just defines the project name and the description, then assigns modules, so that code is fairly simple:
const char *project_name = "Another Memory Ends"; const char *project_path = argv; const char *project_description = "Reference for the ame code.\nContains libraries and cmake macros.\n\nWas generated by code_doc_generator."; c_output("Starting code_doc_generator for project %s", project_name); DocP *project = util_doc_create(project_name, project_description); bool res = true; CORE_PATH_MK(path); res &= util_doc_add_cmake_module(project, "cmake", project_path); c_path_format(path, "%s/%s", project_path, "src/lib/ftest"); res &= util_doc_add_c_module(project, "ftest", path); c_path_format(path, "%s/%s", project_path, "src/lib/math_ame"); res &= util_doc_add_c_module(project, "math_ame", path); c_path_format(path, "%s/%s", project_path, "src/lib/core"); res = util_doc_add_c_module(project, "core", path); c_path_format(path, "%s/%s", project_path, "src/lib/util"); res &= util_doc_add_c_module(project, "util", path); res &= util_doc_add_module(project, "documentation", "project documentation"); res &= util_doc_add_section(project, "documentation", "code writing", "information about ame code writing"); c_path_format(path, "%s/%s", project_path, "doc/coding_style.h"); cstr style = c_file_text_read(path); res &= util_doc_add_c_code_element(project, "documentation", "code writing", "coding style", "", style); c_cstr_free(style); res &= util_doc_generate(project, output_dir); util_doc_delete(project);
That way I don’t have documentation config files, I find that annoying.
The HTML tools are quite precise, generating perfectly indented tags and content. If one day I need to write web related stuff I’ll have solid tools.
The CSS for now is pretty much ad hoc, I just hardcode it in the doc functions when calling the HTML generation, and I have a whole CSS style file that is embedded in the util library, and written out every time the tool is used to generate the documentation.
Above is an example of writing adding a <section> tag with custom CSS
I might add specific CSS tools later on if I have use for it, but contrary to HTML where there aren’t endless combinations to make it work, CSS is MASSIVE.
Anyway a look at the work in progress:
The code structure is nearly done now, still a few things to implement like writing tables in html, and then the fun part… tune up the CSS, the fonts, and the color palette.
My goal is to finish the tool, polish it, and then write all the doc for the current code I have, before continuing to convert my C++ code.
In terms of speed, with release build, I think it takes under 2 seconds to read through (some headers are massive with 900 & 1100 lines) and generate the html. I’m sure it can be faster, when I get to re-implement the thread system I might try to add it in there.
But aren’t you just remaking Doxygen?
Yes, it seems I am doing just that. But I am having tremendous fun doing it and getting much much better at programming C.
Again a HUGE thank you to Eskil Steenberg, he has really helped reigniting my love of programming.