Menu

Overview

Relevant source files

Purpose and Scope

This document provides an introduction to the github-wiki-to-html system, a static site generator that converts GitHub Wiki repositories into standalone HTML websites. This page covers the system's purpose, high-level architecture, core components, and key capabilities.

For detailed setup instructions, see Getting Started. For in-depth architectural documentation, see Architecture. For deployment procedures, see Deployment.

Sources: README.md1-6


System Purpose

The github-wiki-to-html system converts GitHub Wiki content (Markdown files tracked in Git) into a static HTML website suitable for hosting on GitHub Pages or other static hosting platforms. The system preserves wiki content structure while adding metadata from Git commit history, including publication dates, modification dates, and author attribution.

Key Design Goals

GoalImplementation
Preserve Git historyExtract publication and modification dates from commit history with rename tracking
Support GitHub Flavored MarkdownUse commonmarker with GFM extensions (tables, strikethrough, task lists)
Generate SEO-friendly outputInclude Open Graph, X Card, and Schema.org metadata in templates
Maintain clean URLsStrip .md extensions from internal links for prettier URLs
Enable mathematical notationDetect LaTeX expressions and conditionally include MathJax
Ensure reproducible buildsDocker containerization with locked dependencies

Sources: constants.rb1-71 gollum-config.rb github-wiki-to-html.rb1-143


High-Level Architecture

The system follows an ETL (Extract, Transform, Load) pattern, processing wiki content through a pipeline of Ruby and Node.js tools:

System Data Flow

Sources: github-wiki-to-html.rb1-143 constants.rb10-71 methods.rb


Core Components

The system consists of five primary code components that work together to convert wiki content:

Component Architecture

Component Responsibilities

ComponentFileKey Responsibilities
Main Orchestratorgithub-wiki-to-html.rbPage iteration github-wiki-to-html.rb36-117 metadata extraction github-wiki-to-html.rb56-87 file generation github-wiki-to-html.rb90-106
Configuration Hubconstants.rbWiki source paths constants.rb10-20 output directory constants.rb26 site identity constants.rb32-47 asset URLs constants.rb50-67
Utility Librarymethods.rbensure_trailing_slash(), postprocess_html(), generate_html_file(), generate_sitemap_file()
Wiki Engine Configgollum-config.rbGOLLUM_OPTIONS hash, Commonmarker configuration, filter chain setup
HTML Templatetemplate.html.liquidPage layout, metadata tags, conditional rendering, asset inclusion
Shell Wrappergithub-wiki-to-html.shScript execution, output directory capture, html-beautify invocation

Sources: github-wiki-to-html.rb1-143 constants.rb1-71 methods.rb gollum-config.rb


Processing Pipeline

The conversion process executes in five distinct stages, each transforming data closer to the final HTML output:

Pipeline Stages with Code Mappings

Sources: github-wiki-to-html.rb24-143 methods.rb


Key Capabilities

Markdown Processing with GFM Extensions

The system uses commonmarker with GitHub Flavored Markdown extensions enabled in gollum-config.rb:

ExtensionPurpose
:strikethroughSupport for ~~deleted text~~
:tablePipe-delimited table syntax
:tasklistInteractive checkboxes - [ ] and - [x]
:footnotesFootnote references and definitions

The unsafe: true option gollum-config.rb allows raw HTML elements like <details> and <summary> for collapsible sections.

Sources: gollum-config.rb

Git History Integration

The system extracts metadata from Git commit history with rename tracking:

The follow: true option github-wiki-to-html.rb58 ensures that renaming a wiki page doesn't change its publication date.

Sources: github-wiki-to-html.rb56-87

The postprocess_html() function in methods.rb uses Nokogiri to strip .md extensions from internal wiki links, converting /Page-Name.md to /Page-Name for cleaner URLs in the static site.

Sources: methods.rb

Template Rendering with Liquid

The system uses Liquid templates with strict error mode github-wiki-to-html.rb25 to ensure undefined variables cause build failures rather than silent errors. The template receives a context hash github-wiki-to-html.rb90-106 containing:

  • is_home: Boolean flag for conditional rendering
  • main_heading: Page title
  • canonical_url, wiki_page_url: Attribution URLs
  • published_date_iso, modified_date_iso: ISO 8601 timestamps
  • author_name, author_url: Attribution data
  • page_footer_html: Footer content from _Footer.md

Sources: github-wiki-to-html.rb24-106 template.html.liquid

Sitemap Generation

The generate_sitemap_file() function github-wiki-to-html.rb142 creates an XML sitemap with <url> elements containing:

  • <loc>: Canonical URL
  • <lastmod>: ISO 8601 modified date

Pages are sorted by modification date (newest first) github-wiki-to-html.rb136-139

Sources: github-wiki-to-html.rb136-142 methods.rb

HTML Beautification

The shell wrapper github-wiki-to-html.sh pipes generated HTML through npx html-beautify with configuration from .jsbeautifyrc, ensuring consistent indentation and formatting. The 404.html file is excluded from beautification.

Sources: github-wiki-to-html.sh .jsbeautifyrc


Configuration Points

The system exposes several configuration points for customization:

Configuration FilePurposeKey Constants
constants.rbCentral configurationWIKI_REPO, OUTPUT_DIRECTORY, SITE_NAME, SITE_URL, PUBLISHER_NAME, STYLESHEET_URL
gollum-config.rbWiki engine behaviorGOLLUM_OPTIONS, Commonmarker extensions, filter chain
.jsbeautifyrcHTML formattingindent_size, end_with_newline, HTML-specific options
.gitmodulesGit submodulesWiki source and output repositories

For complete configuration documentation, see Configuration Reference.

Sources: constants.rb1-71 gollum-config.rb .jsbeautifyrc .gitmodules


Technology Stack

The system integrates Ruby and Node.js ecosystems:

Runtime Dependencies

TechnologyVersionPurpose
Ruby3.4Core scripting language
Node.jsLatestHTML beautification runtime
gollum-lib6.1.0Git-backed wiki engine
commonmarker2.5.0Markdown parsing with native extensions
liquid5.11.0Template rendering
nokogiri1.18.10HTML/XML manipulation
js-beautify1.15.4HTML formatting

For comprehensive dependency documentation, see Dependencies.

Sources: Gemfile Gemfile.lock package.json package-lock.json Dockerfile


Deployment Model

The system uses Git submodules for source and destination management:

The wiki submodule uses HTTPS for read-only access, while the output submodule uses SSH for authenticated push access. For deployment procedures, see Deployment.

Sources: .gitmodules constants.rb10-26


Build Process

The build process is orchestrated by a shell wrapper that executes the Ruby script and post-processes the output:

  1. Execute conversion: Run github-wiki-to-html.rb to generate HTML files
  2. Capture output directory: Extract OUTPUT_DIRECTORY from Ruby script
  3. Discover files: Use find to locate all .html files except 404.html
  4. Beautify HTML: Pipe each file through npx html-beautify
  5. Write formatted output: Overwrite original files with beautified versions

For detailed build system documentation, see Build System.

Sources: github-wiki-to-html.sh Dockerfile