The wikinder-rebase-2025-12-10 repository is a specialized Git history maintenance tool designed to clean up and optimize the commit history of the Wikinder wiki repository (https://github.com/wikinder/wikinder.wiki.git). The system executes a three-stage pipeline that transforms messy, verbose commit history into a consolidated, maintainable format by rewriting commit messages, squashing related commits, and removing empty commits while preserving chronological order.
This document describes the overall system architecture, processing workflow, and repository structure. For detailed information about the wiki content being processed, see The Wikinder Wiki. For technical details about individual processing scripts, see Processing Scripts. For usage instructions, see Usage Guide.
The system addresses the following Git history maintenance problems through a three-stage transformation pipeline:
| Problem | Solution Stage | Script/Tool | Input | Output |
|---|---|---|---|---|
| Verbose "Revert X...Y on Title" messages with unstable hash references | Stage 1 | `01-revert-to-updated.pl`() | `$FIRST_LOG`() | `$FIRST_REBASE_TODO`() with exec git commit --amend commands |
| Multiple sequential edits to same file by same author on same day (JST) | Stage 2 | `02-squash.pl`() | `$SECOND_LOG`() | `$SECOND_REBASE_TODO`() with pick/fixup sequences |
| Empty commits after squashing, mismatched committer metadata | Stage 3 | git-filter-repo in `00.sh101-105() | `$THIRD_LOG`() | `$FOURTH_LOG`() with pruned history |
Transformation metrics:
The final result is a clean, linear history where each file has at most one consolidated commit per author per calendar day (JST timezone), with standardized "Updated Title (markdown)" messages and proper author/committer metadata alignment.
Sources: 00.sh1-16(), 00.sh32-37(), README.md1-4()
Diagram: Core System Components and Data Flow
Sources: 00.sh1-124(), .gitignore1-2()
The system executes the following sequential stages, with user confirmation checkpoints between each stage:
Diagram: Checkpoint-Driven Execution Flow
Sources: 00.sh66-116()
The first stage normalizes commit messages by converting verbose "Revert" messages into standardized "Updated" messages.
Input format (from `01-git-log.txt`()):
pick e1372d3 # "2025-03-10T10:49:49+09:00", "yuuki", "Revert 3135fb48...fb0a8ca7 on yuuki"
pick a6ccc7a # "2025-03-10T13:20:54+09:00", "yuuki", "Updated yuuki (markdown)"
pick 9adb70e # "2025-03-10T13:21:08+09:00", "yuuki", "Revert 8a0c6523...8062ba60 on bear"
Output format (to `01-git-rebase-todo.txt`()):
pick e1372d3 # "2025-03-10T10:49:49+09:00", "yuuki", "Revert 3135fb48...fb0a8ca7 on yuuki"
exec GIT_COMMITTER_DATE='2025-03-10T10:49:49+09:00' git commit --amend --date='2025-03-10T10:49:49+09:00' --message='Updated yuuki (markdown)'
pick a6ccc7a # "2025-03-10T13:20:54+09:00", "yuuki", "Updated yuuki (markdown)"
pick 9adb70e # "2025-03-10T13:21:08+09:00", "yuuki", "Revert 8a0c6523...8062ba60 on bear"
exec GIT_COMMITTER_DATE='2025-03-10T13:21:08+09:00' git commit --amend --date='2025-03-10T13:21:08+09:00' --message='Updated bear (markdown)'
The script inserts exec commands after pick commands with "Revert" messages to amend them with standardized messages while preserving the original timestamp.
Sources: 01-revert-to-updated.pl1-34(), 01-git-rebase-todo.txt40-44()
The second stage consolidates commits based on three criteria: same file title, same author, same day (JST timezone).
Squashing criteria:
| Criterion | Implementation | Location |
|---|---|---|
| Same file title | Extracted from commit message via regex | 02-squash.pl45-48() |
| Same day | Date comparison in JST timezone | 00.sh40(), 02-squash.pl52() |
| Same author | Author comparison (yuuki or bear only) | 02-squash.pl51() |
Example transformation from `02-git-log.txt`() to `02-git-rebase-todo.txt`():
Before (multiple commits to same file on same day):
pick 2c2acd4 # "2025-04-16T10:01:20+09:00", "yuuki", "Updated Mathematics of blocks (markdown)"
pick f4edbe1 # "2025-04-16T10:44:40+09:00", "yuuki", "Updated Mathematics of blocks (markdown)"
pick 1109fe5 # "2025-04-16T11:29:20+09:00", "yuuki", "Updated Mathematics of blocks (markdown)"
After (squashed with preserved final timestamp):
pick 2c2acd4 # "2025-04-16T10:01:20+09:00", "yuuki", "Updated Mathematics of blocks (markdown)"
fixup f4edbe1
fixup 1109fe5
exec GIT_COMMITTER_DATE='2025-04-16T11:29:20+09:00' git commit --amend --date='2025-04-16T11:29:20+09:00' --no-edit
Sources: 02-squash.pl1-86(), 02-git-rebase-todo.txt72-91()
The final stage uses git-filter-repo to remove empty commits and standardize metadata.
Operations performed:
commit.committer_name = commit.author_name commit.committer_email = commit.author_email commit.committer_date = commit.author_date
This ensures that the committer metadata matches the author metadata for all commits, fixing any discrepancies introduced during rebasing.
The --prune-empty=always flag removes commits that become empty after squashing (e.g., commits that were pure reverts).
Sources: 00.sh101-105()
File inventory with variable names:
| File | Variable Name | Type | Purpose | Git Status |
|---|---|---|---|---|
| `00.sh`() | N/A | Bash script | Master orchestrator with set -xeuo pipefail | Tracked |
| `01-revert-to-updated.pl`() | N/A | Perl script | Stage 1: Parses "Revert" messages, generates exec commands | Tracked |
| `02-squash.pl`() | N/A | Perl script | Stage 2: Groups commits by file/author/day, generates fixup sequences | Tracked |
| `README.md`() | N/A | Markdown | Project identifier with DeepWiki documentation badge | Tracked |
| `LICENSE`() | N/A | Text | MIT License (2025) | Tracked |
| `.gitignore`() | N/A | Config | Excludes production-repo.git/ and work-repo/ directories | Tracked |
production-repo.git/ | N/A | Directory | Bare mirror created by git clone --mirror | Ignored |
work-repo/ | N/A | Directory | Working repository where git rebase -i operations execute | Ignored |
../01-git-log.txt | $FIRST_LOG | Log file | Initial history: git log --reverse before Stage 1 | Generated |
../01-git-rebase-todo.txt | $FIRST_REBASE_TODO | Todo file | Stage 1 rebase instructions with pick/exec commands | Generated |
../02-git-log.txt | $SECOND_LOG | Log file | History after Stage 1 message rewriting | Generated |
../02-git-rebase-todo.txt | $SECOND_REBASE_TODO | Todo file | Stage 2 rebase instructions with pick/fixup/exec commands | Generated |
../03-git-log.txt | $THIRD_LOG | Log file | History after Stage 2 squashing (before pruning) | Generated |
../04-git-log.txt | $FOURTH_LOG | Log file | Final history after git-filter-repo pruning | Generated |
Configuration format:
All log files use the format configured by:
git config --local log.date iso-strict-local git config --local format.pretty 'pick %h # "%ad", "%an", "%s"'
Sources: 00.sh32-37(), 00.sh62-63(), .gitignore1-2(), README.md1-4()
Diagram: Repository Configuration and Backup Strategy
Remote configuration variables:
# Line 26-27 in 00.sh PRODUCTION_REMOTE='https://github.com/wikinder/wikinder.wiki.git' WORK_REMOTE="git@github.com:wikinderbear/wikinder-rebase-$TIMESTAMP.git" # Line 29-30 in 00.sh BACKUP_BRANCH='backup-before-rebase' BACKUP_TAG="$BACKUP_BRANCH-$TIMESTAMP"
The $PRODUCTION_REMOTE uses HTTPS for anonymous read access, while $WORK_REMOTE uses SSH for authenticated write access. The $TIMESTAMP variable is set via TIMESTAMP=$(date -u +'%Y-%m-%d') (`00.sh24()), creating a unique staging repository name like wikinder-rebase-2025-12-10 for each execution.
Backup strategy execution order:
git branch "$BACKUP_BRANCH" master creates branch backup-before-rebase from mastergit tag --annotate "$BACKUP_TAG" "$BACKUP_BRANCH" --message="Backup before rebase ($TIMESTAMP)" creates annotated taggit push work-remote "$BACKUP_BRANCH" "$BACKUP_TAG" pushes both to work remote before any history modificationThis ensures complete rollback capability via git reset --hard backup-before-rebase if issues are discovered after rebasing.
Sources: 00.sh24-59()
The system configures git log format to match the git rebase --interactive todo file format:
git config --local log.date iso-strict-local git config --local format.pretty 'pick %h # "%ad", "%an", "%s"'
This produces output like:
pick fc7162f # "2025-01-01T21:09:52+09:00", "yuuki", "Initial Home page"
pick 350891d # "2025-01-16T17:54:18+09:00", "yuuki", "Created _Footer (markdown)"
The format includes:
%h: abbreviated commit hash%ad: author date in ISO 8601 format with local timezone offset%an: author name%s: commit subject (first line of message)This format is directly parseable by the Perl scripts and can be used as-is in git rebase -i todo files.
Sources: 00.sh62-66(), 01-git-log.txt1-10()
The system includes automated error recovery for the common case where squashing creates empty commits:
git -c sequence.editor="cp '$SECOND_REBASE_TODO'" \ rebase -i --root --committer-date-is-author-date \ || { until git commit --amend --allow-empty --no-edit && git rebase --continue; do :; done }
When git rebase encounters an empty commit and exits with an error, the shell loop automatically:
--allow-empty to accept the empty stategit rebase --continueThis automation prevents manual intervention during Stage 2, which can have hundreds of squash operations.
Sources: 00.sh87-92()
After all processing completes, the system verifies that all commits remain in chronological order:
git log --reverse --format='%at' | sort -nc
This command:
%at = author timestamp in Unix epoch format)sort -nc which checks if the input is numerically sortedThis verification ensures that despite extensive history rewriting, the final timeline remains logically consistent.
Sources: 00.sh111()
The entire pipeline operates in JST (Japan Standard Time, UTC+9):
export TZ='Asia/Tokyo'
This ensures that day-based grouping in Stage 2 uses JST dates consistently. For example, commits at "2025-04-16T23:30:00+09:00" and "2025-04-17T00:30:00+09:00" are treated as different days despite being only one hour apart, because they occur on different calendar days in JST.
The iso-strict-local date format preserves timezone information in all log files for audit purposes.
This wiki is featured in the repository
Refresh this wiki
This wiki was recently refreshed. Please wait 7 days to refresh again.