Skip to content

Add Discord mirror exporter, config, and docs#1

Merged
psjamesp merged 1 commit into
mainfrom
discord-mirror
Apr 7, 2026
Merged

Add Discord mirror exporter, config, and docs#1
psjamesp merged 1 commit into
mainfrom
discord-mirror

Conversation

@mikenelson-io
Copy link
Copy Markdown
Contributor

This should not effect prod, but hard to test when no Hugo in a test repo.

Introduce a PowerShell-based Discord mirror that exports selected channels into the Hugo site. Adds tools/discord-mirror/Export-DiscordMirror.ps1, config/discord-mirror.json, static assets (search index, JS, CSS), and content-generation helpers, plus three docs (deployment, moderation guide, quickstart). Also updates hugo.yaml to surface the Discord Archive in the site menu.

Introduce a PowerShell-based Discord mirror that exports selected channels into the Hugo site. Adds tools/discord-mirror/Export-DiscordMirror.ps1, config/discord-mirror.json, static assets (search index, JS, CSS), and content-generation helpers, plus three docs (deployment, moderation guide, quickstart). Also updates hugo.yaml to surface the Discord Archive in the site menu.
Copilot AI review requested due to automatic review settings April 7, 2026 16:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a PowerShell-based Discord mirror exporter and associated site/config/docs to publish curated Discord content into the Hugo site under /discord/.

Changes:

  • Introduces tools/discord-mirror/Export-DiscordMirror.ps1 to fetch, moderate-filter, and generate Hugo content + search assets.
  • Adds initial mirror configuration (config/discord-mirror.json) and placeholder static assets under static/discord/.
  • Updates site navigation and adds operational docs (deployment, moderation guide, quickstart).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tools/discord-mirror/Export-DiscordMirror.ps1 New exporter that pulls Discord messages/threads, applies moderation rules, generates Hugo pages + search index + static assets.
config/discord-mirror.json Adds initial guild/channel allowlist and export settings used by the exporter.
hugo.yaml Adds a “Discord Archive” entry to the main menu pointing at /discord/.
static/discord/styles.css Placeholder file indicating CSS is generated by the exporter.
static/discord/search.js Placeholder file indicating JS is generated by the exporter.
static/discord/search-index.json Placeholder empty search index to be replaced by the exporter.
DISCORD_MIRROR_SITE_OWNER_QUICKSTART.md Quick-start steps for enabling the mirror via GitHub Actions secrets + config.
DISCORD_MIRROR_MODERATION_GUIDE.md Documents moderation/approval modes and recommended rollout policies.
DISCORD_MIRROR_DEPLOYMENT.md Deployment/configuration guidance and validation checklist.

Comment on lines +537 to +570
let index = [];
try {
const response = await fetch('/$SectionPath/$SearchIndexFileName');
index = await response.json();
} catch (error) {
results.innerHTML = '<p>Search index could not be loaded.</p>';
return;
}

const render = (items) => {
if (!items.length) {
results.innerHTML = '<p>No results found.</p>';
return;
}
results.innerHTML = items.map(item => `
<div class="discord-search-result">
<p><strong><a href="${item.url}">${item.channel}</a></strong></p>
<p>${item.excerpt}</p>
<p><small>${item.author} — ${item.timestamp}</small></p>
</div>
`).join('');
};

input.addEventListener('input', () => {
const query = input.value.trim().toLowerCase();
if (!query) {
results.innerHTML = '<p>Start typing to search.</p>';
return;
}
const filtered = index.filter(item => item.text.toLowerCase().includes(query)).slice(0, 100);
render(filtered);
});

results.innerHTML = '<p>Start typing to search.</p>';
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated search UI builds results.innerHTML using item.channel, item.excerpt, item.author, etc. Those values ultimately come from Discord message content/usernames and are not HTML-escaped, so a message containing HTML can become executable script in the search results page (XSS). Render search results using text nodes (textContent) / DOM APIs, or HTML-escape values before inserting them into the page.

Suggested change
let index = [];
try {
const response = await fetch('/$SectionPath/$SearchIndexFileName');
index = await response.json();
} catch (error) {
results.innerHTML = '<p>Search index could not be loaded.</p>';
return;
}
const render = (items) => {
if (!items.length) {
results.innerHTML = '<p>No results found.</p>';
return;
}
results.innerHTML = items.map(item => `
<div class="discord-search-result">
<p><strong><a href="${item.url}">${item.channel}</a></strong></p>
<p>${item.excerpt}</p>
<p><small>${item.author}${item.timestamp}</small></p>
</div>
`).join('');
};
input.addEventListener('input', () => {
const query = input.value.trim().toLowerCase();
if (!query) {
results.innerHTML = '<p>Start typing to search.</p>';
return;
}
const filtered = index.filter(item => item.text.toLowerCase().includes(query)).slice(0, 100);
render(filtered);
});
results.innerHTML = '<p>Start typing to search.</p>';
const renderStatus = (message) => {
results.textContent = '';
const paragraph = document.createElement('p');
paragraph.textContent = message;
results.appendChild(paragraph);
};
let index = [];
try {
const response = await fetch('/$SectionPath/$SearchIndexFileName');
index = await response.json();
} catch (error) {
renderStatus('Search index could not be loaded.');
return;
}
const render = (items) => {
results.textContent = '';
if (!items.length) {
renderStatus('No results found.');
return;
}
items.forEach(item => {
const result = document.createElement('div');
result.className = 'discord-search-result';
const titleParagraph = document.createElement('p');
const strong = document.createElement('strong');
const link = document.createElement('a');
link.href = item.url;
link.textContent = item.channel;
strong.appendChild(link);
titleParagraph.appendChild(strong);
const excerptParagraph = document.createElement('p');
excerptParagraph.textContent = item.excerpt;
const metaParagraph = document.createElement('p');
const small = document.createElement('small');
small.textContent = `${item.author} — ${item.timestamp}`;
metaParagraph.appendChild(small);
result.appendChild(titleParagraph);
result.appendChild(excerptParagraph);
result.appendChild(metaParagraph);
results.appendChild(result);
});
};
input.addEventListener('input', () => {
const query = input.value.trim().toLowerCase();
if (!query) {
renderStatus('Start typing to search.');
return;
}
const filtered = index.filter(item => item.text.toLowerCase().includes(query)).slice(0, 100);
render(filtered);
});
renderStatus('Start typing to search.');

Copilot uses AI. Check for mistakes.
Comment on lines +639 to +651
foreach ($message in $approved) {
$text = Convert-DiscordMentions -Text ([string]$message.content) -Message $message -ChannelLookup $channelLookup -SanitizeMentions:([bool]$export.sanitizeMentions)
if ([string]::IsNullOrWhiteSpace($text)) { continue }
$excerpt = $text
if ($excerpt.Length -gt 220) { $excerpt = $excerpt.Substring(0,220) + '…' }
$searchIndex.Add([pscustomobject]@{
channel = $page.title
url = "$($page.url)#msg-$($message.id)"
author = (Get-MessageAuthorName -Message $message)
timestamp = [datetimeoffset]::Parse($message.timestamp).ToString('yyyy-MM-dd HH:mm') + ' UTC'
text = $text
excerpt = $excerpt
})
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search index records store raw Discord message text in text/excerpt (and also author/channel). Since those fields are later rendered into the search results HTML, they should be treated as untrusted input. Consider exporting an explicitly escaped/encoded variant (or only plain text) and ensure the frontend never injects these fields as HTML.

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +92
foreach ($mention in $Message.mentions) {
$display = if ($mention.global_name) { $mention.global_name } elseif ($mention.username) { $mention.username } else { 'user' }
$output = $output -replace "<@!?$($mention.id)>", "@$display"
}
}

$roleMentions = @($Message.mention_roles)
foreach ($roleId in $roleMentions) {
$output = $output -replace "<@&$roleId>", '@role'
}

foreach ($key in $ChannelLookup.Keys) {
$channelName = $ChannelLookup[$key]
$output = $output -replace "<#${key}>", "#$channelName"
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-replace treats the replacement string as a regex replacement pattern. Since $display/$channelName can contain $ or \, Discord-provided names can be mangled (e.g., $1 interpreted as a capture group). Use a MatchEvaluator/scriptblock replacement (or escape replacement metacharacters) so mention/channel display values are inserted literally.

Copilot uses AI. Check for mistakes.
)

$author = HtmlEncode -Value (Get-MessageAuthorName -Message $Message)
$timestamp = [datetimeoffset]::Parse($Message.timestamp).ToString('yyyy-MM-dd HH:mm') + ' UTC'
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp is labeled as UTC but the value is not converted to UTC before formatting. If Discord ever returns a non-UTC offset, this will display the wrong time. Convert to UTC (e.g., ToUniversalTime()) before formatting or omit the hard-coded UTC label.

Suggested change
$timestamp = [datetimeoffset]::Parse($Message.timestamp).ToString('yyyy-MM-dd HH:mm') + ' UTC'
$timestamp = [datetimeoffset]::Parse($Message.timestamp).ToUniversalTime().ToString('yyyy-MM-dd HH:mm') + ' UTC'

Copilot uses AI. Check for mistakes.
channel = $page.title
url = "$($page.url)#msg-$($message.id)"
author = (Get-MessageAuthorName -Message $message)
timestamp = [datetimeoffset]::Parse($message.timestamp).ToString('yyyy-MM-dd HH:mm') + ' UTC'
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search index timestamp is labeled as UTC but is not converted to UTC before formatting. Convert to UTC (or avoid labeling as UTC) to prevent incorrect timestamps if the source timestamp includes a non-UTC offset.

Suggested change
timestamp = [datetimeoffset]::Parse($message.timestamp).ToString('yyyy-MM-dd HH:mm') + ' UTC'
timestamp = ([datetimeoffset]::Parse($message.timestamp).ToUniversalTime()).ToString('yyyy-MM-dd HH:mm') + ' UTC'

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +46
Start-Sleep -Milliseconds 150
return Invoke-RestMethod -Method Get -Uri $Uri -Headers $headers
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invoke-DiscordApi uses a fixed sleep but doesn't handle Discord rate limiting (HTTP 429) or transient failures. In practice this exporter will intermittently fail on busy servers. Handle 429 responses by reading the retry_after value / rate-limit headers and retrying with backoff, and consider retrying transient network errors.

Suggested change
Start-Sleep -Milliseconds 150
return Invoke-RestMethod -Method Get -Uri $Uri -Headers $headers
$maxAttempts = 6
$attempt = 0
while ($true) {
$attempt++
try {
$response = Invoke-WebRequest -Method Get -Uri $Uri -Headers $headers
if ([string]::IsNullOrWhiteSpace($response.Content)) {
return $null
}
return ($response.Content | ConvertFrom-Json -Depth 100)
}
catch {
$statusCode = $null
$responseHeaders = $null
$responseBody = $null
$retryAfterSeconds = $null
$shouldRetry = $false
if ($_.Exception.PSObject.Properties.Name -contains 'Response' -and $null -ne $_.Exception.Response) {
$response = $_.Exception.Response
if ($response.PSObject.Properties.Name -contains 'StatusCode' -and $null -ne $response.StatusCode) {
$statusCode = [int]$response.StatusCode
}
if ($response.PSObject.Properties.Name -contains 'Headers') {
$responseHeaders = $response.Headers
}
try {
if ($response.PSObject.Properties.Name -contains 'Content' -and -not [string]::IsNullOrWhiteSpace($response.Content)) {
$responseBody = $response.Content
}
elseif ($response.PSObject.Properties.Name -contains 'GetResponseStream') {
$stream = $response.GetResponseStream()
if ($null -ne $stream) {
$reader = [System.IO.StreamReader]::new($stream)
try {
$responseBody = $reader.ReadToEnd()
}
finally {
$reader.Dispose()
$stream.Dispose()
}
}
}
}
catch {
}
}
if ($statusCode -eq 429) {
$shouldRetry = $true
if (-not [string]::IsNullOrWhiteSpace($responseBody)) {
try {
$rateLimitBody = $responseBody | ConvertFrom-Json -Depth 100
if ($null -ne $rateLimitBody.retry_after) {
$retryAfterSeconds = [double]$rateLimitBody.retry_after
}
}
catch {
}
}
if ($null -eq $retryAfterSeconds -and $null -ne $responseHeaders) {
$retryAfterHeader = $responseHeaders['Retry-After']
if (-not [string]::IsNullOrWhiteSpace($retryAfterHeader)) {
$retryAfterSeconds = [double]$retryAfterHeader
}
}
if ($null -eq $retryAfterSeconds -and $null -ne $responseHeaders) {
$resetAfterHeader = $responseHeaders['X-RateLimit-Reset-After']
if (-not [string]::IsNullOrWhiteSpace($resetAfterHeader)) {
$retryAfterSeconds = [double]$resetAfterHeader
}
}
}
elseif ($statusCode -eq 408 -or ($null -ne $statusCode -and $statusCode -ge 500)) {
$shouldRetry = $true
}
elseif ($null -eq $statusCode) {
$shouldRetry = $true
}
if (-not $shouldRetry -or $attempt -ge $maxAttempts) {
throw
}
if ($null -eq $retryAfterSeconds) {
$retryAfterSeconds = [Math]::Min([Math]::Pow(2, $attempt - 1), 30)
}
$jitterMilliseconds = Get-Random -Minimum 100 -Maximum 500
Start-Sleep -Milliseconds ([int]([Math]::Ceiling($retryAfterSeconds * 1000)) + $jitterMilliseconds)
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +579 to +584
function Remove-GeneratedChannelDirectories {
param([string]$ContentDir)
if (-not (Test-Path -LiteralPath $ContentDir)) { return }
Get-ChildItem -LiteralPath $ContentDir -Directory | Where-Object { $_.Name -ne 'search' } | Remove-Item -Recurse -Force -ErrorAction SilentlyContinue
Get-ChildItem -LiteralPath $ContentDir -File | Where-Object { $_.Name -ne '_index.md' } | Remove-Item -Force -ErrorAction SilentlyContinue
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove-GeneratedChannelDirectories recursively deletes almost everything under the export content directory. That will also delete any manually maintained pages placed under content/discord/ (or any future content not generated by the tool). Consider writing generated content under a dedicated subfolder (e.g., content/discord/generated/) or deleting only directories/files that match the configured/generated slugs.

Copilot uses AI. Check for mistakes.
Comment on lines +426 to +435
<div class="discord-channel-page">
<div class="discord-channel-meta">
<p><strong>Source channel:</strong> #$slug</p>
<p><strong>Exported messages:</strong> $count</p>
<p><strong>Last generated:</strong> $generated</p>
</div>
<div class="discord-message-list">
$($htmlBlocks -join "`n")
</div>
</div>
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write-StaticAssets generates styles.css, but none of the generated pages include a <link> tag to load it (and the theme doesn’t reference /discord/styles.css). As a result, the mirrored pages/search page will render unstyled. Add a stylesheet reference in the generated Markdown/HTML (or include it via a Hugo template/partial for the /discord/ section).

Copilot uses AI. Check for mistakes.
Comment on lines +501 to +507
function Write-StaticAssets {
param(
[string]$StaticDir,
[string]$SearchIndexFileName,
[string]$SectionPath,
[string]$FooterText
)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write-StaticAssets takes FooterText but doesn’t use it, and the generated pages don’t include the configured footer/disclaimer text. Either remove the unused parameter/CSS class, or render FooterText into the generated pages (and load the corresponding CSS) so visitors see the publishing/disclaimer context.

Copilot uses AI. Check for mistakes.
Comment thread hugo.yaml
Comment on lines +60 to +62
- name: "Discord Archive"
url: "/discord/"
weight: 62
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the Discord Archive menu entry points users to /discord/, but this section is generated by the exporter and won’t exist when the export step is skipped (e.g., missing DISCORD_BOT_TOKEN). Consider adding a committed stub content/discord/_index.md (or making the menu entry conditional) so the site doesn’t ship with a dead link when the exporter hasn’t run.

Suggested change
- name: "Discord Archive"
url: "/discord/"
weight: 62

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@psjamesp psjamesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@psjamesp psjamesp merged commit 7230c88 into main Apr 7, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants