Initial import: web4beginners editor and deployment setup

This commit is contained in:
2026-03-06 13:49:43 +01:00
commit fd9ea482bf
73 changed files with 4043 additions and 0 deletions

View File

@@ -0,0 +1,88 @@
---
title: "feat: DOM-to-JSON content extraction for static snapshot"
type: feat
status: completed
date: 2026-03-04
---
# feat: DOM-to-JSON content extraction for static snapshot
## Overview
Create a first-stage extraction workflow that converts the existing HTML snapshot into a nested JSON content file. This plan is intentionally limited to extraction and content mapping. It does not include building the WYSIWYG editor yet.
## Problem Statement / Motivation
Content updates are currently tied to manual HTML edits. A JSON representation is needed so text and selected image properties can be adapted more easily and later edited through an interface.
## Proposed Solution
Build a deterministic DOM-to-JSON extraction flow for `web4beginners.com.html` that captures visible text, selected metadata, and image fields (`src`, `alt`).
The JSON structure should be DOM-first with section-based top-level subtopics, matching the brainstorm decisions and keeping context for editors. Duplicate text handling should follow the agreed hybrid policy: keep section-local duplicates; dedupe only clearly global/common items.
## Scope
In scope:
- Extract visible text content from page sections
- Extract metadata: `title`, `description`, Open Graph, Twitter
- Extract image fields: `img src`, `img alt`
- Produce nested JSON output aligned with DOM sections
- Define stable content identity strategy (reuse existing selectors/IDs; add `data-*` only when needed)
Out of scope:
- WYSIWYG editing UI
- Styling/layout editing
- Full responsive image source editing (`srcset`, `picture`)
- Full bidirectional sync mechanics
## Technical Considerations
- The repository is a static snapshot with bundled/minified assets; there is no existing i18n framework.
- Extraction rules must avoid pulling non-content technical strings from scripts/styles.
- Section mapping should remain stable even if content text changes.
- Output should be deterministic so repeated runs produce predictable key ordering/paths.
## SpecFlow Analysis
Primary flow:
1. Input snapshot HTML is parsed.
2. Eligible text nodes and target attributes are identified.
3. Content is grouped by top-level page sections.
4. Metadata and image fields are merged into the same JSON tree.
5. Output JSON is written.
Edge cases to cover:
- Empty or whitespace-only nodes
- Repeated text across sections
- Links/buttons with nested elements
- Missing `alt` attributes
- Cookie/modal/footer content that may be conditionally visible
## Acceptance Criteria
- [x] A single extraction run generates one nested JSON file from `web4beginners.com.html`.
- [x] JSON includes visible page text grouped by section subtopics.
- [x] JSON includes `title`, `description`, Open Graph, and Twitter metadata values.
- [x] JSON includes `img src` and `img alt` values where present.
- [x] Duplicate policy is applied: section-local duplicates kept; global/common duplicates deduped.
- [x] Extraction excludes JS/CSS artifacts and non-content noise.
- [x] Re-running extraction on unchanged input produces stable output structure.
## Success Metrics
- Editors can locate and update target strings in JSON without editing HTML directly.
- JSON organization is understandable by section/context without reverse-engineering selectors.
- No unintended layout/content regressions in source HTML (read-only extraction phase).
## Dependencies & Risks
Dependencies:
- Final agreement on section boundaries for grouping
- Final output file location/name convention
Risks:
- Over-extraction of non-user-facing strings
- Unstable keys if selector strategy is inconsistent
- Ambiguity around “global/common” duplicate classification
Mitigations:
- Explicit extraction allowlist for elements/attributes
- Deterministic key-generation policy
- Documented duplicate decision rules with examples
## References & Research
- Brainstorm: `docs/brainstorms/2026-03-03-dom-json-wysiwyg-sync-brainstorm.md`
- Source snapshot: `web4beginners.com.html`
- Existing site bundle references: `web4beginners.com_files/*`

View File

@@ -0,0 +1,103 @@
---
title: "feat: Inline WYSIWYG editor with HTML-JSON sync"
type: feat
status: completed
date: 2026-03-04
---
# feat: Inline WYSIWYG editor with HTML-JSON sync
## Overview
Build step 2 of the content workflow: a local-first WYSIWYG editor for the existing static page where creators can directly edit content on the rendered page and persist changes to both HTML and JSON.
This plan is limited to content editing and synchronization behavior. It explicitly excludes style/layout editing.
## Problem Statement / Motivation
The repository now has extractable structured content (`content/site-content.de.json`) but no practical editing surface for creators. Editors need direct, low-friction page editing (double-click text, click image) while keeping HTML and JSON in sync.
## Proposed Solution
Add an in-page edit mode with inline `contenteditable` text editing and an image overlay editor for `src` and `alt`. Implement autosave (blur/enter) plus manual save/undo controls. Persist edits via a local helper service that writes both `web4beginners.com.html` and `content/site-content.de.json`.
Synchronization is bidirectional in model intent, with conflict default set to HTML wins when the same mapped key diverges.
## Scope
In scope:
- Text editing on double-click for editable content nodes
- Image editing overlay for `src` and `alt`
- Aspect-ratio warning (non-blocking) at 15% threshold
- Autosave + manual save/undo controls
- Local persistence endpoint to write HTML + JSON
- Content identity mapping between DOM elements and JSON keys
- Conflict handling policy: HTML wins
Out of scope:
- CSS/layout editing
- `srcset`/`picture` editing
- Multi-user collaboration or remote persistence
- Authentication/authorization layer
## Technical Considerations
- Current site scripts are bundled/minified, so editor behavior should be isolated in a dedicated script layer.
- Content identity mapping must be stable enough for repeat edits and sync.
- Editing rules should avoid hidden/system nodes (cookie mechanics/scripts/non-content regions unless explicitly intended).
- Local persistence requires a trusted local helper process and clear file write boundaries.
- Undo scope must be defined (session-level content undo, not full VCS-like history).
## SpecFlow Analysis
Primary flow:
1. Editor enters edit mode.
2. User double-clicks text node and edits inline.
3. User blur/enter triggers autosave path.
4. Mapping resolves edited node to JSON key.
5. HTML and JSON are both updated and persisted via local helper.
Image flow:
1. User selects editable image.
2. Overlay opens with current `src` and `alt`.
3. New source is validated; ratio warning shown if aspect ratio differs by >15%.
4. User can still save despite warning.
5. HTML and JSON are updated and persisted.
Conflict flow:
1. Divergent values detected for same mapped key.
2. Default resolution applies: HTML value wins.
3. JSON is reconciled to HTML on save.
## Acceptance Criteria
- [x] Double-click enables inline text editing on intended content elements.
- [x] Text autosaves on blur/enter and also supports explicit save/undo controls.
- [x] Clicking editable images opens an overlay with `src` and `alt` fields.
- [x] Image ratio check warns (non-blocking) when replacement differs by >15% aspect ratio.
- [x] Save operation persists both `web4beginners.com.html` and `content/site-content.de.json`.
- [x] Sync mapping updates the correct JSON key for edited text/image values.
- [x] Conflict resolution follows HTML-wins default.
- [x] No CSS/layout properties are modified by editor actions.
## Success Metrics
- Editors can modify headings/body text/images directly on page without manual JSON editing.
- Saved output remains consistent between HTML and JSON for edited items.
- Editing interactions feel immediate and require minimal training.
- No unintended style/layout changes caused by the editor.
## Dependencies & Risks
Dependencies:
- Defined DOM↔JSON mapping contract for editable nodes
- Local helper service runtime available in the editor environment
Risks:
- Incorrect key mapping leading to wrong JSON updates
- Over-editability (allowing non-content nodes)
- Unexpected side effects from integrating with existing bundled scripts
- File write race conditions during rapid autosave
Mitigations:
- Explicit editable-node allowlist and mapping tests
- Isolated editor namespace/events
- Debounced autosave + write serialization
- Dry-run/preview diagnostics for mapping during development
## References & Research
- Brainstorm input: `docs/brainstorms/2026-03-04-wysiwyg-inline-editor-sync-brainstorm.md`
- Prior extraction plan: `docs/plans/2026-03-04-feat-dom-to-json-content-extraction-plan.md`
- Extractor/source contract: `scripts/extract_dom_content.php`, `content/site-content.de.json`
- Target HTML: `web4beginners.com.html`