Initial import: web4beginners editor and deployment setup

2026-03-06 13:49:43 +01:00
commit fd9ea482bf
73 changed files with 4043 additions and 0 deletions
--- a/docs/plans/2026-03-04-feat-dom-to-json-content-extraction-plan.md
+++ b/docs/plans/2026-03-04-feat-dom-to-json-content-extraction-plan.md
@@ -0,0 +1,88 @@
+---
+title: "feat: DOM-to-JSON content extraction for static snapshot"
+type: feat
+status: completed
+date: 2026-03-04
+---
+
+# feat: DOM-to-JSON content extraction for static snapshot
+
+## Overview
+Create a first-stage extraction workflow that converts the existing HTML snapshot into a nested JSON content file. This plan is intentionally limited to extraction and content mapping. It does not include building the WYSIWYG editor yet.
+
+## Problem Statement / Motivation
+Content updates are currently tied to manual HTML edits. A JSON representation is needed so text and selected image properties can be adapted more easily and later edited through an interface.
+
+## Proposed Solution
+Build a deterministic DOM-to-JSON extraction flow for `web4beginners.com.html` that captures visible text, selected metadata, and image fields (`src`, `alt`).
+
+The JSON structure should be DOM-first with section-based top-level subtopics, matching the brainstorm decisions and keeping context for editors. Duplicate text handling should follow the agreed hybrid policy: keep section-local duplicates; dedupe only clearly global/common items.
+
+## Scope
+In scope:
+- Extract visible text content from page sections
+- Extract metadata: `title`, `description`, Open Graph, Twitter
+- Extract image fields: `img src`, `img alt`
+- Produce nested JSON output aligned with DOM sections
+- Define stable content identity strategy (reuse existing selectors/IDs; add `data-*` only when needed)
+
+Out of scope:
+- WYSIWYG editing UI
+- Styling/layout editing
+- Full responsive image source editing (`srcset`, `picture`)
+- Full bidirectional sync mechanics
+
+## Technical Considerations
+- The repository is a static snapshot with bundled/minified assets; there is no existing i18n framework.
+- Extraction rules must avoid pulling non-content technical strings from scripts/styles.
+- Section mapping should remain stable even if content text changes.
+- Output should be deterministic so repeated runs produce predictable key ordering/paths.
+
+## SpecFlow Analysis
+Primary flow:
+1. Input snapshot HTML is parsed.
+2. Eligible text nodes and target attributes are identified.
+3. Content is grouped by top-level page sections.
+4. Metadata and image fields are merged into the same JSON tree.
+5. Output JSON is written.
+
+Edge cases to cover:
+- Empty or whitespace-only nodes
+- Repeated text across sections
+- Links/buttons with nested elements
+- Missing `alt` attributes
+- Cookie/modal/footer content that may be conditionally visible
+
+## Acceptance Criteria
+- [x] A single extraction run generates one nested JSON file from `web4beginners.com.html`.
+- [x] JSON includes visible page text grouped by section subtopics.
+- [x] JSON includes `title`, `description`, Open Graph, and Twitter metadata values.
+- [x] JSON includes `img src` and `img alt` values where present.
+- [x] Duplicate policy is applied: section-local duplicates kept; global/common duplicates deduped.
+- [x] Extraction excludes JS/CSS artifacts and non-content noise.
+- [x] Re-running extraction on unchanged input produces stable output structure.
+
+## Success Metrics
+- Editors can locate and update target strings in JSON without editing HTML directly.
+- JSON organization is understandable by section/context without reverse-engineering selectors.
+- No unintended layout/content regressions in source HTML (read-only extraction phase).
+
+## Dependencies & Risks
+Dependencies:
+- Final agreement on section boundaries for grouping
+- Final output file location/name convention
+
+Risks:
+- Over-extraction of non-user-facing strings
+- Unstable keys if selector strategy is inconsistent
+- Ambiguity around “global/common” duplicate classification
+
+Mitigations:
+- Explicit extraction allowlist for elements/attributes
+- Deterministic key-generation policy
+- Documented duplicate decision rules with examples
+
+## References & Research
+- Brainstorm: `docs/brainstorms/2026-03-03-dom-json-wysiwyg-sync-brainstorm.md`
+- Source snapshot: `web4beginners.com.html`
+- Existing site bundle references: `web4beginners.com_files/*`
--- a/docs/plans/2026-03-04-feat-inline-wysiwyg-html-json-sync-plan.md
+++ b/docs/plans/2026-03-04-feat-inline-wysiwyg-html-json-sync-plan.md
@@ -0,0 +1,103 @@
+---
+title: "feat: Inline WYSIWYG editor with HTML-JSON sync"
+type: feat
+status: completed
+date: 2026-03-04
+---
+
+# feat: Inline WYSIWYG editor with HTML-JSON sync
+
+## Overview
+Build step 2 of the content workflow: a local-first WYSIWYG editor for the existing static page where creators can directly edit content on the rendered page and persist changes to both HTML and JSON.
+
+This plan is limited to content editing and synchronization behavior. It explicitly excludes style/layout editing.
+
+## Problem Statement / Motivation
+The repository now has extractable structured content (`content/site-content.de.json`) but no practical editing surface for creators. Editors need direct, low-friction page editing (double-click text, click image) while keeping HTML and JSON in sync.
+
+## Proposed Solution
+Add an in-page edit mode with inline `contenteditable` text editing and an image overlay editor for `src` and `alt`. Implement autosave (blur/enter) plus manual save/undo controls. Persist edits via a local helper service that writes both `web4beginners.com.html` and `content/site-content.de.json`.
+
+Synchronization is bidirectional in model intent, with conflict default set to HTML wins when the same mapped key diverges.
+
+## Scope
+In scope:
+- Text editing on double-click for editable content nodes
+- Image editing overlay for `src` and `alt`
+- Aspect-ratio warning (non-blocking) at 15% threshold
+- Autosave + manual save/undo controls
+- Local persistence endpoint to write HTML + JSON
+- Content identity mapping between DOM elements and JSON keys
+- Conflict handling policy: HTML wins
+
+Out of scope:
+- CSS/layout editing
+- `srcset`/`picture` editing
+- Multi-user collaboration or remote persistence
+- Authentication/authorization layer
+
+## Technical Considerations
+- Current site scripts are bundled/minified, so editor behavior should be isolated in a dedicated script layer.
+- Content identity mapping must be stable enough for repeat edits and sync.
+- Editing rules should avoid hidden/system nodes (cookie mechanics/scripts/non-content regions unless explicitly intended).
+- Local persistence requires a trusted local helper process and clear file write boundaries.
+- Undo scope must be defined (session-level content undo, not full VCS-like history).
+
+## SpecFlow Analysis
+Primary flow:
+1. Editor enters edit mode.
+2. User double-clicks text node and edits inline.
+3. User blur/enter triggers autosave path.
+4. Mapping resolves edited node to JSON key.
+5. HTML and JSON are both updated and persisted via local helper.
+
+Image flow:
+1. User selects editable image.
+2. Overlay opens with current `src` and `alt`.
+3. New source is validated; ratio warning shown if aspect ratio differs by >15%.
+4. User can still save despite warning.
+5. HTML and JSON are updated and persisted.
+
+Conflict flow:
+1. Divergent values detected for same mapped key.
+2. Default resolution applies: HTML value wins.
+3. JSON is reconciled to HTML on save.
+
+## Acceptance Criteria
+- [x] Double-click enables inline text editing on intended content elements.
+- [x] Text autosaves on blur/enter and also supports explicit save/undo controls.
+- [x] Clicking editable images opens an overlay with `src` and `alt` fields.
+- [x] Image ratio check warns (non-blocking) when replacement differs by >15% aspect ratio.
+- [x] Save operation persists both `web4beginners.com.html` and `content/site-content.de.json`.
+- [x] Sync mapping updates the correct JSON key for edited text/image values.
+- [x] Conflict resolution follows HTML-wins default.
+- [x] No CSS/layout properties are modified by editor actions.
+
+## Success Metrics
+- Editors can modify headings/body text/images directly on page without manual JSON editing.
+- Saved output remains consistent between HTML and JSON for edited items.
+- Editing interactions feel immediate and require minimal training.
+- No unintended style/layout changes caused by the editor.
+
+## Dependencies & Risks
+Dependencies:
+- Defined DOM↔JSON mapping contract for editable nodes
+- Local helper service runtime available in the editor environment
+
+Risks:
+- Incorrect key mapping leading to wrong JSON updates
+- Over-editability (allowing non-content nodes)
+- Unexpected side effects from integrating with existing bundled scripts
+- File write race conditions during rapid autosave
+
+Mitigations:
+- Explicit editable-node allowlist and mapping tests
+- Isolated editor namespace/events
+- Debounced autosave + write serialization
+- Dry-run/preview diagnostics for mapping during development
+
+## References & Research
+- Brainstorm input: `docs/brainstorms/2026-03-04-wysiwyg-inline-editor-sync-brainstorm.md`
+- Prior extraction plan: `docs/plans/2026-03-04-feat-dom-to-json-content-extraction-plan.md`
+- Extractor/source contract: `scripts/extract_dom_content.php`, `content/site-content.de.json`
+- Target HTML: `web4beginners.com.html`