feat: enrich /lookup with university domain list check

Add a second detection path alongside ASN lookup: a self-maintained
list of university domains (uni_domains.txt) loaded at startup.

- New /lookup params: email= (extracts domain from address), domain= unchanged
- Suffix matching: insti.uni-stuttgart.de matches list entry uni-stuttgart.de
  without false-positives (evil-uni-stuttgart.de does not match)
- New response fields: asn_match, domain_match, matched_domain (omitempty)
- nren remains true if either asn_match OR domain_match is true (backwards compat)
- /healthz now returns JSON body: {"asn_count":N,"domain_count":N}
- asn-updater: new update_uni_domains() merges hs-kompass.de TSV + Hipo JSON
  (configurable via UNI_DOMAIN_COUNTRIES / HS_KOMPASS_URL env vars)
- 7 new tests; all existing tests pass unchanged

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-17 15:10:49 +01:00
parent 15476898c3
commit 082ecc579a
7 changed files with 435 additions and 60 deletions

View File

@@ -42,6 +42,11 @@ Client
- Kategorie: `Research and Education`
- monatliche Aktualisierung
- **Hochschul-Domainliste (`uni_domains.txt`)**
- zusammengeführt aus hs-kompass.de TSV und Hipo university-domains-list JSON
- Länderfilter konfigurierbar via `UNI_DOMAIN_COUNTRIES` (Standard: `DE,AT`)
- nach Update: `uni_domains_meta.json` mit Zählern je Quelle
## Bereitgestellte Header
| Header | Beschreibung |
@@ -85,6 +90,8 @@ Bitte füge diese zu dem Service hinzu, bei welchem man die gewünschten Header
- `PDB_BASE`, `PDB_INFO_TYPE`, `PDB_LIMIT`: PeeringDB Filter.
- `HTTP_TIMEOUT`: Timeout pro HTTP-Request.
- `INTERVAL_SECONDS`: Update-Intervall (Standard 30 Tage).
- `UNI_DOMAIN_COUNTRIES`: ISO-Ländercodes für Hipo-Filter (Standard: `DE,AT`).
- `HS_KOMPASS_URL`: URL der hs-kompass.de TSV-Datei (überschreibbar ohne Image-Rebuild).
## Update-Strategie