Description
Parser meta tag SEO completo da HTML — estrae title, description, canonical, lang, robots, viewport, Open Graph (og:title/og:description/og:image/og:type/ og:url), Twitter Card (twitter:card/title/description/image/site/creator), JSON-LD strutturato (Schema.org Article/Product/Organization/BreadcrumbList /FAQPage/HowTo/Recipe), favicon, hreflang multi-language, og:locale, theme-color. Output strutturato schema-uniforme per scoring SEO automatico, alimentare DB audit, generare report email, AI agent suggestion rewriting. Differenza con i sibling: action_meta_extract = parsing PURE locale (HTML → object structured, no network). Per scoring SEO completo con weight rules usa action_seo_audit downstream. Per analisi keyword density specifica usa action_keyword_density. Per audit broken links su pagina usa action_link_audit. Per redirect chain analysis usa action_redirect_chain. Pipeline tipico: action_web_fetch_advanced (fetch HTML) → action_meta_extract (parse meta) → action_seo_audit (score) → action_send_email (digest report). Zero network in questo nodo: parsing locale via cheerio + regex robust. Tollerante a HTML malformato (tag chiusi male, attributi senza quote, encoding inconsistent UTF-8/Latin-1). Output: `{ title, description, canonical, lang, robots, viewport, openGraph: {...}, twitter: {...}, jsonLd: [{...}], favicon, hreflang: [{lang, href}], meta_extra: { theme-color, generator, application-name, ... } }`. JSON-LD normalizzato a array (anche se HTML ne ha 1 solo blocco) per uniformare il downstream loop. hreflang validato (lang code ISO 639-1 + region ISO 3166-1). Use case Cappella-Sistina-grade: (1) **SEO audit mensile** propri 100 landing pages — cron loop su sitemap → fetch + meta_extract + seo_audit → digest email con red flag (description mancante / canonical wrong / og:image broken); (2) **OpenGraph preview tester** — utente incolla URL nel form, workflow fetch + meta_extract → render preview FB/Twitter/LinkedIn come si vedrebbe → permette editor di iterare prima di publish; (3) **Schema.org validation** per propri prodotti e-commerce — verifica JSON-LD Product schema completo (price/availability/aggregateRating) per Google Rich Results; (4) **competitor SERP monitoring** — settimanale fetch 50 landing competitor, estrai title/description, alert se cambiano (= nuova strategia keyword). Safety budget: HTML parse max 5 MB (oltre → truncate), regex robust con timeout 100ms (anti-ReDoS), JSON-LD parse safe (no crash su malformed), audit log con URL hash + meta count per cost monitoring.
