GitHub Issue: html5lib/html5lib-python
Title
<template> element placed in <body> instead of <head> (tree builder bug)
Body
Summary
html5lib-python's tree builder places <template> elements in <body> instead of <head> when encountered before the <body> start tag. The WHATWG spec (section 13.2.6.4.7 "The 'in body' insertion mode") requires <template> start tags to be processed using the "in head" rules, which places them in <head>.
Reproduction
import html5lib
doc = html5lib.parse("<template></template>", treebuilder="dom", namespaceHTMLElements=False)
# Walk the tree to find where <template> ended up
for child in doc.documentElement.childNodes:
print(f"<{child.localName}>")
for grandchild in child.childNodes:
print(f" <{grandchild.localName}>")
Actual output:
Expected output (per WHATWG spec):
Spec reference
WHATWG HTML spec, section 13.2.6.4.7:
A start tag whose tag name is one of: "base", "basefont", "bgsound", "link", "meta", "noframes", "script", "style", "template", "title"
Process the token using the rules for the "in head" insertion mode.
This means <template> in "in body" mode should be handled identically to <script>, <style>, <title>, etc. -- all of which html5lib-python correctly places in <head>.
Additional context
There is also a secondary issue: html5lib-python's treebuilders (minidom, etree) do not implement the template content document fragment. The spec says <template> elements store their children in a separate "content" document fragment, not as regular child nodes. html5lib-python stores template children as regular childNodes instead. This is noted in existing code comments but compounds the placement bug above.
Environment
- html5lib version: 1.1 (also reproduced with 1.2-dev from git)
- Python: 3.13
- Treebuilder:
dom (minidom) -- also affects etree
- Compared against: html5ever 0.38 (Servo) -- which places
<template> in <head>
GitHub Issue: html5lib/html5lib-python
Title
<template>element placed in<body>instead of<head>(tree builder bug)Body
Summary
html5lib-python's tree builder places
<template>elements in<body>instead of<head>when encountered before the<body>start tag. The WHATWG spec (section 13.2.6.4.7 "The 'in body' insertion mode") requires<template>start tags to be processed using the "in head" rules, which places them in<head>.Reproduction
Actual output:
Expected output (per WHATWG spec):
Spec reference
WHATWG HTML spec, section 13.2.6.4.7:
This means
<template>in "in body" mode should be handled identically to<script>,<style>,<title>, etc. -- all of which html5lib-python correctly places in<head>.Additional context
There is also a secondary issue: html5lib-python's treebuilders (minidom, etree) do not implement the template content document fragment. The spec says
<template>elements store their children in a separate "content" document fragment, not as regular child nodes. html5lib-python stores template children as regularchildNodesinstead. This is noted in existing code comments but compounds the placement bug above.Environment
dom(minidom) -- also affectsetree<template>in<head>