Skip to content

HTML to Markdown

Standard · 1 credit

Convert HTML to clean Markdown, optionally extracting just the article content and estimating token count for LLM pipelines.

POST /api/html-to-markdown

FieldTypeRequiredDefaultDescription
htmlstringYesHTML content to convert
options.extract_articlebooleanNofalseExtract main article content, stripping navigation, sidebars, and boilerplate
options.include_linksbooleanNotruePreserve hyperlinks in Markdown output
options.include_imagesbooleanNotruePreserve image references in Markdown output
FieldTypeDescription
markdownstringConverted Markdown
titlestringExtracted page title
bylinestringExtracted author byline (when available)
excerptstringShort excerpt or meta description
token_estimateintegerEstimated token count of the Markdown output
errorstringWarning if conversion had issues (output may still be present)

token_estimate is useful for LLM pipelines — check output size before sending to a model.

Terminal window
curl -X POST https://morso.dev/api/html-to-markdown \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"html": "<h1>Hello World</h1><p>This is a <strong>test</strong> with a <a href=\"https://example.com\">link</a>.</p>"}'
{
"markdown": "# Hello World\n\nThis is a **test** with a [link](https://example.com).\n",
"title": "Hello World",
"byline": "",
"excerpt": "",
"token_estimate": 18,
"error": ""
}

Strip away navigation, headers, footers, and sidebar chrome — keep only the article body.

Terminal window
curl -X POST https://morso.dev/api/html-to-markdown \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"html": "<html><head><title>Blog — Acme Corp</title></head><body><nav><a href=\"/\">Home</a><a href=\"/blog\">Blog</a></nav><header><h1>Acme Corp</h1></header><article><h2>Why Markdown Matters</h2><p>By <span class=\"author\">Jane Park</span></p><p>Markdown keeps your content portable and readable. No vendor lock-in, no proprietary formats — just plain text that converts anywhere.</p><p>Read more at <a href=\"https://daringfireball.net/projects/markdown/\">Daring Fireball</a>.</p></article><aside><h3>Popular Posts</h3><ul><li>Post A</li><li>Post B</li></ul></aside><footer><p>&copy; 2026 Acme Corp</p></footer></body></html>",
"options": {"extract_article": true}
}'
{
"markdown": "## Why Markdown Matters\n\nBy Jane Park\n\nMarkdown keeps your content portable and readable. No vendor lock-in, no proprietary formats — just plain text that converts anywhere.\n\nRead more at [Daring Fireball](https://daringfireball.net/projects/markdown/).\n",
"title": "Blog — Acme Corp",
"byline": "Jane Park",
"excerpt": "Markdown keeps your content portable and readable.",
"token_estimate": 52,
"error": ""
}

Navigation, sidebar, header, and footer content are all removed — only the <article> body survives.