Content Between HTML Tags Regex

Extracts text content between opening and closing HTML tags.

パターン

/<(\w+)[^>]*>([^<]+)<\/\1>/g
→ ビジュアライザーで開く

テスト例

<h1>Hello World</h1> <p>This is a paragraph.</p> <span class="name">John</span>

コード例

javascript

const regex = /<(\w+)[^>]*>([^<]+)<\/\1>/g;
const result = [...str.matchAll(regex)].map(m => ({tag: m[1], content: m[2]}));

python

import re
pattern = re.compile(r'<(\w+)[^>]*>([^<]+)<\/\1>')
result = pattern.findall(text)

go

import "regexp"
re := regexp.MustCompile(`<(\w+)[^>]*>([^<]+)<\/\1>`)
result := re.FindAllString(text, -1)
htmlparsingextraction