C#解析html库
网上查找有如下几个库
- SGMLReader (好久不更新了)
- html-agility-pack(活跃)
- AngleSharp(活跃)
- CsQuery(好久不更新)
选择AngleSharp
原因:能向写js一样获取元素
安装AngleSharp 1.2.0-beta.431
using AngleSharp.Html.Parser;
namespace HtmlParse01;
class Program
{
static void Main(string[] args)
{
var htmlPart =
@"<td>
<div class='cell'>
<div>编码</div>
</div>
</td>
<td>
<div class='cell'>
<div>
<div>1234567</div>
</div>
</div>
</td>";
var parser = new HtmlParser();
var htmlDocument = parser.ParseDocument(htmlPart);
var cellDivs = htmlDocument.QuerySelectorAll(".cell");
Console.WriteLine(cellDivs.Length);
foreach (var cellDiv in cellDivs)
{
Console.WriteLine(cellDiv.TextContent.Trim());
}
}
}
结果如下
也可以向原生js一样操作dom,参考下面的文章
https://www.jb51.net/article/251499.htm
参考
https://github.com/jamietre/CsQuery
https://www.nuget.org/packages/HtmlAgilityPack/
https://github.com/MindTouch/SGMLReader
https://html-agility-pack.net/
https://github.com/AngleSharp/AngleSharp
AngleSharp
https://scrapingant.com/blog/parse-html-dot-net
https://www.jb51.net/article/251499.htm