langchina中的PlayWrightBrowserToolkit
好久不见了,朋友们,我(小云)也是好久不写博客了,随着ai的不断引进,和一大堆大语言的到来,我们今天就学习一下啊langchain这个大语言模型中PlaywrightBrowerTookit
1.langchain是什么和他有什么特点 ?
你问我langchain是什么,小云也不能说的很清楚,有什么可以问度娘,和chatgpt,谢谢,好了废话不都说了,
这里分享两个官网网站 ,还有其他的博主比小云写的还仔细,我有不跟其他大佬争了!!
2.langchain的官方网站 ?
官方网站 https://docs.langchain.com/docs/
中文翻译网站 https://www.langchain.com.cn/
中文案例网站 https://liaokong.gitbook.io/llm-kai-fa-jiao-cheng/
3.langchain中的PlayWrightBrowserToolkit
实例化浏览器工具包
<1>请各位大佬请翻到官方课本的python列中
<2>然后我们找到intergations > Agent toolkits
<3>里面有一个 Playwright Browser Tookit
<4>大佬们请看这里有7个功能,(为了大佬的方便我们直接调到中文官方,这里我们需要结合官方文档来看)
中文Playwright浏览器工具包
那我们怎么用咧,官网文档写的异步,小云这里可能不太喜欢异步,人家很贴心的给了一下同步的
下面我就给各位大佬写同步的了,等下让小云
我们可以利用下面的create_sync_playwright_browser这个
sync_browser =create_sync_playwright_browser()
toolkit = PlayWrightBrowserToolkit.from_browser(sync_browser=sync_browser)
tools = toolkit.get_tools()
print(toolst)
这里我们会看到它输出了这7个功能的概括,你看懂了吗?我是没看懂0.0……
[ClickTool(name='click_element', description='Click on an element with the given CSS selector', args_schema=<class 'langchain.tools.playwright.click.ClickToolInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
NavigateTool(name='navigate_browser', description='Navigate a browser to the specified URL', args_schema=<class 'langchain.tools.playwright.navigate.NavigateToolInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
NavigateBackTool(name='previous_webpage', description='Navigate back to the previous page in the browser history', args_schema=<class 'pydantic.main.BaseModel'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
ExtractTextTool(name='extract_text', description='Extract all the text on the current webpage', args_schema=<class 'pydantic.main.BaseModel'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
ExtractHyperlinksTool(name='extract_hyperlinks', description='Extract all hyperlinks on the current webpage', args_schema=<class 'langchain.tools.playwright.extract_hyperlinks.ExtractHyperlinksToolInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
GetElementsTool(name='get_elements', description='Retrieve elements in the current web page matching the given CSS selector', args_schema=<class 'langchain.tools.playwright.get_elements.GetElementsToolInput'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>),
CurrentWebPageTool(name='current_webpage', description='Returns the URL of the current page', args_schema=<class 'pydantic.main.BaseModel'>, return_direct=False, verbose=False, callbacks=None, callback_manager=None, sync_browser=None, async_browser=<Browser type=<BrowserType name=chromium executable_path=/Users/wfh/Library/Caches/ms-playwright/chromium-1055/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=112.0.5615.29>)]
不管了按照人家的继续走吧,我有事先试试人家官方的代码,哎就是手痒,
tools_by_name = {tool.name: tool for tool in tools}
navigate_tool = tools_by_name["navigate_browser"]
get_elements_tool = tools_by_name["get_elements"]
奥,这里他弄了一个字典然后,通过字典匹配来获取这些功能
(1)导航至url功能,这里官方是异步,
await navigate_tool.arun(
{"url": "https:www.baidu.com"}
)
这个改成同步如下:
navigate_tool.run(
{"url": "https:www.baidu.com"}
)
(2)GetElementsTool(get_elements)-通过CSS选择器选择元素,官方异步如下
await get_elements_tool.arun(
{"selector": ".container__headline", "attributes": ["innerText"]}
)
改成异步(这里小云说下一怎么改同步,相比大佬都会吧 await 不要 ,arun > run )
get_elements_tool.run(
{"selector": ".container__headline", "attributes": ["innerText"]}
)
(3)异步CurrentPageTool(current_page)-获取当前页面的URL
await tools_by_name["current_webpage"].arun({})
改为同步
tools_by_name["current_webpage"].run({})
**
这里官网文档实例化浏览器工具包就这些,下个博客看点:
1.ClickTool(click_element)-单击元素(由选择器指定)这个功能怎么用 (已试验成功)
2.ExtractTextTool(extract_text)-使用beautiful soup从当前网页中提取文本 (未实施)
3.ExtractHyperlinksTool(extract_hyperlinks)-使用beautiful soup从当前网页中提取超链接 (未实施)
4.NavigateBackTool(previous_page)-等待元素出现 (未实施)
2.大佬们有没有想一个问题他既然内部调用了playwright方法,为什么只有点击功能,没有输入文本内容呢??可能有这个功能,小云眼瞎没有找到吧,所以即然小云没有找到我就自己封装一个FillToll功能,分析内部调用,编写一个定位搜索框并填写文本的功能
**
最后最后喜欢小云的博客论文,请一键点赞收藏0.0*!!!