透过源码看本质-关于Selenium Webdriver 实现原理的一点思考和分享

Loong_T 发布于2019-07-31 11:13 / 1166人阅读

摘要：最近针对这个问题看了不少了文章和书籍，在加上一点自己的思考和整理，与大家一起分享，一起学习。本文将以为例进行说明。这个值表示的是访问的。以为例可以看到，指令的部分包含了几个组成部分请求方法。这一部分用来表示具体的指令。

作为一名使用Selenium开发UI自动化多年的工程师，一直都对Selenium Webdriver的实现原理感觉不是很清楚。怎么就通过脚本控制浏览器进行各种操作了呢？相信很多Selenium的使用者也会有类似的疑惑。最近针对这个问题看了不少了文章和书籍，在加上一点自己的思考和整理，与大家一起分享，一起学习。文章中如果有不准确的地方，希望大家给予指正。

结构

想要使用Selenium实现自动化测试，主要需要三个东西。

测试代码

Webdriver

浏览器

测试代码

测试代码就是程序员利用不同的语言和相应的selenium API库完成的代码。本文将以python为例进行说明。

Webdriver

Webdriver是针对不同的浏览器开发的，不同的浏览器有不同的webdriver。例如针对Chrome使用的chromedriver。

浏览器

浏览器和相应的Webdriver对应。

首先我们来看一下这三个部分的关系。
对于三个部分的关系模型，可以用一个日常生活中常见的例子来类比。

对于打的这个行为来说，乘客和出租车司机进行交互，告诉出租车想去的目的地，出租车司机驾驶汽车把乘客送到目的地，这样乘客就乘坐出租车到达了自己想去的地方。
这和Webdriver的实现原理是类似的，测试代码中包含了各种期望的对浏览器界面的操作，例如点击。测试代码通过给Webdriver发送指令，让Webdriver知道想要做的操作，而Webdriver根据这些操作在浏览器界面上进行控制，由此测试代码达到了在浏览器界面上操作的目的。
理清了Selenium自动化测试三个重要组成之间的关系，接下来我们来具体分析其中一个最重要的关系。

测试代码与Webdriver的交互

接下来我会以获取界面元素这个基本的操作为例来分析两者之间的关系。
在测试代码中，我们第一步要做的是新建一个webdriver类的对象：

from selenium import webdriver
driver = webdriver.Chrome()

这里新建的driver对象是一个webdriver.Chrome()类的对象，而webdriver.Chrome()类的本质是

from .chrome.webdriver import WebDriver as Chrome

也就是一个来自chrome的WebDriver类。这个.chrome.webdriver.WebDriver是继承了selenium.webdriver.remote.webdriver.WebDriver

from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
...
class WebDriver(RemoteWebDriver):
    """
    Controls the ChromeDriver and allows you to drive the browser.

    You will need to download the ChromeDriver executable from
    http://chromedriver.storage.googleapis.com/index.html
    """

    def __init__(self, executable_path="chromedriver", port=0,
                 chrome_options=None, service_args=None,
                 desired_capabilities=None, service_log_path=None):
...

以python为例，在selenium库中，通过ID获取界面元素的方法是这样的：

from selenium import webdriver
driver = webdriver.Chrome()
driver.find_element_by_id(id)

find_elements_by_id是selenium.webdriver.remote.webdriver.WebDriver类的实例方法。在代码中，我们直接使用的其实不是selenium.webdriver.remote.webdriver.WebDriver这个类，而是针对各个浏览器的webdriver类，例如webdriver.Chrome()。
所以说在测试代码中执行各种浏览器操作的方法其实都是selenium.webdriver.remote.webdriver.WebDriver类的实例方法。
接下来我们再深入selenium.webdriver.remote.webdriver.WebDriver类来看看具体是如何实现例如find_element_by_id()的实例方法的。
通过Source code可以看到：

    def find_element(self, by=By.ID, value=None):
        """
        "Private" method used by the find_element_by_* methods.

        :Usage:
            Use the corresponding find_element_by_* instead of this.

        :rtype: WebElement
        """
        if self.w3c:
      ...
        return self.execute(Command.FIND_ELEMENT, {
            "using": by,
            "value": value})["value"]

这个方法最后call了一个execute方法，方法的定义如下：

    def execute(self, driver_command, params=None):
        """
        Sends a command to be executed by a command.CommandExecutor.

        :Args:
         - driver_command: The name of the command to execute as a string.
         - params: A dictionary of named parameters to send with the command.

        :Returns:
          The command"s JSON response loaded into a dictionary object.
        """
        if self.session_id is not None:
            if not params:
                params = {"sessionId": self.session_id}
            elif "sessionId" not in params:
                params["sessionId"] = self.session_id

        params = self._wrap_value(params)
        response = self.command_executor.execute(driver_command, params)
        if response:
            self.error_handler.check_response(response)
            response["value"] = self._unwrap_value(
                response.get("value", None))
            return response
        # If the server doesn"t send a response, assume the command was
        # a success
        return {"success": 0, "value": None, "sessionId": self.session_id}

正如注释中提到的一样，其中的关键在于

response = self.command_executor.execute(driver_command, params)

一个名为command_executor的对象执行了execute方法。
名为command_executor的对象是RemoteConnection类的对象，并且这个对象是在新建selenium.webdriver.remote.webdriver.WebDriver类对象的时候就完成赋值的self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)。
结合selenium.webdriver.remote.webdriver.WebDriver类的类注释来看：

class WebDriver(object):
    """
    Controls a browser by sending commands to a remote server.
    This server is expected to be running the WebDriver wire protocol
    as defined at
    https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol

    :Attributes:
     - session_id - String ID of the browser session started and controlled by this WebDriver.
     - capabilities - Dictionaty of effective capabilities of this browser session as returned
         by the remote server. See https://github.com/SeleniumHQ/selenium/wiki/DesiredCapabilities
     - command_executor - remote_connection.RemoteConnection object used to execute commands.
     - error_handler - errorhandler.ErrorHandler object used to handle errors.
    """

    _web_element_cls = WebElement

    def __init__(self, command_executor="http://127.0.0.1:4444/wd/hub",
                 desired_capabilities=None, browser_profile=None, proxy=None,
                 keep_alive=False, file_detector=None):

WebDriver类的功能是通过给一个remote server发送指令来控制浏览器。而这个remote server是一个运行WebDriver wire protocol的server。而RemoteConnection类就是负责与Remote WebDriver server的连接的类。
可以注意到有这么一个新建WebDriver类的对象时候的参数command_executor，默认值＝"http://127.0.0.1:4444/wd/hub"。这个值表示的是访问remote server的URL。因此这个值作为了RemoteConnection类的构造方法的参数，因为要连接remote server，URL是必须的。
现在再来看RemoteConnection类的实例方法execute。

    def execute(self, command, params):
        """
        Send a command to the remote server.

        Any path subtitutions required for the URL mapped to the command should be
        included in the command parameters.

        :Args:
         - command - A string specifying the command to execute.
         - params - A dictionary of named parameters to send with the command as
           its JSON payload.
        """
        command_info = self._commands[command]
        assert command_info is not None, "Unrecognised command %s" % command
        data = utils.dump_json(params)
        path = string.Template(command_info[1]).substitute(params)
        url = "%s%s" % (self._url, path)
        return self._request(command_info[0], url, body=data)

这个方法有两个参数：

command

params

command表示期望执行的指令的名字。通过观察self._commands这个dict可以看到，self._commands存储了selenium.webdriver.remote.command.Command类里的常量指令和WebDriver wire protocol中定义的指令的对应关系。

self._commands = {
            Command.STATUS: ("GET", "/status"),
            Command.NEW_SESSION: ("POST", "/session"),
            Command.GET_ALL_SESSIONS: ("GET", "/sessions"),
            Command.QUIT: ("DELETE", "/session/$sessionId"),
...
            Command.FIND_ELEMENT: ("POST", "/session/$sessionId/element"),

以FIND_ELEMENT为例可以看到，指令的URL部分包含了几个组成部分：

HTTP请求方法。WebDriver wire protocol中定义的指令是符合RESTful规范的，通过不同请求方法对应不同的指令操作。

sessionId。Session的概念是这么定义的：

The server should maintain one browser per session. Commands sent to a session will be directed to the corresponding browser.

也就是说sessionId表示了remote server和浏览器的一个会话，指令通过这个会话变成对于浏览器的一个操作。

element。这一部分用来表示具体的指令。

而selenium.webdriver.remote.command.Command类里的常量指令又在各个具体的类似find_elements的实例方法中作为execute方法的参数来使用，这样就实现了selenium.webdriver.remote.webdriver.WebDriver类中实现各种操作的实例方法与WebDriver wire protocol中定义的指令的一一对应。
而selenium.webdriver.remote.webelement.WebElement中各种在WebElement上的操作也是用类似的原理实现的。

实例方法execute的另一个参数params则是用来保存指令的参数的，这个参数将转化为JSON格式，作为HTTP请求的body发送到remote server。
remote server在执行完对浏览器的操作后得到的数据将作为HTTP Response的body返回给测试代码，测试代码经过解析处理后得到想要的数据。

Webdriver与浏览器的关系

这一部分属于各个浏览器开发者和Webdriver开发者的范畴，所以我们不需要太关注，我们所关心的主要还是测试代码和Webdriver的关系，就好像出租车驾驶员如何驾驶汽车我们不需要关心一样。

总结

最后通过这个关系图来简单的描述Selenium三个组成部分的关系。通过对python selenium库的分析，希望能够帮助大家对selenium和webdriver的实现原理有更进一步的了解，在日常的自动化脚本开发中更加快捷的定位问题和解决问题。

GPU云服务器云服务器数据库和数据仓库本质的区别关于asp的工作原理实现分享关于后台服务器的原理

文章版权归作者所有，未经允许请勿转载,若此文章存在违规行为，您可以联系管理员删除。

转载请注明本文地址：https://www.ucloud.cn/yun/44786.html

以后再有人问你selenium是什么，你就把这篇文章给他

摘要：不同目标的自动化测试有不同的测试工具，但是任何工具都无不例外的需要编程的过程，实现源代码，也可以称之为测试脚本。写在最前面：目前自动化测试并不属于新鲜的事物，或者说自动化测试的各种方法论已经层出不穷，但是，能够在项目中持之以恒的实践自动化测试的团队，却依旧不是非常多。有的团队知道怎么做，做的还不够好；有的团队还正在探索和摸索怎么做，甚至还有一些多方面的技术上和非技术上的旧系统需要重构……...

Keven 2019-05-23 12:07 评论0 收藏0
使用selenium模拟浏览器抓取淘宝商品美食信息

摘要：目标通过模拟浏览器抓取淘宝商品美食信息，并存储到数据库中。流程框架淘宝页面比较复杂，含有各种请求参数和加密参数，如果直接请求或者分析将会非常繁琐。目标通过Selenium模拟浏览器抓取淘宝商品美食信息，并存储到MongoDB数据库中。流程框架淘宝页面比较复杂，含有各种请求参数和加密参数，如果直接请求或者分析Ajax将会非常繁琐。Selenium是一个自动化测试工具，可以驱动浏览...

djfml 2019-07-30 18:37 评论0 收藏0
Python 从零开始爬虫(八)——动态爬取解决方案之 selenium

摘要：然而让虫师们垂涎的并不是以上的种种，而是其通过驱动浏览器获得的解析的能力。所以说这货在动态爬取方面简直是挂逼级别的存在，相较于手动分析更简单易用，节省分析打码时间。一旦设置了隐式等待时间，它的作用范围就是对象实例的整个生命周期。 selenium——自动化测试工具，专门为Web应用程序编写的一个验收测试工具，测试其兼容性，功能什么的。然而让虫师们垂涎的并不是以上的种种，而是其通过驱动浏...

fobnn 2019-07-30 17:11 评论0 收藏0
[Python自动化]selenium之文件批量下载

摘要：自动化这一专栏，将以目的为导向，以简化或自动化完成工作任务为目标，将运用于实践中，解决实际问题，以激发读者对这门脚本语言的学习兴趣。 Python 自动化这一专栏...

wzyplus 2021-09-28 09:36 评论0 收藏0