使用Boost.URL实现http路由

2023年4月22日 · 阅读需 5 分钟

阅读量: 101阅读人次: 102

一个正在躺平的板砖人

使用Boost.Beast实现http服务器时，一直没有一个好的方式解决路由的问题。Boost.URL的发布给了一个比较好的实现方案。

最开始是直接使用硬编码来实现的：

void handleRequest(const boost::beast::http::request<boost::beast::http::string_body> &request) {
	if(request.target()=="/public") {
    	// do something
    } else if(request.target()=="/login") {
        // do something
    } else if(request.target()=="/about") {
        // do something
    }
}

这是最简单粗暴的方式，当然缺点有很多。扩展性差，不易于维护，会出现代码文件爆炸的情况。

后面我找到了0xdead4ead/BeastHttp这个开源库，代码写的很高深，用了std::regex以及大量的模板技术，而且里面还使用大量的宏，难以读懂。在尝试理解其代码，并使用该库实现http路由后，一度使我失去了实现http服务器的乐趣。虽然使用上很简洁，但是这种比较小众的开源代码，看不懂实在令人不放心。

router.get(R"(^/$)", [](auto beast_http_request, auto context) {
    // Send content message to client and wait to receive next request
    context.send(make_200<beast::http::string_body>(beast_http_request, "Main page\n", "text/html"));
});

router.all(R"(^.*$)", [](auto beast_http_request, auto context) {
    context.send(make_404<beast::http::string_body>(beast_http_request, "Resource is not found\n", "text/html"));
});

后来Boost发布了URL库提供了解析URL的一系列工具。其中比较强大的提供了一组类似ABNF范式解析字符串的工具。

先提出目标，我需要解析如下路径：

/user/list
/user/login/{username}
/public/{resource}+

url中，被/分割的字符被称为片段（Segement），所以路径其实就是由一串有序片段组成。在考虑以上路径，在路由http路径时其实我们可以将其抽像为树。

就这样我们逐级匹配从而得到Handler。

首先需要定义匹配片段的规则类SegmentTemplateRule以及该规则匹配得到的值类SegmentTemplate。

class SegmentTemplate {
public:
    enum class Modifier {
        None,
        Optional,	// {id?}
        Star,		// {id*}
        Plus		// {id+}
    }
    bool isLiteral{true};
    Modifier modifier{Modifier::None};
    std::string string;
};
class SegmentTemplateRule {
public:
    using value_type = segment_template;
    boost::urls::result<value_type> parse(char const *&it, char const *end) const noexcept;
};
constexpr auto segmentTemplateRule = SegmentTemplateRule{};

注意到我们的片段可能为字符常量，例如list。也有模板例如{username}方便路由到指定路径时，方便获取该值。

按照ABNF我们可以写入如下文法：

URL			   ::= ['/'] SegmentTemplate {'/' SegmentTemplate}*
SegmentTemplate	::= ['{'] [arg_id] [modifier] ['}']
arg_id          ::=  integer | identifier
integer         ::=  digit+
digit           ::=  "0"..."9"
identifier      ::=  id_start id_continue*
id_start        ::=  "a"..."z" | "A"..."Z" | "_"
id_continue     ::=  id_start | digit
modifier	    ::=  '*' | '+' | '?'

在Boost.URL中，则其解析路径的规则为：

constexpr auto pathTemplateRule = boost::urls::grammar::tuple_rule(
    boost::urls::grammar::squelch(
        boost::urls::grammar::optional_rule(boost::urls::grammar::delim_rule('/'))),
    boost::urls::grammar::range_rule(
        segmentTemplateRule,
        boost::urls::grammar::tuple_rule(boost::urls::grammar::squelch(
                                             boost::urls::grammar::delim_rule('/')),
                                         segmentTemplateRule)));

在做结点存储的时候，我们并没有使用链表式的通用树结构。而是为了遍历以及排序的效率，我们使用两个vector结构来组织树。

结点的数据结构为：

class SegementNode {
public:
    std::size_t parentIndex{npos};
    ChildIndexVector childIndexes;
    std::shared_ptr<AnyResource> resource;
};

其中ChildIndexVector则存储着该结点的子结点在所有结点std::vector<SegementNode>中的索引。

class ChildIndexVector {
    static constexpr std::size_t N = 5;
public:
    bool empty() const;
    std::size_t size() const;
    std::size_t *begin();
    const std::size_t *begin() const;
    std::size_t *end();
    const std::size_t *end() const;
    void erase(std::size_t *it);
    void push_back(std::size_t v);

private:
    std::size_t m_capcity{0};
    std::size_t m_size{0};
    std::size_t *m_childIndexes{nullptr};
    std::size_t m_staticChildIndexes[N]{};
};

ChildIndexVector的实现也做了一定程度的优化。考虑到大多数时候我们编排URL的特点，每个路径下的子路径并不会太多，为了尽可能少的使用动态内存分配，我们预先在对象中分配N个空间大小。

类的实现以及定义具体实现见HttpProxy/UrlRouterPrivate.cpp，这里就不在一一叙述。