嵌顿是什么意思| 豆角和什么一起炒好吃| 乳房胀痛是什么原因| 考试吃什么早餐| 背痛去医院挂什么科| 未必是什么意思| ed2k用什么下载| 什么叫渣男| 近视和远视有什么区别| 排骨汤什么时候放盐最好| 什么食物是碱性的| 甘油三酯高是什么病| 头孢属于什么类药物| 手麻挂什么科| 月经为什么推迟不来| 适合是什么意思| 远在天边近在眼前是什么意思| 为什么十个络腮九个帅| 头发掉的厉害是什么原因| 有口无心是什么意思| 有肾病的人吃什么好| 耳石症是什么| 胸膜炎吃什么药好| 亦什么意思| 晚上睡觉手麻木是什么原因| 肝多发囊肿是什么意思| 黑匣子是什么意思| hpv52阳性是什么病| 痛风吃什么药效果好| 胎儿缺氧孕妇会有什么反应| 什么时候跑步减肥效果最好| 规培结束后是什么医生| 小五行属什么| 山代表什么生肖| rh是什么意思| 生理曲度变直什么意思| 梦见家里好多蛇是什么预兆| 甲状腺弥漫性病变是什么意思| 画什么| 心脏变大是什么原因| 口腔医学是干什么的| 打嗝挂什么科| 甲磺酸倍他司汀片治什么病| 什么的莲蓬| 宜什么意思| 7月1日是什么节日| 6.8什么星座| peppa是什么意思| 眼睛有红血丝是什么原因| 军魂是什么意思| 吃什么长个子最快| 历久弥新什么意思| 怎么知道自己适合什么发型| 农历五月属什么生肖| 抖m什么意思| 反流性食管炎吃什么药最好| 为什么飞机撞鸟会坠机| le是什么| 酒店尾房是什么意思| 黄痰黄鼻涕吃什么药| 看望病人买什么水果| 吃西瓜不能吃什么| 下巴长痘痘是什么原因| 为什么不敢挖雍正陵墓| 女人绝经一般在什么年龄段| emba是什么意思| 山药叶子长什么样图片| 什么叫多囊| 近亲为什么不能结婚| 黑色柳丁是什么意思| 孩子吃什么有助于长高| 心脏早搏是什么症状| 吃什么能治脂肪肝| 对唔嗨住什么意思| ercp是什么检查| r商标是什么意思| 眼睛做激光手术有什么后遗症| 血稠是什么原因造成的| 整编师和师有什么区别| 嗓子挂什么科| 肠易激综合征吃什么药好| 睡觉磨牙什么原因| 槟榔长什么样| gop是什么| 38属什么| 垂体瘤是什么| 新生儿溶血是什么意思| 恩惠什么意思| 视功能是什么| 甲沟炎有什么药| 什么的孙悟空| nt检查是什么意思| 睡觉咬牙是什么原因| 饽饽是什么意思| 什么是病原体| 1983年是什么年| 养胃吃什么食物好| 需要是什么意思| 七月是什么季节| 什么是证件照| 什么水果低糖| 做梦梦见屎是什么意思| 甲低有什么危害| 天然是什么意思| 女朋友过生日送什么最好| 项链折了意味着什么| 浮肿是什么原因| 脚底发红是什么原因| 红萝卜不能和什么一起吃| rc是什么| 78是什么意思| 什么时候闰三月| 右手手指头麻木是什么病的前兆| 六月十九是什么星座| 多子多福是什么意思| 7月17什么星座| 女人五行缺水是什么命| 体内湿气重吃什么药| 上皮细胞一个加号什么意思| 尿素酶阳性什么意思| 盐菜是什么菜| 月经期适合吃什么水果| 婚前体检都检查什么| 铖字五行属什么| 璠字取名寓意什么| 细菌性结膜炎用什么眼药水| 祛湿吃什么药| 女人更年期什么症状| 驾校体检都检查什么| 骨折是什么感觉| hpv16是什么意思| 性病有什么症状| 7月1号是什么节| 梦见狼是什么意思周公解梦| 手足口病是什么原因引起的| 小狗发抖是什么原因| 有酒瘾是什么感觉| 冲太岁是什么意思| 猴和什么属相最配| 做梦梦到小孩子是什么意思| rf是什么| 除草剂中毒有什么症状| 3.9是什么星座| 一个h是什么牌子| 一生无虞是什么意思| 乙型肝炎e抗体阳性是什么意思| 细胞质由什么组成| 三价铁离子什么颜色| 1965年属什么| 宫缩疼是什么感觉| 耳朵大代表什么| 肌酸激酶偏低是什么原因| 胡萝卜含有什么维生素| 农历2月12日是什么星座| 沉香是什么| 揽件是什么意思| 什么补肾壮阳最好| 在什么情况下需要做肠镜| 夏至要吃什么| 手机电池是什么电池| 去医院看膝盖挂什么科| 暗度陈仓是什么意思| 腿痛挂什么科| 食积是什么意思| 世风日下什么意思| 夏占生女是什么意思| 老年人心慌是什么原因| 项羽的马叫什么名字| zeesea是什么牌子| 你掀起波澜抛弃了我是什么歌| 得了幽门螺杆菌是什么症状| mc是什么意思啊| 做梦梦见狗咬我什么意思啊| 半夜容易醒是什么原因| 得艾滋病的前兆是什么| 肝肾不足是什么意思| 夏天可以玩什么| ips屏幕是什么意思| 娅字五行属什么| 同房有什么好处| 吃什么补身体| 2011属什么生肖| 心包隐窝是什么意思| 卫生纸筒可以做什么| 跟腱炎吃什么药效果好| 火召是什么字| 银花有焰万家春是什么生肖| 空调滴水是什么原因| 德艺双馨是什么意思| 5.5号是什么星座| 高校自主招生是什么意思| 柏拉图爱情是什么意思| 蓝得什么| 釉面是什么意思| 什么是生殖器疱疹| 心跳过快是什么原因引起的| 口若什么什么| 男扮女装是什么意思| 病毒性感染是什么原因| 一什么三什么的成语| 心口窝疼挂什么科| 什么是接触性皮炎| 拜阿司匹灵是什么药| 仁德是什么意思| 梦见好多蚊子是什么意思| 小便有泡沫是什么原因| 肺门不大是什么意思| 荔枝可以做什么菜| 渐行渐远是什么意思| 鸡蛋清敷脸有什么好处和坏处| ns是什么单位| 星是什么意思| 什么是统招生| 福五行属什么| 飞机杯什么意思| development是什么意思| 隐睾是什么意思| 活学活用是什么意思| 常吃阿司匹林有什么副作用| 皮肤瘙痒是什么病的前兆| 罄竹难书的罄什么意思| 什么鱼适合红烧| 普陀山求什么最灵验| 小寄居蟹吃什么| 翻来覆去的覆什么意思| 扁桃体肿大吃什么药好| no2是什么气体| lgbtq是什么意思| 提心吊胆是什么生肖| 什么花是绿色的| 厚植是什么意思| 月经血黑是什么原因| 睡觉盗汗是什么原因| cbd什么意思| 王火火念什么| 猴子尾巴的作用是什么| 五月二十一是什么星座| 钙化是什么意思| 二甲双胍什么时候吃| 三十七岁属什么生肖| 吃什么能解酒| 什么的遗产| 煲什么汤去湿气最好| 天蝎座有什么特点| 什么叫白癜风| 1996年出生属什么| 小腿骨头疼是什么原因| 扁桃体炎吃什么消炎药| 菩提是什么材质| 腊肉和什么菜炒好吃| 晚上尿多是什么原因| 口臭严重是什么原因| cbp是什么意思| 苦荞茶有什么功效| 安陵容什么时候变坏的| 素数是什么| 世界屋脊指的是什么| 伯伯的儿子叫什么| 什么中药可以减肥| 吃什么水果减肥| peak是什么牌子| 为什么说冬吃萝卜夏吃姜| 爱什么分明| 四史指的是什么| 百度

交通在线:咸阳交警推出“微信挪车”服务 20170414

$$ \newcommand{\hessian}{\mathbf{H}} \newcommand{\grad}{\mathbf{g}} \newcommand{\invhessian}{\mathbf{H}^{-1}} \newcommand{\qquad}{\hspace{1em}} $$
百度 开幕式上,中国浦东干部学院副院长王金定致欢迎辞,光明日报社副总编辑陆先高致辞。

Numerical optimization is at the core of much of machine learning. Once you’ve defined your model and have a dataset ready, estimating the parameters of your model typically boils down to minimizing some multivariate function $f(x)$, where the input $x$ is in some high-dimensional space and corresponds to model parameters. In other words, if you solve:

\[x^* = \arg\min_x f(x)\]

then $x^*$ is the ‘best’ choice for model parameters according to how you’ve set your objective.1

In this post, I’ll focus on the motivation for the L-BFGS algorithm for unconstrained function minimization, which is very popular for ML problems where ‘batch’ optimization makes sense. For larger problems, online methods based around stochastic gradient descent have gained popularity, since they require fewer passes over data to converge. In a later post, I might cover some of these techniques, including my personal favorite AdaDelta.

Note: Throughout the post, I’ll assume you remember multivariable calculus. So if you don’t recall what a gradient or Hessian is, you’ll want to bone up first.

"Illustration of iterative function descent"

Newton’s Method

Most numerical optimization procedures are iterative algorithms which consider a sequence of ‘guesses’ $x_n$ which ultimately converge to $x^*$ the true global minimizer of $f$. Suppose, we have an estimate $x_n$ and we want our next estimate $x_{n+1}$ to have the property that \(f(x_{n+1}) < f(x_n)\).

Newton’s method is centered around a quadratic approximation of $f$ for points near $x_n$. Assuming that $f$ is twice-differentiable, we can use a quadratic approximation of $f$ for points ‘near’ a fixed point $x$ using a Taylor expansion:

\[\begin{align} f(x + \Delta x) &\approx f(x) + \Delta x^T \nabla f(x) + \frac{1}{2} \Delta x^T \left( \nabla^2 f(x) \right) \Delta x \end{align}\]

where $\nabla f(x)$ and $\nabla^2 f(x)$ are the gradient and Hessian of $f$ at the point $x_n$. This approximation holds in the limit as $|| \Delta x || \rightarrow 0$. This is a generalization of the single-dimensional Taylor polynomial expansion you might remember from Calculus.

In order to simplify much of the notation, we’re going to think of our iterative algorithm of producing a sequence of such quadratic approximations $h_n$. Without loss of generality, we can write \(x_{n+1} = x_n + \Delta x\) and re-write the above equation,

\[\begin{align} h_n(\Delta x) &= f(x_n) + \Delta x^T \grad_n + \frac{1}{2} \Delta x^T \hessian_n \Delta x \end{align}\]

where $\grad_n$ and $\hessian_n$ represent the gradient and Hessian of $f$ at $x_n$.

We want to choose $\Delta x$ to minimize this local quadratic approximation of $f$ at $x_n$. Differentiating with respect to $\Delta x$ above yields:

\[\begin{align} \frac{\partial h_n(\Delta x)}{\partial \Delta x} = \grad_n + \hessian_n \Delta x \end{align}\]

Recall that any $\Delta x$ which yields $\frac{\partial h_n(\Delta x)}{\partial \Delta x} = 0$ is a local extrema of $h_n(\cdot)$. If we assume that $\hessian_n$ is [postive definite] (psd) then we know this $\Delta x$ is also a global minimum for $h_n(\cdot)$. Solving for $\Delta x$:2

\[\Delta x = - \invhessian_n \grad_n\]

This suggests $\invhessian_n \grad_n$ as a good direction to move $x_n$ towards. In practice, we set \(x_{n+1} = x_n - \alpha (\invhessian_n \grad_n)\) for a value of $\alpha$ where $f(x_{n+1})$ is ‘sufficiently’ smaller than $f(x_n)$.

Iterative Algorithm

The above suggests an iterative algorithm:

\[\begin{align} & \mathbf{NewtonRaphson}(f,x_0): \\ & \qquad \mbox{For $n=0,1,\ldots$ (until converged)}: \\ & \qquad \qquad \mbox{Compute $\grad_n$ and $\invhessian_n$ for $x_n$} \\ & \qquad \qquad d = \invhessian_n \grad_n \\ & \qquad \qquad \alpha = \min_{\alpha \geq 0} f(x_{n} - \alpha d) \\ & \qquad \qquad x_{n+1} \leftarrow x_{n} - \alpha d \end{align}\]

The computation of the $\alpha$ step-size can use any number of line search algorithms. The simplest of these is backtracking line search, where you simply try smaller and smaller values of $\alpha$ until the function value is ‘small enough’.

In terms of software engineering, we can treat $\mathbf{NewtonRaphson}$ as a blackbox for any twice-differentiable function which satisfies the Java interface:

public interface TwiceDifferentiableFunction {
  // compute f(x)

  public double valueAt(double[] x);

  // compute grad f(x)

  public double[] gradientAt(double[] x);

  // compute inverse hessian H^-1

  public double[][] inverseHessian(double[] x);
}

With quite a bit of tedious math, you can prove that for a convex function, the above procedure will converge to a unique global minimizer $x^*$, regardless of the choice of $x_0$. For non-convex functions that arise in ML (almost all latent variable models or deep nets), the procedure still works but is only guranteed to converge to a local minimum. In practice, for non-convex optimization, users need to pay more attention to initialization and other algorithm details.

Huge Hessians

The central issue with $\mathbf{NewtonRaphson}$ is that we need to be able to compute the inverse Hessian matrix.3 Note that for ML applications, the dimensionality of the input to $f$ typically corresponds to model parameters. It’s not unusual to have hundreds of millions of parameters or in some vision applications even billions of parameters. For these reasons, computing the hessian or its inverse is often impractical. For many functions, the hessian may not even be analytically computable, let along representable.

Because of these reasons, $\mathbf{NewtonRaphson}$ is rarely used in practice to optimize functions corresponding to large problems. Luckily, the above algorithm can still work even if $\invhessian_n$ doesn’t correspond to the exact inverse hessian at $x_n$, but is instead a good approximation.

Quasi-Newton

Suppose that instead of requiring $\invhessian_n$ be the inverse hessian at $x_n$, we think of it as an approximation of this information. We can generalize $\mathbf{NewtonRaphson}$ to take a $\mbox{QuasiUpdate}$ policy which is responsible for producing a sequence of $\invhessian_n$.

\[\begin{align} & \mathbf{QuasiNewton}(f,x_0, \invhessian_0, \mbox{QuasiUpdate}): \\ & \qquad \mbox{For $n=0,1,\ldots$ (until converged)}: \\ & \qquad \qquad \mbox{// Compute search direction and step-size } \\ & \qquad \qquad d = \invhessian_n \grad_n \\ & \qquad \qquad \alpha \leftarrow \min_{\alpha \geq 0} f(x_{n} - \alpha d) \\ & \qquad \qquad x_{n+1} \leftarrow x_{n} - \alpha d \\ & \qquad \qquad \mbox{// Store the input and gradient deltas } \\ & \qquad \qquad \grad_{n+1} \leftarrow \nabla f(x_{n+1}) \\ & \qquad \qquad s_{n+1} \leftarrow x_{n+1} - x_n \\ & \qquad \qquad y_{n+1} \leftarrow \grad_{n+1} - \grad_n \\ & \qquad \qquad \mbox{// Update inverse hessian } \\ & \qquad \qquad \invhessian_{n+1} \leftarrow \mbox{QuasiUpdate}(\invhessian_{n},s_{n+1}, y_{n+1}) \end{align}\]

We’ve assumed that $\mbox{QuasiUpdate}$ only requires the former inverse hessian estimate as well tas the input and gradient differences ($s_n$ and $y_n$ respectively). Note that if $\mbox{QuasiUpdate}$ just returns $\nabla^2 f(x_{n+1})$, we recover exact $\mbox{NewtonRaphson}$.

In terms of software, we can blackbox optimize an arbitrary differentiable function (with no need to be able to compute a second derivative) using $\mathbf{QuasiNewton}$ assuming we get a quasi-newton approximation update policy. In Java this might look like this,

public interface DifferentiableFunction {
  // compute f(x)

  public double valueAt(double[] x);

  // compute grad f(x)

  public double[] gradientAt(double[] x);  
}

public interface QuasiNewtonApproximation {
  // update the H^{-1} estimate (using x_{n+1}-x_n and grad_{n+1}-grad_n)

  public void update(double[] deltaX, double[] deltaGrad);

  // H^{-1} (direction) using the current H^{-1} estimate

  public double[] inverseHessianMultiply(double[] direction);
}

Note that the only use we have of the hessian is via it’s product with the gradient direction. This will become useful for the L-BFGS algorithm described below, since we don’t need to represent the Hessian approximation in memory. If you want to see these abstractions in action, here’s a link to a Java 8 and golang implementation I’ve written.

Behave like a Hessian

What form should $\mbox{QuasiUpdate}$ take? Well, if we have $\mbox{QuasiUpdate}$ always return the identity matrix (ignoring its inputs), then this corresponds to simple gradient descent, since the search direction is always $\nabla f_n$. While this actually yields a valid procedure which will converge to $x^*$ for convex $f$, intuitively this choice of $\mbox{QuasiUpdate}$ isn’t attempting to capture second-order information about $f$.

Let’s think about our choice of \(\hessian_{n}\) as an approximation for $f$ near $x_{n}$:

\[h_{n}(d) = f(x_{n}) + d^T \grad_{n} + \frac{1}{2} d^T \hessian_{n} d\]

Secant Condition

A good property for \(h_{n}(d)\) is that its gradient agrees with $f$ at $x_n$ and $x_{n-1}$. In other words, we’d like to ensure:

\[\begin{align} \nabla h_{n}(x_{n}) &= \grad_{n} \\ \nabla h_{n}(x_{n-1}) &= \grad_{n-1}\\ \end{align}\]

Using both of the equations above:

\[\nabla h_{n}(x_{n}) - \nabla h_{n}(x_{n-1}) = \grad_{n} - \grad_{n-1}\]

Using the gradient of $h_{n+1}(\cdot)$ and canceling terms we get

\[\hessian_{n}(x_{n} - x_{n-1}) = (\grad_{n} - \grad_{n-1}) \\\]

This yields the so-called “secant conditions” which ensures that $\hessian_{n+1}$ behaves like the Hessian at least for the diference \((x_{n} - x_{n-1})\). Assuming \(\hessian_{n}\) is invertible (which is true if it is psd), then multiplying both sides by \(\invhessian_{n}\) yields

\[\invhessian_{n} \mathbf{y}_{n} = \mathbf{s}_{n}\]

where \(\mathbf{y}_{n+1}\) is the difference in gradients and \(\mathbf{s}_{n+1}\) is the difference in inputs.

Symmetric

Recall that the a hessian represents the matrix of 2nd order partial derivatives: $\hessian^{(i,j)} = \partial f / \partial x_i \partial x_j$. The hessian is symmetric since the order of differentiation doesn’t matter.

The BFGS Update

Intuitively, we want $\hessian_n$ to satisfy the two conditions above:

  • Secant condition holds for $\mathbf{s}_n$ and $\mathbf{y}_n$
  • $\hessian_n$ is symmetric

Given the two conditions above, we’d like to take the most conservative change relative to $\hessian_{n-1}$. This is reminiscent of the MIRA update, where we have conditions on any good solution but all other things equal, want the ‘smallest’ change.

\[\begin{aligned} \min_{\invhessian} & \hspace{0.5em} \| \invhessian - \invhessian_{n-1} \|^2 \\ \mbox{s.t. } & \hspace{0.5em} \invhessian \mathbf{y}_{n} = \mathbf{s}_{n} \\ & \hspace{0.5em} \invhessian \mbox{ is symmetric } \end{aligned}\]

The norm used here \(\| \cdot \|\) is the weighted frobenius norm.4 The solution to this optimization problem is given by

\[\invhessian_{n+1} = (I - \rho_n y_n s_n^T) \invhessian_n (I - \rho_n s_n y_n^T) + \rho_n s_n s_n^T\]

where $\rho_n = (y_n^T s_n)^{-1}$. Proving this is relatively involved and mostly symbol crunching. I don’t know of any intuitive way to derive this unfortunately.

This update is known as the Broyden–Fletcher–Goldfarb–Shanno (BFGS) update, named after the original authors. Some things worth noting about this update:

  • $\invhessian_{n+1}$ is positive definite (psd) when $\invhessian_n$ is. Assuming our initial guess of $\hessian_0$ is psd, it follows by induction each inverse Hessian estimate is as well. Since we can choose any $\invhessian_0$ we want, including the $I$ matrix, this is easy to ensure.

  • The above also specifies a recurrence relationship between \(\invhessian_{n+1}\) and \(\invhessian_{n}\). We only need the history of \(s_n\) and \(y_n\) to re-construct \(\invhessian_n\).

The last point is significant since it will yield a procedural algorithm for computing $\invhessian_n d$, for a direction $d$, without ever forming the $\invhessian_n$ matrix. Repeatedly applying the recurrence above we have

\[\begin{align} & \mathbf{BFGSMultiply}(\invhessian_0, \{s_k\}, \{y_k\}, d): \\ & \qquad r \leftarrow d \\ & \qquad \mbox{// Compute right product} \\ & \qquad \mbox{for $i=n,\ldots,1$}: \\ & \qquad \qquad \alpha_i \leftarrow \rho_{i} s^T_i r \\ & \qquad \qquad r \leftarrow r - \alpha_i y_i \\ & \qquad \mbox{// Compute center} \\ & \qquad r \leftarrow \invhessian_0 r \\ & \qquad \mbox{// Compute left product} \\ & \qquad \mbox{for $i=1,\ldots,n$}: \\ & \qquad \qquad \beta \leftarrow \rho_{i} y^T_i r \\ & \qquad \qquad r \leftarrow r + (\alpha_{n-i+1}-\beta)s_i \\ & \qquad \mbox{return $r$} \end{align}\]

Since the only use for $\invhessian_n$ is via the product $\invhessian_n \grad_n$, we only need the above procedure to use the BFGS approximation in $\mbox{QuasiNewton}$.

L-BFGS: BFGS on a memory budget

The BFGS quasi-newton approximation has the benefit of not requiring us to be able to analytically compute the Hessian of a function. However, we still must maintain a history of the $s_n$ and $y_n$ vectors for each iteration. Since one of the core-concerns of the $\mathbf{NewtonRaphson}$ algorithm were the memory requirements associated with maintaining an Hessian, the BFGS Quasi-Newton algorithm doesn’t address that since our memory use can grow without bound.

The L-BFGS algorithm, named for limited BFGS, simply truncates the \(\mathbf{BFGSMultiply}\) update to use the last $m$ input differences and gradient differences. This means, we only need to store \(s_n, s_{n-1},\ldots, s_{n-m-1}\) and \(y_n, y_{n-1},\ldots, y_{n-m-1}\) to compute the update. The center product can still use any symmetric psd matrix \(\invhessian_0\), which can also depend on any \(\{s_k\}\) or \(\{ y_k \}\).

L-BFGS variants

There are lots of variants of L-BFGS which get used in practice. For non-differentiable functions, there is an othant-wise varient which is suitable for training $L_1$ regularized loss.

One of the main reasons to not use L-BFGS is in very large data-settings where an online approach can converge faster. There are in fact online variants of L-BFGS, but to my knowledge, none have consistently out-performed SGD variants (including AdaGrad or AdaDelta) for sufficiently large data sets.

  1. This assumes there is a unique global minimizer for $f$. In practice, in practice unless $f$ is convex, the parameters used are whatever pops out the other side of an iterative algorithm.?

  2. We know $- \invhessian \nabla f$ is a local extrema since the gradient is zero, since the Hessian has positive curvature, we know it’s in fact a local minima. If $f$ is convex, we know the Hessian is always positive definite and we know there is a single unique global minimum.?

  3. As we’ll see, we really on require being able to multiply by $\invhessian d$ for a direction $d$.?

  4. I’ve intentionally left the weighting matrix $W$ used to weight the norm since you get the same solution under many choices. In particular for any positive-definite $W$ such that $W s_n = y_n$, we get the same solution.?

红豆吃多了有什么坏处 阴道口溃疡用什么药 拉肚子吃什么菜 男人阴茎硬不起来是什么原因 焦糖色裤子配什么颜色上衣
回是什么生肖 eos是什么 全身发黄是什么原因 什么是微量元素 靖五行属性是什么
尿的是白色米汤是什么病 辄是什么意思 胃泌素17是什么检查 b型和o型生的孩子是什么血型 跳蚤什么样
10月28日什么星座 心脏早搏是什么症状 哺乳期是什么意思 九六年属什么的 芒硝是什么东西
为什么喜欢你hcv8jop9ns6r.cn edf是什么意思hcv8jop8ns4r.cn 什么人始终不敢洗澡hcv9jop6ns3r.cn 感冒流黄鼻涕吃什么药hcv9jop5ns5r.cn 糖尿病的症状是什么hcv8jop9ns5r.cn
小孩说话晚是什么原因hcv9jop0ns5r.cn 蜻蜓点水的目的是什么creativexi.com 扳机是什么意思hcv8jop8ns7r.cn 困觉是什么意思hcv9jop6ns5r.cn 初心是什么hcv7jop6ns1r.cn
安陵容什么时候变坏的hcv7jop4ns7r.cn 真菌镜检阴性是什么意思hcv8jop9ns5r.cn triangle是什么意思ff14chat.com 早上10点是什么时辰hcv8jop2ns5r.cn 01年属什么生肖hcv9jop6ns3r.cn
西葫芦炒什么好吃hcv9jop5ns9r.cn 为什么16岁不能吃维生素Bhcv8jop5ns7r.cn 为什么作什么的成语hcv8jop2ns4r.cn 提供什么fenrenren.com 颈肩综合症有什么症状hcv8jop7ns4r.cn
百度