Learning Objectives
- State the chain rules for one or two independent variables.
- Use tree diagrams as an aid to understanding the chain rule for several independent and intermediate variables.
Chain Rules for One or Two Independent Variables
Recall that the chain rule for the derivative of a composite of two functions can be written in the form
ddx(f(g(x)))=f′(g(x))g′(x).
In this equation, both f(x) and g(x) are functions of one variable. Now suppose that f is a function of two variables and g is a function of one variable. Or perhaps they are both functions of two variables, or even more. How would we calculate the derivative in these cases? The following theorem gives us the answer for the case of one independent variable.
theorem: Chain rule for one independent variable
Suppose that x=g(t) and y=h(t) are differentiable functions of t and z=f(x,y) is a differentiable function of x and y. Then z=f(x(t),y(t)) is a differentiable function of t and
dzdt=∂z∂x⋅dxdt+∂z∂y⋅dydt,
where the ordinary derivatives are evaluated at t and the partial derivatives are evaluated at (x,y).
Proof
The proof of this theorem uses the definition of differentiability of a function of two variables. Suppose that f is differentiable at the point P(x0,y0), where x0=g(t0) and y0=h(t0) for a fixed value of t0. We wish to prove that z=f(x(t),y(t)) is differentiable at t=t0 and that the Chain Rule for One Independent Variable holds at that point as well.
Since f is differentiable at P, we know that
z(t)=f(x,y)=f(x0,y0)+fx(x0,y0)(x−x0)+fy(x0,y0)(y−y0)+E(x,y),
where lim(x,y)→(x0,y0)E(x,y)√(x−x0)2+(y−y0)2=0. We then subtract z0=f(x0,y0) from both sides of this equation:
z(t)−z(t0)=f(x(t),y(t))−f(x(t0),y(t0))=fx(x0,y0)(x(t)−x(t0))+fy(x0,y0)(y(t)−y(t0))+E(x(t),y(t)).
Next, we divide both sides by t−t0:
z(t)−z(t0)t−t0=fx(x0,y0)(x(t)−x(t0)t−t0)+fy(x0,y0)(y(t)−y(t0)t−t0)+E(x(t),y(t))t−t0.
Then we take the limit as t approaches t0:
limt→t0z(t)−z(t0)t−t0=fx(x0,y0)limt→t0(x(t)−x(t)t−t0)+fy(x0,y0)limt→t0(y(t)−y(t0)t−t0)+limt→t0E(x(t),y(t))t−t0.
The left-hand side of this equation is equal to dz/dt, which leads to
dzdt=fx(x0,y0)dxdt+fy(x0,y0)dydt+limt→t0E(x(t),y(t))t−t0.
The last term can be rewritten as
limt→t0E(x(t),y(t))t−t0=limt→t0(E(x,y)√(x−x0)2+(y−y0)2√(x−x0)2+(y−y0)2t−t0)=limt→t0(E(x,y)√(x−x0)2+(y−y0)2)limt→t0(√(x−x0)2+(y−y0)2t−t0).
As t approaches t0,(x(t),y(t)) approaches (x(t0),y(t0)), so we can rewrite the last product as
lim(x,y)→(x0,y0)(E(x,y)√(x−x0)2+(y−y0)2)lim(x,y)→(x0,y0)(√(x−x0)2+(y−y0)2t−t0)
Since the first limit is equal to zero, we need only show that the second limit is finite:
lim(x,y)→(x0,y0)(√(x−x0)2+(y−y0)2t−t0)=lim(x,y)→(x0,y0)(√(x−x0)2+(y−y0)2(t−t0)2)=lim(x,y)→(x0,y0)⎛⎝√(x−x0t−t0)2+(y−y0t−t0)2⎞⎠=√(lim(x,y)→(x0,y0)(x−x0t−t0))2+(lim(x,y)→(x0,y0)(y−y0t−t0))2.
Since x(t) and y(t) are both differentiable functions of t, both limits inside the last radical exist. Therefore, this value is finite. This proves the chain rule at t=t0; the rest of the theorem follows from the assumption that all functions are differentiable over their entire domains.
■
Closer examination of the Chain Rule for One Independent Variable reveals an interesting pattern. The first term in the equation is ∂f∂x⋅dxdt and the second term is ∂f∂y⋅dydt. Recall that when multiplying fractions, cancelation can be used. If we treat these derivatives as fractions, then each product “simplifies” to something resembling ∂f/dt. The variables x and y that disappear in this simplification are often called intermediate variables: they are independent variables for the function f, but are dependent variables for the variable t. Two terms appear on the right-hand side of the formula, and f is a function of two variables. This pattern works with functions of more than two variables as well, as we see later in this section.
Example: Using the chain rule
Calculate dz/dt for each of the following functions:
a. z=f(x,y)=4x2+3y2,x=x(t)=sint,y=y(t)=cost
b. z=f(x,y)=√x2−y2,x=x(t)=e2t,y=y(t)=e−t
Show Solution
a. To use the chain rule, we need four quantities—∂z/∂x,∂z/∂y,dx/dt, and dy/dt:
∂z∂x=8x∂z∂y=6ydxdt=costdydt=−sint.
Now, we substitute each of these into the Chain Rule for One Independent Variable:
dzdt=∂z∂x⋅dxdt+∂z∂y⋅dydt=(8x)(cost)+(6y)(−sint)=8xcost−6ysint.
This answer has three variables in it. To reduce it to one variable, use the fact that x(t)=sint and y(t)=cost. We obtain
dzdt=8xcost−6ysint=8(sint)cost−6(cost)sint=2sintcost.
This derivative can also be calculated by first substituting x(t) and y(t) into f(x,y), then differentiating with respect to t:
z=f(x,y)=f(x(t),y(t))=4(x(t))2+3(y(t))2=4sin2t+3cos2t.
Then
dzdt=2(4sint)(cost)+2(3cost)(−sint)=8sintcost−6sintcost=2sintcost,
which is the same solution. However, it may not always be this easy to differentiate in this form.
b. To use the chain rule, we again need four quantities—∂z/∂x,∂z/∂y,dx/dt,, and dy/dt:
∂z∂x=x√x2−y2∂z∂y=−y√x2−y2dxdt=2e2tdydt=−e−tt.
We substitute each of these into the Chain Rule for One Independent Variable:
dzdt=∂z∂x⋅dxdt+∂z∂y⋅dydt=(x√x2−y2)(2e2t)+(−y√x2−y2)(−e−t)=2xe2t−ye−t√x2−y2.
To reduce this to one variable, we use the fact that x(t)=e2t and y(t)=e−1. Therefore,
dzdt=2xe2t−ye−t√x2−y2=2(e2t)e2t+(e−t)e−t√e4t−e−2t=2e4t+e−2t√e4t−e−2t.
To eliminate negative exponents, we multiply the top by e2t and the bottom by √e4t:
dzdt=2e4t+e−2t√e4t−e−2t⋅e2t√e4t=2e6t+1√e8t−e2t=2e6t+1√e2t(e6t−1)=2e6t+1et√e6t−1.
Again, this derivative can also be calculated by first substituting x(t) and y(t) into f(x,y), then differentiating with respect to t:
z=f(x,y)=f(x(t),y(t))=√(x(t))2−(y(t))2=√e4t−e−2t=(e4t−e−2t)1/2.
Then
dzdt=12(e4t−e−2t)−1/2(4e4t+2e−2t)=2e4t+e−2t√e4t−e−2t.
This is the same solution.
Try it
Calculate dz/dt given the following functions. Express the final answer in terms of t.
z=f(x,y)=x2−3xy+2y2,x=x(t)=3sin2t,y=y(t)=4cos2t
Show Solution
=(2x−3y)(6cos2t)+(−3x+y4)(−8sin2t)=−92sin2tcos2t−72(cos22t−sin22t=−46sin4t−72cos4t.
It is often useful to create a visual representation of the Chain Rule for One Independent Variable for the chain rule. This is called a tree diagram for the chain rule for functions of one variable and it provides a way to remember the formula (Figure 1). This diagram can be expanded for functions of more than one variable, as we shall see very shortly.
Figure 1. Tree diagram for the case dzdt=∂z∂x⋅dxdt+∂z∂y⋅dydt.
In this diagram, the leftmost corner corresponds to z=f(x,y). Since f has two independent variables, there are two lines coming from this corner. The upper branch corresponds to the variable x and the lower branch corresponds to the variable y. Since each of these variables is then dependent on one variable t, one branch then comes from x and one branch comes from y. Last, each of the branches on the far right has a label that represents the path traveled to reach that branch. The top branch is reached by following the x branch, then the t branch; therefore, it is labeled (∂z/∂x)×(dx/dt). The bottom branch is similar: first the y branch, then the t branch. This branch is labeled (∂z/∂x)×(dy/dt). To get the formula for dz/dt, add all the terms that appear on the rightmost side of the diagram. This gives us the Chain Rule for One Independent Variable.
In the Chain Rule for Two Independent Variables, z=f(x,y) is a function of x and y, and both x=g(u,v) and y=h(u,v) are functions of the independent variables u and v.
Theorem: Chain Rule for two independent variables
Suppose x=g(u,v) and y=h(u,v) are differentiable functions of u and v, and z=f(x,y) is a differentiable function of x and y. Then, z=f(g(u,v),h(u,v)) is a differentiable function of u and v, and
∂z∂u=∂z∂x∂x∂u+∂z∂y∂y∂u
and
∂x∂v=∂z∂x∂x∂v+∂z∂y∂y∂v
We can draw a tree diagram for each of these formulas as well as follows.
Figure 2. Tree diagram for ∂z∂u=∂z∂x∂x∂u+∂z∂y∂y∂u and ∂x∂v=∂z∂x∂x∂v+∂z∂y∂y∂v.
To derive the formula for ∂z/∂u, start from the left side of the diagram, then follow only the branches that end with u and add the terms that appear at the end of those branches. For the formula for ∂z/∂v, follow only the branches that end with v and add the terms that appear at the end of those branches.
There is an important difference between these two chain rule theorems. In the Chain Rule for One Independent Variable, the left-hand side of the formula for the derivative is not a partial derivative, but in the Chain Rule for Two Independent Variables it is. The reason is that, in The reason is that, in the Chain Rule for One Independent Variable, z is ultimately a function of t alone, whereas in Chain Rule for Two Independent Variables, z is a function of both u and v.
Example: using the chain rule for two variables
Calculate ∂z/∂u and ∂z/∂v using the following functions:
z=f(x,y)=3x2−2xy+y2,x=x(u,v)=3u+2v,y=y(u,v)=4u−v
Show Solution
To implement the chain rule for two variables, we need six partial derivatives—∂z/∂x,∂z/∂y,∂x/∂u,∂x/∂v,∂y/∂u,, and ∂y/∂v:
∂z∂x=6x−2y∂z∂y=−2x+2y∂x∂u=3∂x∂v=2∂y∂u=4∂y∂v=−1.
To find ∂z/∂u, we use the Chain Rule for Two Independent Variables:
∂z∂u=∂z∂x∂x∂u+∂x∂y∂y∂u=3(6x−2y)+4(−2x+2y)=10x+2y.
Next, we substitute x(u,v)=3u+2v and y(u,v)=4u−v:
∂z∂u=10x+2y=10(3u+2v)+2(4u−v)=38u+18v.
To find ∂z/∂v, we use the Chain Rule for Two Independent Variables:
∂z∂v=∂z∂x∂x∂v+∂x∂y∂y∂v=2(6x−2y)+(−1)(−2x+2y)=14x−6y.
Then we substitute x(u,v)=3u+2v and y(u,v)=4u−v:
∂z∂v=14x−6y=14(3u+2v)−6(4u−v)=18u+34v.
Try it
Calculate ∂x/∂u and ∂z/∂v given the following functions:
z=f(x,y)=2x−yx+3y,x(u,v)=e2ucos3v,y(u,v)=e2usin3v
Show Solution
∂z∂u=0,∂z∂v=−21(3sin3v+cos3v)2
Watch the following video to see the worked solution to the above Try It
The Generalized Chain Rule
Now that we’ve see how to extend the original chain rule to functions of two variables, it is natural to ask: Can we extend the rule to more than two variables? The answer is yes, as the generalized chain rule states.
Theorem: Generalized Chain Rule
Let w=f(x1,x2,…,xm) be a differentiable function of m independent variables, and for each i∈{1,…,m}, let xi=xi(t1,t2,…,tn) be a differentiable function of n independent variables. Then
∂w∂tj=∂w∂x1∂x1∂tj+∂w∂x2∂x2∂tj+⋯+∂w∂xm∂xm∂tj
for any j∈{1,2,…,n}.
In the next example we calculate the derivative of a function of three independent variables in which each of the three variables is dependent on two other variables.
Example: using the generalized Chain Rule
Calculate ∂w/∂u and ∂w/∂v using the following functions:
w=f(x,y,z)=3x2−2xy+4z2x=x(u,v)=eusinvy=y(u,v)=eycosvz=z(u,v)=eu.
Show Solution
The formulas for ∂w/∂u and ∂w/∂v are
∂w∂u=∂w∂x⋅∂x∂u+∂w∂y⋅∂y∂u+∂w∂z⋅∂z∂u
∂w∂v=∂w∂x⋅∂x∂v+∂w∂y⋅∂y∂v+∂w∂z⋅∂z∂v.
Therefore, there are nine different partial derivatives that need to be calculated and substituted. We need to calculate each of them:
∂w∂x=6x−2y∂w∂y=−2x∂w∂z=8z∂x∂u=eusinv∂y∂u=eucosv∂z∂u=eu∂x∂v=eucosv∂y∂v=−eusinv∂z∂v=0.
Now, we substitute each of them into the first formula to calculate ∂w/∂u:
∂w∂u=∂w∂x⋅∂x∂u+∂w∂y⋅∂y∂u+∂w∂z⋅∂z∂u=(6x−2y)eusinv−2xeucosv+8zeu,
then substitute x(u,v)=eusinv,y(u,v)=eucosv, and z(u,v)=eu, into this equation:
∂w∂u=(6x−2y)eusinv−2xeucosv+8zeu=(6eusinv−2eucosv)eusinv−2(eusinv)eucosv+8e2u=6e2usin2v−4e2usinvcosv+8e2u=2e2u(3sin2v−2sinvcosv+4).
Next, we calculate ∂w/∂v:
∂w∂v=∂w∂x⋅∂x∂v+∂w∂y⋅∂y∂v+∂w∂z⋅∂z∂v=(6x−2y)eucosv−2x(−eusinv)+8z(0),
then we substitute x(u,v)=eusinv,y(u,v)=eucosv, and z(u,v)=eu into this equation:
∂w∂v=(6x−2y)eucosv−2x(−eusinv)=(6eusinv−2eucosv)eucosv+2(eusinv)(eusin)=2e2usin2v+6e2usinvcosv−2e2ucos2v=2e2u(sin2v+sinvcosv−cos2v).
Try it
Calculate ∂w/∂u and ∂w/∂v given the following functions:
w=f(x,y,z)=x+2y−4z2x−y+3zx=x(u,v)=e2ucos3vy=y(u,v)=e2usin3vz=z(u,v)=e2u.
Show Solution
∂w∂u=0∂w∂v=15−33sin3v+6cos3v(3+2cos3v−sin3v)2
Watch the following video to see the worked solution to the above Try It
Example: drawing a tree diagram
Create a tree diagram for the case when
w=f(x,y,z),x=x(t,u,v),y=y(t,u,v),z=z(t,u,v)
and write out the formulas for the three partial derivatives of w.
Show Solution
Starting from the left, the function f has three independent variables: x, y, and z. Therefore, three branches must be emanating from the first node. Each of these three branches also has three branches, for each of the variables t, u, and v.
Figure 3. Tree diagram for a function of three variables, each of which is a function of three independent variables.
The three formulas are
∂w∂t=∂w∂x∂x∂t+∂w∂y∂y∂t+∂w∂z∂z∂t
∂w∂u=∂w∂x∂x∂u+∂w∂y∂y∂u+∂w∂z∂z∂u
∂w∂v=∂w∂x∂x∂v+∂w∂y∂y∂v+∂w∂z∂z∂v.
Try it
Create a tree diagram for the case when
w=f(x,y),x=x(t,u,v),y=y(t,u,v)
and write out the formulas for the three partial derivatives of w.
Show Solution
∂w∂t=∂w∂x∂x∂t+∂w∂y∂y∂t
∂w∂u=∂w∂x∂x∂u+∂w∂y∂y∂u
∂w∂v=∂w∂x∂x∂v+∂w∂y∂y∂v.
Figure 4.
Candela Citations
CC licensed content, Original
CC licensed content, Shared previously