So here's one approach: let Nn denote the size of your population at time n without the additional condition that p(0,1)=1 and let (Xi,j)1≤i,j<∞ be i.i.d. random variables distributed according to your offspring distribution. That is, P(X1,1=k)=pk for all nonnegative integers k. Then:
ENn+1=E(∑k=1NnXn+1,k)=ENnEXn+1,1=μE(Nn)
The second equality here follows from the fact that all of the Xn+1,k's are independent of Nn. From this recurrence, we deduce that ENn=N0μn. Thus,
ENn=N0μn=∑k=1∞kP(Nn=k)≥∑k=1∞P(Nn=k)=1−P(Nn=0)
Rearranging, this implies that:
P(Nn=0)≥1−N0μn
Let τ0 denote the hitting time of 0 by Nn. Then P(Nn=0)=P(τ0≤n) since once our population dies, it stays dead. Thus, it follows that:
P(τ0>n)≤N0μn
From here, since τ0 is a nonnegative random variable, the layer- cake representation tells us that:
Eτ0=∑k=1∞P(τ0≥k)≤1+∑k=1∞P(τ0>k)≤1+∑k=1∞N0μk=1+N0μ1−μ<∞
The summability of this last geometric series is where we made use of the fact that μ<1. Notice now that since your Markov chain behaves the same way as the original branching process up until the point we hit 0, τ0 is unchanged when we add on the condition that p(0,1)=1. In summa, we see that Eτ0<∞ as desired and so the resulting Markov process is positive recurrent.