In this paper we demonstrate limits to performance of sensors based on t-rflp. The analysis uses Shannon and Renyi information entropy as a priori and a posteriori measures of bacterial probability density function. The conditional ‘information gain' is then a metric measuring the effectiveness of a sensor based on either single or multiple restriction enzymes. We demonstrate this principle by computing information gain and the number of distinguishable bacterial ‘equivalence classes' of typical restriction enzymes used for the phylogenetic analysis of bacterial colonies in the human gut.
Typical t-rflp instrumental methods combine digests from multiple restriction enzymes. For this case, we formulate and solve for the maximum entropy aposteriori probability distribution for bacterial organisms. We further show how to optimize information gain for two characteristic cases. The first is optimal combination of multiple digests to form a single t-rflp chromatogram which leads to a large scale convex optimization problem. The second case of multiple enzymes and multiple length chromatograms leads to an intractably large optimization problem, but for which we can propose performance metrics.
These ideas are demonstrated using literature data for t-rflp in medical and ecological applications.