The assembly resulted in 95 contigs with a total length of 5.4 Mb. Mapping to available genomes from the Rhizobiaceae family revealed that Agrobacterium tumefaciens presents the highest similarity to ATCC 31749. However, not all contigs mapped to A. tumefaciens, revealing differences between the two organisms.
Genes previously reported to influence curdlan production were identified in the ATCC 31749 sequences. The curdlan synthesis operon crdASC, which has homologous sequences in the linear chromosome of A. tumefaciens, was identified, as well as phosphatidylserine synthase (pssAG), with a homologous sequence in the circular chromosome of A. tumefaciens. Other relevant metabolic genes were identified, including genes for sucrose hydrolysis, nitrogen regulation and fixation.
Despite the presence of crdASC and pssAG homologs, curdlan production has not been detected in A. tumefaciens. Potential reasons for this include unexpressed genes, or lack of other, yet unidentified genes involved in curdlan production. Genes related to sugar import and transport were identified in ATCC 31749 sequences that do not map to the A. tumefaciens genome. The available sequences offer an opportunity to identify genes relevant to curdlan production, improve our understanding of the metabolic regulation in ATCC 31749, and reveal targets for future metabolic engineering endeavors.