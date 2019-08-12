"The bacterial genome is like a book with long strings of letters, only some of which encode the information necessary to make proteins," says Bhatt. "Traditionally, we identify the presence of protein-coding genes within this book by searching for combinations of letters that indicate the 'start' and 'stop' signals that sandwich genes. This works well for larger proteins. But the smaller the protein, the more likely that this technique yields large numbers of false positives that muddy the results."