Skip to main content

National Institutes of Health

Eunice Kennedy Shriver National Institute of Child Health and Human Development

2023 Annual Report of the Division of Intramural Research

Regulatory Small RNAs and Small Proteins

Gigi Storz
  • Gisela Storz, PhD, Head, Section on Environmental Gene Regulation
  • Aixia Zhang, PhD, Staff Scientist
  • Aisha Burton Okala, PhD, Postdoctoral Fellow
  • Rajat Dhyani, PhD, Postdoctoral Fellow
  • Chelsey R. Fontenot, PhD, Postdoctoral Fellow
  • Shuwen Shan, PhD, Postdoctoral Fellow
  • Narumon Thongdee, PhD, Postdoctoral Fellow
  • Rilee D. Zeinert, PhD, Postdoctoral Fellow
  • Aoshu Zhong, PhD, Postdoctoral Fellow
  • Dennis X. Zhu, PhD, Postdoctoral Fellow
  • Miranda Alaniz, BS, Postbaccalaureate Fellow
  • Amanda P. Brewer, BS, Postbaccalaureate Fellow
  • Juwaan Douglas-Jenkins, BS, Postbaccalaureate Fellow
  • Tiara D. Tillis, BA, Postbaccalaureate Fellow

The group currently has two main interests: identification and characterization of small noncoding RNAs (sRNAs), and identification and characterization of small proteins of less than 50 amino acids. Both small RNAs and small proteins have been overlooked because they are not detected in biochemical assays, and the corresponding genes are missed by genome annotation and are poor targets for genetic approaches. However, both classes of small molecules are being found to have important regulatory roles in organisms ranging from bacteria to humans.

Identification and characterization of small regulatory RNAs

During the past 20 years, we have carried out several different systematic screens for small regulatory RNAs in Escherichia coli. The screens included computational searches for conservation of intergenic regions and direct detection after size selection or co-immunoprecipitation with RNA–binding proteins. Most recently, we have been using deep sequencing approaches to map the 5′ and 3′ ends of all transcripts to further extend our identification of small RNAs in a range of bacteria species [Reference 1]. This work showed that sRNAs are encoded by diverse loci including sequences overlapping mRNAs.

A major focus for the group has been to elucidate the functions of the small RNAs that we and others identified. Early on, we showed that the OxyS RNA, whose expression is induced in response to oxidative stress, acts to repress translation through limited base-pairing with target mRNAs. We discovered that OxyS action is dependent on the Sm-like Hfq protein, which acts as a chaperone to facilitate OxyS RNA base pairing with its target mRNAs. Follow up studies allowed us to learn more about the mechanism by which the Hfq protein facilitates base pairing through multiple RNA binding domains [Reference 2]. We also started to explore the role of ProQ, a second RNA chaperone in E. coli, and, by comparing the sRNA–mRNA interactomes by deep sequencing, found that ProQ and Hfq have overlapping as well as competing roles in the cell. It is likely that still other RNA–binding proteins such as KH domain proteins are involved in small RNA–mediated regulation [Reference 3].

Hfq–binding small RNAs, which act through limited base pairing, are integral to many different stress responses in E. coli and other bacteria, as well as during the interaction between bacteria and bacteriophage. For example, we showed that the Spot 42 RNA, whose levels are highest when glucose is present, plays a broad role in catabolite repression by directly repressing genes involved in central and secondary metabolism, redox balancing, and the consumption of diverse non-preferred carbon sources. Similarly, we found that an sRNA derived from the 3′ UTR (untranslated region) of the glnA mRNA, encoding glutamine synthetase, impacts E. coli growth under low nitrogen conditions by modulating the expression of genes that affect carbon and nitrogen flux [Reference 4]. We recently discovered four UTR–derived sRNAs (UhpU, MotR, FliX, and FlgO), whose expression is controlled by the flagella sigma factor σ28 (fliA), and reported that MotR and FliX modulate the timing of flagella synthesis in E. coli [Reference 5], work that illustrated how sRNA–mediated regulation can overlay a complex network, enabling nuanced control of flagella synthesis. A recent collaborative study of Vibrio cholerae revealed that the QrrX RNA controls quorum sensing dynamics and biofilm formation [Reference 6]. As more and more sRNAs encoded by 5′ or 3′ UTRs, or that are internal to coding sequences, are being found, our observations raise the possibility that phenotypes currently attributed to protein defects are the result of deficiencies in unappreciated regulatory RNAs.

One interesting recent observation is that some small RNAs have dual functions in that they act by both base pairing and by encoding a small, regulatory protein. For example, we discovered the Spot 42 RNA also encodes a 15-amino acid protein (denoted SpfP) [Aoyama JJ, Raina M, Zhong A, Storz G. Dual-function Spot 42 RNA encodes a 15-amino acid protein that regulates the CRP transcription factor. Proc Natl Acad Sci USA 2022;119:e21198661197]. Overexpression of just the small protein from a Spot 42 derivative deficient in base-pairing activity, or just the base-pairing activity from a Spot 42 derivative with a stop codon mutation, both prevented growth on galactose, revealing that the small protein and the small RNA impact the same pathway. Copurification experiments showed that SpfP binds to the CRP (cAMP receptor protein) transcription factor, affecting the kinetics of induction when cells are shifted from glucose to galactose medium. Thus, the small protein reinforces the feedforward loop regulated by the base-pairing activity of the Spot 42 RNA. As a second example, we found a 164–nucleotide RNA previously shown to encode a 28–amino acid protein (denoted AzuC) also base pairs with the cadA (lysine decarboxylase involved in maintaining pH homeostasis) and galE (encoding UDP-glucose 4-epimerase) mRNAs to block expression [Raina M, Aoyama JJ, Bhatt S, Paul BJ, Zhang A, Updegrove TB, Miranda-Ríos J, Storz G. Dual-function AzuCR RNA modulates carbon metabolism. Proc Natl Acad Sci USA 2022;119:e21179301198]. Interestingly, AzuC translation interferes with the observed repression of cadA and galE by the RNA, and base pairing interferes with AzuC translation, demonstrating that the translation and base-pairing functions compete. We hypothesize that many more dual-function RNAs remain to be discovered and suggest that they can be exploited to control gene expression at many levels [Aoyama JJ, Storz G. Two for one: regulatory RNAs that encode small proteins. Trends Biochem Sci 2023;48:1035-1043].

In addition to small RNAs that act via limited base pairing, we have been interested in regulatory RNAs that act by other mechanisms. For instance, early work showed that the 6S RNA binds to and modulates RNA polymerase by mimicking the structure of an open promoter. In another study, we discovered that a broadly conserved RNA structure motif, the yybPykoY motif, found in the 5′ UTR of the mntP gene encoding a manganese exporter, directly binds manganese, resulting in a conformation that liberates the ribosome-binding site.

Further studies to characterize other Hfq– and ProQ–binding RNAs and their physiological roles and evolution, as well as regulatory RNAs that act in ways other than base pairing, are ongoing.

Identification and characterization of small proteins

In our genome-wide screens for small RNAs, we found that a number of short RNAs actually encode small proteins. The correct annotation of the smallest proteins is one of the biggest challenges of genome annotation. Furthermore, there is limited evidence that proteins are synthesized from annotated and predicted short ORFs (open reading frames). Although these proteins have largely been missed, the few small proteins that have been studied in detail in bacterial and mammalian cells were shown to have important functions in regulation, signaling, and cellular defenses [Gray T, Storz G, Papenfort K. Small proteins; big questions. J Bacteriol 2022;204:e0034121]. We thus established a project to identify and characterize proteins of less than 50 amino acids.

We first used sequence conservation and ribosome binding-site models to predict genes encoding small proteins of 16–50 amino acids in the intergenic regions of the E. coli genome. We tested expression of these predicted as well as previously annotated small proteins by integrating the sequential peptide affinity tag directly upstream of the stop codon on the chromosome and assaying for synthesis using immunoblot assays. The approach confirmed that 20 previously annotated and 18 newly discovered proteins of 16–50 amino acids are synthesized. We also carried out a complementary approach based on genome-wide ribosome profiling of ribosomes arrested on start codons to identify many additional candidates; the presence of 38 of these small proteins was confirmed by chromosomal tagging. These studies, together with the work of others, documented that E. coli synthesize over 150 small proteins.

Many of the initially discovered proteins were predicted to consist of a single transmembrane alpha-helix and were found to be in the inner membrane. Interestingly, despite their diminutive size, small membrane proteins display considerable diversity in topology and insertion pathways. Additionally, systematic assays for the accumulation of tagged versions of the proteins showed that many small proteins accumulate under specific growth conditions or after exposure to stress.

We are using the tagged derivatives and information about synthesis and subcellular localization, along with many of the approaches the group used to characterize the functions of small regulatory RNAs, to elucidate the functions of the small proteins. The combined approaches are beginning to give insights into how the small proteins act in E. coli. For example, we discovered the 49–amino acid inner membrane protein AcrZ, whose synthesis is increased in response to noxious compounds such as antibiotics and oxidizing agents, associates with the inner membrane AcrB component of the AcrAB–TolC multidrug efflux pump. Mutants lacking AcrZ are sensitive to many, but not all, of the antibiotics transported by AcrAB–TolC as the result of AcrZ effects on the conformation of the AcrB drug-binding pocket. We also found that synthesis of the 42–amino acid protein MntS is repressed by high levels of manganese by the MntR transcription factor. The lack of MntS leads to reduced activities of manganese-dependent enzymes under manganese-poor conditions, while overproduction of MntS leads to very high intracellular manganese and bacteriostasis under manganese-rich conditions. These and other phenotypes led us to propose that MntS modulates intracellular manganese levels, possibly by inhibiting the manganese exporter MntP. Additionally, we showed that the 31–amino acid inner membrane protein MgtS, whose synthesis is induced by very low magnesium by the PhoPQ two-component regulatory system, acts to raise intracellular magnesium levels and maintain cell integrity upon magnesium depletion. Upon development of a functional tagged derivative of MgtS, we found that MgtS interacts with MgtA to increase the levels of this P-type ATPase magnesium transporter under magnesium-limiting conditions. Correspondingly, the effects of MgtS upon magnesium limitation are lost in an mgtA mutant, and MgtA overexpression can suppress the mgtS phenotype. MgtS stabilization of MgtA provides an additional layer of regulation of this tightly controlled magnesium importer. Unexpectedly, we found that MgtS also interacts with and modulates the activity of a second protein, the PitA cation–phosphate symporter, to further increase intracellular magnesium levels.

The ribosome profiling used to identify the intergenic-encoded small proteins revealed that there is significant translation initiation within larger open reading frames in the E. coli genome. All five E. coli genes encoding Rpn (recombination-promoting nuclease) proteins have such an internal translation site. We showed that the small, highly variable Rpn C–terminal domains (RpnS), which are translated separately from the full-length proteins (RpnL), directly block the activities of the toxic full-length RpnL proteins, constituting a novel toxin-antitoxin system [Zhong A, Jiang X, Hickman AB, Klier K, Teodoro GIC, Dyda F, Laub MT, Storz G. Toxic antiphage defense proteins inhibited by intragenic antitoxin proteins. Proc Natl Acad Sci USA 2023;120:e2307382120]. The crystal structure of RpnAS revealed a dimerization interface encompassing helix that can have four amino acid repeats, whose number varies widely among strains of the same species. Consistent with strong selection for the variation, we documented that plasmid-encoded RpnP2L protects E. coli against certain phages, indicating that the RpnL–RpnS represent a novel antiphage defense system (Figure 1). We propose that many more intragenic-encoded small proteins that serve regulatory roles remain to be discovered in all organisms.

Figure 1. Model for functions of RpnL and RpnS proteins

Figure 1

Click image to view.

In normally growing cells, the RpnS proteins, synthesized from a translation initiation site internal to the rpnL coding sequences, associate with and block the DNA endonuclease activities of the longer RpnL proteins. Upon phage infection, the RpnS proteins dissociate, and the RpnL proteins constitute an antiphage defense. The structure of the RpnS dimer (green) was solved by X-ray crystallography, while the structures of the RpnL-RpnS tetramer (left) and RpnL dimer (right) were predicted with AlphaFold–Multimer.

The ribosome profiling also revealed that some regulatory RNAs encode a small protein and are thus dual-function RNAs. As mentioned above, we documented the 109–nucleotide Spot 42 RNA, one of the best characterized base-pairing small RNAs (sRNAs) in E. coli encodes a 15–amino acid protein (denoted SpfP), which binds to the global transcriptional regulator CRP. The binding blocks the ability of CRP to activate specific genes, impacting the kinetics of induction when cells are shifted from glucose to galactose medium. Thus, the small protein reinforces the feedforward loop regulated by the base-pairing activity of the Spot 42 RNA. Another 164–nucleotide RNA was previously shown to encode a 28–amino acid, amphipathic-helix protein (denoted AzuC). We discovered that the membrane-associated AzuC protein interacts with GlpD, the aerobic glycerol-3-phosphate dehydrogenase, and increases dehydrogenase activity [Raina M, Aoyama JJ, Bhatt S, Paul BJ, Zhang A, Updegrove TB, Miranda-Ríos J, Storz G. Dual-function AzuCR RNA modulates carbon metabolism. Proc Natl Acad Sci USA 2022;119:e2117930119]. Overexpression of the RNA encoding AzuC results in a growth defect in glycerol and galactose medium. The defect in galactose medium was still observed for a stop codon mutant derivative, consistent with the second base-pairing role for the RNA. Interestingly, the MgtS protein mentioned above is encoded divergent from the MgrR small regulatory RNA, which is also important for bacterial adaptation to low magnesium. We constructed synthetic dual-function RNAs comprising MgrR and MgtS. Such constructs allowed us to probe how the organization of the coding and base-pairing sequences and the distance between the two components contribute to the proper function of both activities of a dual-function RNA. By understanding the features of natural and synthetic dual-function RNAs, future synthetic molecules can be designed to maximize their regulatory impact.

Our work, along with related findings by others in eukaryotic cells, supports our hypothesis that small proteins are an overlooked but important class of proteins, which we continue to study.

Additonal Funding

  • NICHD Early Career Award
  • NIGMS Postdoctoral Research Associate (PRAT) Program
  • Scientific Director's Award 2023–2024

Publications

  1. Petroni E, Esnault C, Tetreault D, Dale RK, Storz G, Adams PP. Extensive diversity in RNA termination and regulation revealed by transcriptome mapping for the Lyme pathogen Borrelia burgdorferi. Nat Commun 2023 14:3931.
  2. Kavita K, Zhang A, Tai CH, Majdalani N, Storz G, Gottesman S. Multiple in vivo roles for the C-terminal domain of the RNA chaperone Hfq. Nucleic Acids Res 2022 50:1718–1733.
  3. Olejniczak M, Jiang X, Basczok MM, Storz G. KH-domain proteins: another family of bacterial RNA matchmakers? Mol Microbiol 2022 117:10–19.
  4. Walling LR, Kouse AB, Shabalina SA, Zhang H, Storz G. A 3' UTR-derived small RNA connecting nitrogen and carbon metabolism in enteric bacteria. Nucleic Acids Res 2022 50:10093–10109.
  5. Melamed S, Zhang A, Jarnik M, Mills J, Silverman A, Zhang H, Storz G. σ28-dependent small RNA regulation of flagella biosynthesis. eLife 2023 12:RP87151.
  6. Huber M, Lippegaus A, Melamed S, Siemers M, Wucher BR, Hoyos M, Nadell C, Storz G, Papenfort K. An RNA sponge controls quorum sensing dynamics and biofilm formation in Vibrio cholerae. Nat Commun 2022 13:7585.

Collaborators

  • Shantanu Bhatt, PhD, Department of Biology, Saint Joseph's University, Philadelphia, PA
  • Ryan K. Dale, PhD, Bioinformatics and Scientific Programming Core, NICHD, Bethesda, MD
  • Fred Dyda, PhD, Laboratory of Molecular Biology, NIDDK, Bethesda, MD
  • Caroline Esnault, PhD, Bioinformatics and Scientific Programming Core, NICHD, Bethesda, MD
  • Susan Gottesman, PhD, Laboratory of Molecular Biology, Center for Cancer Research, NCI, Bethesda, MD
  • Todd Gray, PhD, Wadsworth Center, New York State Department of Health, Albany, NY
  • Alison B. Hickman, PhD, Laboratory of Molecular Biology, NIDDK, Bethesda, MD
  • Xiaofang Jiang, PhD, National Library of Medicine, NIH, Bethesda, MD
  • Michael T. Laub, PhD, Massachusetts Institute of Technology, Cambridge, MA
  • Nadim Majdalani, PhD, Laboratory of Molecular Biology, Center for Cancer Research, NCI, Bethesda, MD
  • Juan Miranda-Rios, PhD, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
  • Carey D. Nadell, PhD, Department of Biological Sciences, Dartmouth College, Hanover, NH
  • Mikolaj Olejniczak, PhD, Institute of Molecular Biology and Biotechnology, Adam Mickiewicz University, Poznan, Poland
  • Kai Papenfort, PhD, Institute of Microbiology, Friedrich-Schiller-Universität, Jena, Germany
  • Brian J. Paul, PhD, International Flavors & Fragrances Inc., Wilmington, DE
  • Svetlana A. Shabalina, PhD, National Library of Medicine, NIH, Bethesda, MD
  • Chin-Hsien Tai, PhD, Laboratory of Molecular Biology, Center for Cancer Research, NCI, Bethesda, MD
  • Taylor B. Updegrove, PhD, Laboratory of Molecular Biology, Center for Cancer Research, NCI, Bethesda, MD
  • Joseph T. Wade, PhD, Wadsworth Center, New York State Department of Health, Albany, NY
  • Henry Zhang, PhD, Bioinformatics and Scientific Programming Core, NICHD, Bethesda, MD

Contact

For more information, email storzg@mail.nih.gov or visit https://www.nichd.nih.gov/research/atNICHD/Investigators/storz.

Top of Page