A transfected attacin A construct faithfully reports Toll and Imd signaling
To explore how pathway responsiveness is encoded at the DNA level, we used a molecular genetic analysis of a single locus as the foundation for a global bioinformatic approach. Attacin A (AttA), an antimicrobial peptide gene, is responsive to both Toll and Imd signaling (
Asling et al, 1995;
Lemaitre et al, 1997;
De Gregorio et al, 2002;
Hedengren‐Olcott et al, 2004). Using cultured
Drosophila Schneider (S2
*) cells, which express all three fly NF‐κB proteins and mediate robust Toll and Imd responses (
Samakovlis et al, 1990;
Han and Ip, 1999), we assayed transcription of an AttA reporter construct (
Figure 1A) upon innate immune signaling.
To stimulate the Toll pathway, we applied epidermal growth factor (EGF) to cells expressing EGFR‐Toll, a chimera fusing the extracellular and transmembrane domains of the human EGF receptor to the intracellular domain of Toll (
Sun et al, 2004). To specifically induce the Imd pathway, we used a peptidoglycan (PG) preparation from
Bacillus subtilis (
Leulier et al, 2003;
Stenbak et al, 2004). The AttA reporter construct exhibited a robust response to either inducer, with 5‐ to 11‐fold activation on EGF treatment and a 12‐ to 28‐fold increase on exposure to PG.
To further validate the cultured cell system, we used RNA interference (RNAi) to inactivate innate immune response loci in the Toll or Imd signaling cascades. RNAi against genes in the Toll pathway—
MyD88,
tube, or
pelle—specifically blocked induction by EGF, but not PG (
Figure 1B). Similarly, inactivation of Imd pathway components—
imd,
Tak1, or
key (IKKγ)—eliminated induction by PG, but not EGF. AttA induction by either innate immune pathway was strictly dependent on endogenous NF‐κB factors (
Figure 1C). For the Toll response, Dif and Dorsal had overlapping function, with either factor alone being sufficient for some signaling. For the Imd response, Relish alone was necessary and sufficient. The S2
* cell system thus effectively recapitulates endogenous regulation of the AttA gene.
Promoter proximal κB sites govern AttA induction by the Toll and Imd pathways
The transcriptional start site for AttA lies approximately 650 bp downstream of the 3′ end of the neighboring transcription unit (
Figure 1A). Within the intergenic region lie four potential κB sites (
Dushay et al, 2000;
Senger et al, 2004). As is characteristic of prototypical κB motifs, the 5′ half‐site in each case contains either GGG or GGGG.
Using site‐directed mutagenesis to inactivate or reposition the κB‐related motifs, we assayed motif function in directing expression of the AttA reporter gene. Inactivation of pairs of sites revealed that only the κB motifs at positions −46 and −118 were necessary for activation by either Imd or Toll (
Figure 2A). Furthermore, each of these two sites, κB1 and κB2, preferentially mediates signaling by one innate immune pathway (
Figure 2B). Constructs containing one or more copies of κB1, but not κB2, responded more strongly to Imd than to Toll. Likewise, the presence of κB2, but not κB1, directed a stronger response to Toll than to Imd.
The sequence of a single κB site can thus determine the signaling pathway to which a Drosophila innate immune gene is preferentially responsive.
Bioinformatic analysis reveals that Toll‐ and Imd‐responsive loci differ in κB site structure and sequence
To establish whether there is a general κB sequence code for innate immunity, we carried out a bioinformatics analysis based on published microarray data sets for innate immune responses in wild‐type and mutant strains of
Drosophila (
De Gregorio et al, 2001,
2002). To identify loci responsive to Toll but not Imd, we screened for strong induction by fungal infection and by a constitutively active Toll receptor, as well as a dependence on a functional Toll pathway for induction by bacterial infection. Similarly, we identified genes specifically responsive to Imd as those that responded robustly to bacterial infection only in the presence of a functional Imd pathway, but were not appreciably induced by fungal infection. Using quantitative expressions of these criteria to define screening algorithms (see Materials and methods), we identified 16 Toll‐responsive genes and 11 Imd‐responsive loci. We noted good agreement of these two gene sets with a classification based on clustering of temporal expression patterns (
Boutros et al, 2002), eleven of the Toll loci and nine of the Imd loci being common to both analyses.
The results of the bioinformatic analysis were striking. For both the Toll‐ and Imd‐responsive gene sets, MEME analysis identified a κB‐type sequence as the highest scoring motif (
Table I). The Toll and Imd gene sets were decidedly different, however, with regard to κB site composition and number.
Whereas nearly two‐thirds (62%) of κB motifs in the Imd set had a GGGGA 5′ half‐site, such a half‐site was absent from the κB motifs in the Toll set. The Toll κB motifs instead typically had either GGGA (12 examples) or GGAA (3 examples) as the 5′ half‐site (
Table II). Moreover, the difference in half‐site sequence was not an accident of motif definition during MEME analysis. A scan of the entire 3.2 kb sequence space comprising the upstream regions for the Toll gene set detected only a single example of GGGGA or its reverse complement, TCCCC; a parallel scan of the smaller (2.2 kb) sequence space for the Imd upstream regions detected 15 such instances.
Divergence between the potential κB sites in the Toll and Imd gene sets extended throughout the motifs. Representative Toll‐responsive κB motifs had four or five bases between the G cluster and the first C residue, for example, GGGAAAACCC. Conversely, those Imd elements containing strings of G's and C's typically had a two or three base separation, for example, GGGGATTCCT. Statistically, these differences were marked. Overall, a 4–5 bp (A,T)‐rich region separated G's and C's in 94% of the Toll motifs, but only 14% of the Imd motifs. Similarly, we found a 2–3 bp (A,T)‐rich region separating G's and C's in 52% of the Imd motifs, but only one of the Toll motifs.
In approximately half of the Imd motifs, the 3′ half‐site diverged significantly from a canonical κB motif. In place of two or more C residues, the 3′ half‐site of these motifs consisted largely or entirely of a string of T residues, for example, GGGGA
TTTTT. Studies in mammalian systems have demonstrated that there are motifs of this type, that is, having only a single cognate κB half‐site, that can nevertheless bind specifically to Rel proteins
in vitro and exhibit
cis‐regulatory activity
in vivo (see, e.g.,
Whitley et al, 1994).
The Toll‐ and Imd‐responsive gene sets differed not only in κB site sequence, but also κB site number. Of the 14 Toll genes for which κB sites were detected, 12 had only a single presumptive κB site and none had more than two sites. In contrast, eight of the nine Imd genes with predicted κB sites had two or more sites.
A κB sequence code governs fly innate immunity
To determine whether the observed differences between the κB sites in the Toll and Imd gene sets correspond to a cis‐regulatory code, we returned to the cultured cell system. We inactivated the −118 site in the AttA reporter construct, introduced synthetic versions of the κB consensus motifs at the −46 position, and assayed responses to innate immune signaling.
The consensus κB motifs defined by the MEME analysis mediated pathway‐specific transcriptional responses in the context of the AttA reporter (
Figure 2C). In response to Toll signaling, the Toll consensus sequence, GGGAAAACCC, directed reporter gene expression many times that seen for the wild‐type gene. In contrast, the response to Imd signaling was well below that of the wild type. The two Imd consensus sites—GGGGATCCCC and GGGGATTTTT—also discriminated between pathways (
Figure 2C). On Imd induction, both sites provided a significant increase in reporter gene expression, as substantial as that seen for any single site introduced into AttA (
Figure 2B and unpublished results). On Toll induction, the increase in reporter expression with either Imd consensus site was an order of magnitude less than with the Toll consensus site.
The κB sequence code functions by specifying NF‐κB protein binding
There are at least two distinct mechanisms by which a κB sequence code could dictate specific transcriptional responses. First, NF‐κB proteins activated by different pathways could vary significantly in their affinity for particular κB motifs. Available data implicate this mechanism in the IKKβ‐independent NF‐κB response in humans (
Bonizzi et al, 2004). Second, a single κB sequence could bind a range of NF‐κB proteins indiscriminately, but motif sequence could determine which coactivators interact with bound NF‐κB proteins. Such a model can explain the regulation of several mammalian genes by pairs of κB motifs (
Leung et al, 2004).
In exploring which mechanism is operative in flies, we used a gel shift assay to determine whether or not NF‐κB proteins differ in their target site‐binding preference. We expressed Relish and DIF from pGEX vectors in bacteria, purified the GST fusion proteins by affinity chromatography, and carried out binding assays with labeled oligonucleotides.
The gel shift assays demonstrated that the Relish and Dif GST fusion proteins bind synthetic κB motifs
in vitro with a specificity identical to that observed
in vivo for signaling by Imd and Toll, respectively (
Figure 3A). Using GST–Relish, we observed a strong gel shift signal for the Imd‐specific sites GGGGATCCCC and GGGGATTCCC. Relish also bound well to the noncanonical Imd site, GGGGATTTTT, but not to the Toll consensus sequence, GGGAAAACCC. Results with GST–DIF were the inverse: strong binding to the Toll consensus site, but not the Imd consensus motifs.
The correlation between interaction seen in the gel shift assay and pathway specificity
in vivo held true not only for synthetic κB sites, but also for κB sites found upstream of innate immune loci (
Figure 3B). For example, GST–Relish gave a strong gel shift signal with the κB site from an Imd‐specific locus, diptericin, but had no detectable interaction with the κB site from a Toll‐specific gene, IM1. Furthermore, GST–Relish appeared to interact to a much greater extent with the −46 motif from AttA than with the −118 motif, consistent with the specificity observed in cells. GST–DIF exhibited complementary specificity, binding to the greatest extent with the IM1 motif and the −118 motif from AttA.
Given the consistent relationship between pathway responsiveness and Rel protein‐binding specificity, we extended our gel shift analysis to all 38 of the sites identified by the MEME program. Thirty‐four of the 38 sites bound to either DIF or Relish (
Table II). Of the 16 κB sites identified in the Toll gene set, 15 had significant binding to GST–DIF and none bound to GST–Relish. The results with the Imd motifs were likewise very clear. Of the 18 potential κB sites in the Imd gene set, 15 bound well to Relish and none bound to DIF.
DIF and Relish tolerate κB site variation
Together, the bioinformatic, cell culture, and gel shift analyses revealed that the differential response to the Toll and Imd pathways of
Drosophila was governed by a κB sequence code that directs binding of DIF and Relish. We noted, however, variation in both the sequence and length of the active κB sites bound by DIF and Relish. Work in mammalian systems had demonstrated that NF‐κB proteins interact with a variety of related target sites via alternative patterns of hydrogen bonding DNA binding mediated by flexible polypeptide loops (
Chen et al, 2000). To determine the extent to which DIF and Relish accommodate κB sequence variation, we extended our DNA binding and cell culture studies of synthetic κB motifs.
Illustrative examples from our parallel studies of κB site activity
in vivo and
in vitro are presented in
Figure 3A and
Table III. The preferred half‐site for Toll responsiveness and DIF binding was 5′GGGAA, with an additional G at the 5′ end being well tolerated. The preferred 5′ half‐site for Imd responsiveness and Relish binding was GGGGA, with a GGGAA tolerated in palindromic sites. Both pathways tolerate some variation in motif length (
Table III, compare rows 1 and 2 and rows 3 and 4). For example, the sites that bind Dif and are most responsive to Toll can have either one or no central residue separating half‐sites based on GGGAA.
The κB sequence code can specify dual responsiveness
Expression studies have shown that many innate immune loci are dual‐responsive, that is, activated by either Toll or Imd signaling. The cell culture studies revealed that dual responsiveness can reside in a single κB site. As shown by the examples in
Table III (rows 6 and 7), some κB sequences bind strongly to Relish and DIF. Such sites were not identified in our bioinformatic study, as expected for an analysis restricted to loci responsive only to a single pathway.
We envision two mechanisms for dual responsiveness. One would be regulation by a single site that responds to either pathway. Alternatively, a dual‐responsive gene could have multiple
cis‐regulatory sites, with at least one responsive to each pathway. Regulation of metchnikowin (Mtk), a locus that has a robust response to both Toll and Imd (
De Gregorio et al, 2001), appears to reflect an amalgamation of the two strategies. The Mtk gene contains three potential κB sites upstream of the transcriptional start. As predicted from bioinformatic analysis and confirmed by gel shift studies (
Supplementary Figure 2), one, GGGAAGTCCCC, binds to both DIF and Relish, whereas two others, GGGGACTTTTT and GGGGAACCC, bind only to Relish.
κB‐site sequence and number regulate the immune response
Turning to the questions as to how κB site number influences transcriptional responses, we investigated two additional innate immune loci, IM1 and AttD (
Figure 4A). Using luciferase reporter genes and site‐directed mutagenesis, we demonstrated that the Toll‐specific IM1 gene has a single functional κB site, whereas the Imd‐specific AttD gene contains three κB sites that are each required for a wild‐type response (
Supplementary Figure 3). We generated a pair of additional constructs for each gene, replacing one endogenous site with a synthetic site specific for Relish or DIF binding.
Within the divergent contexts of the IM1 and AttD loci, the Relish‐ and Dif‐specific κB sites had behaviors comparable to those observed in the AttA reporter. In particular, when we compared expression induced by each pathway, the ratio of the Imd response to the Toll response was in each case greater for the Relish site than for the Dif site. The κB sequence code thus functions predictably in the context of loci varying in pathway responsiveness and κB site number.
In the course of our investigation of both the AttA and AttD genes, we observed a greater than additive effect of inducing the Toll and Imd pathways simultaneously (
Figure 4B). For each gene, the fold‐increase in reporter gene expression on activating both Toll and Imd was significantly greater than the sum of the increases seen upon activating the two pathways individually. Furthermore, concomitant Toll and Imd activation had an apparently multiplicative effect not only in these loci, but also in drosomycin, which in S2 cells have a significant response only to Toll (
Figure 4C).
To explore the origin of these effects in drosomycin expression, we mapped and mutated the functional κB sites (
Supplementary Figure 3). There were two such sites, varying significantly in their behavior. The site at −303 bound DIF
in vitro; inactivation of this site eliminated responsiveness to Toll. The site at −139 bound to Relish; inactivation of this site left the response to Toll signaling largely intact, but abrogated the synergistic effect of activating both pathways. We interpret these results to indicate that Relish bound at −139 is in itself insufficient to direct significant gene expression, but that the presence of Relish at the −139 site enhances the transcriptional response directed by Dif bound at the −303 site. Consistent with this hypothesis, we observed only additive effects for the pathways with a drosomycin construct having a single, albeit dual‐responsive, site (
Figure 4C).
Taken together, the studies of AttA, AttD, and Drs reveal that pairs of κB sites can act together to enhance the strength of a transcriptional response. Furthermore, the greater than additive effect results from pairs of sites responsive to the same pathway (
Figure 2B) or to different pathways (
Figure 4B and C). The most parsimonious explanation for these observations is an interaction between κB protein homodimers bound to distinct κB sites. Based on similar findings, Ip and co‐workers suggest the formation of active heterodimers when both Dif and Relish sites are present (
Tanji et al, 2007), an explanation we cannot rule out.