Nucleotide excision repair (NER) excises a variety of environmentally derived DNA lesions. However, NER efficiencies for structurally different DNA lesions can vary by orders of magnitude
yet the origin of this variance is poorly understood. Our goal is to develop computational strategies that predict and identify the most hazardous, repair-resistant lesions from the plethora of such adducts. In the present work, we are focusing on lesion recognition by the xeroderma pigmentosum C protein complex (XPC), the first and required step for the subsequent assembly of factors needed to produce successful NER. We have performed molecular dynamics simulations to characterize the initial binding of Rad4, the yeast orthologue of human XPC, to a library of 10 different lesion-containing DNA duplexes derived from environmental carcinogens. These vary in lesion chemical structures and conformations in duplex DNA and exhibit a wide range of relative NER efficiencies from repair resistant to highly susceptible. We have determined a promising set of structural descriptors that characterize initial binding of Rad4 to lesions that are resistant to NER. Key initial binding requirements for successful recognition are absent in the repair-resistant cases: There is little or no duplex unwinding, very limited interaction between the ?-hairpin domain 2 of Rad4 and the minor groove of the lesion-containing duplex, and no conformational capture of a base on the lesion partner strand. By contrast, these key binding features are present to different degrees in NER susceptible lesions and correlate to their relative NER efficiencies. Furthermore, we have gained molecular understanding of Rad4 initial binding as determined by the lesion structures in duplex DNA and how the initial binding relates to the repair efficiencies. The development of a computational strategy for identifying NER-resistant lesions is grounded in this molecular understanding of the lesion recognition mechanism.