The conditional logit model is a standard workhorse approach to estimating customers' product feature preferences using choice data. Using these models at scale, however, can result in numerical imprecision and optimization failure due to a combination of large-valued covariates and the softmax probability function. Standard machine learning approaches alleviate these concerns by applying a normalization scheme to the matrix of covariates, scaling all values to sit within some interval (such as the unit simplex). While this type of normalization is innocuous when using models for prediction, it has the side effect of perturbing the estimated coefficients, which are necessary for researchers interested in inference. This paper shows that, for two common classes of normalizers, designated scaling and centered scaling, the data-generating non-scaled model parameters can be analytically recovered along with their asymptotic distributions. The paper also shows the numerical performance of the analytical results using an example of a scaling normalizer.