Old Wine in New Bottles
Old Wine in New Bottles
The fashion industry is subject to recurring cycles of popularity that are regular enough to be dubbed the “20-year rule.” Activewear clothing that is suitable for the gym and the street was fashionable in the 1990s and, 20 years later, in the 2010s. Intellectually, the British economist Dennis Robertson once wrote, “Now, as I have often pointed out to my students, some of whom have been brought up in sporting circles, high-brow opinion is like a hunted hare; if you stand in the same place, or nearly the same place, it can be relied upon to come round to you in a circle.” In the same way, today’s data miners have rediscovered several statistical tools that were once fashionable. These tools have been given new life because they are mathematically complex, indeed beautifully complex, and many data miners are easily seduced by mathematical beauty. Too few think about whether the underlying assumptions make sense and if the conclusions are reasonable. Consider data mining with multiple regression models. Rummaging through a large data base looking for the combination of explanatory variables that gives the best fit can be daunting. With 100 variables to choose from, there are more than 17 trillion possible combinations of 10 explanatory variables. With 1,000 possible explanatory variables, there are nearly a trillion trillion possible combinations of 10 explanatory variables. With 1 million possible explanatory variables, the number of 10-variable combinations grows to nearly a million trillion trillion trillion trillion (if we were to write it out, there would be 54 zeros). Stepwise regression was born back when computers were much slower than today, but it has become a popular data-mining tool because it is less computationally demanding than a full search over all possible combinations of explanatory variables but, it is hoped, will still give a reasonable approximation to the results of a full search. The stepwise label comes from the fact that the calculations go through a number of steps, considering potential explanatory variables one by one. There are three main stepwise procedures. A forward-selection rule starts with the one explanatory variable that has the highest correlation with the variable being predicted. Then the procedure adds a second variable, the variable that improves the fit the most.
Oxford Scholarship Online requires a subscription or purchase to access the full text of books within the service. Public users can however freely search the site and view the abstracts and keywords for each book and chapter.
If you think you should have access to this title, please contact your librarian.