A minor project for caching and aggregating NHL prospect and career data in Sqlite for complex queries.
Along with several devout hockey enthusiast friends, we theorized a novel metric p based around the Pareto principle, as a predictor of how young pre-NHL prospects will perform in their first year of professional hockey (which could be use to influence draft choices and trades).
The pareto-hockey-populate repo maintained a local cache of prospect data from serveral hockey data aggregators, such as EliteProspects, allowing faster access and more complicated queries over the data than the API allowed.
Unfortunately, the resultant data analysis did not provide the hoped-for results. While an excellent predictor of points and goals earned in their first season, p and its derivative values performed no better than simpler metrics, like points per game or goals / team's goals.