Discover more from Course Notes: Continuous Business Learning
Weapons of Math Destruction 1/6
The WMD (Weapons of Math Destruction) are algorithms that judge people on many criteria. But they’re opaque, unquestioned, and unaccountable, and they operate at a scale to sort, target or “optimize” millions of people. (direct quote)
1/ Bomb Parts: What is a Model?
A model is just an abstract representation of some process, usually utilized to predict outcomes and/or explain the past performance.
The people building models often lack the data they need to predict the subject’s behaviour, so they use proxies, draw statistical correlations with anything available about the person: a zip code, credit score, language patterns, etc. Some of this data is discriminatory (e.g., zip code) and illegal.
Models are simplifications, so they will never completely describe reality, but if the right variables and constraints are used, and the models are “trained” (i.e., the values in the formulas are adjusted based on the repeated “running” of the model), there’s a lot of practical value in them – in the form of probabilities of expected outcomes.
Whatever is not included in the model (i.e., “blind spots”) is an indication of what the creator of the model doesn’t find useful. These omissions can be deliberate or a result of poor model construction, laziness, or negligence. A model is only useful if its author makes it clear what was omitted and why.
Every model must “tell” what it is trying to accomplish: optimize a process, predict future results, ensure sufficient representation of a certain group of people, etc. The objective of a model is a reflection not only of the task at hand, but also of its author and, if it’s produced in a corporate environment, - of the corporate priorities.
Models have a useful life; over time they become less precise due to the variance of data and the emergence of unexpected variables.
The danger of simple models is that they may become persuasive enough (e.g., racism) to morph into beliefs; at this stage it’s very hard to change them because of the confirmation bias. Simplified beliefs can literally be the decisive factor in whether the person lives or dies, or at least gets a longer prison sentence.
Supposedly, using models that are blind to biases will lead to great improvements in fairness and consistency. But first it has to be proven that they don’t merely camouflage these biases with technology (say, the supposedly race-blind model being trained by a racist). Metadata (i.e. the data describing other data) reveals the needed details about a person without asking direct questions, which might be illegal. Bias usually hits where it hurts the most.
The outputs of models are rarely questioned, even if they are clearly biased. The unfair outputs of models may themselves be inputs into similar models (an ex-convict isn’t hired, can’t support themselves, steals, gets a longer sentence, thus increasing the risk of getting incarcerated yet again and again). This is a typical example of a WMD.
Models can be used dishonestly by not revealing the possible consequences to the person being profiled or questioned. This is true for both the law enforcement and the corporate settings. It’s quite clear why: like with security cameras – whoever knows the setup, knows how to avoid being caught. At the same time, even the most honest model can be distrusted for the mere reason of its existence.
Intellectual property rights are not helping create transparency: it’s very tempting to hide the inner workings and the suspected deficiencies of the models by claiming their proprietary nature. This may be a fair point: reverse engineering a model may be very hard, and it the model is self-learning (i.e., incorporates new data every time it runs), then this is an impossible task due to the limited number of external observations.
Models are by nature probabilistic, i.e., there will definitely be cases when they spit out false positives or false negatives; it sucks being the person on whom the model misfired. Wrong credit score can mean both reduced job prospects and increased mortgage payments. One single model glitch can propagate through an interconnected system affecting the entire person’s life. The danger is that the decisions based on the model outputs usually can’t be appealed. This creates the worst case of unaccountability.