Consider an Active Learning environment where Crowd Sourcing is used as an Oracle. An attacker
could use malicious bots which label data erroneously on a large scale to disable the model. Next to
the Oracle, the Query Strategy is another potential point of attack. In a Pool-Based Sampling or Stream-Based
Selective Sampling scenario, an attacker could inject (unlabeled) malicious instances which appear to be
appealing but impair the model. Even if the model is not impaired, but only not improved, is this often a
success for the attacker. In the application area of Active Learning, data is often exposed to Concept drift.
By this natural drift of data, the model is then impaired little by little. These and other security problems are
presented and examined in various related research.
B. Miller et. Al. in [1] present an attack concept on a Pool-Based Active Learning environment which uses
Uncertainty Sampling as a Querying Strategy. However, it assumes the most simple data sets and neither considers
complex data distributions nor higher dimensional features.
In this thesis, a new attack concept on very similar framework conditions is presented which can be applied to a
significantly wider spectrum of classification problems (arbitrarily distributed data of arbitrary size with an arbitrary
number of features).