Measuring Moderation
The role of moderation in digital spaces has never been more critical—or more complex. In an era where harmful content like hate speech, extremism, and disinformation proliferates, community managers are not only tasked with addressing these challenges but also with proving the effectiveness of their efforts.
At the 2023 All Things in Moderation conference, Dr. Andre Oboler, CEO of the Online Hate Prevention Institute, delivered a presentation on how we can better measure, refine, and future-proof our moderation strategies.
Here are some key takeaways from his session, tailored for community managers navigating the increasingly demanding landscape of trust and safety.
Framing Moderation Challenges
Dr. Oboler offered a helpful analogy: think of online hate, extremism, and other harmful behaviours as digital pollution. Like industrial waste, this content can be a harmful by-product of digital engagement—unwanted, damaging to social health and cohesiveness, and difficult to manage. Thinking of these issues as pollution reframes the problem as one requiring systemic, scalable solutions, rather than a focus on over-individualised approaches.
This shift in perspective is timely. In 2023, the bar for moderation is rising. Regulators, advertisers, and the public are no longer satisfied with vague assurances of effort; they expect measurable outcomes and demonstrable accountability. As community managers on the front lines of these digtal environments, we must ask ourselves: are we ready for this new era of transparency and scrutiny? And what role can we play - especially when we may not have full control?
4 Approaches to Measuring Moderation
Dr. Oboler outlined four key methods to examine moderation efforts, each with distinct strengths and limitations.
1. Demonstrating Problems
This approach is familiar to most of us: using specific examples to highlight harmful content or behaviour. Whether it’s a viral post spreading disinformation or hate speech on a platform, demonstrating the issue can draw public attention and spur immediate action.
While effective for raising awareness, this method has limitations. It’s reactive, often tackling symptoms rather than underlying causes, and it struggles to scale.
2. Counting Problems
Counting represents the next step up: aggregating harmful content into patterns and trends. This allows us to map the scope of an issue, identify systemic gaps, and evaluate where policies or interventions are falling short.
For example, the Online Hate Prevention Institute’s 2013 report on Islamophobia categorised 349 memes sourced from 50 anti-Muslim Facebook pages. By identifying patterns in the content and its spread, the report illuminated how these memes perpetuated harm at scale. [A new report is out in 2024]
3. Manual Coding
Manual coding is the painstaking process of categorising content using a consistent framework, often involving multiple moderators to ensure accuracy. This approach builds the ‘source of truth’ needed to train and refine artificial intelligence (AI) tools for moderation.
However, Dr. Oboler stressed that human moderation isn’t just a stopgap until AI “catches up.” Errors in manual coding propagate into AI systems, and nuanced judgment is irreplaceable. Human expertise remains essential to effective moderation.
4. Modelling with AI
AI enables moderation at scale, applying trained models to vast datasets. While AI offers speed and efficiency, it has limitations. Models often struggle with edge cases, context, and emerging behaviours not reflected in their training data.
Dr. Oboler emphasised that AI and human moderation must work together. One enhances scale; the other ensures depth and nuance. Community managers know they communities better than anyone, and are well placed to offer context.
How Do We Measure Success?
The heart of the session lay in answering one pressing question: how do we measure whether our moderation efforts are working? Dr. Oboler introduced five key methods:
1. Intercoder Agreement: Evaluating consistency among human moderators. This ensures reliability and highlights areas needing clearer guidelines or training.
2. Confusion Matrices: Comparing human-coded data with AI outputs to identify successes (true positives and true negatives) and errors (false positives and false negatives).
3. Precision and Recall: Balancing precision (how accurate our flagging is) with recall (how comprehensively we identify issues). These metrics are critical for aligning moderation goals with platform or community values.
4. F-Scores: Combining precision and recall into a single metric, an F-score has the flexibility to prioritise one over the other depending on context.
5. Transparent Standards: The ultimate goal is to set and continually raise benchmarks for moderation, reflecting advances in tools and community expectations.
What does this all mean for Community Managers?
As the expectations of regulators and internet users increased, community managers must evolve their moderation strategies and tools.
Invest in Tools and Data: Systems like the OHPI’s *Fight Against Hate* tool can help communities collect and analyse harmful content more effectively, enabling data-driven decision-making.
Embrace AI-Human Collaboration: AI is a powerful ally, but it’s not a replacement for human judgment. Balancing the two will be key to meeting evolving standards.
Prepare for Accountability: Transparency will soon be non-negotiable. Community managers will need to report on moderation outcomes, whether to regulators, stakeholders (such as business or insurers), or users themselves.
Dr. Oboler also cautioned that increasing demands on moderators—both human and automated—may exacerbate burnout and stress. Supporting moderation teams with realistic policies and mental health resources is critical.
A Call to Action
Dr. Oboler’s analogy of moderation as pollution reframes the challenge ahead. Rather than only putting out fires and reacting to crises, community managers can champion systemic solutions. Collaboration with regulators, platforms, and civil society will be essential to address the root causes of digital harms.
Moderation is no longer a behind-the-scenes task; it’s a visible, measurable contribution to the health of online spaces. The frameworks and insights shared at All Things in Moderation empower us to think bigger—to see moderation as a field of practice, innovation, and real social impact.