Group of people with one person stood to the side

If you work in community engagement and undertake online consultation activities there is one question you should always be asking: how representative are my results? In other words, can you trust that the results you are getting represent the community at large? This level of evaluation is one of the most crucial aspects of online engagement, yet it is also one of the most overlooked.

When talking about ‘community’, engagement professionals can fall into the trap of thinking the community is represented by the squeaky wheels; the ‘usual suspects’ who turn out in large groups to have their say on an issue.

These squeaky wheels often get the grease as they are the most engaged members of the community, but they rarely represent the diversity of age groups, professions, education levels, socio-economic stratas, gender identities and religious and cultural groups that make up a community.

If you rely on results that are not representative of your community, you run the risk of making decisions that are not well supported by reliable data.

In a perfect world, most community engagement projects would gather highly representative data sets that can be drawn on to inform decisions. But the truth is, most community engagement does not yield even broadly representative, statistically significant results. This means caution should be applied when using the data in isolation to draw conclusions and inform decision-making, unless the representativeness can be demonstrated.

What is representative data?

When data is ‘representative’, it either represents the majority of people within the community, or it represents the diversity of opinions and socio-spatial characteristics of the community. Representative results can be assumed to reflect the views, priorities and values of various cross-sections of the community in sufficient quantities that you can have confidence in the accuracy of the results.

When data is unrepresentative it can overly represent a particular interest or group, and you cannot know what the majority of people within your community think. If you are using your community engagement process to inform good decision-making, you need to have confidence that your data accurately represents your community and that certain voices and points of view are not disproportionately amplified or disadvantaged.

The simplest way to check whether your data is representative is to use simple statistical tests to assess your sample size. This can ensure you can have confidence that your results are representative.

Testing for representation

There are a number of ways to check how representative your data is. Here’s a basic example of using a simple statistical test:

John works at a council where the population is 100,000. Using a sample size calculator, he knows he needs at least 383 respondents to be confident his results achieve statistical significance (with a 95% confidence level and a 5% margin of error).

He gets 400 responses, which means that his results are statistically significant. This means that even if John got more than 400 responses, it is highly likely his results will not greatly differ from the original.

On the surface, his results appear to be representative based on his sample size.

However, it’s easy to see that these kinds of tests provide a crude measurement of representation: what if that sample of 383 was stacked in an unrepresentative way by special interest groups or vocal minorities? What if some groups haven’t had any input at all? What if all respondents were above the age of 55?

Drilling deeper into results

To understand community representation in a deeper way, we must first understand the composition of the community we are targeting. Diving deeper into the results allows us to see which social, economic and spatial groups are being represented. Applying the same statistical tests to each of these key segments allows us to see where our results are biased and actively target groups that are under-represented.

Resources such as the census or .id’s demographic resource centre (in Australia) are useful tools for understanding your community composition and helping you break down your population by age, gender, location and socio-cultural background information (such as income, education and cultural heritage).

Armed with the information about our community composition, here’s another, more detailed example:

Christine is running an engagement on pedestrianising a main street. She is collecting demographic information as part of the engagement, including age. She knows that:

  • The community has a population of 30,000
  • There are approximately 5000 people under the age of 30
  • She has received 2000 responses in total
  • 400 of these responses are from people under 30

The vast majority of these responses indicate that people under 30 support pedestrianisation, so Christine uses this as a hypothesis.

Using the same sample size calculator as above, Christine knows that from a population of 5000 people under 30, she needs a minimum of 357 responses to be 95% confident that her sample size is representative with a margin of error of 5. Therefore, she knows that this demographic is well-represented in her results and she can be confident in her hypothesis that people under 30 support pedestrianisation.

Of course, not every demographic is likely to be well-represented or even necessary, so you should identify the groups within your community that you want to target and that need to be represented in your data.

Implementing this approach during the feedback process allows you to track representation in real-time, and adjust your strategy to target underrepresented groups.

This approach was used in the award-winning Plan Brimbank project, which monitored results weekly to determine which groups needed to receive attention.

Conclusion

It is important to consider that online community engagement is essentially a form of social research - data is gathered, analysed and used to draw conclusions. Unless you can guarantee your data's representational quality, then you shouldn't rely on it to make big decisions or draw conclusions that certain views, priorities or values represent ‘the community’.

The first step to ensuring your data is representative is to analyse and monitor your results in the context of your community profile.

You should then consider using other techniques to actively promote representation in your results such as mixing offline community engagement methods with online, building promotional campaigns around targeted demographics, and tailoring your content to targeted audiences to increase the representation in your results and enrich your data.