Introduction to Survey Design B

Designing the Survey Questions, Data Processing, and Missing Data

Two-day Workshop

Within this workshop we cover: how to design the questions for a survey (considering factors such as ambiguity, memory problems, and participant sensitivity), how to evaluate the questions (gathering data from the population about the use of a proposed survey), data pre-processing (the steps involved between data collection and statistical analysis), and the issue of missing data (both how to minimise it during data collection and how to reduce biases during data analysis).

This workshop is intended for audience members who are new to the methods of survey research, and who desire a solid foundation in these methods. A simple knowledge of statistics is assumed (where audience members should be comfortable discussing terms such as mean and variance).

Workshop Contents

Each session comprises a lecture and a practical as per the program format below.

Session 1: Designing the Research Questions

There are many obstacles for a participant when choosing their response to a survey question. These obstacles include flaws in understanding what the research question is trying to ask, retrieving possible responses from their short-term and long-term memory, evaluating which pieces of remembered information to base their response on, and communicating their response to the survey interviewer. In this session we explore the difficulties that a participant will have in providing a response, and will describe how to develop a questionnaire to maximise the accuracy of provided responses.

Session 2: Evaluating the Chosen Research Questions

Following the design of a survey questionnaire, the questionnaire is then presented to a pilot audience in order to identify potential weaknesses in the survey. This piloting can include presenting and discussing the questionnaire with application experts of members of the target population, or administering trial survey interviews with survey assistants or study participants. In this session we will explore these methods for evaluating the chosen research questions, and describe how to maximise learning through this evaluation process.

Session 3: Processing the Data Prior to Statistical Analysis

Between data collection and statistical analysis there are a number of tasks that must be performed with the survey data. These include coding a non-numeric answer (such as a paragraph of text) into numeric data (eg. categories), verifying that recorded data is consistent with expected responses (eg. checking that responses were provided in the same units of measurement), and weighting the data to account for decisions made in the sampling design. In this session we will explore the various tasks involved in data processing, and describe how to maintain quality of our dataset throughout this process.

Session 4: Accounting for Missing Data and Non-Response

Missing data can occur for a variety of reasons – the inability to contact particular people that we would like to include in our survey, potential participants refusing to participate in the survey, and participants willing to participate in the survey but who are unable to provide the requested information. There are numerous approaches for addressing concerns over missing data that can be employed during the design of the survey and the questionnaire, during the interaction between the interviewer and the participant, and during the data analysis. In this session we will explore the various approaches for accounting for missing data, and review their strengths and weaknesses.

Teaching Style

This workshop uses a combination of three teaching styles:

  • Lectures and classroom discussions

  • Small group discussions

  • Computer exercises

During the lecture sessions the theory of statistics will be presented, and will be discussed in an interactive manner with the class.

Small Group Discussions

During one of the practicals in this workshop, we will read through a number of application papers from a range of fields (including medicine, education, business, and environmental sciences). We will explore what research question is being asked in the paper, the choice of statistical methods used, and an explanation of the results obtained and their interpretation.

Computer Exercises

Each workshop will involve the use of laptop computers. For these sessions participants will be asked to bring their own laptops and will be able to choose which statistical software they would like to use during the workshop. For this workshop, participants will be able to choose which package (R or SPSS) they would like to use during individual hands-on exercises throughout the workshop.

  • Introduction to Statistics – can choose between SPSS and R

  • Introduction to Regression, Longitudinal Data Analysis – can choose between Stata, SPSS and R

Please note that a copy of R will be given to all participants at the start of the workshop, if participants would like to use one of the other software packages then it will be the responsibility of the participant to ensure that they have that software package available on their laptop.

Program Format

The workshop will adhere to the following format. Please note that both teas and lunch are catered on both days, so please be sure to include dietary requirements on your registration form.

Day 1

8:30 - 9:00          Registration
9:00 - 10:30        Lecture 1
10:30 - 11:00      Morning Tea
11:00 - 12:30      Practical 1
12:30 - 1:30        Lunch
1:30 - 3:00          Lecture 2
3:00 - 3:30          Afternoon Tea
3:30 - 5:00          Practical 2

Day 2

9:00 - 10:30        Lecture 3
10:30 - 11:00      Morning Tea
11:00 - 12:30      Practical 3
12:30 - 1:30        Lunch
1:30 - 3:00          Lecture 4
3:00 - 3:30          Afternoon Tea
3:30 - 5:00          Practical 4


Mark Griffin.jpg

Dr Mark Griffin is the Founding Director of Insight Research Services Associated (, and holds academic appointments at the University of Queensland and the University of Sydney. Mark is the Chair of the IIBA Business Analytics Special Interest Group and the IIBA Asia-Pacific Regional Director. Mark also serves on the Executive Committee for the Statistical Society of Australia, and is the Chair of their Section for Business Analytics. Mark has previously taught over 80 two-day workshops and 10 five-day workshops in the fields of Business Analytics and Statistics. Major analytics projects that Mark is or has been involved in include:

  • Mark leads a research group at the University of Queensland conducting analysis of incident reports collated by the Queensland Ambulance Service. The QAS visits approximately 700,000 incidents per year where QAS staff complete a report detailing each incident. This project uses R for text analytics, market segmentation, and spatial mapping (GIS) (2017 to present).

  • Mark is leading a research group at the University of Queensland that are creating an online sample size calculator in R. This software will be used by managers of medical trials who wish to know how many patients to enrol in their trials. This work is being conducted in partnership with research collaborators at Harvard University. This project uses R for developing a web interface and for the mathematical equations involved (2017 to present).

  • Mark has developed software in R for SeqWater (where SeqWater monitors the water quality of all 28 water reservoirs in South-East Queensland). This project uses R for developing a web interface and for statistical analysis using time-series data (2017).

  • Mark led a project team evaluating the delivery of the Positive Parenting Program for the Queensland Department of Communities, Child Safety and Disability Services. This included the collection and analysis of data from 140,000 parents and 1000 practitioners (psychologists) involved in the program. This project used R for statistical analysis and data visualization (2016-2017).