Website accessibility
显示或隐藏菜单栏
主人
部分家庭
|
Content
Calendar
Links
|
登录
|

Market research scales, ranks and trade-offs

Scales and ranks for market research大多数标准市场研究是评级尺度万博官网manbet登录的保存 - 超过10,同意,不同意,满意,购买可能性。在研究界中,可以进行加热辩论,关于哪种类型的尺度是最好的,应该如何问问题。

Scales are not the only possible method of measurement. Using choices, ranks and trade-offs through techniques like conjoint analysis or MaxDiff can provide more actionable data for models and to forecast market behaviour.

Scales in market research

评级鳞片是肉类和饮料到市场研究人员 - “请评价10”中的每一项,“你对万博官网manbet登录每个陈述的同意或不同意多少钱?”,“你有多满意?”,十分之一你推荐有多可能?“(净启动子评分 - NPS),“您在此价格购买此产品有多可能?”

Scales come in many different forms and formats. The most common type are Likert scales - also known as agree-disagree scales - where the scale agreement is measures on a five or seven points in order (hence an ordinal scale). Ratings, typically from 1 to 10, are another use of scales to indicate a level of performance.

The benefit of scales is that they are easy to ask, provide data that can be analysed numerically for statistical analysis and that are stable on repeated measures across a sample (though not necessarily at the individual level).

多少点和格式?

The most common questions about scales are how many points? Should you have a midpoint? Should you label the points? And can you transform old answers into a new scale? Though these are easy questions, there is a surprising amount of academic study and debate around, and some decisions come to researcher preference.

For more academic researchers, a 7 point scale is often used as it gives more data for later analysis. Though 7-point scales are most 'pure' theoretically, it would be fair to say that most commercial researchers would use a 5 point scale as it is easier to label the points. In customer satisfaction a ten point scale 1 to 10 (or preferably an 11 point scale 0 to 10 - people scoring low tend to prefer to be able to allocate zero points, than 1 point) is often used, as it is familiar from use in grading papers and tests.

在实践中,1至10级尺度往往会发现受访者不会均匀地使用整个规模。在分配评分时,朝向使用比例的上半部分(7,8,9,10)。该偏差明确地用作净启动子分数的基础,其中高分评分(9,10)被认为是启动子,并且低分(0-6)被认为是批评者,识别实际使用中规模的不均匀性。

数字尺度也可以用于转换口头点(例如'非常有价值','有点有价值','不是非常有价值','不完全有价值'),因为在电话调查中,这可以节省被访者记住实际名称。但是,订单需要做得非常清晰 - 是1好还是坏?

On the telephone a five point scale can also be used as a 'roll out' scale. That is rather than label all five points for the respondent you split the question in two: The first part is to ask "Do you agree or disagree?" Then follow this with "Is that a lot or a little?" The combination result is "Disagree a lot, Disagree a little, Neither, Agree a little, Agree a lot". After a couple of goes respondents know the scale and automatically start to answer Disagree a little, Agree a lot etc.

对于在线调查,尺度可以被滑块代替,或视点 - 例如脸型到悲伤的面孔。使用图像而不是言语可以帮助标准化在国际项目中的意义,假设从图像中取出相同的文化推论。

In practice, the choice of scale will depend on the subject, the presentation format and whether it will be written or spoken. Because scales are easy to create, there is a temptation to ask too many questions. Large question grids with banks of attitude statements and scales, are a common reason for drop-out on online surveys.

Mid points

Scales without a mid-point (ie removing the neutral 'neither' point) force opinions in one direction or another. Ratings and likelihood scales naturally have no mid-point, they run from high to low ('Very likely', 'Somewhat likely', 'Not very likely', 'Not at all likely'), so the choice for mid-points is normally around Likert-type scales.

The reason for forcing a choice, is that it can make statistical analysis easier and it reduces the problem of mid-lining - simply choosing a mid point on all answers to complete the survey quickly.

The debate comes on whether a five point scale should really be a four point. On the telephone, it's common to offer a four point scale (eg do you agree or disagree) explicitly, but allow the interviewer to code for Neither, so neither is different to Don't know. On screen or on paper, forcing responses is more difficult, and often creates annoyance for respondents if they feel their answer is not represented.

Since most surveys are now self-complete online, it is often better for completion rates and accuracy to include the mid-point, or at least a "Can't Say" option.

Straightlining

A common quality problem seen on scale and grid questions is straightlining. That is the respondent simply gives the same answer to all questions - to get through a question quickly, due to boredom, inattentiveness or distractions, or just attempting to finish the survey fast to get to a reward. Straightlining is one of the quality checks we use to assess response quality on a survey. If the overall quality of responses is low, then the questionnaire will be rejected.

针对直线,语句应该be framed so that respondents would be expected to switch between positive and negative ratings, and so keep 'on task' rather than just run mechanically saying 5, 5, 5... So instead of just giving statements in the positive, encouraging agreement, the statements would also be reversed with an expectation of disagreement ("This store is clean","The store is poorly laid out").

然后,这将与随机化的顺序相结合,其中陈述既可以最小化秩序效应(顶部项目越来越高,因为它们是第一个语句),并确保每个受访者的响应顺序不同。

不平衡的鳞片

Though theoretically scales are better balanced, in practice unbalanced scales can also be used, particularly where there is a strong natural bias towards a positive rating. If you are talking to donors to a charity for instance there is a tendency to eulogise through the scores to the positive end. For tracking purposes, this can be difficult if everything is always rated at the top and no-one rates negatively. For this reason additional superlative levels may be added to the scale or alternative ways of asking the question used.

Mixing scale types and using choice type questions

There can be a tendency to overuse simple ordinal scales. For instance "How much do you agree or disagree with 'I know a lot about the brand'? forces a likert type answer. But the response in terms of agreement doesn't explicitly say how much someone knows about the brand.

In this case, if the aim is to understand level of knowledge it might be better asked as a分类问题"How much do you know about the brand" - "Never heard of it", "Heard of it, but know nothing", "Know a little", "Know a lot". In this case the clearer more categorical approach will have more meaning to respondents and be easier to understand in analysis. Categorical approaches are used in Kano analysis of which features are needed for a new product and are often easier to interpret.

Another second alternative to likert scales is a choice: "Which of these two brands is better quality?" The choice can also be framed with a scale from Brand 1 to Brand 2. Choices offer powerful alternatives to standard scales and are used in techniques like conjoint analysis or MaxDiff as they are more actionable than simple scale points.

The concept of choices can extend to联想问题. "Which of these brands are ... friendly?" and a list of brands can be associated with the word, followed by "Which of these brands are ... unfriendly". This type of associative approach leads to measures known as "图像力量" (total associations made for a brand) and "图像字符“(协会的方向 - 正面或负面),允许大量品牌快速轻松地在许多不同的特征上进行评分。

We take this approach further in our hot-cold, or thumbs-up-thumbs-down questions. In these types of questions, respondents choose which of the items to select - positively or negatively - and can give multiple ticks up or down like a scale. In choosing what to rate, no answers are forced so only genuine opinion is scored.

Analysis and reporting

Challenges moving means
想象一下,只有8个受访者在1-4级。2.5中点分数可以来自4:0:0:4的分布;0,4,4,0;2:2:2:2;1:3:3:1,3:1:1:3。相比之下,只有一种获得3.75 - 0:1:0:7的方法。

如果一个企业想要提高它的分数0.1 from 2.5 midpoint there are multiple possibilities - between four and eight moves. But the move from 3.75 to 3.9 can only be accomplished one way. Thus, as the mean score gets higher, it gets harder to to find improvements - something seen on customer satisfaction measures for high performing businesses

Scales form the basis of a great many statistical techniques for understanding markets. Often scales are treated as numeric values be used as independent variables to feed into a regression model and so determine which ratings are most important in decision making (note though that correlation doesn't necessarily imply causation).

The use of scales as statistical parameters opens up statistical techniques such as perceptual mapping, cluster analysis for segmentation, factor analysis to distil core meanings, and regression analysis to identify key drivers.

In reporting, scales are often reported asmean scores, particularly for academic reports. Mean scores implicitly imply that points on the scale are equally spaced and that respondents use the scale points in the same way. Mean scores give a single number to report, with a confidence interval (and p-scores for comparisons between groups.

However, for commercial research, mean scores are less commonly used. A mean score itself can be difficult to interpret directly and more difficult for non-expert readers to understand. For instance, for a four-point scale, scored as 1 to 4, the mid-point - ie balance point - is 2.5. Most people would intuitively think it should be 2.

平均分数本身是任意的scale it uses. The points can be given any value, not just 1, 2, 3, 4, 5 say, but -2, -1, 0, 1, 2. This means the score can be converted to a number between 0 and 100 and which is easier to understand - and avoids decimals for less numerate readers.

A second reason for avoiding showing means is that the mean itself is not evenly distributed in terms of actionable changes. What this means is that it is easier to move a mean score up by 0.1 in the middle, than to move it at the extremes (see sidebar). Consequently, researchers often prefer to report 'top-box' scores - usually the sum of the percentages of the top two items, or to show the actual percentages to show the distribution of answers.

Criticisms of scale use

对于如此大量的尺度和评级选择它并不令人惊讶,它们在市场研究中广泛使用(以及在临床试验中作为结果的临床试验 - 例如疼痛减少)。万博官网manbet登录尺度非常容易创建和使用,似乎对它们有直观,它们跨越样本稳定,因此可以用于跟踪和测量改进。

There are criticisms of scales and a need to understand good practice. Ideally scales should be validated to show they measure what they are supposed to measure. Otherwise there is a concern that they are 'fuzzy' or unclear in meaning. What does the respondent mean if they say they agree slightly with something? Is this good or bad? Will it affect their decision making?

The meaning or impact of a scale rating can be shown by carrying out statistical analysis to link scale scores to behaviour on aggregate using regression type techniques. And factor analysis or principal components can be used to identify meta-concepts behind combinations of scale ratings.

然而,对于许多受访者来说,他们的个人评级并不稳定 - 他们可能会在一到两周后的同一措施(甚至在同一调查中)给出不同的评级。因此,虽然评级在样品水平上稳定,但如果个人的视图明显切换,但它可以用作预测工具?

Ratings also vary according to the scale used. For long-standing surveys with trends, switching scales from say 5 points to 7 points, or from 10 to 5, changes the ratings (and introduces a discontinuity into the trend series). Directionally, the ratings are usually the same, but the precise values shift.

In part, this may because a rating is a forced item. Individuals are required to show an opinion on something, that perhaps they do not value, or do not consider important, and they have to give their opinion within a frame created by an external researcher who may use language the individual would not normally use.

因此,规模与知识中的强烈意见混合了,这是一项简单地给出答案的个人的意见,因为这是他们被要求做的事情。这个“实证主义者偏见”(不是'正面')假设研究人员要求的问题与受访者的意见相关。

In practice, low-involvement respondents will tend to guess and give a response they think is required. This provides stability to the statistics without necessarily representing genuine opinion.

出于这个原因,受访者可以选择对涉及意见的尚未加强的评级,或者(例如我们的热冷尺度)可能是了解潜在观点动态的更好方法。

A second criticism of scales is that unless scales can be linked to actions or decisions, for business decision makers understanding and implementing change based on scale ratings is hard. If 20% of customers think you are unfriendly, is this good or bad? How do you change? Is it something you should change? How much effort and money should you put towards changing? And what will the return be?

第三批评是,这些措施是合理的 - 或“思考”措施 - 他们明确有人常常思考或反应的物品。

通过使他们明确地,被访者考虑到社会规范 - 别人如果我说我讨厌回收是什么?或预计对面试官的看法。受访者掩盖意见,即使对自己而言,让自己看起来更加社会,或者有更高的社会站立,或者因为他们认为有正确的答案。受访者因此最终能够在他们给予的答案中“谈判”。

这些规范和尼数可以通过问题所要求的方式来缓解,找到允许隐含的不受欢迎意见的方法公平化 - 例如对药物采取的态度或性行为问题。

使用尺度选择

In general, choice-researchers prefer categories to abstract scales - though naturally we use scales too. If you force someone to make a choice, for instance from two options, or by asking for a ranking or top three, you get a better perspective of relative values between items. For researchers from a scale-based background, the power of ranks seems difficult to understand and there is a tendency to try to turn the rank into a scale (how do I score the ranks? is a common question). But ranks indicate preference and trade-offs. They can be interpreted using statistical tools like those available for conjoint and we can ask modelling questions like if X wasn't available what would people choose next. There are issues with ranks too - determining the 'step-size' between items for instance, so there are a number of hybrid techniques that combine elements of scales and ranks to provide more information.

And obviously from conjoint analysis, we would prefer categorical answers rather than scale or ordinal answers as they provide a fuller description of where customers needs are and what they want. You don't want a 'quite good' or 'very good' camera, you want a DSLR or a Canon - a specific category, not a rating. Defining categories can be difficult, but knowing categorical preferences is much more powerful than just understanding rated preferences.


For help and advice on market research design and development contactinfo@dobney.com


Previous article: Understanding market metrics 下一篇:市场研究统计技术万博官网manbet登录
More details

Go to Notanant menu Website accessibility

Access level: public

This site uses饼干. By continuing to use this site you agree to our use of cookies:OK