Introduction to the Sensibleness and Specificity Average (SSA)

Adam Peddicord
Mar 16, 2020
2 min read

The Sensibleness and Specificity Average (SSA) captures basic, but important attributes for natural conversations for open-domain chatbots. This metric aims to assess how effective your machine learning customer engagement mechanisms "make sense" to their audience regardless of the cultural or linguistic context exchanged during their chat.

SSA is a very new and promising customer experience metric. Existing chatbot quality metrics for human evaluation are complex and don't provide consistent results. This motivated chatbot developers to design a human evaluation metric, the Sensibleness and Specificity Average (SSA), to capture basic, but important attributes for natural conversations. Researchers found SSA highly correlates to perplexity, which is a proven and automatic metric for neural conversational models such as Mena. This correlation, the rapid development of chatbot AI, and increased chatbot adoption by customer success teams in need of better scale makes SSA a metric every leader must now know.

Here is an example of Mena, an end-to-end chatbot that learns to respond sensibly to a given conversational context, in action.

Installing an SSA metric with this program allows leaders to understand how effective this chatbot's learning while engaging with customers. Partnering your SSA scores with your Customer Satisfaction Scores (CSAT) is essential to ensuring your customer's still are receiving the engagement experience they expect even though they're engaging with a machine instead of a human. To do this right, provide customers an in-app survey post-chat and modify your CSAT question slightly to inquire "how satisfied were you with the agent's service?". Structure your internal data to ensure there's no confusion between staff vs. AI-driven chat performance, and overlay the SSA metric with your AI chat CSAT to ensure that channel is delivering acceptable results and growth.

The mathematical formula for SSA is complicated, but the benchmarks are solid enough for success leaders to ask their chatbot partners how their's perform in this statistical area. Google's modeling shows an SSA score of 72% is the "norm", their full version of Meena, further advances the SSA score to 79%. It's worth noting 86% is the benchmark given for human interactions.

Remaining forward-thinking in the tools necessary for you to deliver your customer-centric strategy at scale is critical to your business. What customer success leadership needs to take away from these findings is we're quickly reaching the point where customers won't be able to tell the difference between engaging with a chatbot or a human, and that day is rapidly approaching us. Understanding more about the metric, and holding your chatbot partners accountable to presenting it to you will assure you're keeping pace with the rate of customer experience innovation.