The Data Daily
Less than 2 minutes to read each morning.
Not sure if you want in? Read the archives below.
5 days a week since May 1st, 2023.
How much would they pay?
What if your business teams had to pay for the data team?
We’ve been talking about getting business teams to “buy in”. And how data teams need to think about what and how they are “selling”.
It’s been all theoretical. Business teams don’t actually buy anything that data teams sell.
But what if business teams did have to pay for the data?
Would they still buy?
The finance team puts in a request for their end-of-month financial reporting needs. The data team replies with a price. $3,500.
The admissions team at a university needs analytics on their applicants, and acceptance and rejection demographics. Price: $250 per report page.
A warehouse manager wants projections for inventory restocking needs throughout the holiday season. Price $8,250.
The donations team needs insights into donor frequency around the end-of-year giving season. Price $4,300
Would they pay up?
Is the data worth it to them?
When it comes out of their department budget, are they still interested in data?
If you provide a first-class data experience
with data products built to meet business needs
that provide the business team with the feeling they want
then yes.
Yes, they will pay. Because it’s a key driver for them to hit their business goals.
Thanks for being here,
Sawyer
You might need to ask
Questions to ask your business team:
What has been on your backlog for the longest time?
What does success look like for your team this quarter?
Where do you see friction in your team's ability to hit your goals?
What difference would it make for the company if you hit that goal?
What’s a regular task or process that takes way longer than it should?
Where is there a big opportunity that you haven’t been able to execute on yet?
What could you accomplish if you had double the team size?
These aren’t data questions. They are questions about the hopes, dreams, and frustrations of the business.
Data can’t solve all of them. Maybe not even most of them.
But they make you far more likely to build data products that
move the needle
delivery value
meet a need
I’m here,
Sawyer
from The Data Shop
5 Signs that you’ve lost them
5 Signs your business team isn’t buying in on data.
It’s difficult to schedule meetings to gather requirements
The usage metrics show cobwebs on the dashboards and BI reports
When you get in a meeting to discuss data needs you get half-hearted engagement
Every couple of months they share a different data goal for you to meet
They reject or resist any updates or upgrades to legacy reports
These are signs the business team isn’t interested in “buying” the data products you are selling.
These are signs that your data team is focused on data and not on the business.
These are signs you are selling the wrong thing.
These are signs that trust is gone.
I’m here,
Sawyer
from The Data Shop
When they aren’t buying
A data team needs to sell.
There’s no money exchanged. But you need the business teams to “buy” what you produce.
You want them to love what you are producing.
You need them to be repeat “buyers”.
You want them to request more, different, and innovative data products.
If your business team isn’t buying, then it’s time to revisit how you are selling.
You want to sell data quality —> They want to buy trustworthy data
You want to sell data analysis —> They want to buy confidence that they will hit their goals.
You want to sell high-performance data throughput —> They want to buy ease of finding their numbers while on a sales call.
You want to sell data governance —> They want to find the data they need to be creative and innovative in executing their business goals.
What they are “buying” is a feeling.
Selling a feeling requires knowing what your business team wants to feel.
Which means standing up
walking across the aisle and
beginning a conversation
with the business.
It was good to see you today,
Sawyer
What are you selling?
My toothpaste shocked me the other day
I was brushing my teeth and, for the first time ever, I read the text on the toothpaste tube
It had a shocking promise.
“Up to 2x deeper clean feeling”
That last word surprised me. Feeling.
Not a promise about cleaner teeth. Not a promise for fewer cavities. No guarantees about a better smile.
Cleaner feeling.
When the megacorp branding department put together the marketing for this toothpaste they decided to sell a feeling. My hunch is all toothpastes provide a similar quality of cleaning your teeth (and preventing cavities). It’s hard to make distinctive claims about actually clean teeth.
So they chose to sell a feeling. And it works.
What I really want when I brush my teeth is for my teeth to feel clean.
What are you selling?
Data teams that provide reports and data products to business teams are selling. They aren’t exchanging money for the data, but you need them to buy in.
You could sell them a dashboard, report, or database. And then listen to the yawns and watch the cobwebs grow on your reports.
Or you can sell a feeling.
Confidence in the marketing campaign results.
Trust that the product inventory is sufficient to handle the busy season.
Excitement about the sales opportunities from your account analytics.
What feeling are you selling?
Is your business team buying?
I’m here,
Sawyer
Who's the hero?
You and the data team are not the heroes.
The data team doesn’t…
sell the product
build the product
deliver the product on time
establish brand reputation
satisfy customer support issues
etc. The business teams are the hero. Data supports this work.
Get comfortable operating behind the scenes.
Enjoy helping others succeed.
You’ll be a hero of a different sort.
“You can have everything in life you want if you will just help enough other people get what they want” - Zig Ziglar
I’m here,
Sawyer
Quick Question - Your responses
Last Friday I asked you what first comes to mind when you think about data modeling.
Your replies were honest and led to some great dialogue.
Here are a few themes I pulled out (quoted parts are replies from you).
Data modeling is:
…frustrating. “Business users were so busy that they didn’t give me valuable feedback until after we’d launched the finished product.”
…time consuming. “our data team is slow...really slow”.
...”designing the shape(s) and possibly the flows for each data set.” Love this. Data modeling is putting the data in a shape.
…”ETL Processing”. Totally! ETL/ELT is all about delivering data models to business teams!
...about “creating a plan, whether tangibly or conceptually, to be able to do what we want with our data”.
…”nope, not for me”. Yes, I have a few people on this list who aren’t data people (so glad you are here)
More than one of you made the joke about data having a modeling job ;-)
Data modeling is a fundamental part of any data program and the linchpin connecting data teams and business teams.
I'm sure we'll talk about it here more in the future.
Have a great weekend.
Sawyer
Everybody does it. Why not be good at it?
It's data modeling week here. Read the past emails here.
One of my first classes in seminary was “Intro to Theology”. The professor started the class off with a statement I’ve never forgotten.
“You are already a theologian. So why not be a good one?”
Most of us sitting in the class thought “theologians” were people from history books or Ivy League professors writing the textbooks in our library. That professor shattered our idea of “a theologian” and gave us a simpler definition: Anyone who thinks or talks about God is a theologian. Most of us weren’t very good at developing our ideas, asking questions, or thinking through challenging theology topics well.
When he said “You are already a theologian. So why not be a good one?” the whole class was ready to go. We were already doing this thing... it was worth it to get better at it.
If you work with data in any way then you are already a “data modeler”. It’s not an ivory tower practice only for Enterprise Architects and academic book writers. Most of us do it.
The vast majority of us aren’t very good at it.
Data modeling is saving data in any shape or form. If you input data into Excel, you are modeling data. You build Power BI reports you are modeling data. If you are an application developer, you have to model data. If you are a sales exec building out a contract estimate in Excel you are modeling data.
You are already a data modeler. So why not be good at it?
Learn the patterns, practice, and questions required
to shape data in the best way possible
to deliver value for
your business.
I’m here,
Sawyer
from The Data Shop
A data model is a shared language
I recently returned from a trip to New York City with my son. There were many amazing sights and experiences - especially for an 11-year-old who grew up in Michigan.
One specific thing that stood out to both of us was the number of languages we heard around us. Granted, we were in tourist-centric areas (Statue of Liberty, Time Square, Central Park, etc). If I ventured a guess, we overheard close to a dozen different languages being spoken around us. The diversity of languages and ethnicities surrounding us was astounding - especially for a 30-something who grew up in the Midwest.
It was enough that my son turned to me at one point and said “Dad, I feel like we aren’t in America anymore”.
It’s disorienting when the numerous voices around you are speaking a language you don’t understand.
Thankfully because we were in America, all the signs were in English, and if I needed to turn to another tourist around me to ask something we could speak in English just fine.
English provided shared concepts and vocabulary for people from around the world to connect.
A data model at your company is very similar.
The business teams speak their own languages
The data teams speak their own language.
“Customer”, “lead”, “revenue”, “orders”, and “account” —> Each team often has different words and phrases to describe these concepts.
A “customer” could be represented one way in a database, another way in the Marketing software, another way in Salesforce, and yet another way in financial reporting.
Which way is correct?
None of them. And all of them.
It’s like asking a crowd of people in Time Square the correct way to say “Hello”. You first have to ask “What language do you want to say ‘hello’ in?”. And more importantly “what language does the person you are speaking to understand?”
A core value of a data model is to create a shared language.
A shared way to communicate about core concepts.
A shared way to talk about “customer” or “orders”.
A shared way for business teams
to talk with data teams.
I’m here,
Sawyer
from The Data Shop
Let’s talk about grains
Not your breakfast cereal (that you might be enjoying while you read this).
Not the grains of sand lingering in your shoes, car, and hair after a vacation at the beach (happy summer to y’all)
I am talking about the grain of your data. And it might be the most important step in data modeling and reporting building.
“The grain” of a table is what a single row represents.
A row in a sales table could be:
A single transaction
Sum of sales amount for a single day
Number of items purchased by a single customer on a single day.
Total sales by month by region.
Three recommendations for grains:
1. Get clear on your grain.
Upfront it’s crucial to make a conscious and clear choice on the grain of your data.
Ask this question: What does one row of your data equal?
The total of a transaction? Is it the sum of sales for the week? Is it the average throughput of your department for the month?
2. Realize the trade-offs
Ideally, we would always have data at the lowest possible grain. But with big data sets that can demand a lot of storage.
So sometimes you need to choose a higher grain. That’s OK, but just acknowledge that with each step up in grain, you lose some ability to analyze your data. (You can’t analyze sales by day if your grain is at the monthly level). Be conscious of the trade-offs of your grain choice and own it from the start.
3. Never mix your grains.
If one line of your data is totaled at the month level and the next at the department level, doing analysis across your data will be increasingly painful. This happens more often than you would guess. Don’t do it.
I’m here,
Sawyer
What is a Data Model?
Long time reader (since day 1!), Kyle Gibson, wrote in and asked some great questions about data modeling.
Here’s an excerpt and my response (lightly edited for clarity. Shared with permission):
“I’ve always associated “modeling data” with building Fact and Dimension tables to bring into Power BI for a star schema. So my main usage of model is just how I plan to model it in Power BI.
But I see the usage of “model” for modeling data for Data Scientists and AI usage, which is more for training their AI algorithms.
What is the best way to think of what people mean in building a data “model”?”
I love this question because it cuts through the noise and points out how sloppy our vocabulary is sometimes as data professionals.
So what exactly is a data model?
At a most basic level, to model data is to shape it. To put it into some kind of form. Anytime you "save" data to a file or table you have modeled it.
It may not be modeled well or in a usable way.
Most of the time when we talk about Data modeling, we mean putting the data into a shape that is usable for a business or application purpose.
Modeling data for analytical reporting (descriptive and diagnostic analysis) is often best served by a star schema (like in Power BI).
Modeling data for Machine Learning model might mean feature engineering and/or flattening the data into one big table.
Modeling data for a web application database probably means highly normalized tables like an OLTP.
There are no right way to model data. It's a generic term. We are always modeling the data with a specific use case in - AI, BI or an application, etc.
Which requires data teams
talking
with
business teams.
I’m here,
Sawyer
Quick Question
The next several days we are going to be talking about data modeling.
So today, just a question to get us started:
If you see “data modeling” on a job description (or in a memo about what the data team has been working on), what jumps to mind first?
Hit reply and tell me about it.
Sawyer
An LLM for you
I’ve talked with CIO and CTO who find Large Language Models (LLMs) scary.
In fact numerous companies have enough concerns that they have restricted their employees from using LLMs like ChatGPT.
Why?
They are concerned ChatGPT will eat their data and not give it back. At this point, it’s safest to assume anything typed into the prompt is consumed by the model and used to continue training the model.
The minute a developer starts dropping code blocks into GPT for help finding a code bug, a potential code leak occurs.
The same thing with marketing, finance, HR, product, or any other type of confidential company information.
It’s a real problem.
Early solutions are forming:
Bing Enterprise promises “user and business data is protected and will not leak outside the organization. You can be confident that chat data is not saved, Microsoft has no eyes-on access to it, and it is not used to train the models.”
Essentially, you can use these great LLMs, but the LLM won’t eat your data.
Another alternative is to build an LLM for your company, on your company’s data, with security restrictions guarding access. In this case, you might want the LLM to eat your data and continue to train based on how your business teams are interacting with it.
Finding LLM use cases at your company is quickly becoming not just feasible, but a competitive advantage.
Hit reply and tell me how you are thinking about LLMs at your company.
I might be able to help bring some of your dreams to reality (or tell them that they should stay dreams for the time being)
It was good to see you today,
Sawyer
LLM Bottlenecks
I’ve been talking about Generative AI and LLMs a recently here.
There are dozens of use cases that are exciting and with tangible business value. So what is stopping everybody from adopting these technologies across their business?
Two key bottlenecks:
Data quality and data management
If you need to build an LLM on your proprietary data, then the quality and management of your data is paramount. LLMs are already known to hallucinate. Giving the model bad data is only going to create more variability in the quality of responses you get.
Computing costs
These models are very compute intensive which is costly. The larger the model, or the heavier it is used, can drive up the costs significantly.
Computing costs will come down with time - as the hardware becomes cheaper and the models become more optimized.
But data quality and management only gets worse with time.
So while you wait for computing costs to drop, spend your time building data quality standards and management best practices.
That way, when your budget is ready for LLMs
Your data will be as well,
Sawyer
Here to stay
The big players are already finding Large Language Model (LLM) use cases. It’s already a revenue-generating feature for big tech businesses.
But what about your business?
Most business leaders are caught up in daily execution and quarterly planning. It’s hard to take in macro-technology trends.
Here are a few foundational ways LLMs could help your business:
HR
Employee onboarding buddy via Conversational AI model. New employees are bombarded by employee handbooks, company acronyms, department guidelines, and processes, etc. A conversational AI chat model, trained on internal company data specifically for new employee orientation could provide improved new employee experience.
Benefit enrollment time: HR reps respond to tons of emails and questions around benefit enrollment time. They send out emails, video walk-throughs, and documentation, but it doesn’t stop the onslaught of questions. A conversational AI chat model trained on HR benefits and enrollment data could streamline the request queue dramatically.
Sales
Writing contracts and statements of work: Generative AI is bad a creative and unique content creation, but really good at rote and standardized template work. A conversational chat workflow that allows the sales rep to input customer information and then auto-generates the contract or SOW for review.
University Accreditation
Accreditation is a multi-year process that can take up several full-time roles at a university. The amount of paperwork, forms, documents, and requirements is overwhelming. LLM models are great at synthesizing and processing the large volumes of information an accreditation office is required to review, understand and abide by.
Project Management
Between status reports, budget burndown, and change requests, project managers handle a large amount of information and process flow. An LLM copilot could auto-generate core aspects of status reports. For executive leadership looking for status updates, a conversational AI model could provide easy access to up-to-date project information on demand.
LLMs are here to stay. Get creative. Have fun.
Tell me about your business and what ideas you have for how LLMs could improve your workflow.
Sawyer
Quality AI
Generative AI is inching its way into tons of consumer products.
Gmail will help you draft an email.
Linkedin will write a post for you.
Github Copilot will generate code for you.
Notion AI drafts ideas for you or continue writing where you left off.
Very few spaces only will be without AI infusion over the next couple of years. Everybody will claim to have generative AI, but the quality will vary dramatically between these services. For the time being, most Generative AI is just a starting point, a draft, an idea generator. Only in rare cases will it produce content you can send out the door without review. The outputs can be repetitive, generic, and often straight-up wrong.
Infusing Generative AI into areas of your business can create significant value (more on business use cases in a future email).
What shapes the quality of a Large Language Model output? Three key factors:
The structure and quality of the prompt provided to the model.
The model parameters (the amount of randomness or uniqueness allowed, max response length, pattern frequency and repetitive content allowed, etc.)
The data the model is trained on.
Number #1 is likely a skill business and analytical users will learn.
Number #2 is a deep technical specialization from an AI engineer.
Number #3 will fall onto the data team.
Improving the quality of the AI output will require iterative collaboration between these groups
Business teams and data teams
Building better data experiences
With LLMs.
I’m here,
Sawyer
First-Class or Coach
There are two dramatically different experiences when flying.
The average flight experience looks like this:
Cheap and distant airport parking and shuttle to the terminal.
Regular security lines with queues the length of a few football fields
Scarce seating of questionable comfort around the departure gate
Long wait to board. Seat near the back. Minimal overhead bin space
Average service during the flight. Standard drink and snack options.
An elite experience looks like this:
Parking close to the airport terminal or Uber drop off curbside
TSA Pre-Check for a quick breeze through security.
Executive Lounge access with free food, drinks, fast internet, and a comfortable quiet place to sit.
Early and quick plane boarding for a first-class seat.
Comfortable flight with prompt service from flight staff.
Both these travelers are flying to their destination. The flight itself is the same length of time.
But the top-notch experience could require up to a couple of hours less time and lead to significantly less stress upon arrival (” Will the shuttle pick me up on time?”, “Will I make it through security quick enough for my flight?”)
This is the difference between the average company’s data experience and those at the top of the field. Both companies are “doing data analytics”. But the amount of time it takes and the amount of stress it causes them tells the real story.
The north star for data teams is to build better data experiences.
It was good to see you today,
Sawyer
Board Games
You are throwing a party. A board game event for you a group of your best friends.
On the invite, you say “Bring along your favorite board game or two”
The night arrives and your friends begWin to show up.
The first friend, Joseph, arrives with aW board game under his arm. He heads straight for the living room coffee table and begins to set up the game.
Heather shows up next and she quickly pulls out a board game and sets it up on the kitchen table.
Finally, Carlton arrives (he’s always a bit late). He also has a game under his arm and he marches straight to the dining room and sets up his game.
Initially, you are thrilled about the evening ahead. Three of your best friends are here and they brought some of their favorite games to play. It’ll be a great evening.
Then you notice Joseph is sitting at the coffee table neatly adjusting pieces of his game, Heather hasn’t said hi to anyone but is very intently organizing cards for her game in the kitchen. And Carlton (despite his late start) has his game entirely set up and is waiting patiently for someone to join him.
You wander around and individually ask your friend about what’s going on. They each express excitement about their game, insisting it’s an excellent choice for the evening, and are confident that the others will love it.
So they each sit there, with an excellent game in front of them, but no one to play with.
Business teams do this with data tools all the time.
Each business org has a favorite data visualization and business intelligence tool. Marketing uses Tableau. Operations uses Power BI. A finance manager loves Qlik. There are a few people on Looker in the Sales Analytics team.
They each express excitement about their BI, insisting it’s a great choice for the company, and are confident that the others will love it if they give it a try. But no one wants to give up their favorite tool.
And so they all sit there, with an excellent tool in front of them, hoping - dreaming - of collaborating across divisions and sharing analytics. All while remaining quite comfortable in their BI tool silo.
I’m here,
Sawyer
p.s. Most data problems aren’t data problems. They are people and culture problems. Which is a problem Chat-GPT-n won’t be able to fix anytime soon.
The Treehouse
Building a treehouse is terribly inefficient.
A treehouse is far inferior to a regular house.
No plumbing
No electricity
Exposed to the elements of rain, wind, snow, and heat.
Only accessible via a ladder
Short life span depending on healthy the trees are.
Much smaller square footage
Why would you ever want to build one?
Because they are awesome.
They are an amazing canvas for a kid's creativity, imagination, and exploration.
Climbing a ladder doesn’t both them. It adds to the fun. The wind, rain, and snow? Part of the adventure.
Does it need to last for 30 years? No. It needs to last through the peak childhood years. 10-12 years max.
Building data analytics in Excel is terribly inefficient.
Excel is far inferior to a database and ETL technology.
Limited automation
Smaller capacity
Difficult to share or collaborate with colleagues
Prone to all sorts of formula errors, typos, and manual configuration mistakes.
Easy to lose, misplace or delete a file by mistake
Why would you ever build analytics there?
Because it’s an amazingly flexible and powerful tool for business users.
With just a bit of training and experience, it becomes a tremendous canvas for creativity, imagination, and exploration.
Do you care that it can’t hold 1TB of data? No. It only needs to hold this month's sales for the department.
Do you care that there is no automation? No. The business team wants the flexibility to adjust it as they see fit.
Prone to errors? Yes, but also very easy for a business user to write some formulas and uncover insights.
Does it need to last for 10 years as an enterprise data store? No. Just this month while we evaluate our sales process.
Do I want to live in a tree house? No way. Give me a home with climate control and indoor plumbing. But do I want a tree house in my backyard? Absolutely.
Do I want to build an enterprise data strategy around Excel? No way. Give me a scalable database, ETL technology, and analytical tools. But do I want Excel at my company? Absolutely.
I’m here
Sawyer
from The Data Shop
Before you walk.
You crawl before you walk.
I talk with companies almost weekly and hear something like "we want to do advanced analytics, AI/ML, LLM Models, etc."
I love to hear companies passionate about the transformation data can have on their business.
But as the conversation goes on, it's clear they are trying to run a marathon before they have figured out how to lace up their shoes.
The path to advanced analytics, with very few exceptions, goes through this progression.
✅ Step 1: Do you have Batch ETL with a well-model data warehouse with operational reporting? Have you implemented basic data cleansing and quality standards in place? Is there any governance or data dictionary deployed?
Great. This is your descriptive analytics. You now know what’s happening in your business.
✅ Step 2: Do you have the talent and the tools to perform root cause analysis? Is your data stored so you can meaningfully look at historical trends, perform data mining and review A/B analysis?
Good work. This is your diagnostic analytics. You know why something is happening in your business.
In most organizations, these two steps will take several months to a few years.
This work will have a much larger ROI than trying to jump into predictive or prescriptive analytics without a good foundation.
If you are working on these first two steps and not making the progress you hope for, hit reply and tell me about it.
I help companies take the first step.
Thanks for being here,
Sawyer