The Data Daily
Less than 2 minutes to read each morning.
Not sure if you want in? Read the archives below.
5 days a week since May 1st, 2023.
What if it was easy?
What is good data?
Earlier this week I shared that my “Why” for working in data is because “Good data makes lives better”.
Good data measures up. It’s aligned to the standard.
Here’s one part of the standard…one way to think about measuring up.
Ease of use.
Without question, good data is easy to use, consume, and understand. It can’t make lives better without being easy.
Here’s a bit about what that could mean.
Accessible data - Data I can find easily, I have access to, and doesn’t require skills I don’t have.
Clear data - Data I can understand, explain to others, and that answers my questions
Reliable data - Data I can expect at a specific time, in the expected format, with an appropriate level of quality.
Integrated data - Data that merges, translates, and connects with my other key data points.
etc.
Together, we could continue to come up with a very long list.
In short, good data is easy to use.
And good data - data that measures up to the standard - makes lives better.
Before you go, tell me about one area of your life where you encounter. Rate the ease of use that you experience. On a scale of 1-10, how does it measure up?
We’ll talk more about good data tomorrow.
Sawyer
The Standard
If “Good data makes lives better”
Then what is good data?
Your mind might jump to data quality and removing duplicates or filling missing values. But that’s a small part of what “Good data” really is.
In my home, we have a chart hanging on the wall marking the heights of my kids at different ages. At 3 years old one child was 35 inches, another child was 40 inches at 5 years old, and at 7 years old my eldest was nearly 49 inches. On and on it goes.
It’s enjoyable and nostalgic to revisit their heights at different ages, and maybe compare the growth curves at different points in time. But how tall they are doesn’t really matter much. Until you have a standard to measure against.
We planned a trip to a water park recently and beforehand we read online about the height requirements for some of the rides. Quickly I shuffled each of my kids against the wall chart to see how they “measured up” to a standard. The oldest two cleared most of the requirements. The youngest will need to wait a couple of years before he can ride everything.
Once we knew what was required, we had something to compare against. Sure, it was an arbitrary standard that doesn’t hold any meaning for the rest of life. But in the world of the waterpark, the height requirement was the law. It was a binding standard by which everything else had to be measured.
Good data meets the standard. It measures up.
The expectations, requirements, and usefulness of data will vary based on the work I do and the data I need. But there is always a standard.
Sometimes an unexpressed standard. Ideally, clear and thoughtful standards.
Over the coming days, I’ll share what I think “good data” standards are. And specifically what the standards are that make lives better.
I’m here,
Sawyer
Good data makes lives better
There’s a famous Simon Sinek video where he presents the main concept that made him famous: Start with why.
It’s compelling in a number of ways. Simon is an excellent speaker. The diagram he draws is simple and communicates clearly. He gives illustrations (Apple) that tickle our imagination and nostalgia.
But most importantly, he presents a vision for work we all desperately want: a meaningful “why” for what we are doing.
We want a purpose, a deep driver, something at the innermost circle to make show up each morning at our desk.
What’s the why for working with data?
Let me present one idea that I’ve settled on for myself over the years for why I work with data:
Good data makes lives better.
There is a lot we could unpack in that sentence (and we will in the coming days), but for now, I offer it to us as a starting place. If you haven’t articulated “why” you work with data (or whatever your professional work is), then maybe that statement can get the creative energy flowing for you.
In the next few emails, we will talk more about this “why” statement.
What is “good data”?
How does it “make lives better”?
What does a “better” life look like?
Before you go today, tell me this - what is your “why” for your career, profession or job? Hit reply and tell me about it.
I’m here,
Sawyer
Bring your creativity to data
The ever-present debate on Linkedin among data people is the importance of soft skills vs hard skills.
Should you learn SQL and Spark? Or communication skills, data storytelling, and requirements gathering?
A more interesting question to me is not “which skills are more important”.
But “which skills keep you in the data profession?”
Are you here because you love table partitioning strategies and tuning Spark jobs? Or explaining forecasts and analytical models to executives?
For me, it’s creativity.
Creative expression is what keeps me in the data profession.
Numerous technical and "hard" skills are involved in working with data, but creativity is the lifeblood of innovation and ideation.
Here are a few ways creativity shows up in data work:
Creativity in data connections. Combining metrics or attributes in unconventional or unique ways leads to insights.
Creativity in data modeling. Modeling data is an art form and is a canvas where a data modeler gets to represent business concepts in data.
Creativity in data storytelling. How you communicate data is just as important as what data you communicate.
Creative in Design. Many technical or design problems don't have correct answers. A bit of creativity, and can generate the ideas necessary to build excellent solutions.
Both business teams and data teams have abundant opportunities to express and explore their creative side when working with data. Don’t get stuck in the “hard skill” vs “soft skill” debate.
It was good to see you today,
Sawyer
What should we spend our time on?
This question comes up in data teams and business teams alike.
During leadership meetings, sprint reviews, and one-on-one with managers.
We try to define what is valuable and so we ask:
“What should we be spending our time on?”
I've asked myself that many times.
Yet, time is a cheap metric for value.
“Spending time” on something doesn't produce value.
It's actually shorthand something else.
We assume if we are spending time on something, then we will be producing a valuable output.
So we spend weeks or months creating systems and policies around “tracking time”.
When that was never the goal.
The real goal is to create valuable output. So instead of tracking time and hoping, praying, dreaming that we get the output we want…
Let's skip the shorthand and track output first.
Thanks for being here,
Sawyer
from The Data Shop
p.s. The opposite is true for personal life. Time spent is a great measurement. Time spent reading books to your kids, going on a walk with your partner, mountain biking, playing an instrument, etc. Those activities are done for the joy of it - please don’t try to measure output. Happy Friday - enjoy this weekend.
Contractors and Consultants
Before you send out that RFP or start contacting IT shops answer this question.
Do you want a consultant or contractor?
Don't listen to the labels people put on themselves. Contractors call themselves consultants and vice versa.
Instead, listen to the questions they ask.
"What would you like me to do?" = Contractor
"Why are you interested in doing this project?" = Consultant
"When can I start?" = Contractor
"Why are you doing this project now?" = Consultant
"What's your budget?" = Contractor
"How does this project impact your business?" = Consultant
“Who else are you considering hiring for this project?” = Contractor
“What type of service provider do you think will best serve your company’s needs?” = Consultant
Deciding which questions you want to answer will tell you which person you want to hire.
There are plenty of scenarios where you might want a contractor. Plenty more when you might want a consultant.
Neither is right or wrong. But hopefully, you can start the conversations knowing what you are looking for.
It was good to see you today,
Sawyer
It’s more fun this way
I’ve had those moments in my career where I sit at my desk wondering why I chose this career path.
Data is hard.
Sure, lots of other job types are hard as well. But I work primarily with data, that’s usually the object of my frustration and despair.
During the hard days, I’ve tried a number of things. Going for a walk. Taking a nap. Asking for help from a teammate.
The remedy that most often cures my ailment is returning to the business story.
Data is easier when you understand the sales team’s deal pipeline.
Data problems are reshaped when you understand the bank branch’s operational needs.
Prioritizing a backlog of data requests is easier when you understand the COO's objectives for the fiscal year.
Dashboard design finds new life when you grasp an accounting team’s reporting workflow.
Data is just more fun with business mixed in.
Question for you:
What business use case are you working on right now? What conversations can you have this week to enhance your understanding of how the business functions?
Sawyer
Expensive <> Price
A $10 book you only pick up once for 15 minutes is expensive.
A $150 book you reference once a week for years is cheap.
The cheap option that's wrong or ignored is very costly. The expensive option that meets the need is cheap.
This is true for books, electricians, plumbers, data consultants, and data tools.
If you are a buyer, be cautious of cheap options. Rarely does cheap come with easy maintenance, great support, and high customer satisfaction. But also be wary of arbitrary budgets and price perceptions; expensive and sticky problems are worth the cost to fix.
If you are a seller, being cheap often introduces frustrations into the solution. Be great at delivering value and the revenue will take care of itself. This doesn't mean you do everything great, or even most things. But you are clear about what you can successfully deliver. You underpromise and over-deliver every time.
I’m here,
Sawyer
Niche Expertise
Recently I had a call with a Principal Software Engineer named Michael.
He has over a decade of experience in engineering at a variety of start-ups and enterprises. He holds an advanced degree in computer science. As he unpacked his credentials to me at the beginning of the call, I was impressed.
And a bit intimidated.
We were on this call because Michael had scheduled this call with me to ask for help.
He’d been tasked with building data pipelines and the beginnings of a data platform at his company. Despite all his vast experience in software engineering, Michael was struggling to design and implement a solution.
We spent over an hour talking through data platform design fundamentals, data modeling practices, and noticing areas of his work that might be landmines waiting to explode. At the end of the call, he said “I’m going to schedule more time in a couple months. This was really helpful”.
Even with his advanced skills and extensive experience, he was honest enough to ask for help when tackling something he wasn’t confident in. That’s probably why he’s so accomplished in his career.
You likely have elite expertise in certain areas that make you great at what you do. That expertise also contains inherent blind spots. The fastest way to scale your expertise is to collaborate with an expert in another niche.
You be great at what you do. Let someone else be great at what they do. Work together when you are ready to scale.
I’m here,
Sawyer
Intel for the front lines
“Leadership won’t get behind my data efforts.”
“The executives haven’t approved the budget we need to build our data team or tech stack”
“I don’t think the VP has viewed the dashboard that we built for her a single time”
Instead of trying to get leadership to care about data, shift your thinking to care about the business problems facing leadership.
Here’s one idea…
How often in the last 18 months have you thought about the rapidly rising interest rates impact on your organization?
That topic has come up in every executive board room and has played a role in future forecasts for every C-Suite.
Hoping that leadership will one day start caring about data is futile.
Instead, find the topics that keep them up at night and that show up on board meeting agendas.
They don’t want a shiny new toy.
They want intel to help on the front lines of navigating complex macroeconomic conditions.
It was good to see you today,
Sawyer
Free Software
Yes, some software is free. It’s called Open Source.
The source code was built by practitioners, maintained by a community, and is freely available.
Some household names are open-source projects: Mozilla Firefox, WordPress, VLC Media Player.
Also, some very popular data tools: PostgreSQL, Spark, Airflow, and Kafka.
If ‘free’ sounds like a great price, you might thinking “Sawyer, tell me how to get some free software!”
Free is great, but it comes with a ton of work. And while the software is free, the computers required to run it aren’t free. And the expertise required to maintain it isn’t free.
It’s like a “free puppies” sign next to the road.
Yes, the puppy was free, but between vet bills, supplies, and food it’s far from free. It also demands a ton of your time and restricts what and when you can do certain things.
Most people opt to pay for a managed version of open source projects: Databricks, Confluent, WordPress.com (instead of .org).
Open Source software plays a huge role in the data world. Hit reply if your team is using an Open Source project or considering using one.
I’m here,
Sawyer
p.s. Next week, June 20th, I’m presenting on the Open Source project Spark at the PASSMN SQL Server User Group. It’s a virtual user group so you can attend from anywhere. Here’s the info so you can attend.
Hits and Misses
Success is always surrounded by failures.
In 1996 musician Joni Mitchell released a compilation album of her greatest hits during her 30-year career.
It was exactly the sort of thing a beloved artist normally does after decades of success.
But when her record company came to her requesting she releases a "Hits" album, she agreed under one condition.
That she could also release a "Misses" album filled with songs that never hit the charts. They were hand selected for "Misses" by Joni and were some of her favorite and more experimental songs.
Both "Hits" and "Misses" were released on the same day.
Of course "Hits" became a certified Gold album and "Misses" didn't sell that great. But the message was clear for everyone who paid attention.
Hit songs only show up alongside a bunch of misses.
Put in the work. Be resilient enough to take chances. Build a collection of misses.
Odds are you'll find a few hits scattered in as well.
I’m here,
Sawyer
Progressives and Preservationists
Every organization needs two kinds of people: preservationists and progressives.
In politics, we call them conservatives and progressives.
In religion, we call them orthodox and heterodox (or heretics)
In business, it's the "best practices keepers" and "disrupters".
These people either conserve or create.
No matter which group you are in, you are perpetually frustrated by the other.
A progressive is constantly recreating the old and reimaging it for the future. A preservationist believes in maintaining strong continuity with the past.
As you read this, at least one person popped into your head who's in the "other" group. You also have a pretty strong hunch about which one you fall into. The divide between groups occasionally falls along generation lines, but more often personality types and past experiences. Every company, team, and social group needs some of both.
In practice, this looks like a data team frustrated by business practice that prioritizes the status quo over data-driven insights. Or, an innovative business strategy group that’s hamstrung by legacy data systems and processes.
The contrast and stress between both groups allow them to thrive by restraining their extreme tendencies and highlighting their strengths.
Enjoying and navigating this tension will take you far.
I’m here,
Sawyer
Where’s your data?
There are so many ways you can store data.
Files System. Data lake. Database. Data Warehouse. ERP or CRM.
There are tradeoffs everywhere you look.
An ERP is easy because most of the data management is handled by the software. But it also has rigidity and it’s isolated from data in other key systems.
A file system is infinitely flexible, with numerous file types and methods to structure your file storage. They are also cheap. But that flexibility can also make management a nightmare.
Databases are very popular and do many things very well. But only certain types of data (structured) thrive in databases which leaves you stuck with other types of data.
Data Warehouses are great at handling tons of data and tons of queries, but the cost can often be prohibitive for smaller companies.
The list of options and considerations can go on for a long time. Security, scaling, governance, costs, integrations, performance, etc.
You are always making trade-offs. Always attempting to optimize for business goals.
It was good to see you today,
Sawyer
p.s. Facing questions about how and where to store your data? Book a free 30-minute call.
When do you need it?
You’ve hit “Add to Cart” and landed on the Amazon checkout screen.
There’s one important decision left to make. It’s not about your payment information or shipping address. Amazon probably has that saved and ready to go for you.
No, the decision you need to make is this: When do you want it to arrive?
Usually, a few options are available to you ranging from same-day delivery to a week or more.
You pick the option that’s best for your situation. Order a birthday present for next month? Pick the 1-week delivery. A birthday present for tomorrow? Same day, please!
Business data has a shipping option as well. We called it latency.
Latency is the time from when is produced to when it is available for a user to view.
When designing data solutions data teams are heavily reliant on business teams to determine the required latency.
Is the birthday tomorrow or next month?
How quickly does this data point need to arrive?
The cost, complexity and design of the data solution depends on answering “When do you need it?”
I’m here,
Sawyer
“Making Money” at a non-profit
Last week I suggested you should answer “how do you make money?” at your company. It’s the easiest way to connect your data to value.
A thoughtful reader wrote in about non-profits:
The challenge with the non-profit space is how to align that reality with the "we don't do this for the money" thinking. Mission and vision always come before fiscal feasibility, and then the development teams are told to just raise more funds. So how does this impact the data teams? I think it flattens the value of the organization's data assets and each department clambers to be at the top of the analytics list. Data teams get vague responses from executives on what's a priority and they're told everything is or will be critically important for the mission.
In for-profit companies, connecting value to money is easy. Money is the primary way companies maximize shareholder value - which is the purpose of a for-profit company.
Rather than “maximizing shareholder value”, in non-profits it could be:
Shareholders = the people behind the public good identified (students in education, communities needing disaster relief, hungry children in need, etc.)
Value = how you measure the good you provide to your shareholders.
“Maximizing shareholder value” becomes “Effectively serving a group of people”. All non-profit mission statements essentially boil down to that. Of course, the leadership likely comes up with a more eloquent way of stating their mission and vision than that.
The business teams and data teams then work together to define how to execute and measure progress toward that mission.
Which isn’t that much different than asking…
“How do you make money?”
I’m here,
Sawyer
The Rope Bridge
You look out across the gapping chasm in front of you.
The only way across is a rope bridge.
On close inspection, you see evidence of weakness. Fraying ropes. Cracked and sagging footboards.
You glance around at the people adventuring with you deciding what to do next.
What’s the best-case scenario?
No one walks across that bridge.
The potential for plunging into a rocky canyon below is too high.
You look at your data pipelines and the resulting dashboard.
On close inspection, you see evidence of inaccuracies. This metric appears to be missing data. This column looks like the wrong data type.
You glance around at the business teams and executives who rely on this data.
What’s the best case scenario for your business when you have data quality issues?
No one walks across that bridge.
No one touches that dashboard.
The potential for making a key business decision based on faulty data is too high.
The potential for leadership to lose trust in the data is too costly.
It was good to see you today,
Sawyer
How do you make money?
If you want data to be a valuable part of your business (which is a good idea) then you need to define what’s valuable at your company.
In short, you need to know how your company makes money.
If that sounds like a simple question to answer, spend a few minutes with these questions:
What are our sources of revenue?
What is the profitability of different revenue sources?
What are the factors that impact our profit positively or negatively?
What are the main avenues available to grow revenue?
What specific value do our customers ascribe to our product or service?
How do our physical assets (real estate, vehicles, equipment) impact our revenue or profitability? Are they cost centers or profit centers?
How do our revenue and profit align with the company’s mission and values?
etc.
Adapt those questions to match the nature of your organization. Get curious. Come up with more questions. When you don’t know the answer find people in your company who do.
Data will never deliver value if you haven’t defined what’s valuable for your organization.
I’m here,
Sawyer
Blindfolded
I never cease to be amazed at the games kids come up with.
We have a large open room in our basement with no furniture at the moment. My three young boys love playing down there. Their most recent game - a version of Marco Polo.
You’ve probably played something similar. In this version, one boy wraps a towel, sweatshirt, or robe around his head (seemingly to prevent him from seeing and breathing), and begins to call out “Marco” while waving his arms around trying to tag his brothers. His two brothers call out “Polo” in response.
They have one important rule. In addition to “Polo” the boys who aren’t blindfolded are required to yell “Stop” if the blindfolded “Marco” is about to run into a wall. I appreciate their attempt at safety.
Inevitably though, the blindfolded boy slams into a wall face-first. “Why didn’t you yell, ‘stop’?” he screams at his brothers while rubbing his head.
Responses along the lines of “I forgot” “I didn’t see it coming” or just “oops” are common.
Too many teams and companies attack Data Projects blindfolded.
There might be skills and experience on your team, but blind spots are inevitable. The worst part of blind spots is you don’t know when you’re about to hit a wall.
Finding someone to call out “Stop!” can save you a headache. But from wasting months of time and hundreds of thousands of dollars. Whether in design, development, deployment, or delivery to stakeholders, an experienced outside perspective matters.
It was great to see you today,
Sawyer
AI anxiety
AI anxiety is a real thing.
At some point in the last 6 months, it’s likely you’ve felt some form of concern, stress, fear, or anxiety over a future with AI.
I have.
Much digital ink has spilled talking about the future of jobs in a GPT-n world. Here’s a bit more ink poured out.
Yes, many things about your work will change. If you write code, create content, organize law suits, or file taxes.
But there is no future I’ve yet imagined (or seen the interwebs imagine) where GPT/LLMs fix the most expensive problem companies face.
People problems.
GPT won't fix people problems and culture problems.
If there is one skill worth investing in for the AI future, it’s navigating complex people, teams, and organizational alignment.
GPT might automate many lower-level tasks, but higher-level (and higher-value) problem-solving appears to be still on our plate.
A tiny mission of this daily email is to make you better at just that.
Helping data people work better with business people. And vice versa.
I’m here,
Sawyer