DataFramed
DataFramed

Episode · 5 months ago

#102 How an Always-Learning Culture Drives Innovation at Shopify

ABOUT THIS EPISODE

Many times, data scientists can fall into the trap of resume-driven development. As in, learning the shiniest, most advanced technique available to them in an attempt to solve a business problem. However, this is not what a learning mindset should look like for data teams.

As it turns out, taking a step back and focusing on the fundamentals and step-by-step iteration can be the key to growing as a data scientist, because when data teams develop a strong understanding of the problems and solutions lying underneath the surface, they will be able to wield their tools with complete mastery.

Ella Hilal joins the show to share why operating from an always-learning mindset will open up the path to a true mastery and innovation for data teams. Ella is the VP of Data Science and Engineering for Commercial and Service Lines at Shopify, a global commerce leader that helps businesses of all size grow, market, and manage their retail operations. Recognized as a leading woman in Data science, Internet of things and Machine Learning, Ella has over 15 years of experience spanning multiple countries, and is an advocate for responsible innovation, women in tech, and STEM.

In this episode, we talk about the biggest mistakes data scientists make when solving business problems, how to create cohesion between data teams and the broader organization, how to be an effective data leader that prioritizes their team’s growth, and how developing an always-learning mindset based on iteration, experimentation, and deep understanding of the problems needing to be solved can accelerate the growth of data teams.

You're listening to data framed, a podcast by data camp. In this show you'll hear all the latest trends and insights in data science. Whether you're just getting started in your data career or you're a data leader looking to scale data driven decisions in your organization, join us for in depth discussions with data and analytics leaders at the forefront of the data revolution. Let's dive right in. Hello everyone, this is adult data science educator and evangelists at data camp. One thing we keep thinking about here at data camp is how important a culture of continuous learning is for data teams. Data Science is still relatively nascent compared to other technology disciplines like software engineering, and too often now we see new frameworks, new tools and new ways of work for data teams. This definitely requires a culture of continuous learning for data scientists, and this is why I'm so excited for today's guest. La Hilal is the VP of data science and engineering that shopifies commercial and service lines division. She is a well seasoned data leader with an extensive resume that I'm not doing justice with the short introduction. She has led a variety of projects and is an expert in areas such as data analysis, machine learning, autonomous systems, Iot, to name a few. She's also an incredible learning advocate for the data scientists that she leads. Throughout the episode, we speak about her experience leading data teams that shopify, how data scientists can develop a continuous learning mindset, how data leaders can create space for innovation within their teams, some of the use cases that she's worked on in her biggest learnings from them, and much more. If you enjoy this episode, make sure to rate, comment and subscribe, but only if you like them. Now on to today's episode. Hello, it's great to have you on the show. Thank you. I'm really excited to be here. Awesome. So I'm excited to discuss with you the data science powering shopify, how you approach and always learning mindset, how you lead data team teams and more. But before we begin, I'd love to talk about your background and how you got to where you are today. So can you briefly walk us through your journey and how you joined shopify? So I will start with I'm a girl from the Middle East. I actually went to university in Cairo and I studied computer engineering. Then I traveled to do my masters. I did my master jointly between Cairo and Wool University in Germany, which was amazing. I got to learn a lot. They take some courses from child Guardian University. I got to visit the different campuses around Germany and then I went. At the time I had full scholarships from Fulbright, from DA DA and in Canada too, from O G S and insert, and I ended up lending in University of Waterloo where I started machine learning in ai and then I graduated. And then I'm not gonna take you through my whole career, but maybe I'll give you a couple of highlights. I have a PH D in petannassis machine in intelligence. I started my creator's job developer because when I started, data science was not a thing at the time, and then from there I started leading innovation teams and then started data teams and then grew into leading the data science organization in a company called Intelligence Makatronic systems. Then I moved into shopify as a director of data for plus, which is large sized merciants and International. Plus is like a lot of our big merchants like test les, General Electric. Some of the Kardashians like name it. Anybody who is anybody is on there. We have lots of very amazing, very talented merchants building their own brands and international. We started with the mission of making shopify a perfect market fit for all the markets we're in. We're already we're in a hundred and seventy five markets, but we were started with the intention of making making it feel like solving local needs, not just a platform, a global platform that operates regardless of the needs of the merchants and from their group...

...into leading the growth and revenue organizations. And now I'm the VP of data science, heading all the commercial and service data science teams. And given your extensive experience as a data leader a shopify, you know one thing that I've seen you speak about, and this definitely requires it to be able to succeed in such roles, developing and always learning mindset and just really what I want to center today's episode about. So I'd love to set the stage for our conversation today and how you define and always learning mindset and why you think it's so important for data scientists to rest within their careers. I think this is the most important superpower data scientists can have. Like I talked to a lot of leaders in data science world and they talk all of my goodness, we need somebody with the PhD and we need somebody with the masters or ex RA Y or Z, and I was like, don't get me wrong, I do you have a masters, I do want to PhD, but I don't think that's what makes a good data scientists. I actually think the good data scientists is the one who has this learner's mind set and learners might set. I define it is the person who is able to go back and learn, doesn't get stuck in what they know. They are actually able to continue collecting additional tools, additional formats of thinking of data, philosophies, mindsets, frameworks along their journey to add to their toolbox. The data science craft in general is evolving rapidly and it's relatively it's in an early stage compared to engineering and other crafts who have been there for like many, many more years, and because of that, things are evolving fast. Framework story forming evolving fast and we're coming with new techniques and new approaches all the time. The trick is not knowing the latest all the time, but the trick is being able to learn new techniques when the right questions on the right setups come. So it's not about the shiny new thing, it's about picking the right tool out of your toolbox and if it's not, they're going and finding it and getting it and adding it and learning how to use it. That's really great. I'm excited to expand with your methodology for learning here that you've acquired along the years. One thing that you mentioned upon is one data science is not relatively mature yet compared to other fields. But also data science is inherently multidisciplinary. You know, data scientists are required to blend too broad skill sets to be able to deliver value. One of them is business acumen, right, knowing the product that you're working on, having communications skills, being able to work with collaborators on business problems, but also technical skills, not some bolts of data science, as we say. So starting off maybe with a technical skill set, because that's arguably the more comfortable one to grow in as a data scientist. A lot of the growth data scientists have on the technical side comes from actually learning new tools and experiments on the job. As you said, however, given the importance of delivering value in the near term. How should data scientists maneuver the trade off between applying tried and tested techniques to solve problems and learning and experimenting with new tools that may not pan out to deliver short term value? Yeah, so I will divide this enter into two. I almost believe in focusing on impact, and by focusing on impact, you can always iterate, like I believe in incremental shipping. So what you can start with is, let's say, a simple thing. You're asked to do, forecasting, for like a top line metric. You can go with the latest, coolest and fanciest paper that was published about like this neural network that allows you to optimize a hyper model with like many parameters and then like takes that to do something with like some form of like hyper tuning for some regression or whatever. You can go with these some very complicated techniques multilayered right away. But don't get it wrong. Yes, you learn something cool, but did you really solve the business problem? Did you really know how to use it effectively? Did you know effectively or baseline? I don't think so. I think the right way to go about it is to start with your simplest. You know what let's line fit and from and then take it one more step further. Let's apply some some linger aggression, and maybe, you know what, let's do some logistic regression. Maybe then we and as you iterate, you...

...understand the progress and your understand your data, you understand your different parameters, you understand your levels that you're pulling and then, as you iterate, you're actually finding more and more and learning more and more and better understanding why you're leveraging and using this. One of the biggest mistakes that I see data scientists do is to try to be at the cutting edge of technology, run to the shiniest thing right away, and the problem is the Chinese thing doesn't mean that this is the most important or practical thing. To be effective and to be successful and to reach this mastery in your craft, you need to understand what exact tool to pull out of your toolbox to solve the problem in the most impactful way. It's not the fanciest Chinese tool, is the appropriate, right sized tool, and to do this you need to build this sense of iteration, incremental shipping. As you read, over time you get better and accordingly you can use more sophisticated the niques. The more sophisticity techniques actually sometimes blinds you from why it's operating this way, because it's a black box and, like you spend a lot of time trying to throw things at it, but the truth is just throwing things at the wall versus you're really trying to understand what levers are you pulling? At the end of the day, any machine learning model is literally a line fitting in a higher or like hyperplane, fitting within two multiple dimensions. That's it's it's cleanear math, guys. It's math. It's not it's not rocket scized. It's math. And if you understand that, then any new technique is not shiny. You need to understand the underlying math to choose and to understand the underlying math. You can't start with the most complicated equation. You need to start from the beginning to progress through it. So with that mindset, I think you tend to iterate and UN learn. The second thing that I always do with my teams is, like here's something called B l t time, or like hag days, or like whether it's a pipergramming time, which is a great way to learn new things. Data Digest which places where people can present their work and teach each other or head days where you can get to experiment with new things. So, like you always need to have some like space, in a scheduled space, to get to pick new things and experiment and learn. But on the job, on the day to day job, you can also learn through iteration and through pushing the boundary and also through experimentation. But don't start with the thing that you cannot debug and understand why it's working or not. Start Simple and iterate. So there's two frameworks that play here. The first framework is constituation and starting off from simple solutions and avoiding that shiny toy temptation, because I think a lot of data scientists fall into that resume driven development pathway. And the second framework is also, as a leader, creating that space for teams to share knowledge and experiment with new tools that may or may not be shining kind of present their work as well throughout the way. Is that correct? Yeah, totally. And reward is shining or not. The understand if you understand the why underneath, you understand the distribution of your data sets, you can iterate, you and have even better ways to enhance existing algorithms or even new algorithms. Now for the Business Acumen side, I think that's ugly be a more challenging skill set for data scientists because it blends communication skills, collaboration, product sense and more, and it's not something that you learn necessarily in the data science education and it's not something that technically minded person would be geared towards. Potentially, what are frameworks and mental models and similar here as well, mechanisms within a team that you find useful to improve that skill set continuously as well? I love this question. I can tell you how much I love this question. So I think the biggest thing and the most important thing that I keep repealing is like, as data scientists, we need to focus on the outcome, not the output. And I know the sentence is very simple, but it's so true. We have a lot of times focus about shipping and Algorithm, but we don't ship the business impact. We don't focus on the business impact, which is the outcome, we focus on shipping the Algorithm. So for us to get ourselves to tie to the business impact, I think one of the key tools that, like, I recommend for...

...everybody to use and I actually reference a lot, is the five wise you need to understand why we want to do this and you want to actually debate it from a human devil. So, for example, if I want to tell you bive me a recommendation engine, that's the sentence that the TM product major can come and say. It's like. The question is why? And then you can say we need to recommend themes for merchants teams, which is like what's the nameplate to you? So then another wise will we want to save time, Dadada? Then as you continue the conversation in your life, that when merchants come in to start their stores, one of their highest friction points is actually choosing what seemed to use for the right business. They want to make it unique, but they want to make it useful, they want to make it appropriate to the product that they're selling, but they still want to put their flare on it. So being like this intelligent partner, like this automated, intelligent recommendation assistant type of Algorithm with them tends to save them a lot of time and actually become a sounding board too. That has a real impact for merchants and when you understand that, you can actually start with you know what, you actually don't need a recommendation engineering you, maybe let's start with the ranking and take it from there and then as you get the ranking, maybe the next iteration will be a full recommendation engine right like. So you can iterate over time, knowing the outcome that you're trying to drive for and using your skill sets and this massive toolbox that you've been building from our step one to be able to pull the right thing, versus acting on a specific ask. And the business acumen is built by being curious. There's no other hacks. I can give you a ton of frameworks, but all of them are founded in US asking questions and asking to understand the driver's keeping an eye on the outcome, not on just what we're shipping, and that makes a big difference. And also you're gonna see massive engagement difference with their counter parts. So if you're working with product managers, get to see them interact with you differently if you're working with engineers, with even like sales reps or anybody. At the end of the day, you have also a shared language, and this is really important. Regardless technical or non technical business problems or business acumen. Business problems are common and shared by all the crafts working in a certain group or company. Now your language changes from like a data science craft language to a business common language, share, Dot Cross. So the bond, the connection, the alignment becomes much more amplified and faster. That's really great and I love that. What science about the framework that you will that laid out is that by breaking down a business problem into its multiple components of parts, like with the five wise for example, you're able to also break down a technical solution into its component parts and be iterated from there totally. And also you you get to understand that drivers, not just the translation of it by a P m. So the PM heard something and came back. Like I literally was on a conversation yesterday and somebody came as like I want a neural network, and it's like why? And then when we started talking about it, it's like, yes, he needs to do a classification. Maybe you're a network and not the best choice for the Dataset. And because, again, any mission learning model has its underlying statistics. So like maybe we're over complicating. Well, we just it's a linear data. We just need something much simpler. So it's it's all about this conversation to understand, to understand also what assumptions are made when you're discussing, because we're talking about them, the human the usage, like when, for example, if we go back to the example of the recommendation engine for the themes, it also makes an assumption. As you ask these wis, you get assumption on when is the merchant going to use it, at what phase of their journey, that they're doing it early enough, right, like they're not going to used to shopify it. You get to understand that they don't maybe have a full theme for their business. So maybe that actually gets you another idea...

...of a different ranking or recommendation or like whatever additional tool that you can provide them in a separate step that can make this step easier for them. Right, like it can give you this sense of the merchant journey and the information around it and accordingly you can believe a different components and get to see not just that product even can be an ecosystem of products around it. That's really awesome and we had on the podcast last year Shaffre Baha, vp of data science that go check, which is also like a highly data mature organization, and the one thing he mentions is it's really important to embed data scientists and this different business teams simply because it enables that common business language and enables that skin in the game for data scientists on these solutions that they're developing. Do you share that worldview and how has that been effective for you at shopify? The disastance seen in shopify is a centralized craft, but you work with embedded teams. So what does that mean that each team is embedded within their own organization? The eason is and they need to be close to the business problem. Data Sciences cannot be behind the wall where you throw things over with questions and expect like proper answers to come on the other side, because even basic questions has has an assumption. So, for example, if I tell you how is the buyer's doing on our merchants website, so buyers are the customers buying from our merchants, merchants that are our own customers. Right, very simple question. What defines a buyer? Is it the one that comes to check out? Is it the one that just goes into brows is it a session that is starting, somebody just tops in and leaves? What defines a buyer? So there these discussions and these understanding and like being close to the problem space helps build, number one, this better mindset and understanding of how the things work that enables data scientists to do their jobs better, create the common language between the different groups, as well as creates a further, much bigger creos city about how the product itself works. That's really great. I love that and really I think this marks a great segue of how dated teams are shopify or leveling of their skill sets and becoming, you know, and adopting this learning, always learning mindset that you talked about, connecting back to the tradeoff maybe between short term priorities and longer term innovation investments. How do you approach that tradeoff as a leader in your own teams and how do you create time for your team to experiment with new skills? And you want to maybe walk is a bit into more detail of what these programs look like. Yeah, that's that's great, and there are multiple different programs. So we have something. I love this thing. I had this done many years ago, like seven or eight years now, and I've been using it ever since for every single team I lad called mini sprints. So it's the idea is a similar to the idea of hag days. Were like hey, hag days, everybody come and build, but you don't need to always invoke massive scale hack days. Somebody on the team has an idea that we believe in it. I say, you know what, I can make this twenty percent better. I just need a couple of things. Amazing, we can invoke minisprint. That person now invoke the minispriant. So it's not that they by themselves will do it. You can collect people from different groups and say, like you guys, for people, there's this vision, go build a minisprint, experiment with it and come back. So the investment is small. The investments too to three days. Sometimes I do it all the way to up to a week, but usually it's like it's like a spike. It's a small but the value of it is like it's cross teams. Doesn't have to be specific teams. It also creates high bond between the different groups that are working with it, but also it creates this space for quick innovation and experimentation to prove a comment. Like similar to the idea of spikes, but instead of pre planned within the same group, it's across the groups and it's invoked by either an important business need or a question, so that that allows a lot of like US experimenting fast and failing fast and failing forward right where the scaler,...

...this team, these four or five people, build upon and we usually diversify the people, so this way we continue building bonds and connections between the different teams. That's the worst case and the best case scenari you will learn something very useful, whether it positive learning or negative learning, which is like learning of stuff that didn't work or learning stuff that work. So that's that's the way way to do that, where it fosters like experimentation and innovation. But we also have a very specific cycle called what we call the vault projects, which is proposal, prototype, and then we go into the build. The building is we're building for a long term. We're building and spromizing and being able to build like robust, reliable engineering systems. But in the prototype, this is the phase where we it's not normal cycle, nor normal sprint or two, but in the prototyping is what you're standing up fast, unlocked the business and by having naming. So what I shared with you is two techniques for experimentation, as well as the tranciation between fast experiments and long term builds. Why am I saying that? Because of having naming for both, having phases for both, intensively calling them both allows us to focus on the tradeoffs that were making. The problem is when you're building something fast and putting it aside and forgetting that it's fast, Hacky, this is where technical debts arise for you to solve for that you need to have words and names for it and you need to have intentionality. You need different sheet between the quality of the output of these two phases and accordingly, if you have an output from a prototype, the expectation that it's in a data if you're lucky, if it's not an Alpha, but where the output of a productionized cycle or built cycle, it's a fully productionized system, so it's more robust, more liability. So by having this behaving the intentionality, when you're building your roadmap, you're clearly calling out what phase this is in, creates this space, the intention nality for you to ship quickly, to unlock the business, but also plan for the longer term and iterate for the longer term. Maybe one thing that I would like to touch on, and here, because I know that a lot of data scided to suffer from this is ad hoc questions that tend to eat most of people's time. I think there is a big opportune ty miss when we take the adduct question. We hate them and that's okay, that's like, I know they are destructive, but then we just walk away. But the truth is that ad how question came because there is a system that is missing or a system that is broken. If we pause reflect, maybe do an R S, a root cause analysis, like said with the group, is like, why do you think we're getting these questions? What is missing? You might find specific reporting that is missing, you might find specific pulling that is missing and accordingly you can move these fast type of questions into system building with an objective to reduce these. If you do this effectively, you might like, I have cases where we were very successful, to reduce adult questions by seventy or eight. Wow, that's really awesome and I want to kind of unpack a lot of these different initiatives and programs you have set in place, maybe starting off with the minis friends. How do you ensure here, in this situation as a leader, that the time spent by the team on minisprints right you mentioned that the worst case scenarios that people bond. How do you balance between the absolute objectives that we need to, you know, land this quarter and this space for the minispriants that we need to have within the quarters? So what's what's your what's the barometer that you use? I love that question and that's part also of the minisprint. Like that that I have, which is whenever you have an idea, it's not training them and you have an idea and you just go bill that you need to bubble it up and bounce it on your deeds, to to do the loves. And if that's the case, it's an investment that the leads do, because you never run it by yourself. You run it with a couple more people. So it's with internality and usually because we're making it visible, it's much side of this work. We get to know the beginning of the minispriant because there's a raw around it, like hey, we're starting a minisprint.

That the Dada, and then at the end of it people send to somebody of the minisprint. So accordingly, it could ate a sense of ownership and accountability so people are not just running randomly doing this because it's not seen. Because it's visible, people want to do good work and because it's communicated, people are intentional about is it worth it or not. And maybe touching upon the last element of your answer, system building and ad hoc requests. I know this is something data scientists really hey, how does that you mentioned here? That definitely, like ad hole requests, create that create that connection to understand what are the systems that we need to provide, the tools that we need to provide? Walk us through maybe how self service analytics you can solve a lot of these problems right of ad hoc requests, and maybe walk us through some examples of how you more in detail of how you were able to drop down that Hawk request by se because I know that there's a lot of data leaders listening to the show who want to learn that secret. Totally happy to so. The fact is, like ad hoc questions are not coming for without a real business needs, and if they are, we should actually say no, thank you, we have other more important stuff to do. But if they're coming in with the business need, let's look at what is recurring and what can we see. It is, for example, one that came in the team, the plus data team at the time, was very annoyed with it. Is the fact that every time we were doing like some email marketing back in the day, we needed to get like a list of emails, and this is a p I i. So it needs to go through data and we need to make sure that any cross checks to make sure that we're respecting like people who are opting in and out, placking out, and Da, DA, da Da. So it was and at the time, because the system was fragmented between like shopify and the Plus Merchant, Da Da, Da, Dad, we had to do many, many steps manually. This is this is a problem that takes a good two to three hours of every time I questioned like this comes the problem. Also, it used to come within like they already built the whole campaign and now they needed they needed in the next twenty four hours given to me in that right now type of thing. So if you look at this, this is definitely it can do. The for Systemization, you first of all, request need to have x amount of business days are turnaround, unless unless there's exceptions. Number two, a lot of these pieces in the system where like there was many or validations and stuff like that. That always be automated. So by doing this and creating right reporting with right alerts and, like right checks, we just build a system and that now is not as dreadful or not, like doesn't need as much involvement of a data scientists anymore, for every time we're sending like an email for our massive scale immersions. So that's a simple example. Like you can always been complaining about like, Oh my goodness, these questions come, but just seeing the pattern and each of them come doesn't come in the same data poll. It comes like, Oh, we're doing this campaign and we need support, data support, we're doing this new campaign, we need data support. Same things about like what's happening in our funnel. Very simple question. You can again every time answer the question or care very, very complex to answer, but if you do it enough times, you get to see like the answer is actually systematic charts that you're looking for. So you can then go build a reporting suite, and I use the word suite, I don't know say reporting dashboard. The reason I'm saying suite is because you need to think about what type of dashboards are you building and how do they interact with each other? If you think, if you think about you're reporting as a data product. That will set you up for success. The reason I'm saying this is because when you think about it as a product, you think about user experience, you think about navigation, you think about the the up time, think about a lot of things that actually big. Part of the reason dashboards get abandoned in the like dark hole of dashboards is because we don't think about these things. We create a lot of one off dashboards because it's easy, but don't create a navigation between them. We don't make sure that these answer cohesive, comprehensive questions. We just like each of them...

...answer is a random piece of the puzzle. But how do we navigate this? Now we're needing to pull the data scientists to do it, and data science he's doing that because it is dreadful work, not cool work. So if you step back and think about it from a data product side, it becomes now a data product and it becomes now with all the user experience that comes with it and up its running. It's easy to navigate and works a lot better. So again, this is how I solved for a lot of these things. I stepped back, looked at the add our questions are coming in and every time we see see a good collection, we try to systematize by solving for the underlying root cause. That's really great. And the key word here that you mentioned is product, right data product, and I think that when you develop a Dashwak reporting suite, as you mentioned, having that attention to use your experience and how your dashboard is going to get consumed. It's something that I think a lot of data scientists miss, necessarily because it is, at the end of day, a digital product that people will consume. It needs to have the same type of experiences or expectations that people have out of digital products. I do agree with that. Like again, the whole idea is to think about your own experience from data, like if your data scientists using, I don't know, Google analytics, or you're using your analytics on your twitter or any of the tools that use. What do you want to see and what makes sense to you? And if you start seeing the teams of experiences that you enjoy and start bringing these into the dashboards you build and bringing these into the tooling that you build, it becomes like again, it's easier to adopt and more enjoyable to use. For business stakeholders and affordingly less pull on your attention. So we definitely talked about how creating these systems for the wider organization helps out in when reducing the workloads for for data teams, but also helps out accelerating data driven decisions, improving business outcomes across the organization right and automates a lot of different tasks. How much does data culture, an organizational data literacy for non technical stakeholders play a role in creating to humors for the data team's outputs? That's a great question and I'll tell you it makes a huge difference. However, most organizations, like when you start to group and interactions, is similar to any relationship, right, like you don't start with everybody knowing how exactly to work with each other perfectly, even if they're coming from data driven previous role or organization, what have you. Doesn't mean that like it's just gonna Click. So by having high intentionality and showing value repetitive it tends to elevate up the data understand so we do have in shopify, like many courses for non data scientists to uplevel on data science. So, like, how do you understand charts? Or like how do you're at Squl if you're interested, or any of that. But I think the key, the real key that makes pivotal change is having the right level of conversation and showing value. If you're talking with complicated equations, You lose people. If you're talking with the language, and that goes back to the business acumen pieces, if you go back to talk about the business problems, which is a common shared language, regardless of the craft, people tend to listen more and tend to understand more. It's on us, as experts in our domains to be able to play this translator role where we talk from a business perspective and doesn't mean that we take it down or not, like talk fantacy, but like it means that we talk with what really matters, which is the business and the impact on on, on the customers, the consumers. Like I don't think talking with very high precision when it comes to the data science crafts serves us better when nobody understands. I think being understood is more important than being precise when you're talking, like if you're talking about your fond score and your sensitivity and your precision and your false positives, and you're like, if you're talking about all of these things, all of these...

...like we all use them in day to day life when we're talking to each other, but if you talk to a business stakeholder and you're talking about all of this and all of that just like doesn't register at all in their head, then you're both on the losing side of this conversation. But if you simplify it to what really matters and they are able to action your learnings because they understood it, you're both on the winning side of this. So it's really important to keep that in mind. I completely agree on the last point and I think it's extremely detrimental for data teams because if this happens in front of an executive, for example, where you're gonna have is loss of executive trust in the data team and less investment in the data team's longer term output and proactivity. And so I will tell you something funny. I actually did see that. So, for example, the data scientist runs an experiment and the experiment is set up as an e B test, but of course anything is set up has some form of caveats. So the data scientist comes in and sharing the inside with s Lt and this is a true story and just like abstracting, and this is scientists wants to be so precise in the words that they're using. So they went in and the experiment was like had a positive impact. Their intention getting into this meeting is to advocate for up like rolling this experiment out to everybody, and they went in and to be precise at trying to be unbiased, they did so much listing of the caveats that what happened is people in this meeting just assumed that this experiment is useless and they like destructed out. Although it was rigorous, it was done right, there was proper significance, everything was rightly just like again, we this data scientists got in their heads so much and they talked so much with the data science language. That's what happened. Is The opposite outcome of his of their intention when they went into this meeting. That's that's a great story and probably like would have been better off and said, you know, hey, I run this experiment, this is what we should do, this is what you can do, this is an expect like the expected outcome. And if you want to read the appendix, here's the appendix. Exactly or even, if you want to say, the Tabat this mine, but don't list everything you ever thought might that happened in the whole wide world just in pace for like with approb like. It doesn't doesn't work. That's a that's a great example. Now, as we reach the end of our child, I definitely not be remiss not to talk about some of the data science use cases that you've worked on on at shopify. So what were some of the highest impact data science solutions that you've developed that you can publicly share? Of course, wow, there's a lot of cool ones, so I will tell you. Definitely we talked a lot about shopify capital, which is offering loans for merchants to like scale their businesses, which is amazing. It is very dated to data driven product and it definitely does have a massive impact on merchants and their life. We also have shopify balance. We do have our product classification as well as our audiences. What we call audiences, which is like enabling merchants to market better, which is the return on investment, on ad spent for merchants so that they can actually skate, which is pretty pretty darn cool, because you think about as like rous tools or like organizations that build Ross is actually usually other very data driven, so they already have large data teams or they use third party tools to help with that. This is part of the shopify offering, which is pretty cool. Some of the ones that I am personally very excited and invested in, like some of them are internal, so like our own forecasting family of algorithms and like within the economical environment that's happening now. Forecasting your forecasting merchant count or or any of that is pretty hard problems, so this is pretty cool. The other ones is like best next action, which is the recommendation engine I was telling you, which is when shopify merchants start. Starting a business is not easy. There is a higher probability of failure because, like, entrepreneurship is hard. Not shopify aims to make it as simple as possible and removing as many barriers as possible and because of that, like, we do have this recommendation engine, which is that the best next action, which helps you, becomes your partner in your early journey...

...to make sure that it gets you to a successful start on shopify as well on entrepreneurship in general. So there is a lot that to be excited about them, to be proud of. I love these use cases and what I love the most about them is that, of course there's a lot of value for shopify that is generated from these use cases, but it really also provides a lot of value for would be entrepreneurs who would not have become entrepreneurs without these use cases potentially, and that's amazing to see. So, connecting back then, to the theme of the episode, kind of final question from my side. What were some of the biggest learnings for you from working from these projects? Yeah, that's a great question. So, effecting on it, I would say number one is, as I shared earlier, like always start simple, because when you start simplely created baseline and you understand what's possible with the lowest friction points. So even even something like best next action. Instead of starting, we didn't start with like the fancies algorithm that currently have. We started with okay, what about we just organize the organize this list like we're gonna do analysis and like force organize them and then maybe we staffing them automatically and then maybe we feed this machine learning in and that we created on and so starting to simple made us understand the impact we experimented through so that we learned the value, as we iterated, making sure that we got check our hypothesis. So number one, start simple. Number two, experiments to learn and they create to also not to fall into confirmation biased right like to make sure that you're you're really got checking. Last but not least, creating a space for experimentation and minisprints actually tends to surprise me every single time. Like I am, I'm a big advocate of it. A lot of our cool internal solutions started as a minisprint that then stood up to become a fully productionized product after so this was very helpful. I would definitely encourage for us to continue doing that and for others to use it. That's really great. And maybe you know on a personal nerve as well, what were some of the biggest learnings for you from doing from an individual contributor to someone who manages data teams as well, because that that's a jump as well. That's not talked a lot about in data science, with challenges and the different ballpark that you're in as a data leader. I'll tell you honestly, every day is a learning but I'll tell you bank then, when I did this transition. I didn't many years ago, but I think the hardest thing that and I still see people who are moving from an individual contributor into a leader struggle with, is knowing to trust, to let go and create this space for others to learn and full forward. Sometimes, as an individual contributor, especially when you're at the top of your craft, and this is why you got promoted into manager, you think it's like, oh, it's just like I can do it in fifteen minutes. Yes, you can do maybe you can do it in fifteen minutes and that other person might end up doing it in two hours, which is like eight times how much you do it. But like, if you let them do it in eight hours today, tomorrow they will do it in two hours, which is like eight times the time. Like, if you let them do it into two hours, then tomorrow they will do it into one hour and then the day after they will do it in half an hour. And then you scaled yourself up. As a manager, don't forget that your job is to work through others and lift them up around you, because it's not the like. The best managers are not the smartest people at the table. The best managers are the ones who have very strong people around them, where everybody on the team lifts each other up. So that's a key reminder. It's not just hiding the great people and getting out of their way, and I know this is a very popular code from Steve Jobs. It's having great and give them a space to learn and to level you up and you level them up. So it's an environment of shared learning and I always call it collaborative intelligence because you get together smarter. That's such an awesome ending. Now, finally, Alla, do you have any final call to action before we wrap up today's episode? All I can say is...

...maybe my final call of action is, like, data science is a great field and there is a lot that we can do to still shape it. So have fun, don't get stuck at a tool or method or just like focus on the business problems. This is our superpowers. We are problem solvers. Data scientists are problem solvers, so focus on that and I think a lot of good will come after thank you so much. I look for coming on data framed. Thank you all. I was excited to be here and I'm have to have the conversation. Thank you so much for having you've been listening to data framed a podcast by data camp. Keep connected with us by subscribing to the show in your favorite podcast player. Please give us a rating, leave a comment and share episodes you love. That helps us keep delivering insights into all things data. Thanks for listening. Until next time,.

In-Stream Audio Search

NEW

Search across all episodes within this podcast

Episodes (121)