DataFramed
DataFramed

Episode 94 · 4 months ago

#94 How Data Science Enables Better Decisions at Merck

ABOUT THIS EPISODE

In pharmaceuticals, wrong decisions can not only cost a company revenue, but they can also cost people their lives. With stakes so high, it’s vital that pharmaceutical companies have robust systems and processes in place to accurately gather, analyze, and interpret data and turn it into actionable steps to solving health issues.

Suman Giri is the Global Head of Data Science of the Human Health Division at Merck, a biopharmaceutical research company that works to develop innovative health solutions for both people and animals. Suman joins the show today to share how Merck is using data to improve organizational decision-making, medical research outcomes, and how data science is transforming the pharmaceutical industry at scale. He also shares some of the biggest challenges facing the industry right now and what new trends are on the horizon.

You're listening to data framed, a podcast by data camp. In this show you'll hear all the latest trends and insights in data science. Whether you're just getting started in your data career or you're a data leader looking to scale data driven decisions in your organization, join us for in depth discussions with data and analytics leaders at the forefront of the data revolution. Let's dive right in. Hello everyone, this is adult data science evangelists and educator at data camp. A few episodes back, we had current cats on the podcast discuss how data science is transforming healthcare. A lot of the themes that emerge in the episode is that, while there are incredible gains happening on the research side, there are many ways data science is moving the needle and improving health outcomes today. This is just as much the case in pharmaceuticals today, and this is why I'm so excited it to chat with Sumon jury on today's podcast. Sumon jury is the global head of data science of the Human Health Division at Mark. He's held a variety of data leadership roles throughout his career and has a PhD in Advanced Infrastructure Systems from Carnegie Millon University and, throughout the episode, displayed incredible insights when it comes to the state of data science and pharmaceuticals. Throughout our chat we talked about how data science is transforming the Pharmaceuticals Industry today. The main data science challenges facing Pharma Organizations Data Interoperability and data ethics, how to approach data culture, the right skill makes for data teams and more. If you enjoy this episode, make sure to rate, subscribe and comment, but only if you like to. Now let's dive right in. So, man, it's great to have you on the show. Thanks for having me. I am excited to talk to you about data science and machine learning and pharmaceuticals. You're experience leading data science at mark and more. But before I'd love to learn about your background and what got you into the data space. Yeah, sure. So, my name is Sumon Giddy. I head data science at Mark Human Health. Mark, as you know, is one of world's largest pharmaceutical companies. We have world leading products in the oncology, vaccines and cardiovascular space. So in my role at Mark, I'm responsible for all of the commercial analytics, data science and ML OPS that takes place for our early stage advantage and inline products, and my background is in mathematics and engineering. I did my PhD from Carnig melon where I researched machine learning, algorithms and the energy efficiency space, and since then most of my career has been spent in different health care related companies. So I've worked in payer organizations, payer providers, health tech and as of a year ago, I started my new role in mark. I landed in data science just by the virtue of my education, which is what I studied, so it was easy for me to just do that as a profession. I landed in healthcare a little bit by accident, one of the areas I was curious about, and somehow one thing led to the other and I was like fully immersed in the health care experience. So, to start off our conversation, given your experience as a data leader and pharmaceuticals and healthcare, I'd love to understand the current state of data science and machine learning and pharmaceuticals. Arguably, the pharmaceutical space and healthcare overall, is the most exciting space for data science because of the potential value of data science and machine learning applications can provide in this space. Given your experience as a data leader and pharmaceuticals, I'd love to understand how you would describe what the landscape of data science and pharmaceuticals looks like today and how it has evolved over the past few years. I think the major fields within Pharma where data science gets used are drug development and discovery, diagnostics, clinical trials and manufacturing and supply chain, and commercial and regulatory process so these are kind of the major areas. But to give you a sense of where maybe that the impact is, we can start with full especially around m RNA vaccines. There was a strong rule that EI DAD in acceluting the discovery and deployment of...

...called vaccines. Then there was the Alpha fold announcement this past year from deep mine, which basically solved the problem of protein folding is going to actually drug discovery significantly. We're also seeing some interesting use cases in the clinical operations, like selection of locations for clinical ryle's compound screening to test in pre clinical trial. And then there's a commercial space, which is where I sit, where we're seeing a lot of advanced machine learning being applied for effective engagement and promotional marketing for inline products. Finally, there's also some neat applications and and potential on the reimborsement side, with partnerships with payers for valubies outcomes and similar things. So a lot of interesting things happening in the space and to get a sense, there seems to be developments within the research space that you see today like Alpha fold, as you mentioned, a lot of the different innovations that you see, also developments on the commercial side and applications of machine learning and data science. What are some of the main areas of value you've seen? The a science machine learning pushed the envelope forward for organizations today working on the pharmaceutical space. Yeah, this is a tough one for me to pick because I think the envelope is being pushed in all directions and then you everywhere you look like people are doing amazing stuff using applications of the science and machine learning. But if I had to pick one, I would go back to the drug trial space right because that's been eye opening forming as somebody relatively new to Pharma, because the implications to patient safety are huge. So in drug trials we're seeing intelligent patient selection based on multiple data sources and more targeted criteria and sometimes even using bio marketer or genetic information. We're also seeing automation of a bunch of pre clinical quality control steps using AI and machine learning, which again we saw during the covid vaccines. We're also seeing applications of knownet of things such IOT and real time patient monitoring for patients and active trials, which helps avoid adverse events coactively. And then the area of even ofudication, as it's called in clinical trials, can reduce the time to market for drug significantly. So it has huge implications from just our innovation standpoint. Then there's a lot of interesting work done around simulations on pre trial compounds, using data from similar molecules and known effects to understand adverse implications before the molecule even reaches trial right. So this enhances patient safety significantly. And finally, I'LL BE REMISS IF I didn't mention how quickly the farmer industry was able to manufactor and distribute the vaccines for covid, and a lot of that logistics and supply chain was enabled by data science and machine learning. So I think this is the space that I would pick to answer your question, just because you forced me to pick one, because I could just as easily go on about the other areas where where something like this is happening as well. That's really awesome and I definitely want to expand into the research areas you discuss here, especially with covid vaccine. That's a very interesting topic to expand upon. But interestingly, following up on your last point here on around supply chain optimization and kind of innovation there, when we hear a lot of use cases of data science and machine learning and pharmaceuticals and the media, especially in popular discourse, we always talk about drug discovery. These are inspiring use cases related to drug discovery algorithms, right, but there are also a lot of operational aspects where data science can have a lot of value and kind of accelerate the value for patients and improve patient quality of life. You mentioned here supply chain and clinical trials operations. Do you mind expanding on that area go deeply a bit in some of these use cases as well as your experience how you've seen like that value manifest for patients? So you're absolutely right. Most people when they hear, let's say, machine learning in Pharma, they think of drug discovery and maybe one day will get to a stage where algorithms can predict the efficacy of a molecule in humans without having to go to trials. But until then there's a lot that happens in between where algorithms come both to enhance efficiency and productivity. So I already talked about clinical trials and how machine learning is already enhancing rerecient safety. Data Science also has a role to play in understanding, like, the diseases we...

...want to create like. That's the first step, right. There is a huge play there the prevalence of such diseases, just to make sure that there's a financially viable model for drug development and distribution. There is a strong role data science place in creating personalized medicine strategies and accelerating the way we design and developing drugs for inline products. In the commercial space, there's a lot of sophisticated data science that takes place today, especially as it pertains to forecasting, calculating the effect of Promotional Marketing, reviewing promotional content for compliance and then competitive threat analysis is a huge one. Right from a commercial standpoint, to understand, let's say, that the sales and marketing efforts should be focused, and then just creating personalized engagement strategies for healthcare professionals to make them aware of the drug and its benefits. Then there is also use cases like AI driven plano grams that helping pro auctivity, automated data matching, promotional modeling, Real Disease Patient Finding, etcetera, where machine learning is heavily leveraged. So basically, that's a quick preview of the areas apart from just played right out clinical discovery, where machine learning is an important role and in terms of kind of operationalization, given the machine learning research area is very much so still relatively in research phase. A lot of this is still, in some sense ideation. Do you see that a lot of these use cases are actually operationalized today within the pharmaceutical space and they're actually delivering value for Pharma Companies Today? That's a great question. So this is a framework that I look at it with. Right, so, there are companies where data science is the core product. Take Uber as an example. Yes, on its surfaces an APP, but everything you do in it has been facilitated by some algorithms that works on data, which inevitably becomes the product that we use. So in those companies where this relentless push for maybe productionalization comes into play. Then there's a second tier of companies, maybe where you are dealing with real time data, so maybe the companies like Walmart or target, where the data is coming in and in US every speed second and somehow you need to make intelligent decisions in real time, right. And then there's a third category of companies, which I think is where we sit, which is data science is not part of your core product, but it is a decision support too, right. So our core product is obviously like the medical drugs that we manufacture, but the design, discovery, ideation and distribution of it is enabled by data science. Now, in this context and in this framework, there are maybe models that don't necessarily meet the criteria of full blown, let's say, production, but they're just, let's say, dashboards that have some intelligent component to it that is helping somebody make a quick decision. Or maybe there are questions that somebody has that ai helps them or machine learning helps them get to some sort of strategy around. What I'm trying to say is I think these kinds of models and algorithms also have a place, even though they might not be in what would be considered let's say production in the traditional death sense, that farm. I've seen a lot of this gets leverage primarily because we have the luxury of making decisions in batch mode, right, like we don't have to make decisions in real time all the time. But, having said that, there's obviously a class of models, especially in the commercial space, where perhaps the considerations of safety and efficacy are a little bit less nuanced. That's where a lot of the models are in production. So areas like next best action, right where we are enabling sales reps and and marketers to come up with the optimal engagement strategy. Like these are models that I've seen in being full production mode, using a pretty sophisticated and alops architecture, and I'm sure there are parallels around media, mixed modeling and the salesforce optimization, etcetera. So there are other models that are in production as well. So it's a good mix of models that are...

...maybe slightly ad hoc or one time in nature versus model that are in a full grow production. I love how you cross section the nature of the product you serve with a degree of operationalization needed for that. Segueing here, I'd love to deep tive much more into the challenges of working in data science and machine learning in the pharmaceutical space. What would you say are the biggest challenges? They are specific for data science in the farming industry. So the biggest challenge is around data, and it gets worse as you move outside of the US, because we are a global company. We operate in more than sixty countries and data gets increasingly sparse and hard to access as you go outside of the US. And without data you have no way to identify the prevalence of a disease, no way to know whether a molecule is financially viable and no way to even market it effectively. Today, the farming industry relies on a lot of syndicated data, but the lack of mobility to bring simple data sources like claims and the HR right as, as a simple example, holds us back. Forget about bio marker and genetic information like that's a whole new level of complexity, but we're strolling to do even the basic things. Over the past few years there's been some interesting innovation in this space. There's companies like data event that they're trying to bridge this gap, but I think it's still early days in this space. I do believe that there's a real opportunity here for solutions in data identification and synthetic data generation and federated learning to accelerate data science and farmers significantly. Having said that, I do think that the regulatory infrastructure needs to evolve in tandem as well to allow for this kind of innovation. Obviously, since we're talking about challenges from a Pharma's side, we also have a hard time recruiting the right talent which is associated media science in Pharma. I think there's also a third challenge, which is mindset right. So data science by definition has the word science in it. So it requires a little bit of cultural shift and how you think about our processes. And then this is cool for healthcare in gender along with it, I found it difficult to do effective change management with the consumers of data science. So just too maybe quickly summarize, I think data, talent and culture is how I would describe that the bigger challenges in data science and Pharmat that's perfect. So let's span on these one on one. You know, when thinking about some of these obstacles, let's say, for example, data right collection interoperability, collection access? When needs to change so that data science innovation here in pharmaceuticals accelerates? Is it regulatory innovation, as you mentioned as key part of it? Industry centers that need to evolve? What do you think what needs to be unlocked here to be able to push the envelope for work when it comes to data? It's a great question. Again, I obviously talk about data access and interoperability. It's an active area of I think innovation in Farma is how I would characterize it, because everybody knows it's a problem and everybody knows that that's where the bottle neck lies. But I've seen a few major efforts in this field that I personally find exciting. Right. So, there are pharmal companies that are now beginning to collaborate on share ING anonymized clinical trial information some form. Companies have platforms where researchers can go in and submit their molecules, and then there are algorithms that s quore them on their potential and then the interest of maybe decentralized data sharing and end collaboration. So basically we need more of this, a title collaboration between companies, C Ros, which is the clinical research organizations, academia and the government. I think the regulation and innovation are obviously two opposing forces by design, so there will always be a push and pull, and issues like data privacy are extremely important, but I think there's still a wide gap between what we should be able to do to improve lives and what we are able to do today, just because our regulatory infrastructure hasn't brought up. So there definitely needs to be a case for a close examination of what are the hurdles from a regulatory side that are preventing us from doing what we supposedly should be able to do, and there's probably some startups in this field that we'll see, or maybe some changes in this field that...

...we will see prop up in the next few years. And again, and from the data side, I mentioned data event, but there's probably a bunch of other space here that can be taken by innovative companies who can enable these their access to data, not just Pharma data, but also, let's say, social determinants of health and publicly available data that can also guide sound decision making, especially an area or time frame like right now, where there's a bunch of environmental factors right like over is still a thing. There is like geopolitical considerations today, with all the wars going on, etcetera. So all of these data will somehow inform some sort of strategy and I think just having some way to access that easier fashion so that research and innovation can take place, is going to be key. You mentioned hearing when it comes to the data privacy in the applications of data science in pharmaceutical and healthcare in general, a major obstacle to data science and healthcare is biased and of the cool use of AI. I love how you can evaluate the risk of harmful outcomes of machine learning and AI in parmaceuticles and how you go about minimizing it, especially when having this regulatory discussion to be able to create that data access. Yes, so this is something I spent a lot of my time thinking about right so maybe just going to quickly share three examples with you that I learned about recently that I've been thinking about a lot, and this is all data science work related to Covid and the reason I'm sharing this is just to highlight how big of an issue this is and how underreported and underthought this area is. So number one for covid. There was a group of researchers who used chest scans of children who did not have covid as examples of non covid cases, and their intendo was probably to identify covid using chest scans, but what the Algorithm learned was how to identify children from adults and not covid. But these are models that made it to a publication state, so like. Because there was no framework for like ethical use or bias measurement in place, this was able to sit through the so to speak. There's another research where they used chest scans taken while patients were lying down and while they were standing up. Right now, patients were lying down are more likely to be sit so what the AI in turn learned was to predict the risk from their position and not their actual risk. So again another example where maybe the intention was right but because the framework wasn't there to look through the downstream consequences, ended up doing the wrong thing. And then the third example they will share with you is an example of an algorithm that is found to pick up on the text font that certain hospitals used to label the scans right, because they were probably doing o c r, and then they were doing a bunch of things around in the image recognition and whatnot, but at the end of the day, the hospitals that had more serious caseloads and the ponts associated with them became predictors of Covid risk and not actual covid drisk. So these are kind of three examples such just lucidate like how there are real issues with relying exclusively on algorithms without considering the biases and data right. So to mitigate this, at Mark we closely tie ourselves to what we call the good governance framework. Before we push a model in production, we check for explainability, fairness, robustness, transparency and privacy, which we believe to be the major pillars of ethical views of data and algorithms. We basically, in tactical terms, we have a de biasing layer that gets applied throughout the model life cycle, from data to model to it's a model Godles, so we're not causing any inadvertent consequences, but this is obviously not the final form of it. We have an ongoing partnership with Carnegue Mellon University where we continue to research ways to understand the downstream implications of heterogeneating in our data and models. So all of this is to something we think about very seriously and we are continuing to iterate on our approach to make sure that we don't end up being the four example on this list that...

I just share with you. That's really awesome and it really elucidate is how a lot of this research and a lot of these applications that are exposed to have these bias and these issues really stem from a great place. Right. This is a use case of great intention but can be very harmful downstream if a lot of the bias in your data is actually biased that comes from gender or racial attributes or any of that type of demographical data. And I'd love to unpack even further. What do you think needs to change on the data preprocessing side and kind of the data collection side to be able to unbias a lot of this data? And I think that the premise right. So there's always going to be biased in the data. As long as there is some sort of, let's say, variants in your data, there's always going to be biased. So bias is just part of the experience, if you will. Now, I think the right way to do this is to think about this from the get go, right, like what are the implications of said bias and what are frameworks for us to go out and measure it? Is should be part of data scientists two kids from the get grow. A lot of the times the data that we deal with and has already been collected. So we don't get that voice in the input stage that we're working with, let's say third party data or syndicated data that we purchase. So we have very limited input into how it gets collected. But that doesn't mean that once we have it, we don't get to evaluate it for like inherent he origen ating in the system and what the downstream implications could be. Right. So I think part of it is education, like it's such a new field that it's not part of our vocabulary, even like most data scientists haven't taken this class or maybe even heard about this as part of their education. So just education and the solid framework, I think, is the way to solve this, and just constant iteration. Right, like I think the experimentation is part of data science. I think that's what makes it a science. So just fully kind of experimenting and understanding what are the implications of a model level in reproduction and then touched certain lives. And what is the bias that is inherently built into the whole system? I think needs to be an ongoing conversation. I mean to answer your question. I do. I don't think I have a good answer for what needs to change in the data collection side, apart from once at least the data is collected like people should be evaluating it and not just pushing it into a model to modeling exercises. That needs to be a pause and think before you start pushing it into feature engineering or modeling framework. That's really great. And certainly back to the other challenges you mentioned around data science and pharmaceuticals, I'd loves to compact that talent component, and you mentioned here education. So what has been the most challenging aspect of finding the right talent within data science and pharmaceuticals? And what does a great talent profile look like well, within data science and pharmaceuticals? So it's a two part question, right. So let me answer the second one because that's the easy one, which is what does a great talent profile look like for, let's say, data science in the Pharma Industry? So I think the biggest asset of data science can have is good problem solving skills. Right, like forget about data science or the technical aspects a lot of the Times. What I find the true value of data scientists in, again, the third category of companies that I mentioned earlier, which is where data science is primarily used as a decision support tool, is to understand the context in which the decision is being made, right, and then formally that into some sort of framework that can be maybe improved by the use of an algorithm or use of some sort of intelligent automation. Right. So I think problem solving is a key component to a high performing data scientists. And then there's aspects of collaboration, because usually data science doesn't happen in vacuum, right, it's not a back end job, so to speak. You have to be continuously iterating with your stakeholders, pushing back on certain things that don't make sense and maybe giving in and in certain things that are just required as to drive things forward. So just that level of collaboration and communication, I think it is a set can key...

...component. And third is, I would say, the foundational aspects of data science and machine learning, right, like things like bias and variants, like things like the assumptions behind like line I regression, because the problem I see in today's talent is they're so enamored by the fancy stuff, let's say federated learning or deep learning or this or that, because they have kind of overlooked the fundamentals, and that's again another thing that further perpetuates the biases that we have. What we look for somebody who has the fundamentals down, because the day of a data scientist that just imports like x model and then applies it is increasingly numbered, right, especially with auto ml and just the ease of use of certain tools. I think the true differentiator is going to be a data scientist who can frame a business problem in a context that makes sense and drives value and is able to just execute in a collaborative fashion. Now, the fourth thing we look for, and this is not true for all the data scientists per depending on our need, we look for somebody who is heavy on the off side, right, so the ML off side. So again, as I said, model building is an increasingly kind of product sized skill set. Like like today you could just go and get a data robot or a store driver less ai or a data I q and like. They will run through all of the models for you, create a thousand different features for you and it's probably going to be better barrying a few cases than what data scientists can do with the limited set of experiments that they can run. But where a data scientist is going to be needed is to take that model and put it in into some sort of workflow that makes sense for the business. Right. So this will include components like model governance, like are you checking for drifting your data? Right? Is this integrated into whatever a P I is and user platform leads out of? And then like does this have components of C I C D built into this? So these are the things that I think are maybe best practices from the software engineering slash devops world that are kind of story transitioning over to the machine learning side, and that's going to be an increasingly rare skill set. So we do filter for that as well as we look to hire data scientists that these days. Again, to summarize, a high performing data scientists for US good problem solving skills, good collaboration and communication, good foundational skills in especially as it pertains to statistics and machine learning. Class engineering component to their skill set, at least the aptitude to pick it up is what we look for now. I know you had a question before this today. You See, right like, this is the age of the Great Resignation, so obviously there is a lot of talent mobility. Just the biggest challenge I see is just a career pathway for data scientists right like, where they can feel like they are being productive, they have autonomy and they have a sense of community. I think creating that environment is the biggest challenge. Not a lot of companies do this natively and at mark we're trying to solve this by just creating a separate rule just for data science community leaders. You know, we're they will be in charge of creating, let's say, up scaling pathways and talent growth pathways and a sense of community where they said they can learn and grow. But again, it's an experiment that we have in progress and it is a challenge to retain like high performers just because it's a relatively new field, especially in places like healthcare. So just creating that career mobility and growth pathways is an ongoing challenge. That's awesome and I thank you so much for this really holistic answer and harping on the OP skills for data scientists. Do you think that in the future, a standard data scientists will need to have the OP skills, or do you think that a new role will emerge, machine learning engineer, machine learning ops engineer? Where do you see the data science role being general to a certain extent, or do you see specializing more and more over time? I think it again, it goes back to the industry, right. So if we limit ourselves to the three kinds of industries, this answer is going to differ based on what industry you're talking about. If data science is a core component, you're basically a product and there needs to be a strong OPS component, right. So I think in those kind of settings you will increasingly find that your data...

...scientists profile closely resembles an ML engineering profile, and that's probably true for the second category of companies as well, which where data is not their core product or data science is not their poor products, but they have to make decisions in real time because a lot of the decisions need to be integrated into their systems. I think it's the third profile of companies, where it's primarily a decision support tool where there will still be room for statisticians and data scientists who can inform, let's say, decision making without necessarily having to go into full ops mode. But that's set, I think is going to get increasingly smaller with time and just neglecting one large piece of kind of data scientists, which is people in research roles within organizations, right like Google labs or maybe facebook labs, where perhaps there's still room for folks that are not ops heavy but fundamentally want to focus on theoretical algorithms. But again, those are what I would consider increasingly shrinking profiles. That's really great. I'd love to pivot also discuss your work leading data and mark as a data and AI leader. What are some of the exciting use cases you've seen or worked on at mark that really excited you as a day leader? Yeah, so, when I was not at Mark and when I was reading about mark and considering it as a potential place of employment, right. So, there were some publicly available in the use cases that I ran across that had been really excited. Right. So, there was a lot of work done in continuous drug manufacturing. So we were basically revamping how we do manufacturing within our branches to facilitate, let's say, intelligent automation and continuous drug manufacturing. There's a lot of work that we were doing in our supply chain and logistics as well that involved data science and machine learning, so that was exciting too. And then in the commercial side, there was a lot of work that we did in intelligent and effective engagement. Right. So how do you figure out what the right message and the right channel and the right content and the right cadence is to engage your customers with so that they see the value and benefit of the life saving products that we generate? I think that's where a lot of machine learning and data science comes in, because it is fundamentally an intractable problem if you try to do it by brute force. Right. So somehow you need to have this predictive and intelligent component to it. So I think these are some use cases that I was aware of even before I joined and as I have entered this space, there's a lot of interesting work happening in, let's say, natural language processing, to look at, let's say the promotional content that we send out and the engagement that happens with that to understand what is it that is resonating in our messages and what is not resonating so effectually right, so that we can be more curated in our engagement efforts. So that's a huge area of focus for us. There are other areas around, let's say, next Gen and advanced predictive modeling and forecasting to understand, like what are the implications of certain decisions that we meet today five years down the line? So those kind of interesting work. is also happening in the commercial space, which I find extremely exciting. That's really awesome and you know as a day leader, one of the challenges you mentioned is change management data culture. You're not only tasked with operationalizing data science use case that have an impact on the short term. We're also focused on long term transformational projects like change management, enabling data culture and even research and development to drive long term use cases. How do you balance between these different priorities and these initiatives, and how do you allocate your team's time and resources? As such, it's a loaded questions. So I think the balance between business as usual and innovation it is something we continue to strike for like I'll share with you an interesting experiment that we're doing and and keep on using the word experiment within our old structure because we're constantly undergoing...

...transformations. So we have strong leaders who believe in just being agile and adaptable. So there's a bunch of experiments that we have already in flight, both from our ways of working and cultural perspective. So that's why I keep on going back to the term experiment. So we today have this old structure where majority of our data scientists sit in a flex pool, right, and as a result, they get to work on different kinds of projects, so they're not tied to one thing that they do throughout the year. They can work on let's say, one franchise or one business unit or one type of problem, right. So that helps maintain a good balance between innovation and execution. We do also have a dedicated kind of research and innovation function within my team and they focus exclusively on external partnerships, branding, innovation and best practices, right. So they provide that extra bandwidth for innovation, if you will, that kind of permits throughout the organization. And then within the larger like C do organization, we have, like a too earlier, data science community champions, right, like those are dedicated roles that we have who are constantly iterating on our culture through events and activities to create that sense of community. We do follow constructs like objectives and key results, like okay ours, to keep and track our goals and we have dedicated in a spot there for various things that, like, you know, vardios killing, things that we would like to do. And then this is concept that I think Paul Graham from a white combinator, Pioneer, which is maker time versus manager times. It's very easy as a data scientist to be in meetings all day because you're just a glue that connects everything. So everybody would want to data scientists in their meeting and that we call, let's a loosely, manager time. So how do you kind of find the right balance between like heads down problem solving mode, which is maker time, versus let's say more managing people are managing stakeholders time? And loosely we try to strive for being a sixty forty, where sixty percent of our time we spend on problem solving and Foty we spend on let's say, of the quote and boot managerial stuff as a group. So obviously this looks different from individual to individual. But like that's loose kind of compass that we drive towards. So these are some kind of guard rails that we have that help us the point in the right direction. That's really great and I love this analogy from Paul Gram. I think that's a really great way to think about it. Now we mentioned as well data culture here and trying to change management. How do you view the importance and the challenge of data culture when enabling the adoption of the solutions you create, and what have been some of the ways you've been able to move the needle on data culture? Data culture is huge, right. So I mean this is what I refer to as change management, because there's two kinds of culture, right, culture that is four data scientists. So that obviously needs to be there so that we can be happy productive and we can retain them. And the major components there for effective data culture is autonomy, like a sense of autonomy on their work, a sense of community, right, so that they feel like they're part of something larger, and then just a sense of growth, right, so they feel like they are learning either a technical side or from a domain side and improving consistently. So I think those are three key pillars we anchor around. But there's a whole different side of data culture, which is data culture in the organization, right. So data driven mindset. This is commonly referred to as and unless we are asking the right questions, we will never be working on the right problems. Right. So to solve this at work, what we have is basically an analytics translator role, you know. So within the larger organization, which is that the CDO organization, we have dedicated data professionals. The right way to think about them is maybe they have a major in data science and a minor in business, right. So they will sit very closely with our business stakeholders and they act as thought partners, you know. So every time there is a question, there is a certain process that they follow. Right, for instance, what is the action that this question is going to drive? Right, question, the answer is hundred or, let's say, the answer is zero, like how does your decision make me change? Is this like an ad hoc thing, or is there a larger problem that you're trying to get too which has more of a predictive or...

...or a product component to it? So just kind of having that level of dialogue over time, I think is going to generate maybe more advanced data driven thinking and a change in data culture. So again, another experiment that we have in progress, but we take this aspect of culture very seriously. Yeah, that's really great and completely agree with you. I think having someone in the room that can speak both languages, the business language and the data science language, will elevate everyone's skill set, whether that's the business folks getting more of a data language or the data folks getting more of a commercial acumen. Now, so, man, as we close out, I'd love to look into the future and see what you think are the data trends and innovations that you're particularly looking forward to seeing within the pharmaceutical and healthcare space in the next few years. Yeah, yeah, so maybe I'll limited to three things because I am looking forward to a lot of things. Like everybody else, I'm looking to see how Alpha four and the major an announcement last year is going to accelerate drug discovery, so that I'll be watching very closely. Second I think there's a lot of work that's been done on NLP and conversational ai outside of healthcare right and I'm looking very closely to see how that translates into the pharmaceutical industry, because we could definitely use some advanced kind of methods on structured data that we deal with on a regular basis. And then third is a little bit out there, but it's a space on watching very closely around let's say web three and block chains and how it is going to affect marketing right, especially this concept of let's say privacy first, first party data, where end users have control over their data and there's no kind of middleman in the between, so there's no google and facebook that is kind of trying to track you, and like companies like let's say mark will have direct access to your data with your consent and you get to monetize out of it. Will change the commercial landscape fundamental right, because today the data we have is reliant on what let's say data aggregators provide us. Tomorrow it's going to be all First Party data or data that like you, provide to us directly. And this concept of linking across multiples, let's say, of your web experiences is going to be easy because of it's just one idea that you have throughout your web three experience. So I mean very early days for with three. Obviously it's still kind of in the ideation phase in many ways, but it's space on watching very closely because that's going to have huge implications of how we do sales and marketing and analytics around it in the future. That's really awesome and really exciting now as so, man, as we close out, you have any final call to action for today's listeners? First of all, stay safe. I know it doesn't feel like covid is still around, but it very much is, so continue following guidelines. Second, I would just say follow the life sciences data science space very closely, because the majority of the destructions are going to happen in this space very soon. And then, third I will say we're hiring, so if guys are looking for opportunities, please comply to reach out. That's awesome. Thank you so much, someone for coming on the PODCAST. Of course, thank you for having me. Had I had a good time. You've been listening to data framed, a podcast by data camp. Keep connected with us by subscribing to the show in your favorite podcast player. Please give us a rating, leave a comment and share episodes you love. That helps us keep delivering insights into all things data. Thanks for listening. Until next time,.

In-Stream Audio Search

NEW

Search across all episodes within this podcast

Episodes (116)