top of page
Schedule a Consultation

Have you read our blog and still have questions? We offer no-cost consultations to portfolio firms and organizations seeking further advice or assistance in strengthening and growing their Product and Tech teams. 

Sign up now to schedule your session with one of our expert principals. 

Recent Posts
What's next?

Interna principals present at events worldwide. We send out a monthly newsletter with information on where to find us next and how to stream our talks to wherever you are. Our newsletter is filled with additional tips and tricks for executive leadership and the latest stories in global tech news.

 

Stay up-to-date by subscribing to our newsletter using the button below.  

DavidSubarHS1 (2).jpg
I'm David Subar,
Managing Partner of Interna.

 

We enable technology companies to ship better products faster, to achieve product-market fit more quickly, and to deploy capital more efficiently.

 

You might recognize some of our clients. They range in size from small, six-member startups to the Walt Disney Company. We've helped companies such as Pluto on their way to a $340MM sale to Viacom, and Lynda.com on their path to a $1.5B sale to Linkedin.

From Neuroimaging to Chief AI Officer: Jon Morra’s Journey






In a revealing discussion with David Subar, Jon Morra, a seasoned technology executive at Zefr, delves into the nuanced utilization of machine learning across different sectors. With a background in biomedical engineering and extensive experience in data quality and machine learning policies, Morra explores the strategic deployment of AI technologies, emphasizing the importance of discerning where and how to implement these tools effectively. He discusses the substantial costs associated with advanced AI models, such as LLMs from OpenAI, and provides insights on making judicious use of such technologies in business operations to optimize performance without incurring unnecessary expenses.


Further, Morra shares his journey from his academic pursuits in neuroimaging at UCLA to spearheading innovative projects at Zefr. He touches on his initial ventures into entrepreneurship, followed by roles at eHarmony and eventually Zefr, where he now leads as Chief AI Officer. Throughout the conversation, Morra underscores the challenges and advancements in machine learning, from early algorithmic trials to contemporary issues in AI deployment and policy formulation. This rich dialogue not only sheds light on Morra's technical expertise and leadership in integrating AI with corporate strategy but also reflects on the broader implications of AI in enhancing business processes and decision-making frameworks.


Some of the things Jon covers are:


  • Embracing Vector Databases

Why they are necessary and how they can supplement and supplant ML.


  • Machine Learning Evolution

How ML has changed and why some older techniques are still better from some.


  • Effective ML Ops Strategy

It is not just about algorithms - you need a process to make ML succeed.


  • Cost-Effective AI Strategies

Foundation models are expensive. If your company uses them you need a model that pays back. Jon tells us how Zefr does it.


  • From Tech to Leader

Few of the things you need to be an AI leader are taught. Jon tells us his path.



Timestamps


00:00 Introduction and Background


01:01 Jon’s PhD Research on Neuroimaging and Machine Learning


04:13 Early Machine Learning in Dating Apps


05:31 Zefr: Understanding Social Media Content for Brands


07:28 Post-Bid Verification on TikTok, Meta, Facebook, and Instagram


08:07 Approach to Content Moderation: Clustering vs. Large Language Models


11:29 AI Policy Development and Codification


15:12 Variable Cost with Fixed Price - Applying AI to Challenging Economics


21:20 Active Learning Paradigm for Content Moderation


22:39 Applying the Right Model to the Right Problem


23:43 Machine Learning and AI Impact on Engineering Model


29:27 Testing and Deploying LLMs


32:12 Vector Databases


43:15 Board Understanding of Generative Models



Transcript


[00:00:00] David Subar: Hello, everybody. Today I talked to Jon Morra. Jon is a technology executive at Zefr, and he has some very interesting things to say. He deals a lot with data quality, with machine learning, with policy around machine learning, and has been doing it for his whole career since his PhD. Jon's going to talk about some generative techniques, but he's also going to talk about where do you not want to use machine learning? Where is it too costly? Where does it not make sense? And where you do use it. What kind of techniques do you use? He's going to talk about the cost of using some of these AI techniques, these models, LLMs, for instance, with OpenAI are expensive and you don't want to use them everywhere. How do you decide where to use them and when do you not use them? What do you replace them with? He's going to talk about vector databases and how they provide new search opportunities for companies, and then he's going to talk about what does it mean to mature in this industry. This industry is new and it has different implications for teams, for processes, and for executives like Jon.


[00:01:20] David Subar: How they need to think about their jobs differently, including how do you talk to your board about it. Jon's a super bright guy. I've known him for years. I think you'll enjoy this.


[00:01:38] David Subar: Jon, thanks for being here today. Good to chat with you. I've known you for a long time, but other folks, folks listening, have not known you for so long. Tell people about your career, a little bit about how you started with your PhD.


[00:01:55] David Subar: Zefr. Draw the line from here to there.


[00:02:00] Jon Morra: Sure. So when I got my PhD in biomedical engineering from UCLA, focused on neuroimaging. So when I showed up at UCLA, the lab was trying to understand how Alzheimer's disease affected people's brains, specifically how their hippocampus and their amygdala changed shape throughout disease. And one of the problems they had was that in order to determine where the hippocampus was, they had undergraduates go and take slices of brains, of MRIs, of brains and trace. So here's a slice trace, here's a slice trace, here's a slice trace. And they needed to do that for thousands of brains over time in order to show the efficacy of some treatment. What I was very interested in ...


[00:02:45] David Subar: How many slices an MRI deblock in a brain.


[00:02:48] Jon Morra: Oh, my gosh, that is a great question. Probably hundreds, maybe, but it depends on the resolution. So depending on how thick each voxel is, a voxel is, a 3D pixel would determine how many slices you would need to do and what I was very interested in at the time was making that process better because. Because I saw people tracing and I saw lots of mistakes. I saw lots of people who were bored. I saw lots of people who whatever. And the field that really intrigued me was not neuroimaging so much, but it was machine learning. And so my PhD was in this idea of, can we make this labor much better, much faster, much cheaper using machine learning.


[00:03:32] Jon Morra: And so at the time, the hottest algorithm was boosting. So this is GBMs, maybe an acronym that people have heard gradient boosting machines. And so we developed an extension upon GBMs, and at the time, that was worthy of a PhD. So after that, I graduated from UCLA. I worked for some time as a software developer, writing pax picture archiving communication system. It's software that radiologists use to store, retrieve, view medical images, and decided that software engineering was fine, but my passion was in machine learning. So I started a company, with some people from UCLA applying my dissertation technology to not brain MRIs, but to cancer treatments. So I had a company that helped with what's called external beam therapy.


[00:04:20] Jon Morra: So what that means is that if somebody has a mass tumor. We worked a lot on prostate in order to decide where to irradiate the person from outside. So this is, you lay down, you have a machine that fires photons on a line, kills tissue on that line, and then exits your body. You have to know what you can hit and what you can't. And it turns out that if you hit your fatty deposits on the side of your body, not a quality of life issue. If we nick your bladder, that's a quality of life issue. So you have the same problem of saying, we have a CT, in this case, of somebody, we have to trace out all these organs. Very same application.


[00:04:52] Jon Morra: Unfortunately, in 2011 and twelve, when I was working on the company, the FDA didn't really approve machine learning methods for patient outcomes. So 510K approval was. Was too hard. So that company ended up folding. And then I found myself over at eHarmony, which in Los Angeles at the time was one of the preeminent shops doing machine learning. I focused on imaging. So, understanding your photo. And then I…


[00:05:21] David Subar: eHarmony being a dating app.


[00:05:23] Jon Morra: Oh, yes. Thank you. Yes. I assume that people know what eHarmony is and the brand isn't what it used to be, so maybe that's unfair.


[00:05:31] David Subar: No swiping. No swiping on eHarmony.


[00:05:33] Jon Morra: No. And in fact, by the time when I left in 16, Tinder was just rising. We're like what do we do with these new guys? Are they even gonna make it? They're not gonna make it. So, yeah. So I worked my way up to lead all of data science there. I led algorithmic matchmaking, dynamic pricing, fraud modeling, churn modeling, and we were at the cutting edge. Some of the problems that we were facing are now solved, like, how do you even deploy a machine learning model? How do you ensure it's the same at training and inference time? These were new problems at the time that I was there. After that, I came over to Zefr, where I was first started as the VP of data science, and I oversaw data science as a whole.


[00:06:11] Jon Morra: When I started here, we didn't really have data science, and subsequently, I've been here for seven and a half years, and now I'm the Chief AI Officer, and I lead data science, machine learning, engineering, content policy, and product as well.


[00:06:25] David Subar: Excellent. So tell. Tell everyone what Zefr is.


[00:06:29] Jon Morra: Great. What's the Zefr? What's the Zefr? Well, it's not spelled like the. I think it's a blimp. Z A P H Y R. It's spelled Z E F R. It's actually the name of a skate troop in Venice. And so the founder was a guy who lived in Venice, and that's how the name came about. But what the company does, it's had a couple of iterations, but it's an ad tech company focused on understanding social media content for brands.


[00:06:54] Jon Morra: So we have two main product offerings today. One is a targeting business that's primarily on YouTube. And so in this business, a brand would come to us, usually via their agency, and say, we have a campaign we want to run on YouTube, and we want to target certain types of content. So imagine you're a big box retailer and you want to do a back to school campaign. You might come to us to find the YouTube videos and channels that are about back to school, and you run your ad adjacent to those, those videos and channels. The other business that we offer is what's called a post bid verification. So what that means is on the targeting, it's before the impression is shown. Post bid is after the impression is shown.


[00:07:35] Jon Morra: What content did the ad support? What content was it adjacent to? So we offer that on TikTok, on , with Facebook and Instagram, and on YouTube. So our customers will get a report on this is the type of content you ran on. Maybe it's good, maybe it's bad, and then some action that they could take to help influence this campaign and subsequent campaigns.


[00:07:55] David Subar:

So is that real time data, or is that. Or, I mean, is it real time data? Is it near real times? It happens a month later, they get that data.


[00:08:03] Jon Morra: So it's definitely not real time. And a lot of the technical decisions that we've made is because we do not have these really fast response times, because the amount of content coming through is billions, tens of billions of pieces of content. But we do have SLA's that are usually measured in about a day. And so the way that it works is the brand goes to the platform and authorizes the platform to send us the data. So what we have is we have a truly independent observation once we have the content ID, and then we rely on the platforms to tell us what the adjacent content was.


[00:08:36] David Subar: Okay, so do you run a whole bunch of stuff there?


[00:08:42] Jon Morra: Yeah.


[00:08:45] David Subar: So do that list again of the stuff you run, because I want to explore a couple of those things and ask you questions against those.


[00:08:53] Jon Morra: Yeah. So reporting into me is product content policy, which is a trust and safety function, machine learning, engineering, and data science.


[00:09:01] David Subar: Great. Okay, so, as I remember, you're not really doing large language models, you're doing more clustering.


[00:09:15] Jon Morra: Yeah. So basically, we have so much content, we come in, and when we originally designed this, back in 2017, 2018, 2019, the state of the art was, and you could argue maybe is maybe is not maybe, it's changing. Getting mass human review of a bunch of content with a policy in mind and then training a supervised machine learning algorithm on top of that. All our systems were built around that, saying we have a crowdsourcing partner. They watch content, we observe their work, we kick out some people who are bad, we teach some people who are okay, and the people are good. We give more work to be done, and we pay them for that. We would scoop all that data up and learn a model on top of it, and that's what we deploy. What's happening right now with LLMs is this idea that the act of human labeling is kind of moving from a human function to arguably a very large model function.


[00:10:17] Jon Morra: And this is arguable. You could see times when the LLM spectacularly outperforms people and other times where it makes terrible mistakes. The way I like to think about it is you're trading mistakes. You no longer have to deal with mistakes of people who are tired, people who are adversarial, people who don't show up with other mistakes, which is the LM doesn't understand current events. It may not understand specific languages, it may not understand colloquialisms and so there are different problems that you have to tackle. But what I would argue is the other thing that's happening is foundational models everywhere have gotten a lot more democratized. When I was at eHarmony, we trained everything from scratch. You'd have a bunch of matches.


[00:10:57] Jon Morra: And our predictive model was seven day, two way communication. So given a man and a woman, would they talk in seven days? We started from no knowledge of the world, nothing learned everything from scratch. No data scientists should ever do that. Today we have great foundational models on hugging, face being the most prominent. And so the idea is, do we still have to have massive millions and millions and millions of labels, or do we need tens of thousands, thousands or even hundreds of really high quality labels that really tell the story in order to train a good classifier? And so that's why I think the problem is moving to the search paradigm.


[00:11:36] David Subar: The search paradigm mean to generate those small number of labels.


[00:11:42] Jon Morra: That's exactly right. So when you look at a platform like, let's just take TikTok, for instance, you can find content about everything. I mean, everything with a capital e in the world is on TikTok, right? And so if you want to train a classifier to do something, let's say you want to train a classifier to, say, somebody using drugs on this TikTok video, which is something our customers care about, right. What you really need to do to train that classifier is no longer find millions and millions of examples. You need to find the right thousand, 2000, 3000 that really codify the difference between drug use and us according to whatever policy you have. Now, you're in the search paradigm, and you need to label those accurately, obviously.


[00:12:19] David Subar: Okay. So that there's a whole bunch of stuff packed in there. Let's talk about what does policy mean?


[00:12:27] Jon Morra: Oh, gosh, policy. So anytime you're dealing with a trust and safety workload, right, what usually happens is you start with some market facing idea of what you want to do. So in our industry, there is an industry body called the Global Alliance for Responsible Media, or GARM, who publishes a two page standard that we try and implement stuff like adult content, crime, weapons. These are all bad things that brands would want to avoid. You can look this policy up, and when you read it, you will realize it is very under specified. We could sit here and say, we don't want crime. But gosh, there's so many pieces of content on TikTok or Meta that are like maybe kind of, I don't know. What do you mean? About, like, people committing crime in GTA, like, robbing a car at a video game.


[00:13:15] Jon Morra: Is that crime? Maybe. And so what you have to do is the hard work of saying, okay, we're gonna take these market facing ideas and really codify them. And this is not necessarily a technical. It's not a technical task. It's really writing down, what does it mean to be these things in as explicit detail as you can. We have a whole team of people who work on codifying this policy in a way that it's representable to our review workforce, whether that be our employees, crowdsource employees, or maybe an ELA.


[00:13:44] David Subar: Okay, so they've now written down some policy. So now some humans have to read it. It needs to be some kind of error quoted, machine understandable. Machine readable. So there's a translation there. What's that look like?


[00:14:01] Jon Morra: So this is where we've really seen a paradigm shift in content moderation as a whole. It used to be that you would work with large workforces, potentially, and the real challenge was to say, okay, we have, let's say, three people, four people who wrote the policy, who know it inside it out, and not only understand the policy as written, but what they meant to convey in their heads. You have to disseminate that to all these people. You have to test that. They understand the policy. You have to be able to listen back up, because you haven't thought of every edge case necessary maybe to resolve some ambiguities or conflicts or whatever. That is a management nightmare. Once they understand the policy, the workforce, whether, again, it's your workforce or crowdsourcing workforce, kind of doesn't matter, then they can start implementing the policy by saying, okay, we have a piece of content.


[00:14:52] Jon Morra: We have a policy. Yes, no, essentially. And then just go down like that. Where the paradigm shift is happening is now, these large language models are so good at understanding really well written policy, and the devil is very much in the detail, whether you make one call, whether you stuff it all in the context window and all this kind of stuff. But they can make these calls really well. And so now you have this idea, and we're discussing this in Zefr right now, like, what exact role do the humans have? I think they will always, at the very least, have an oversight role. I don't know how much of a labeling role they have. So right now, we have them labeling high value content.


[00:15:32] Jon Morra: So for us, high value content is content that got a lot of impressions, because if content got a lot of impressions, it's more important to our customers. So we want to make sure that the answer is they're right. So we'll have a person review it. But do they have to label for the purposes of training less expensive models that we put in production? I don't think so.


[00:15:52] David Subar: Okay. So I want to dig down that, and then we'll pop the stack back up into this whole policy thing. Okay, so you just talked about tiers of information value. So LLMs are expensive. You've kind of hinted at thoughts about that. LLMs are variable cost. Your model, Zefr's model, Zefr's business model, variable cost model, or fixed cost model on a fixed time.


[00:16:35] Jon Morra: Yeah. So what we want to do, this is interesting. This is a unique quirk to how advertising works. If you were to look at most content moderation systems, or any content understanding system for that matter, through an API, they would give you a price. They'd say, this is how much it costs to do one call. It costs whatever fraction of a cent to do 100 calls. Maybe you get a discount, and then you go on from there. In advertising, most transactions occur on a CPM model.


[00:17:02] Jon Morra: So that's cost per thousand impressions, where M is Milli. Very confusing. And so what that means is it's cost per thousand impressions, not per piece of content. So we incur cost over its effort per piece of content. We run inference on not per time that content is viewed by somebody on Facebook, for instance. So what we have.


[00:17:27] David Subar: Let me unroll that for a second. So the advertiser gets charged per views, you get charged comparing content to other content. Right. Am I staying that?


[00:17:43] Jon Morra: Not content. Content to a policy. Right. So every time we run inference. Yeah, yeah.


[00:17:48] David Subar: Fine. Right. And so their cost is. Is abstracted from the price that you're charging.


[00:18:01] Jon Morra: That's right. That's right. Right. And that's our most. We have other billing models, but a CPM billing model is the most common because that's how the industry usually.


[00:18:09] David Subar: Yeah. Okay. Which means that you could be grossly profitable or grossly unprofitable in unpredictable ways, because your cost model and your price model are different.


[00:18:21] Jon Morra: That's exactly right. So we have this mismatch. Right.


[00:18:24] David Subar: Okay. How do you solve that?


[00:18:26] Jon Morra: Exactly. So the naive implementation, this is something that I've tried for a long time, is, why can't we just pass this on to the customers? Why can't we say, hey, customer, you can pay us a, b or c for different quality. You could choose whatever you want. The customer doesn't want to do that because the way that the customer interacts with the product is as a verification provider. Our job is to spot check, essentially check, not spot check check everything that the platform does. And so the customer comes in when we tell them to, we think there's a problem or something that they could improve, and they then spot check us out and then they take action. They're never going to be in a position where they look at all of our inferences. It's just not going to happen.


[00:19:08] Jon Morra: They're never going to use our product that way. What that means is we have to build a tiering system. At Zefr, we have a three tier tiering system. And the idea is content. As it increased in importance, we will devote more money to from an inference standpoint. So the lowest here that we have ingests the data we get from the platform without any extra calls. So this would be like the title of the YouTube video, the thumbnail of the YouTube video, which we get for free right on TikTok. We get the text overlay, for instance, with no extra money.


[00:19:45] Jon Morra: So then we have models that'll run on top of those, and that's tier one. Then we'll have tier two, which ingests expensive features. So, for instance, getting the transcript of a video costs actual money. So we couldn't afford to run that on every single video because that would be cost prohibitive. So those pieces of content that are more important, and I could describe what that means, would get that. Maybe they'll get thumbnail, they'll get frames extracted, they'll get all their expensive features extracted. And we run more expensive models in tier two. And then tier three is human review.


[00:20:17] Jon Morra: So the idea is, as content becomes more important, it flows from tier one to tier two to tier three. And how we tune the thresholds between that flowing allows us to control our costs.


[00:20:29] David Subar: So let me, let me repeat this back to you. Customer. You're on some platform like TikTok, we're going to charge you in ways that you're expecting to be charged. Cost per CPM, cost per thousand. And, but you're. But for things that are not very important, we know because the number of impressions is small, we're going to do a, we're going to look, but we're not going to look too hard. Things that are medium importance, we're going to look harder. Your cost per thousand cost is the same. And things that are super important to you, we're going to look harder because our costs are inferencing


[00:21:13] David Subar: costs on each of these tiers go up for us. So we're going to apply more expensive, higher valuable, more valuable things to things that we know are more valuable to you because how many people are looking at it? And that's the way we're doing the matching. And so there's some intelligence. You have to understand how to do these tiers.


[00:21:35] Jon Morra: Yeah, that's exactly, you said it exactly. Right. And right now the tiering is based mostly on impressions. Right? So as a piece of content gets more impressions, it will be promoted. This is actually an active area of research internally, like what should be the promotion criteria to move up in tiers other than impressions.


[00:21:58] David Subar: Is that itself an AI model in waiting?


[00:22:03] Jon Morra: Oh, that is an interesting thought. I had not considered that. So you can, you dabbled into this idea of active learning. So are you familiar with active learning?


[00:22:15] David Subar: I am, but not everyone might listen. Might. So give us two minutes on that.


[00:22:20] Jon Morra: So great. So active learning is this machine learning paradigm where it's a classification or regression problem. Are you trying to predict a class or some real value number, but in addition to the prediction that you make, you return some side information, which is how much do you want to spend out of some budget in order to get the ground truth? So the common way that we think about this is, for instance, in medicine, I've heard this talked about this way where it's like, hey, we have a patient and we want to do some small diagnostic tests that's cheap, and it gets some answer, and then depending on maybe the output of that answer or the patient themselves, maybe they have preconditions. Oh, we want to spend more money to do a more expensive test because we really want the ground truth. And so you've kind of, I hadn't even thought about it that way. But yes, you could think of this as an active learning problem.


[00:23:09] David Subar: So I think it's worth talking about this because there are a lot of companies diving into some kind of machine learning AI models. Many companies don't know the cost of it. Some then run away. But I think if you have the sophistication, you have, each applying the right model to the right level of problem, which is to say, don't spend all your money everywhere. How do you spend your money properly?


[00:23:39] Jon Morra: I think the reason that we arrived at this is because of our very early adoption of machine learning in general. So a lot of the smaller startups that I talk to either are using a wrapper around an LLM. There's nothing wrong with that, to solve a customer problem, or are replacing a human process with an AI german process. So let's say writing copy for marketing, that's a common use case. You could say, hey, we can go from three people to two people if we use this process. And so now all of a sudden you're doing this cost reduction journey with AI. What we're doing, we're actually, LLMs actually increase our cost by a lot. So we don't have a choice but to be disciplined.


[00:24:24] Jon Morra: Because if we just said, oh, just deploy GPT four or CloudThree and call that on every single category, on every single piece of content. Last time I ran the numbers, it cost something like $150 million a year on our compute budget. And I don't think my CEO would be too happy with me if I did that.


[00:24:42] David Subar: You know, maybe, maybe if you just cut your salary in half, then that would make it up.


[00:24:46] Jon Morra: I don't know if I can quite make half of that, but.


[00:24:50] David Subar: We should have such trouble.


[00:24:52] Jon Morra: Exactly.


[00:24:54] David Subar: Okay, so that you kind of transition to the next interesting thought that I want to share you to share with us is machine learning, AI, data science. Data engineering changes the engineering model. Like how do you deploy people? What kind of people do you have? What should the design be? What should those processes look like in a classic model from classic, like two years ago, we kind of knew squads and maybe different organizational designs, team topology, the team topologies book all. That was great. And then if software were eating the world before. AI is not only eating the world in terms of product, it's eating the way we think about engineering organizations, product management organizations. How do you even think about who do you want in your organization? How should they talk? So I'm going to give you two questions that are related. You're a startup.


[00:26:04] David Subar: What? And let's just say you're at a startup similar to Zefr. Doesn't to be exactly the same, but similar kind of B2B SaaS. You have a limit. You know, you call it a series. What kind of people you are in your organization, how should they talk? And then you're a mature or more sophisticated, mature company. Maybe you're not like top, you know, maybe not flatlining, you're still on the hockey stick of growth, but you're further up. How would you think about it then?


[00:26:35] Jon Morra: So if I were doing organization for series A, you have to be aggressive with what you can roll out. So I, if you're in the tech space, if you're in the image space, which I'm going to assume you are, because there's still like, I was actually having this debate with somebody at a conference recently. Like what is the most deployed model by count of models, not count of inferences in the world. I think it's still probably XGBoost. It's still probably a boosting model because most problems are tabular data, and boosting is very good at tabular data. Let's put that aside. Let's assume that your question is really focused on like text or images, something where you need a bigger one. I would start with wrappers around publicly available LLMs wherever you can, right? Because I think that they know a lot.


[00:27:21] Jon Morra: You can get 80% of the way there for so many problems really, really fast. And everything I would be focusing on would be delivering customer value while extracting feedback as fast as you could. Because if your bet is that you're a series A and you're going to differentiate because you're faster, nimble, you're going to eventually have to differentiate because of quality if you're in the right domain, because there'll be competitors. And the only way you could achieve quality is by constant observation of what people did and then the ability to act on that observation, fine tuning, retraining, color, what you want. That's for the series A, for the more mature, and this is something that we struggle with because we are a more mature machine learning company, is how do you pivot to incorporation of LLMs in your process in a way that makes sense from both a cost and quality standpoint? Let's say you have, like Zefr, some machine learning models deployed. What you need to do is be able to constantly test LLMs while making sure that your quality doesn't fall off. And this is hard. And so this requires maybe more researchers.


[00:28:27] Jon Morra: This requires people who understand the nitty gritty of how these things work. So for instance, how do you tokenize? Right? How do the embeddings work? How do you make sure that the model that you're deploying understands multiple languages or different video? This is hard to do in general, and so you gotta take a more cautious approach.


[00:28:46] David Subar: So in that second, they're both interesting. The second is interesting for a new way that I don't think we had to deal with as much before. In the old model of startups, you deploy an MVP. You release it to the market. The world tells you it likes the green button, not the red button. You iterate, you do it again, and repeat. Keep doing. And you build a product team and engineering team that knows how to iterate and release and instrument.


[00:29:17] David Subar: Do all that very quickly and then learn in this new model for the more mature company. You're suggesting capital investment in teams that have, I'll call it longer term effects. It takes a while to build that data team. It takes a lot while to build whatever inference model you want. It takes a while and it may not work. And so the loop of build release learn, you know, do it again and again is just longer.


[00:29:57] Jon Morra: And I think that's been true for machine learning. I think that, I think that's been true for machine learning teams for years. Right? Like, like I've experienced this for most of my professional career. Is that the act of saying, we're going to gather some data, we're going to do some training, we're going to test the model offline to make sure that it doesn't break the world, whatever that means for your business, and then we're going to deploy it and then we're going to wait for more data to come in. Has always been a slow process. Right. So for me it's like, yes, obviously that takes a little while. I think the rest of the world is realizing that deploying AI because it's so easy, you could just deploy chat GPT wrapper.


[00:30:36] Jon Morra: You're in production with it, like no problem until there's a problem, right? And so, like, one of the interesting things is I'm actually seeing startups and come out around like automated red teaming of like, hey, you have a model that does something or even a prompt that does something. Let me try and figure out all of the different ways offline that people are going to interact with it so I can deploy it more safely or more quickly or whatever, so that you can actually figure out what the heck this thing's going to do in production. Because when I was younger, working at places like eHarmony, you only had a certain breadth of predictions you could output, right. And you could say, match these two people. That's kind of it. Now the output's really unbounded. Write whatever you want. Like, you have to figure out how to handle that, which is hard.


[00:31:18] Jon Morra: So, yeah, I do think we'll need to be a little more cautious.


[00:31:21] David Subar: So how do you decide what to do and not what to do? And then how do you sell to your CEO? You're going to give me X amount of money, and here's the results that may or may not happen, and here's why it's important and that's why you want to give me X, but you're not going to know for six months whether this was a good investment. How do you, how do you have a conversation.


[00:31:45] Jon Morra: I feel like the conversation has changed over the last year and a half. When OpenAI first released chat GPT, it's why aren't you doing this? Why aren't you stopping everything you're doing and just deploying LMs? Because they look amazing when you just test them out of the box. But I'm glad we didn't make that pivot. And now I think the question you're asking is the real question that we're getting, how do you show efficacy? And I think a lot of what you need to be able to do is show incremental improvement offline. So coming back to the real crux of what I think machine learning, at least in our space, is, is a search problem. So it's not about can you get the right answer on the easy questions, right? It's about how do you define difficult, how do you surface that and how do you show that you can get the right answer there? Nothing about testing a model has changed in my opinion. Right. And what I mean by that is you still need some universe of data, you need some labeled examples, right? However you got those, and then you need to be able to show that you're right often enough, and often enough is defined by your business.


[00:32:48] David Subar: Got it. Got it. And there needs to be some trust.


[00:32:53] Jon Morra: There needs to be some trust, but the trust needs to be established in data. And so this is like, you can't just say trust the model. I mean you can, but you get, you don't know what you're going to get.


[00:33:04] David Subar: Yeah. So let's pop to another topic. Vector databases.


[00:33:11] Jon Morra: Sure.


[00:33:13] David Subar: What are they, how do you use them? How do they, you know, how do they address some of the clustering issues you're thinking about? How else do you use them? Yeah, great.


[00:33:24] Jon Morra: So I'd be happy to. So vector databases are relatively new as an idea, and they came about because of an embedding. So basically an embedding is a way to express important information in a fixed length vector, a compression algorithm, if you will. The first real popular embedding was done through word two, vec. This is a model from I think 2004 where you could literally take a word and convert it into a vector and much, much more pop the powerful embedders, also called encoders. If you ever heard the term encoder model, GPT is what's called a decoder style model. So it takes text and decodes it into other text. And encoder style model is Bert and its friends.


[00:34:10] Jon Morra: It takes text and compresses it. There's also encoder models for images. There's encoder models for video, for audio, for a lot of things where there's this notion of an order, right? So in text, it's a 1D order, an image, it's a 2D order. In video, you could argue it's a 3D order of information, because these embeddings have gotten more and more powerful. What you're able to do now is take information, text, image, whatever, compress it down, store that compression in such a way that other items that are semantically similar are near in space to that vector. And so what you, what we, this problem arose of saying, well, now we have to implement an N-Squared algorithm very fast because the common query is, okay, we have this whole store of vectors, we have a new vector from the wild, tell me the stuff that's close to this, and do it very efficiently. And so the squared algorithm...

[00:35:06] David Subar: Because it wants to be N-Squared, it's not going to be efficient unless you do something different.


[00:35:09] Jon Morra: Exactly, exactly. And so company, there's a couple of algorithms, HNSW is one of them. IVPFQ, I think, is the other one, that are able to do approximate nearest neighbor very, very fast. So companies have reason to support this model, and a lot of the big database companies are either in the process of retrofitting their old stuff or offering vector databases as well. This is so important for our use case, for instance, because again, we want to find these needles in haystacks, we want to find those hard examples. So if I can get ten of them and I can go ask somebody else, hey, give me ten that are each like this. Now all of a sudden I have 100. And you can see how this iterates very fast, and it's much faster because you can do it at the semantic level, not at the keyword level, and in images, you can't really do it at all.


[00:36:05] Jon Morra: We didn't have a great way to compare images before.


[00:36:09] David Subar: So normalized databases have its place, but that world is smaller as a percentage of the whole world, because we have this other kind of problem that it turns out to be maybe even a much bigger problem than we've been solving. We were looking up, we're doing basically lookups, which maybe arguably is a search problem, probably not. Now we have a real search problem.


[00:36:33] Jon Morra: Maybe. I mean, you could argue that Elasticsearch and Lucene have been solar have been solving this for decades, right? And now we just have a better way of doing it. I think that's a fair way to think about it. In the tech space.


[00:36:44] David Subar: Yeah, yeah, yeah. I think that's fair. I think it's fair. Okay, now I'm gonna pop the stack again. So you just, in the beginning of our conversation, you described your career from your PhD work to machine learning to apply at different places. That was Apple call, let's call that horizontal maturation of what you're working on, ok? Vertical maturation of what your role was in different companies, right. You're much more senior role at Zefr than you were before earlier in your career. And you've been at Zefr for a while, so you become more and more senior there over time.


[00:37:28] David Subar: What, what has been your experience doing this? What you've had to learn how to do differently, how you've had to communicate with your teams differently, and how you relate to the technology differently in these different stages.


[00:37:46] Jon Morra: That's a fascinating question. So when I started as an IC, right, especially at the company I founded, obviously, when you found a company, you do everything right. But my job was to write software. I would spend all day, every day in code. We had another co-founder who would talk to customers. I didn't talk a ton to them, but all I wanted to do, I lived and breathed writing code, and I lived and breathed implementing, like implementing machine learning algorithms. When I did this, there was one publicly available machine learning library I can remember. It's called Weka.


[00:38:20] Jon Morra: I think it still exists. And it wasn't very good. So we implemented everything from scratch. And then what happened in my career is I kind of had this interesting change. As I was getting more senior, machine learning started to come up in popularity. I remember at the end of the time at my company, I was trying to sell it to somebody, to Cedars, the hospital. And I had a medical doctor who I gave the whole presentation on what it does. And he said, well, this is great, but I don't believe in machine learning.


[00:38:52] Jon Morra: And remember, like, I went, I don't believe the sky is blue, yet we're still here. Like, what do you want? Like, it's a statistical process. And to think that that would happen today is, like insane. But in 2012, I think this was 13 like that. That was how people thought about it. So when I got to eHarmony, what I found that was part of the reason I was successful there is because I was able to deliver value on machine learning projects while explaining why the output was probabilistic. So we worked a lot on not deterministic problems. This image probably means that this person's a good match for this person.


[00:39:29] Jon Morra: And because I was able to codify that really, really well and explain the bounds and really provide benchmarks of like, here's how good the model's doing today, here are the changes I made. Here's how good the model is doing tomorrow. Still not perfect. Far from perfect. Absolutely will be mistakes. But because I had a good holdout set, a good training set, I could show incremental improvement. And this is something I've encountered throughout my whole career, is that one offs always come in, but you miss this one, whatever this is. And it's so obvious.


[00:39:55] Jon Morra: How could you miss this? And so being, coming back and saying, you're right, I did. But here's the holdout data as to why you should still trust me, is invaluable. So I kind of learned that at eHarmony, and that really allowed me to rise. And then the reason that I was successful at eHarmony was because I fell in love with MLOps. So the act of really retraining and redeploying a system was what I wrote, spark code. I used machine learning libraries in C++ and Java. And because I fell in love with it, I could deploy these things fast. And that was really, really important back then because the way people talked changed a lot.


[00:40:34] Jon Morra: Coming over to Zefr, I again wrote, when I started at Zefr, even as a VP, I wrote a lot of the original code. It wasn't me writing the algorithms anymore. There were packages that existed, but me writing all the orchestration layer, again, focusing on quality, focusing on holdout. And then part of the thing that unlocked a lot for me was this ability to build a self healing algorithm. So we would always make mistakes. We were working only on YouTube at the time. YouTube is weird. There were always corner cases.


[00:41:00] Jon Morra: And what I was able to do was build a system where, via human review, when somebody pointed out something that was wrong, we verified it was wrong. We sent, I used to use this phrase, it and its friends to be human reviewed, where its friends were, stuff that were similar, it would get human reviewed. We'd retrain, and we deploy. And then three days later, I'd go back to the person who complained and said, I fixed it. And by the way, here's 10,000 other examples that were now fixed because of what you reported. Thank you very much. And this system worked really, really well. And then what happened? So as I got promoted, I went from an implementer to really an auditor.


[00:41:35] Jon Morra: And I like to joke now that my best language is SQL. So what I spend when I'm working with the data science team or the policy team. And in fact, I'm even going to do this today. I'll sit down in front of my SQL terminal and be like, I want to check this and this and this and this. And I can still write all this stuff. And then I would talk to the team. Did you consider this? Did you consider this? Did you consider this? And sometimes the answer is yes, and sometimes the answer is no. And I feel like as a manager, especially in data science and machine learning, my goal is to do two things at the same time.


[00:42:02] Jon Morra: One is set out a vision of where I think we should be as an Org, and then the other is to check where we are now.


[00:42:11] David Subar: Good. And so what skills did you have to learn that you didn't need before?


[00:42:21] Jon Morra: I think this is true of a lot of people who go from IC to management. But learning to let go was the hardest skill. Learning to say I'm going to be okay with not understanding exactly how every line of this code works is so hard. It was hard for me. It's hard for almost everyone I've encountered. And so that, that is, that skill still exists today, right? Where I think it's also extra hard is I have to let go of the data quality. And that took me honestly longer to do. Like, it's not that it's less important, it's still super important.


[00:42:54] Jon Morra: But I can't be in the weeds on every mistake. Right? And so one of the things I have my teams working on now is I need a written document of known weaknesses by language, by model type, so we can make sure that we're covering all our bases. I could generate that, but that's hard to generate. It's the unknown unknowns. How do you figure out which. Don't know, but that's more of like, that was really hard for me to let go because I love data. And even now when I have free time, I'll sit down and write queries. And that's my happy place.


[00:43:29] David Subar: So you do qualify as a nerd. So we'll give that checkbox. I do, too. That's. That wouldn't be the way I exhibit it. But. But I am also so. Okay,


[00:43:42] David Subar: I'm going to ask. I'm going to ask a reverse question of other people that I asked. So, as you know, I. We do this PE perspective podcast. We work with a lot of PE firms, and one of the questions I ask partners at PE firms is, what do you wish your technology executives knew? And what do you wish they would tell you about, so I'm not asking you the reverse question. What do you wish people on the board knew and what do you wish they would tell you about?


[00:44:14] Jon Morra: I wish that they knew about the true power and the true limitations of these generative models. I think that a lot of the board spends a lot of time, and as they should, by the way, reading press about how great generative models are on one hand, and then also how they're going to kill humanity on the other hand. And I feel like we constantly are in this, oh my God, how are you not doing everything to, how could you even trust them at all? And so having more understanding of how we're using generative models and why, at least in my opinion, it's the most appropriate way to do it, is something I wish they understood a little more. Something I wish they told me a little more is how do we align our output quality to value of the company more? Like I said, we don't have an expectation that everyone's going to look at every piece of content. Definitely don't. But who's going to look when? What bias do they have? Do they want low or false positives or lower false negatives? And then how do we make sure that we're saying that, hey, these situations are more important and because they're more important, we should devote more effort there. That's something I wish that we could get better communication on. Okay.


[00:45:36] David Subar: Well, thank you, Jon. This has been great. As always. I love our conversations and I'm glad that we get to share this one with a bunch of other folks.


[00:45:46] Jon Morra: Thank you so much for having me. I appreciate it.


[00:45:48] David Subar: Oh yeah. If someone wants to get in contact with you, how would they do that?


[00:45:56] Jon Morra: Oh, you can find me on LinkedIn if you search, I think Jonathan Morra, I think my handle is Jmorra. You can hit me up. You can always email me jonmorra@gmail.com.


[00:46:09] David Subar: Excellent. Thank you, Jon.


Comments


bottom of page