Machine Learning in Production for Robots with Brandon Rohrer
Principal Data Scientist at iRobot, Brandon has an incredibly popular ML course at e2eML
View all podcasts

Brandon Rohrer is a Mechanical Engineer turned Data Scientist. He's currently a Principal Data Scientist at iRobot and has an incredibly popular Machine Learning course at e2eML where he has made some wildly popular videos on convolutional neural networks and deep learning. His fascination with robots began after watching Luke Skywalker's prosthetic hand in the Empire Strikes Back. He turned this fascination into a PhD from MIT and subsequently found his way to building some incredible data science products at Facebook, Microsoft and now at iRobot.

Brandon's brilliant machine learning course:

Follow Brandon on twitter:

Lukas: Brandon, it's really nice to talk to you and thanks for taking the time. It sounds like you've worked on machine learning at a range of different companies and most recently iRobot, I'd love to hear about what kinds of challenges you have at at iRobot and in robotics in general.

Brandon: At iRobot, we get to support these little giant Frisbees that run around people's floors and suck up dirt. We have vacuum cleaners and mops which run around on people's hard floors and clean up messes. What's really fun about this is if you think about production machine learning systems having to deal with whatever input or badly formed requests that you might encounter, imagine taking that to the physical world and you have something that is bopping around literally every type of home in the world. There's 30 million of these things out there now. As hard as we try to imagine, we can’t imagine all of the challenges that they will come up against so this is really fun from an algorithm design and engineering standpoint, making something that can get beat on, can have cats ride on it, can run into all kinds of things. They can encounter cords, socks, Legos, Skittles and how is it going to handle all of these things? That to me is really fun. It's the polar opposite of the sandbox, a carefully prescribed problem where you know exactly what your data is beforehand and you know, it's been cleaned up. So it's going to give you a good high quality answer in the other end.

Lukas: I feel like I've had a lot of friends in the last couple of years move from consumer internet ML applications to to robotics. Was it a big adjustment for you, like where there big changes or was it mostly kind of the same set of issues?

Brandon: Definitely changes but for me, this was coming home. So robots for me is where I started, my degrees are in mechanical engineering and my graduate work was all about using robots to rehabilitate stroke patients. Knowing that things could break all the time and not to trust your sensors. That's kind of what I grew up with. Then when data science became a more common career path, I rebranded myself as a data scientist and went to agriculture. I went to Microsoft doing cloud machine learning solutions for a variety of different companies. I went to Facebook infrastructure, which is a fascinating set of problems around keeping one of the biggest networks and set of data centers in the world up and going and running efficiently. All of these things, what I enjoy about them is that you could not ignore where the data came from. You had to know something about either the people, what state they were in when they generated it, you had to know about what pre-processing it had. You had to know about the assumptions that were made along the way. If you didn't know this, then you couldn't build good models to get answers out of it. Robots just take this and they put it front and center and take it to the extreme, because if you don't know what a given sensor value means in the physical world and it's really hard to build a good model around it, know how to interpret it. Naive models are you just throw things in unless you get really lucky, they just don't work well.

Lukas: So does that lead you to kind of like simpler models? Like are you more afraid of complexity then for these applications?

Brandon: My personal strategy when faced with something like that is if I don't know everything it is going to come up against, the biggest thing I want to make sure is that when it performs poorly, it doesn't do horrendously bad.

Lukas: So how do you do that?

Brandon: Simplicity is a good one, so knowing exactly what happens. One example of this is in agriculture, one of the hard modeling problems is I have a field, I have some corn seed I planted on a certain day, I used this much fertilizer, here's with the weather and the precipitation is all season. How much am I going to harvest at the end of the year? If you had a model that could spit that out, you would have solved agriculture or at least the yield problem in agriculture. But there are so many variables in the model we were working with was a popular academic model that had literally hundreds of variables in it. There's no way that you have enough data to train that model well and what was really funny is when we did some analysis on that using popular settings for that model, you could have a really naive model which just estimated a flat rate for all fields everywhere, for all conditions and then this really elaborate several hundred parameter model, and the flat rate model did like twice as good.

Lukas: It's funny because I would think with plants there would be like a lot of complicated interactions but I guess you just don't have enough data to to know.

Brandon: You're exactly right, and that's why the many parameter model didn't do so well, it did account for a lot of interactions but to get them to work the way they were supposed you had to get all those parameter values correct or in the neighborhood and we just didn't have enough information to do that. An alternative approach then is to start with a very dumb estimate and then incrementally make it a little smarter. By the time I was done, I was working with like a 3 parameter model and one for, you know, precipitation and one for soil texture and one for something else and to be able to check it each time and really just listen to your data. So the same holds true then for robots or for anything else it's like simple is good.

Lukas: Is there anything else, though, like to make sure that your models are sort of robust in the face of different different types of data? It sounds like you're also maybe like kind of pulling apart the problem into sub problems.

Brandon: Definitely that some problems are amenable to that, and to the extent you can separate it into a sub problem, that is a great strategy.

Lukas: Although not everyone thinks that right, like some people talk about sort of like end to end autonomy, right? I do think that's a little bit of a point of view.

Brandon: That's true, that's a good footnote, I will say that that is my opinion. I don't think that's a generally accepted, I agree. But it is easy to fall in love with your model and see what it explains potentially so many phenomena like it must be right if we can just get the data to train it correctly. We've seen this even with some vision models, the actual high quality label data to train it well would cost so much together that it's impractical. So in that case, model doesn't do much, and if you close your eyes to that move ahead with poorly labeled data, then badness happens and you get models that are worse than no model at all.

Lukas: Yeah I think we've all experienced that. I've developed a real interest later in life and sort of mechanical engineering and electrical engineering and I feel like you've kind of gone in the opposite direction. I guess one similarity that I've found of mechanical problems and machine learning is that you don't get good error messages in either domain. I'm kind of curious, do you think that your background in mechanical engineering has helped you in certain ways in machine learning? How did you go about learning a new field because lots of people want to do it and how did you bring the knowledge you had to help you there?

Brandon: In my case, it was motivated by a problem I was trying to solve. In my work, we used robots to help rehabilitate stroke patients. We saw changes in their movements. A good research question, is that well what's going on in the brain to make their movements get smoother? What's going on there? And the more you dig into human movement, a whole collection of questions bubbles up. How does the brain control this hardware? That's sloppy and it changes over time and it's not very accurate compared to precision machine tool robots and with huge time delays. Time delays that would take any off the shelf robot and drive it unstable but the brain does it casually like we do it like we're half freezing and our neuromuscular dynamics all changes, the brain compensates, we're drunk and all of the time delays change, the brain compensates. How does this happen? So this was the problem that I wanted to solve is, how could I make something that could control a piece of hardware that it didn't know or understand very well, and that's going to change all over the place?

So that led me on the path of learning what I could about how the brain works, which from the point of view of, you know, now I want to turn it into an algorithm. There are still huge gaps there, even though we call neural networks, neural networks they have no resemblance at all to anything that goes on in the brain. So studying that and then figuring out if it was successful, what would I want it to do, and that lead me then into studying different signal processing methods and different algorithms, different families of algorithms. Then once I started being a data scientist for my day job, aside from this research interest, there was a whole other like professional motivation to dig into these things. Then once I started writing tutorials and teaching these, then it became even more motivation to learn these things and to be able to understand them well. So it kind of built on itself that original problem of kind of building a general purpose brain or controller that you could pop into any robot that it could learn what to do with it is still a long term passion of mine. It's my personal 30 year project.

Lukas: What makes that hard?

Brandon: In the real world the robot is never going to experience the same thing twice. You're never going to get exactly the same camera image two times in a row. So one thing that's hard is you have to deal with always new experiences so you can never learn exactly what to do in this situation so you have to learn what other situations are similar. That sameness is very hard, and we humans do it so well that it makes it harder to to put down into code.

Lukas: You could say that about like image processing or audio processing and I feel like when you look at the progress in terms of like, facial recognition or understanding voice, it's really spectacular, right? We see it in our lives all the time, for better or worse but I feel like we don't see like robots running around like we might have expected. The feats that robots do that I'm kind of wired to be impressed by are actually like incredibly unimpressive to like my mother. Right? Whereas in every other field the stuff that ML is doing is sort of amazing compared to human but I feel like in robotics we look at what what a Roomba does and it kind of blows our mind but my cat can do much more impressive stuff, right? Less helpful kind of creating a mess instead of undoing it but still, why is robotics so particularly hard?

Brandon: The similarity problem. Once you get past something concrete like this face belongs to this person in different situations but if I'm in a city I've never been before. If I had to guess which way the hospital is, how would I do that? So we have a lot of subtle things that we do to get oriented in novel situations. That's one aspect. Another is that machine learning the way it's set up right now it requires just a whole lot of data. To learn basic things, categorizing images, which is something that we train animals to do on a regular basis, not even very brilliant animals. That requires huge amounts of well labeled data and if you get some poorly labeled data in there, you can mess it all up. To learn things, to do the thing where you've maybe never done it before and you have to make a reasonable guess your first time? There are some people making efforts in that direction, but it's still very early days.

Lukas: What do you think about the approach of simulating data? Does that seem promising to you?

Brandon: Yes. In fact, another thing that's really hard about doing robots is it's hard to keep your robot up and going. If you ever work in a robotics lab, if you get like three solid runs, it's like, great. Write that up. That's your thesis and before the next spring breaks. So a simulation is really useful for that because you can run a robot for thousands of years in simulation time and generate that volume of training data. It's not without its pitfalls, though. I've worked with simulations and if you talk to anyone who has, they'll have a story about how there's some quirk in the simulated physics or the simulated world and your reinforcement learning agent learned to take advantage of it. So like in mine, there is a 7 degree of freedom robot arm and it learned to reach down into the table because I made it too soft and use the table as a guide and then come up beneath to pick it up. So there is a paper that came out not too long ago about Open A.I. using a shadow robot hand with a whole lot of robot simulation to do some of the steps to solve a Rubik's Cube and a good part of what they did was getting the simulation right. In fact, if you read closely, they actually went back and modified their experiment and their physical hardware in order to make it simulate-able. It does not come easy, but potentially it's a really useful thing.

Lukas: What is the toughest part of going from a model on your laptop to something in production?

Brandon: So when I'm working with data on my laptop I run it, it all fits in RAM, I get an answer, spits it out, makes an image, saves it to a file, that's all good. Get an acceptable error rate, I'm good to go. Taking it into production, that trained model then becomes like the easiest part of the whole thing depending on what you're using it for. Let's imagine you're using it as part of an app or part of a service where somebody somewhere on their phone or on their laptop has to do something that needs the result of this. Let's say it's a weather predictor or a corona virus risk predictor or something like that, all of the pieces to get that request to make sure that it's not part of some denial of service attack to make sure that the request is well-formed and it's not going to gum up your model to get the answer out, to make sure that it gets delivered. All of those pieces, break them down individually and they're fairly simple. You put them on a piece of paper and it looks like a bunch of blocks connected by arrows and it's like, okay, here's what all the things do, that's great. I don't do this myself, but I sit next to people at work who spend their days making sure that all of these blocks run smoothly and all of these arrows are working the way they're supposed to and it is a full time preoccupation or a full time demand on your attention to care for and feed these. They're all running on computers that are in data centers somewhere, they're all running on software that's being regularly updated. Anyone who uses Amazon Web services is probably familiar with new services and new capabilities coming all the time. Occasionally there are breaking changes so what worked yesterday doesn't work today and it is a lot harder than it sounds when someone tweets out, "Oh cool, I spun up a cluster and now this thing's running a thousand times faster!" That's super cool but that hides a lot of effort that goes underneath and it also hides a lot of a long term investment required to keep that up and going.

Lukas: I totally agree but the things that you've described feel like the difference between sort of any kind of demo on your laptop and any kind of production thing but I do feel like there is at least kind of like a trope or a meme or something about how machine learning is particularly hard to do this with. Do you think that there's something special about machine learning that makes it extra hard to put the stuff around it and make it stable? Or do you think it's just that people just get too excited about demos in general?

Brandon: Yes. So machine learning specific issues are it's almost impossible to consider all the possible inputs you'll get. For instance, if you want to take an image as an input and say put a filter on it or do some kind of identification on it. It's very possible that basically your users are now adversarial and some people out there are going to either intentionally or on accident, come up with things that will break what you're doing. So being able to identify that, I mean it may not I take the service down, but it might produce a result that's undesirable or offensive or at least embarrassing. So you kind of have to keep an eye on that. The other thing is, a lot of times when we train a machine learning model, there is no training set, validation set, maybe a test set. You train it. You get good results on that data and you go. That assumes that the world doesn't change, which is a terrible assumption because as soon as you deploy that model whatever phenomenon you are modeling is going to start gradually shifting. So a great example of this is weather. If you had a really good weather predictor in 1970, it would probably not be worth very much today and having the ability then to not have a static model or to periodically retrain and redeploy is important if you want to keep that up and going. Those are those are the two big ways that I've see machine learning models in particular fail.

Lukas: What's so great about robots?

Brandon: For me personally, there is passion around robots. When I was a kid, 5 years old and I'm watching the Empire Strikes Back in the theater and Luke at the very end, gets his hand  cut off and ends up with this prosthetic hand and there's these mechanical actuators in place of tendons and I just thought that was the coolest thing I've ever seen. So that right there kind of set my career path. And so I'm in graduate school now in mechanical engineering and working with prostheses and stroke rehabilitation and I have circuit boards out and I'm on the phone with my dad who's an analog electrical engineer and was like, hey I need to build a pre amplifier for this signal because the sensor is not strong enough and I like have to package up and shrink wrap it and tape it to this motorized prostheses that we're putting together. Then I have to read it in through the serial port and write some C code to pull the values off of it and read it in and it's just like down in the electrons. It's the interface between the physical and the software world and maddening and frustrating and so many times it doesn't work and then after weeks it does and I just stand back and look at the thing and I think, woahh! There is no boundary between the real world and the imaginary world and the computers. It's all real. There's physical and there's digital but there's a blurry line in between and robots span this. So with robots, you embrace all of the chaos of this physical world and you really have to put your money where your mouth is with regards to control and learning and you know that your sensors are gonna fail and you know, your actuators are going to change performance over time. You have to be able to handle all of this stuff and when you're done, if you do it right, you have a little thing that if you suspend disbelief it looks like it might be alive in some small cool way. That's pretty cool.

Lukas: It feels like we don't actually engage with many robots in our lives today but I don't think it's so much the cost of the materials right? I mean iRobot maybe has one good example of a robot we do use regularly, but it seems like there's a lot more things where you could build them, but they would be hard to kind of make them smart enough to be to be useful. Software is so amazing right, because once you make it once you can copy it and put it in, everything. I sometimes I wonder if like a day will come that there will be like robots just like all over the place doing useful things for us. Do you imagine that will happen or is are there like breakthroughs that are kind of unforeseeable? What do you what do you think about?

Brandon: I very much do. The chain of reasoning you just followed is exactly what I did when I was in my graduate program and I'm looking around and thinking yeah, hardware capabilities are pretty cool. There were some robots at the time out of Berkeley that had crazy number of degrees of freedom, big as a person could do all kinds of things but the gap between, "what can I do if I was controlling it with a joystick and what can it do with its own brain?" Was so big and a joystick is not even a very good interface. What could I do if it was tied right into my brain? I realized if you went to robotics, typically you focus on hardware or software. Software seems like the short bottleneck here so that's what I'm going to focus on. For now a lot of the emphasis on machine learning methods is driven by performance on benchmarks. That's good if you have to publish papers, you need some basis for saying this method is as good as or better or close to some previous method and the benchmarks are good for that. But it's gotten to the point in my opinion, where the tail is kind of wagging the dog and we only pursue the problems that we have good benchmarks for. In all the world of machine learning stuff, image classification is a tiny, tiny little piece of problem you could solve but you wouldn't know that based on popular press and based on a random sampling of papers at Neurips, for instance. That is starting to change the last couple of years you see a little bit more people going rogue with architectures or with the problems that they're willing to handle and I think more people are losing a little bit of patience with image classification, facial recognition systems of that flavor. We can do all this pretty well, but we are really bending our universe around this one point, why not branch out a little bit? As we're willing then to cover a little bit more of the space of the problems you have to solve we'll get robots who, like the Roomba, you might not watch it and think like, man, that's like vacuuming more efficiently than I would. You'd be watching you think, "Ok, it's getting to all the corners, all the edges, covering everything in its own time, like, great. I can go off and have a coffee and be confident it's gonna do its job." I think we're gonna get more and more of that.

Lukas: What do you think is one underrated aspect of machine learning that you think people should pay more attention to?

Brandon: So we have image classification also some really cool things with word prediction happening. Ripe for the plucking is unsupervised methods being able to automatically do clustering, automatically learn the similarities between things that may have many variables, might be really complex. To be able to say like, I've never seen this situation before, but it's kind of like this one I saw in the past and to be able to make use of that, what we've seen before. I think that there's a lot of work that could be done there for a modest amount of effort and it suffers mostly from the fact that there's no one right answer so it doesn't lend itself to benchmarks. But if anyone hearing this wants to say, "screw benchmarks, I'm going to go work with unsupervised learning," I expect you would be a really fruitful way to spend your time.

Lukas: What is the biggest challenge of machine learning in the real world?

Brandon: If I had to pick one that is the biggest in terms of impact, it is misapplication. It is easy to treat it like a hammer and beat on anything with it without regard to whether the hammer is the right tool for that and so we see places where models are trained on a grab bag of data about people that can transfer biases and transfer historical injustices, because those are the processes that generated this data and the new model will just blithely perpetuate that. There are few people who are kind of intellectually in a position to see how that works. Some of them are wonderfully vocal, but still not all of them. I think that the biggest downside, the biggest difficulty is that those who don't know or don't want to know about that will continue to use and perpetuate these, to end up hurting people.

Lukas: Do you have a particular example that bothers you or that you'd want to call out?

Brandon: Facial recognition in law enforcement is one that comes to mind right away. It is demonstrably inaccurate and especially for non-white minorities the accuracy is even worse than average so it's just a way to cause many more problems than it solves. On the surface, especially when sold the right way, it appears to be a useful tool and you can make lots of great claims about it but that washes over the harms that it does. That's not even touching facial recognition used for like overtly, discriminatory purposes which is completely unethical.

Lukas: Where can people find you if they want to keep this conversation going? What's the best way for them to reach you?

Brandon: I'm online, fairly active on Twitter handle is @_brohrer_ also on Linkedin regular posts, Brandon Rohrer and a lot of my choicest stuff. My labor of love goes to the End to End Machine Learning School, some course materials I put online and that's at

Lukas: That was awesome, that was so fun. Thank you so much.

Brandon: Thanks Lukas, I really enjoyed the conversation. I appreciate you and Lavanya setting it up.

Join our mailing list to get the latest machine learning updates.