Suzanah is a founder of Machine Learning Tokyo which is a nonprofit organization dedicated to democratizing Machine Learning. They are a team of ML Engineers and Researchers and a community of more than 3000 people. Follow Suzanah on twitter.
Lukas: Suzanah, it's so nice to talk to you. I was really looking forward to this because I see that we share these two interests in common. One, seems like the democratization of A.I. and another is EDGE computing or deploying-deploying the hardware. So I'm super excited to hear about what you've been up to.
Lukas: I thought maybe we would start with Machine Learning Tokyo. I would love to hear about why you started it and what it does.
Suzanah: Yeah. First of all, thanks so much for having me. I'm super excited. I love Weights and Biases and visiting SF, so I'm super excited to be on this podcast. So thanks for having.
Suzanah: Yeah, MLT is a Japan based nonprofit organization and it's our kind of core mission is to democratize machine learning. So we want to make machine learning and deep learning as accessible as possible to as many people as possible because we believe that you know, Machine learning is going to be everywhere, is going to be some standard component in the software stack in the very near future. So I think a lot of people should know what it is and be able like to navigate. And we mainly do this through open education, through open source. So we build a lot of open source projects and open science, so we work with universities. And yeah, we were here in Tokyo and we support research and engineering community of about I think four and a half thousand members.
Lukas: Oh, four and a half thousand? And so how does it work? Like how did people join the community and what do they do?
Suzanah: So it depends. Like there is many ways how to join the community. You can just be an attendee of the meetups or join a workshops or Hands-On sessions, and then you can just join Meetup and you get all the information you need there on an upcoming sessions. But there's also like more active ways to join MLT. So if you want to contribute, if you want to work on open source or if you want to, for example, hold a workshop or lead a study session, you can join slack and talk to me. And there's like many ways how to how to be more actively involved in the community.
Lukas: What inspired you to start MLT?
Suzanah: So we started, I think two and a half years ago and it was basically just out of our own needs. We were two people and that's how MLT started. And so I'm a domain expert in Machine learning, I come from a very traditional academic background and I'm a trained linguist and I was always working with text analysis and NLP. I was using very simple methods. At some point during my Masters, I was working on sentiment and emotion and effect and I realized that these kind of very simple statistical methods give us like some intuition, some insight about a corpus, about a data set. But language is full of very complex and very beautiful things like metaphors and humor and analogies and irony and sarcasm. And you know, that's not possible to grasp with those very simple tools. So I think three or four years ago, I started reading about Machine learning and Deep learning and Neural networks and I got super hooked and I realized, OK, having learning algorithms, having algorithms that learn from data directly instead of from rules or lexicons might be a way to understand language better or to be able to process language better. So I started writing my first machine learning code three years ago, but I also realized, well, coming from a different background, it's pretty challenging. It's pretty difficult. And for me back then I knew, OK, I want to have this collaborative learning environment.
Suzanah: I need to be surrounded by people with different backgrounds, people that have, you know, different skills and know different things than I do. And together or at least that was that was what I thought we could learn faster. And that's exactly what happened. So Yuvraaj, my co-founder; is also coming from a different background, from an electrical engineering hardware background. And he wanted to use machine learning and he still wants to use it for EDGE devices, Micro-controllers. And yeah, we started very small and we just met every week and wrote machine learning code. And every week, more and more people joined, even though it was kind of word of mouth. And after like a few weeks, there were so many people, we didn't know where to put them anymore. So we met in this open co-working space at Yahoo!; and too many people! So everybody wanted to write machine learning code. And then we started putting out our first meetups. And ever since it has been growing pretty fast. So we started from very small, but kind of, you know, out of our own need to, because in Tokyo there was no such thing back then like two and a half years ago, there were a lot of communities like great communities. But there was no like place to actually build AI, there was no place to work on hands on stuffs. So that's how it all started.
Lukas: That's call you built the community that you wanted to be a part of. That's so great.
Lukas: How did you frame it? Like when you were first saying, hey, come join me? What was that thing to do like it was I got to learn ML together or read papers or how did you think about that?
Suzanah: So the very first kind of, I think first six months or so, it was purely dedicated to going through tutorials. So really learning about how to write machine learning code and learning about the, you know, getting a conceptual understanding of different algorithms of the math, but mainly to write code. And that was how we started. It's just, you know, going through as much stuff as possible. And then once we kind of and, you know, the team grew bigger and more people have joined us. So, after six months, you kind of slowly started to broaden. So we did a lot more things. We did started doing Hands-On Deep Learning workshops in the first year. So, we had deep learning engineers who were working as full time at the Japanese companies and they were giving 5(five) hour deep learning workshops where we focus on writing life code from scratch and training a specific model training at in a week. We first focused a lot of computer vision. So we went through a lot of computer vision stuff and then gradually kind of moved into different areas of machine learning. And like as the community kind of progresses and grows, we see that we go into different directions. So now we have like a computer vision team that does CNN architectures and their own little ecosystem. We have a team that is fully dedicated to AGI, so running deep learning algorithms on hardware and micro-controllers and EDGE devices. We have a NLP team that does research in natural language processing. And everything is fully community driven. So there is no full time employees or anything. It's really how the community evolves and grows and that kind of broadens into different directions.
Lukas: That's so impressive. So like, how do you run a good workshop? Like, a five hour workshop? You know, I've seen really good ones and bad ones. Like, how would you do it to make sure that it a good experience for people?
Suzanah: I think it was learning by doing then; in the beginning, we really didn't know what we were doing. So I think two years ago, where we held our first deep learning workshops, a lot of things were pretty difficult and pretty challenging because, you know, people come with different machines, with different skill sets, with different background knowledge, with different software and hardware. So that's pretty difficult. But, kind of slowly we got a lot of feedback in kind of first iterations and worked with that feedback. So things that made it easier for us is just, you know, focus on one thing that is really interesting to us, where we see value that can bring value to us as instructors, as deep learning engineers, as well as student communities for something that is very useful. The second thing, is like make sure that technically everything runs smoothly. So we switched, I think after a second or third workshop to Google collapsed. That makes it very easy, like to just write code and there is no like prerequisite except for having a G-mail account. But that solves a lot of the technical issues and problems that we had.
Lukas: Yeah. But does everybody like build the same thing together, is that how you run it? It's like you get your sort of, say, a problem that really works together. Like how much do you kind of coordinate like everybody doing the exact same thing when it's like people going off on their own.
Suzanah: So it depends what kind of workshop we're doing. So if we have our standard deep learning workshop, there's typically a topic and we already have prepared like a repository with the model that we're going to build. We sit down with 50 people and then the instructor. We do some theories.
Suzanah: So we do first like maybe an hour of conceptual understanding of what is going to happen, where we're going to build. And then Demetries, for example, he is like coding from scratch. So he basically walks you through from the very beginning to getting your performance metrics. And so these kind of workshops are designed to do exactly this,only this. And people just follow along with the code and they can life code from scratch. And this is something that people find really useful because especially that kind of life coding aspect, because sometimes when you're on your own, you look at, you know, blocks of code and you kind of trying to figure out what is happening. Try to figure out your own thing. It's useful if someone actually writes code with you and explains what is happening. It's you learn just faster probably. Or this is at least what I find to be useful. On the other hand, we have much more open sessions. So especially like our hardware sessions where we do AGI, the only thing we provide is a ton of hardware and then people come in. These are typically smaller groups, maybe 20-25 people, and then people come in. They build teams, they choose their hardware and they come up with their own idea and they build their own stuff. And then at the end of the day, each team presents what they have been working on. So it really kind of depends on the session, I guess.
Lukas: Do you think is there like a different kind of culture in Japan than, say, in San Francisco? Like are language barriers like an issue at all? What it's like to be sort of? I guess, I know what it's like to be in San Francisco, but what do you think that there's big differences coming from Japan?
Suzanah: I don't know San Francisco that well. So I was and I went to a lot of meetups actually, and they're pretty cool. I think a lot more things are just happening in San Francisco. And I think a lot more things are supported probably in S.F. In Japan, language is definitely an issue, it's a huge barrier. It is something that I've been constantly thinking about.
Suzanah: In Japan, there are amazing communities in machine learning; there are two super big machine learning communities, one is TensorFlow user group that is very related, of course, like to Google and then Deep Lab, which is I think affiliated with Microsoft. Those guys are very big and the very, very Japanese; so everything is in Japanese. And then there's us. I think we're like similar in size but we speak in English. And yes, this is one thing that has been bothering me so much because I'm always trying to find ways how to not have these isolated communities. So this is a challenge in Japan. This is definitely a challenge and we're working on it. But other than that, you see that communities are growing and there is a huge demand also for a machine learning talent. So, yeah, apart from the kind of very Japan specific problems like language barriers, I think it's a pretty, pretty good and active environment to be in.
Lukas: Yeah, remember, I went to Japan last year and I've worked out on off and on with Japan as a market, and I've always been impressed by how excited people are about, you know, machinery giving going back like 10-15 years seem like there's a lot of enthusiasm for it. And actually, I've been kind of wrestling. I just would like to find a way to translate our documentation into Japanese and kind of keep it up to date.
Lukas: I've been thinking about that lately.
Suzanah: Yeah, I think that would be a good move. We were also like only focusing on English, but there needs to be like this bridge and we need to start somewhere. So we also started translating. We worked with a T.A.(Teaching Assistant) from Stanford to translate their 'C' as deep learning course material, of course notes into Japanese to make it more accessible to people and have bilingual kind of resources for people. So we're trying also very hard kind of to include as many people as possible.
Lukas: That's awesome. What do you think when you think about democratization of A.I.? I mean, what else do you think is important? Like how do you think about that?
Suzanah: Maybe, because of my personal background, because I am a domain expert, but I also see how important machine learning is and is going to be in the near future. I feel like, if possible, we should have as many people as possible involved in even technical stuff. So there have been a lot of democratization efforts, for example, If you look at H2O with AutoML, like making it really very easy to experiment, but also of course from other AutoML platforms from tech giants. For us, it's like a lot of education that we do. We work with a lot of universities, something that I kind of personally like doing is working with research scientists or students coming from different backgrounds. So I think, machine learning could be super useful for people that work with a lot of data.
Suzanah: And we worked with a lot of super interesting people. For example, last year in summer, I think we were at the Tokyo Institute of Technology where we held a two day boot camp for Elsie; Elsie is the Earth Life Sciences Institute. And those guys are amazing. They are astrophysicists, the planetary sciences by all computational biologists, chemists like...you know, mind blowing stuff! And we had a room full of people and they all work with different kinds of data sets and problem sets and with different tools and techniques. And machine learning could be one way for them to get new insights and maybe even to advance science. So, personally for me these kinds of things are super exciting, getting like more domain experts involved into technical stuff, doing open education, doing open science; this has been pretty interesting.
Lukas: What about people without math or programming background? Do you think there's room for them to contribute, too?
Suzanah: Yeah, absolutely. I think so. You know, there are Jeremy Howard and Rachel, they've been doing like the best job ever into getting domain experts on board. Right?
Suzanah: You do have to have some coding backgrounds, so you should be able to write some python code. But going through fast AI courses, for example, it's a more top down approach. And they're exactly democratizing machine learning or making it uncool by having so much more people just involved. And this top down approach allows you to get into deep learning without having to have a PhD at Stanford and computer science or like a really strong math background. You build stuff, so you start with thinking about your problem and your data and to build stuff then afterwards you start digging deeper into the math, for example, that you might need for your particular project or problem. And I think I really like this kind of approach. That's very similar to what we've been doing with MLT as well. Even though we also do a lot of like fundamental work. So we also have like study sessions for machine learning, math and other things. But I think there's definitely room for people who are coming from different backgrounds. And I think if they find it even potentially useful, they should look into it.
Lukas: I mean, you've probably seen people go from kind of novices to knowing a lot about ML. And people ask me all the time, how do I get into this stuff? Do you like have any advice from the data that you're saying? And you know what folks should do if you have no background and you really want to go deep on this stuff?
Suzanah: Yeah, for sure. So I think two things are super important. The first thing is, don't neglect your background, don't think that you have to start over from zero and you don't know anything before that. Leverage your background, leverage your professional experience, your academic background, whatever it is that you have been working on in the past years, leverage that. It's the same, there are many examples for that. For example, you could be a hardware engineer and you know a lot about hardware; and now you're getting into Machine learning and Deep learning. Now leverage that background and that expertise to learn about machine learning and learn how to combine these two things. In my case, it's language. So I've studied language as a system for many, many years and I use machine learning, and the combination of language and machine learning to bring interesting and unique insights to particular project that I'm working on. I talked to a recruiter here in Japan and I asked him, so what does the market need? And he said, well, it's here in Japan, it's not enough just to know Deep learning. You have to have some sort of specialization, you have to have some sort of domain expertise, some like way how you can use this kind of Deep Learning in combination with something else. It could be software engineering, it could be hardware, it could be language, it could be anything. So this is the one thing.
Suzanah: And the second thing is when you're coming from a different background and you want to go into machine learning, there is, of course, like two approaches. Either you start with the fundamentals, you start with math or you do what I just earlier mentioned top down. You start with a project and you just write code, build that project and then figure out details later.
Suzanah: And I think the most important thing here is to figure out what is interesting to you, what would be something that really catches your attention and you love working on and make that decision and then start working on that. Because, the problem here is that, there are too many options. You could do too many things. Everything seems to be interesting. But if you spend a little time here and little time there, you will get maybe some shallow understanding of a few things, but you'll not advance as quickly as you might want. So figure out what you want to do and leverage your backgrounds- probably my advice.
Lukas: Do you think that you see people being more successful starting from the fundamentals or starting with a project? Because you mentioned that there's sort of two different approaches and people gravitate towards one or the other. Do you have a preference or can both work?
Suzanah: Both can definitely work. I think we were just like only talking about domain experts and people coming from different backgrounds. But of course, I think what research for the academia and industry needs just as much or even more is people with very, very strong CS backgrounds, with very strong math backgrounds that know how to optimize and know how to work on theoretical things. So I'm not saying this is not important, not at all. I think, of course, this is still the norm and this is what probably employers want to see the most. And if you're coming from a strong cs or math background, I think you already have a strong foundation to go very deep into machine learning and deep learning. But I just want to say, like there's room for other people as well.
Lukas: Ok, so this is a little bit outside of the scope maybe of a ML podcast, but I am just fascinated by this, so maybe it is. What about starting a community? Do you have advice on someone in an area like you where they want to find like-minded people? I mean, do you have any advice on that, like if I'm in a city where there isn't already like an ML group, how would you go about finding people?
Suzanah: Yeah. And like so many people write me messages on that.
Lukas: Oh, yeah(smile)?
Lukas: That's so great!
Suzanah: They are either in remote areas or in cities where like literally something like a machine learning community still doesn't exist. And I would always say like, 'go for it'! If there is no such thing out there, be the first one to do it. Because, MLT has evolved into an amazing community. Like, literally I'm amazed by how active and how engaged communities and all those guys they have; they have full time jobs, but they still find time to work on open source and to teach other people and to do these kind of workshops. So it's pretty amazing. So I would really kind of suggest to think about starting a community wherever you are.
Lukas: Do you have any practice(*) for getting people off the ground? Because, it seems kind of daunting to me to try to start that and keep people engaged. How do you get people to keep talking?
Suzanah: I think the most important thing is to do something that you are really interested in, because if you're starting it, a lot of things will depend on you. And the key is, I think somebody wrote it on Twitter recently; the key also to MLT is consistency. So we consistently just keep doing stuff that we think is exciting and interesting. So start from what you're interested in, start from your own problem set or from your own need, and more people will follow. And then like more practical things like, we started doing remote meet up so there was no burden of having to find a venue and a sponsor and other things. So this is an option how to kick things off to find more people who are interested; that makes it very easy, like there is no easier way probably then to start like remote meet ups.
Suzanah: On the other hand, if you want to start something in your city, first of all, you might want form a small peer group around yourself and try to figure out what you want to do and then start to look for a physical place and figure out if you want to do hands on stuff or if you want to do like more educational stuff; learn together and try to reach as many people as possible. And, you know, just yesterday I talked to someone, a journalist, and he said to me, wow, there's no such thing for writers out there. I want to start something for writers out there. And I think it's kind of the same thing. Right?There is a need for all these niche groups and communities. So I think if you get it out there and if you do things that you're very passionate about, people will follow.
Lukas: Do you have any thoughts on like diversity and any inclusion in ML in these groups that you create? Is that something that is top of mind for you?
Suzanah: Yeah. That's something that is very important to us. Luckily within MLT, we're a very diverse kind of, four and a half thousand people in terms of countries and languages and skill sets and backgrounds and professional experiences. So this is really super diverse. But, yeah, women are super underrepresented. I think two years ago when we started on working on deep learning workshops, we had 60 engineers and I was the only woman.
Suzanah: Yeah, so I realized we really needed to do something about that. So we're doing like very specific, not only events but also projects that support diversity and inclusion. We do a lot of women in machine learning events, they are supported by Google Japan, Mercari and other companies. We also do projects that I just earlier mentioned where in one of them, we had about 12 bilingual engineers that worked on translating some of the Stanford course notes into Japanese and having this kind of bilingual resources for people just to be more inclusive in general, also to the Japanese community because we are literally in Japan and we are very diverse. But it's still seems like there's a disconnection between a Japanese community and an English speaking community. And I think it has never been more important. We all know, Tech in general is multidisciplinary so machine learning should be as well multidisciplinary. We need people with different skills, with different expertise, with different backgrounds in general. So, this is something we all have to work on, I think.
Lukas: Do you have any other suggestions for making the community feel more inclusive?
Suzanah: So in our core team, we decided very early on that we want to create an environment that is very collaborative and that is very inclusive. That means that we really don't want this as kind of elite math machine learning group. We want to include as many people as possible and we want to have like decision processes. We want to have the community involved in what directions we take, what kind of things where we're tackling on next. And every project that we do, in every workshop and every study session, we kind of have that sort of mindset. So when you look at our math sessions, like last year we started doing remote math reading sessions. So we're going through a book that walks you through some machine learning math and more than 1000 people signed up from all over the world.
Suzanah: We have sessions in the Bay Area, we have sessions in India and Apack here in Japan. And the thing is, it is very inclusive because the sessions, the people that join those sessions, their levels of math are very different; so we have complete beginners, we have people that are coming from completely different backgrounds. But in our Tokyo sessions, we also have mathematicians, we have experts, we have PhD's in math, people that have taught math for many years. And it's pretty amazing, after the reading it's a very interactive discussion where people ask all sorts of questions and together we kind of brainstorm around things and our experts like Emil and Jason, they try to explain mathematical concepts. And it's been pretty amazing. So I think really having this mindset that whatever you do, you need this. It's something that is actually enriching to whatever you do and that is very important and having that mindset is probably going to help a lot.
Lukas: Super cool. That sounds really fun. What is something underrated maybe in machine learning that you think people don't pay enough attention to?
Suzanah: Something underrated? So, I think something still underrated in machine learning is data.
Lukas: Still?! Oh my God!
Suzanah: Yeah, I think so. Like, it doesn't matter who I talk to, always I feel like it's a troublesome thing to do. Right. You don't want to work with data, you want to write machine learning algorithms, you want to train models, you want to get good accuracy and then push accuracy or metric. It's not about data. So data is kind of doilies that people think about sometimes or this is at least kind of my understanding of it.
Suzanah: And I think we should definitely think more about data and put more emphasis on data. Maybe this is also because of my own background, because I've been working with data pretty much all my career and just three years ago started working with machine learning algorithms. But yeah, it all starts with data and it'll probably ends with data. I think Chip Huyen just mentioned recently, who owns the data pipeline will own the machine learning and production or the machine learning system. So yeah, I guess; I don't know if it's still that case. Maybe in SF, maybe in the Bay Area people think more about data. I don't know.
Lukas: I don't know. I think my background is similar to yours. And so I feel like data is so unbelievably important, I guess.
Suzanah: Right, Yeah.
Lukas: It's not possible for Data to be properly rated for its contribution to ML. And then when you think about making machine learning work in the real world for real applications, what's the hardest part about getting it to work?
Suzanah: So in our case, we love to experiment with new things. And I think, it's difficult when you're trying new things, you kind of need to figure out a lot of stuff. And generally, I think in production environments, there is a lot of experimenting and trying to see what works. So making a production pipeline work and deploying machine learning for different use cases has different challenges, from data to all the way to software engineering to monitoring your model like how it changes in different real-world scenarios. So even though like things are taking off, there's still a lot of room to work on these kinds of things, infrastructure things, deployment thing, finding new use cases, finding use cases that make a lot of sense for machine learning. But at the same time, I think this is super exciting, so this is something that really excites me probably the most is thinking about use cases and experimenting a lot and trying new things. At MLT, we do work on production things as well, but it's not our main thing. Our main thing is just trying out new things, experimenting and make POC (Proof of Concept), so we don't actually deploy a lot of things on a large scale to production. So maybe I can't talk about like the main challenges here, but what I can say is that we try like if we take EDGE, for example, we're trying out a lot of things. We're working with different hardware where we're trying to think about different use cases where these things can be deployed. And there is a lot of things that just don't work out and fail but that's totally fine. That's good as well. This is something that we also need to grow and to figure out things. But then at the same time, we also build things that work and that are super interesting. So, yeah, it's a lot of experimenting, I guess. Yeah.
Lukas: OK, so my final question. If listening to you talk I get excited about joining one of your virtual ventures, how do I find out more? How do I get more involved with MLT? Can I do that remotely?
Suzanah: Yes, you can, definitely. So we do, as I mentioned earlier, like on Meetup, you can find all of our events and a lot of them are actually remote. So if you want to be part of an event or a meetup or something like that, you can just join meetup and we will post everything there. There's also more active things. So if you would like to work on open source or doing some other things or get more involved in general, you can join our slack group. There is pretty much the whole community, they are talking about different things, so probably more in technical depth. So you can also find people there to work on projects and do other things. And so these are kind of the main two things, the meet up for events and maybe slack for projects and other stuff.
Lukas: Awesome. Thank you so much. It's great to talk with you.
Suzanah: Yeah. Thank you so much for having me.