Opening Portals With Rust

The following is my interview with Brian Schwind and Matěj Laitl of tonari, a company that is building portals to “fuse” together distributed workplaces and enthusiastically doing it all in Rust! There were a few things that really stood out to me in this interview. The first is that the folks at tonari are really asking better questions about distributed work, and I think it's getting them better answers. The second was a strong ethic of craftsmanship. That comes through very clearly when you hear these engineers talk about how they’ve built deep into the layers of the stack and dialed in the most important pieces of their solution like latency. It was a pleasure to have this conversation, and I hope reading it will be a pleasure too! To see jobs available at this and other cool rust companies, check out our extensive rust job board.

Brought to you by:

EuroRust by Mainmatter: Come To Paris With Us!

Want to advertise here? Reach out! filtra@filtra.io

Drew: To set the stage, tonari makes portals.They're these large screens that allow people to have super immersive video calls. What were the problems that tonari identified with the current state of video calling that made the need for this product obvious?

Brian: It started with Ryo and Taj, the co-founders. They were both working at Google Maps at the time. They had tried setting up similar stuff with whatever video calling platform Google had at the time. They would just have a big TV in a couple of conference rooms and just have a call that they didn't hang up on as a way to connect different floors or people in different buildings together. The idea was to just work together without having to call each other up every time.

Brian: When they left Google Maps, they were still thinking about that idea. There was no software out there at the time that allowed you to have something that's just continuously running. A video call is a very temporary thing. They wanted to have something that had two spaces connected in a more or less permanent way. So, one of the things that they wanted was to build a sort of a communication tool for future projects. They wanted something that would enable them to go live their life wherever they wanted but still be connected to the people that they want to work with.

Brian: Most software like Google Meet, Zoom, or Microsoft Teams is very much something where you start the call, you do your business, and then you hang up. There's no permanent connection to it. So, that was probably the biggest thing that they wanted to address by making tonari: having two spaces that feel like they're fused.

Drew: So it was kind of a different problem entirely. It was digitally fusing two physical spaces together.

Matěj: Yeah, exactly. We don't even really categorize it as video conference software. This is supposed to be more ambient and more answering the question of "What technology do we need to make it feel like you're working in the same space?" People still come to the office because they like their colleagues. And, there’s serendipity that happens. It's very easy to have a situation where someone asks a question or complains about something and somebody overhears and says, "I can actually fix that." You don't start a video call for that kind of thing. tonari is answering the question of "Can technology recreate this kind of feeling of being together, working on something together, while you're physically distant?" Technically, it's a video calling product, but we try to be something slightly different than a conference call. It's more about a permanent connection.

Brian: A lot of important interactions are not the scheduled ones that happen. Those spontaneous things that Matěj described don't really happen in a video call because you probably scheduled it to talk about something specific.

Drew: One of the reasons I was so interested to talk to you guys is because I’ve personally thought quite a bit about this problem space. We had this whole remote work period of COVID and then the return to office. With the return to office came these claims that I think have some truth, that doing Zoom calls wasn't cutting it. People felt like there was some collaboration missing. To your point, I think a lot of it is that ambient stuff where you’re catching a conversation that you feel like you should drop into or something like that. So, I just think it's so interesting that this is an angle on that problem. I think remote work should work. There should be a way to use technology to almost completely close the gap or close it like 99.9% of the way.

Drew: Why don't you guys explain some of the things that tonari portals do. Maybe explain how they work and what they do differently to enable this kind of interaction we’re talking about.

Brian: tonari portals let you have a persistent, high resolution, low latency video connection to another space. There is a share screen so you can cast from your laptop or phone, a minimal dial-based UI, and an easy and quick way to switch to other tonari portals. When we first launched, it was just two places connected and that's it. And, that is basically the vision, but it's also kind of hard to sell that if a company wants like three or four or more locations. We found out that a lot of people didn’t want them just limited to one connection. So, in the last year or so, we added the ability to easily change to other locations.

Matěj: The biggest difference from traditional video calling is that this thing is always on. We try to build it to feel kind of natural and pleasant to have on. When you’re on a video call, you basically just want it to end, but we want this to be something you want to have on. With a video call, you feel like you're supposed to stay in place. You're not supposed to go grab a coffee or something. We want to change that. Basically, we want this to feel as much like a doorway as possible. So, there’s a lot that we do to facilitate that. For example, we really try to make latency low so that when you have a conversation you don't step on each other. That's a huge challenge to keep both the video and audio latency as low as possible. Zoom and Google Meet have become slightly better for this lately, but we’re still maybe half the latency of those products. That actually creates quite a different experience. For example, it’s the difference that’s needed to be able to play rock-paper-scissors. So, conversations through the portal feel much more like a real conversation.

Brian: Yeah, this is all within the limits of the speed of light, of course. The connection between Matěj and I from Tokyo to Prague still has, you know, a few hundred milliseconds of latency. One of our goals since the start has always been really getting the lowest possible latency. And, we do that while also working to have high quality audio and video.

Matěj: Another aspect of this is the sheer size of the portal. If someone stands in front of it you can see all of their body language. For example, if you have two people sitting on a sofa chatting, you can see the whole scene.

Drew: What are the dimensions of the portal?

Matěj: The dimensions of the screen are 235 by 140 centimeters, so about 2.3 meters by 1.4 meters. It's bigger than a lot of doorways. What you’re seeing is our second model. Our first model, called tonari pro, was even bigger, but it wouldn’t fit into many buildings or rooms, especially in Tokyo. So, the second one is slightly smaller.

Drew: I guess that's one of those constraints that you wouldn't expect to run into building a technology product until you realize that you’ve actually built a piece of furniture that has to fit in buildings.

Matěj: Yeah, and we could make smaller versions if the need arises. We build these in-house. We use third-party vendors for some components, but we do all the metal pieces and such. It's a completely in-house product in terms of design. The first versions had lots of one-off handmade components and everything was very DIY.

Brian: The very first versions were actually made by a master carpenter. So, that one had a lot of wood, which was cool. The artistry of it was great, but wood is also not the best material for the longer-term where you’re putting a PC and other things inside it that will heat up and warp it over time.

Drew: One of the things I saw that the portal addresses is the eye contact problem. This is one of the things that I've thought about in terms of this “making remote work” problem. Video calls don’t give you anything near eye contact. Can you guys talk about the solution for that?

Matěj: Our solution is simple. We use a projector and a screen, and the camera is right in the middle of the screen. We just make the camera super, super small. It's about the size of an iPhone camera. It's strategically positioned about 1.4 meters high so that it's slightly below the average standing person's eye height because you also want to catch people sitting. This way, if you happen to be looking in the middle of the screen, you happen to be looking into the other person's eyes.

Brian: It’s a simple solution, but that's what we've gone with for now. It does work surprisingly well! When people are standing off to the side, you don't necessarily get perfect eye contact, but you can still very much tell that they're looking at you.

Drew: Simple is good.

Matěj: In the past, we played with multi-cameras, multiple angles, and like fusing them together. We’ve experimented with things like stereo depth estimation, but haven’t yet achieved results that look good or are fast enough for our latency targets.

Drew: You guys said latency is something you really try to solve for. I imagine if you were doing more sophisticated processing that would add to your latency.

Brian: Yeah, that would add to the latency.

Matěj: We still do some image processing, but we try to keep our pipeline as simple as possible. The camera captures the scene, and we try to load the data as quickly as possible to the GPU where all the processing happens. We even encode there without leaving the GPU memory. From there, we take the compressed data and try to pass it over the network as fast as possible. We have metrics that measure the individual steps of the pipeline over time. We spend a lot of time staring at those metrics and shaving off single milliseconds here and there. The GPUs are pretty fast these days. So, the problem is more about avoiding transferring data back and forth. I should also say that we own all parts of the stack, it’s not just something built on top of WebRTC. Our protocol is very simple. We basically take the compressed video frames and fire hose them over a UDP socket.

Brian: If we were Apple or someone that had millions of dollars, we might do even more ourselves. For example, they have the Vision Pro and made all of these custom ICs to process video from all the different cameras and do all of that kind of correction. If we had that, maybe we could make something that would correct eye contact properly, or do stereo depth estimation without having that "uncanny valley" problem that we ran into. But, we don't have those kinds of resources.

Matěj: We're a very small team. As you can probably see from our website, we do a bit of everything. As an engineer at tonari, you work on almost everything. If you think something is wrong, you can be vocal about it and make a change in the next release. And, if you're interested, you can work on the hardware, like designing PCBs, new case designs, cabling simplifications, etc.. Or, you can work on the other side, doing things like playing with Rust types so the network protocol is safer against packet loss. Within a week, you may work on something very high-level, something very hardware-specific, and something very abstract in software. You can almost choose what interests you most or what you feel is most pressing.

Drew: I feel like that kind of autonomy is one of the fun things about working for a smaller company. How many people are in the company right now?

Brian: It's 16 core members, plus maybe 10 to 15 part-time or extended members. We aren't trying to grow exponentially. We're trying to build something more sustainable, in all senses—culturally, team wise, and financially.

Drew: Brian kind of talked about how the company was started with the founders being at Google and hacking on things. Is there more you would want to share about the company's story?

Brian: There's a little tidbit that might be relevant to people interested in Rust. I joined in 2019, but the company started sometime in 2018. Ryo, one of the co-founders, started the codebase in C++ with WebRTC. They scrapped together some demos, but they were very rudimentary. The first engineer they hired was very into Rust and suggested rewriting in Rust, mostly because he happened to enjoy Rust more than C++, but also because it was a bit more of an “up and coming” language that engineers might be excited to work with. He convinced Ryo, and it only took them about a week or two to reproduce most of the code since there wasn’t much of it at that point. That's basically when tonari became a Rust company.

Brian: I actually met that person at a Rust meetup in Tokyo in 2018. We kept in touch, and later, tonari hosted a meetup at their office. I helped make a silly Rust programming game for that meetup where everyone was developing these little AI bots to fight each other in one shared arena.. Through working together on that, they asked if I was interested in working at tonari. I was getting into Rust and wanted to work on it full-time, so I took the opportunity. From that point, we had two people who were very into Rust, and we just kept hiring more people like that. From then on, hiring specifically for Rust development has been a magnet for many talented engineers.

Drew: Rust itself does seem to attract talented engineers. Have there been any big wins lately that you'd want to share?

Matěj: No singular big events lately, but we're really ramping up the production and deliveries. This is a niche product. So, a year ago, we could handle maybe two installations in a month. Now we can handle two in a week.. Previously, it would take a lot of time to prepare everything and provision it very manually. Now, we’re at the point where we’ve shortened everything, standardized the software more, and generally have less manual configuration.

Matěj: Then there’s also something Brian mentioned already which is location hopping. That’s the biggest user-facing software feature.

Brian: Yeah, that was a big change to a codebase that previously assumed you just had two tonaris connected to each other. You can now have an arbitrary number connected, and you need to negotiate switching between them and cases where they're not available and all that kind of stuff. That took up a lot of our time last year. The first time we got it really working, we were kind of blown away by how instantaneous it is to switch between different ones. That's probably one of the features I’m most proud of.

Matěj: We’re always baking new things in secret, so nothing else we could share right now, but the team is pretty busy.

Brian: So no singular big wins, but the momentum feels good.

Drew: I guess I haven't thought that much about what Matěj was talking about with the production ramp. That's always a huge hurdle for companies that are building hardware, right?

Brian: Yeah, and tonari's such a large thing. That's been the most difficult part- just getting it to companies without any issues. How do you pack it? How do you assemble it? It's a lot of specialized knowledge. Shipping that large number of things to another country is a whole thing as well. It gets customs and stuff involved. You also have a number of different power certifications in different countries. All kinds of fun like that.

Matěj: A hardware product startup is "hard mode" compared to a software-only startup. For example, we use Rust and our software is just very robust. We've gotten to the point where we mostly face hardware instabilities and deficiencies. What we actually spend a lot of time on now is fixing hardware issues in software, because software is what you can rely on. The hardware is flaky.

Brian: There are some times I wish our product was just a PCB of a reasonable size and a case around it or something. Something you can hold in one hand would be very easy compared to shipping all the components we have.

Drew: You mentioned that the deliveries are ramping up. I'm curious, where are they being deployed?

Matěj: Most of our customer base is still in Japan. Tokyo-Osaka is probably connected like five times or even more by various companies. We're a Japanese company after all. But lately, we see a lot of international connections where maybe headquarters in Japan connect to Hanoi, Malaysia, Singapore, or maybe Hawaii or California. We also see people connecting European cities with Asia. We have three different customers that have Vietnam to Tokyo connections.

Brian: Have you been to Japan, Drew?

Drew: No, I haven’t yet.

Brian: You should definitely visit sometime, but you obviously know about the bullet trains that they have. The first line was between Tokyo and Osaka, because it's just two major business hubs going back and forth. We have a lot of tonaris connecting those two locations now. So, hopefully we're saving on train rides and other travel between those two.

Matěj: One of the value propositions for companies literally is, "Save on airline costs because you no longer have to fly there." You can have meetings with equal expressive power and convincing power if you have a tonari connection.

Drew: That makes sense. Tell me about the role that Rust plays in your stack and the big advantages of using it for the work you're doing.

Matěj: Basically, everything is Rust. We own the whole stack, from the business logic to low-level things like working with video and audio. When it’s necessary, we can bind to existing C or C++ libraries. For example, working with proprietary camera SDKs or doing some work in CUDA might require that for now.. But, whenever there's a native Rust library available, we prefer to use it, even if we have to patch it a bit and contribute our changes upstream. It’s more fun!

Brian: Fun is an important thing on the long days. We also have a bit of firmware running in Rust as well. I think Rust is great for embedded programming - it’s a good fit for some of our firmware which interacts with hardware like the rotary encoder dial and lighting controls we have. There are new tools like Embassy that use async await in a way that really makes sense for embedded systems, and it really can be fun to work with.

Matěj: For me, Rust brings a feeling of comfort. When we made a big change to the code base to support multiple portals for location hopping, Rust's type system was a huge advantage. If you do your types right, you can introduce a central change, and Rust compiler will tell you everything you need to fix. Once you fix the compile errors, it just works. I'm often surprised by how reliable this is. It might be a little slower to get started, but once you have an established code base, making big refactors is much easier. You can easily ensure everything is correct. This gives us more velocity later in the code base's lifetime.

Drew: Tell me about some of the more interesting technical challenges you guys have taken on in the course of this work.

Matěj: We have a wide range of them. One example is a low-level video issue we call "frame pumping." When we tried to reduce the video bitrate, we noticed a slight pumping effect. We debugged this down to how video codecs work, and we realized that the way we were sending frames wasn't efficient. Brian realized that we could request a frame only when the decoder needs one rather than sending them on a fixed schedule like a TV does. This allowed us to significantly lower our bitrate while maintaining very high video quality. It was a cool moment because we later found a Miracast wireless display protocol extension that does the same thing, so we had independently discovered a similar technique.

Brian: When I joined, our video streams were struggling to hit 45 fps. I like to play games, and 60 FPS is typically the bare minimum for a good experience in that environment, and I wanted our video to at least hit that mark. I looked into the code and realized we were constantly moving data between the GPU and the CPU. By stopping those transfers, reducing copies, and processing everything on the GPU, we easily hit 60 fps. We also realized that 60 fps gives video calls a "soap opera effect," which works in our favor and makes the interaction feel more real.

Brian: Another challenge has been automating the camera's black-out region. We have a camera in the middle of a screen that a projector shines on. If a bright scene is on the screen, it can cause a nasty blue flare on the camera. We used to manually define a black region to prevent this, which would solve the problem, but it would often shift. In Japan, earthquakes are common, and that’s enough to shift the display on a short-throw projector. These shifts would happen and we’d have to always recalibrate where we were drawing the black region. So, we developed algorithms that shine patterns on the screen to find the camera's location and then draw a black region around it. This was a hard but fun technical challenge. It's not perfect yet, but it's fairly reliable.

Matěj: Maybe the last thing from a pure software perspective is that our software uses an actor framework. We use our own which is called tonari-actor. It's open source, but we just didn’t announce it very publicly. Historically, we've been staying away from async code and libraries, but now more and more ecosystem libraries are better in their async versions. So, I was like, "Uh, how do we fit async?" We didn't really want to change our architecture, as we really like this little isolated model.

Matěj: Just last week, I managed to somehow convince our actor framework to support async, but in a contained manner. Every actor is a thread, and now we can say, "Hey, this actor is an async actor." It's got its own runtime, so it's kind of still confined in its own thread. You can combine sync code and async code in one actor system.

Matěj: Because we stayed away from async, I had to really refresh myself on where the ecosystem is, what the difference between an executor, a reactor and a runtime is, etc. There's fresh support for async fn in a trait, which is great, but that one cannot specify the Send bound when using it. Then, I dug into how Tokio is thread-based by default, so your futures must be Send. We didn't really want this. In the end, we managed to enable some experimental configuration flags for Tokio to use a local runtime which can spawn non-Send futures, so that we can use the nice async trait methods in our traits. It kind of looks like this approach might be working.

Brian: Just reviewing the generics and types on that has been pretty wild.

Drew: Your mission is enabling remote work, but you're also building hardware. How do you balance being remote with the need to be co-located with the hardware?

Matěj: It's a funny clash. The long-term goal is a smaller product for home offices, but at our current scale it's more viable for connecting small business hubs. Our customers and our own team use tonari to connect these hubs, which is why we're located in a few different places like Tokyo, Okinawa, and Prague. There are times when we have to be physically present to work on the hardware, like during a new feature or a hardware revision. For example, when we worked on location hopping last year, it was almost purely software. A lot of that could be done remotely. So, it really goes back and forth.

Drew: Did having a distributed team happen naturally, or was it a deliberate choice?

Brian: It happened naturally, with Covid being the main driving factor. We started in our Tokyo office and then got another space nearby. Our co-founder, Taj, lives in Hayama, south of Tokyo, and his house was our first tonari connection. Matěj applied to join us during the pandemic, but he couldn't enter the country so he started working remotely. He eventually set up an office in Prague. As we grew, we just put a tonari in every space we used to keep the network connected. It's fascinating how we've built a culture around this. For me, a typical day might involve working with a colleague in Tokyo in the morning, and then switching over to Prague in the afternoon as they're starting their day.

Drew: Are there any other interesting things about the tonari culture you'd point out?

Matěj: We have a flat hierarchy. There's no formal CTO or COO; we just get stuff done. Everyone's roles are fluid and change based on company needs and individual interests. This allows our engineers to work on a wide variety of tasks, from designing PCBs to optimizing our software. We also "dog food" our own product—our portals are always on, and we operate the way we do because of them. This is how we work and solve problems together, and it's a huge part of our culture.

Drew: Are you actively hiring right now?

Matěj: We are not actively hiring for software engineering roles, but we are looking for a hardware engineer to be based in our Tokyo office.

Drew: Is there anything unusual about compensation or benefits at tonari?

Matěj: We have a flat salary policy, which means everyone in the company, including co-founders, gets the same salary. This eliminates the need for negotiation and reflects our belief that we value everyone's time and contribution equally. It fits perfectly with our flat hierarchy.

Drew: What stands out when you're looking at candidates?

Brian: We use an evaluation project after an initial call to see how a candidate works with us. We look for open-source contributions, interesting projects, and a passion for engineering in specific areas like electronics, firmware, or graphics and audio processing. The evaluation project is a paid, week-long collaboration where we work on a small project together. This gives us a chance to see what it's like to work with them on a team, and it gives them a realistic feel for our culture. We try to open-source the project when possible so their work doesn't just disappear.

Matěj: The most interesting candidates are often those who have built similar projects or faced similar problems in a different domain, showing a real passion for a particular area. We look for people who bring expertise that complements the team, which allows us to strategically cover all the things that make the product work.

Matěj: The way we try to assemble our team is to have everybody contribute their expertise and passion in some area. I might be the "Linux nerd" of the engineering team, for example, who custom-builds their kernel. We recently hired someone who is really into audio and maybe had their own little projects. A lot of the time, the best hires and most interesting candidates we've seen are the ones who have built a version of tonari before. That’s not necessarily a direct competitor, but maybe something in a different domain while having faced similar problems. For example, “my brother wanted to compose music with me, but he lives on the other side of the country. So, we made this little ad hoc software that allows us to compose simultaneously.”

Brian: Obviously, the company is not just composed of engineers; we have sales and marketing and other areas. We also have calls with them to make sure everyone is excited about whoever might join the team.

Drew: That was everything that I wanted to ask. Is there anything else you guys wish we had the chance to discuss that we didn't?

Matěj: I think I've dumped all my ideas.

Brian: Yeah, I think hopefully we gave a pretty good representation of what the company is about, what we're building, and how Rust fits into the equation. We're a bunch of Rust fans here.

Drew: All right. Well, thank you Matěj and Brian!

links:

1. tonari-actor

get rust jobs on filtra

Know someone we should interview? Let us know: filtra@filtra.io