The secrets that make Pixel 10 Pro the world’s smartest phone camera – from inside Google

ZDNET’s Kerry Wan takes a photo with the Google Pixel 10 Pro camera.

Sabrina Ortiz/ZDNET

Follow ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

In a deep dive interview, a Google leader on the Pixel Camera team pulled back the curtain and how the company used AI to launch new camera features that other phones don’t have.
We learned what makes Pro Res Zoom, Auto Best Take, Conversational Editing and other software features work.
As the only one of the companies making frontier AI models that also makes a smartphone, Google has a key advantage over every other smartphone maker.

Isaac Reynolds has been working on the Pixel Camera team at Google for almost a decade — since the first Google Pixel phone launched in 2016. And yet, I think it’s fair to say that the Group Product Manager of Pixel Camera has never been more bullish about the technology that Google has integrated into a phone camera than he is with this year’s Pixel 10 Pro. A new wave of AI breakthroughs in the past year have allowed Google to use Large Language Models, machine learning, and generative AI imaging to unlock new capabilities that have powered another meaningful leap forward in phone photography.

I got the chance to sit down with Reynolds while he was still catching his breath after the launch of the Pixel 10 phones — and at the same time, ramping up for the next set of camera upgrades the team is preparing for Google’s 2026 smartphones.

Also: Pixel just zoomed ahead of iPhone in the camera photography race

I peppered Reynolds with all of my burning questions about Pro Res Zoom, Conversational Editing, Camera Coach, AI models, the Tensor G5 chip, Auto Best Take and the larger ambitions of the Pixel Camera team. At the same time, he challenged me with information I didn’t expect on Telephoto Panoramas, C2PA AI metadata, Guided Frame, and educating the public about AI.

I got to unpack a lot about how the Google team was able to engineer such big advances in the Pixel 10 Pro camera system, and we delved far deeper into the new photography features than Google talked about in its 2025 Made by Google event or in its published blog post.

Here’s my reporter’s notebook on what I learned.

Mission of the Pixel Camera team

“I think the major thing our team has always been focused on is what I call durable [photography] problems — low light, zoom, dynamic range, and detail,” said Reynolds. “And every generation [of Pixel] has brought new technologies.”

Camera Coach

Reynolds noted, “LLMs have such an enormous context window, and they’re so powerful at understanding that we can actually teach people to do things that tech can’t do.

“Today, tech cannot move the camera down four feet. Tech can’t walk the camera over 100 yards to the better viewpoint. It can’t tell you to turn 90 degrees. Now, Camera Coach can do that kind of stuff. So that’s just another way we’re using technology to solve some of these durable problems.”

Conversational editing

One of the most surprising new features Google announced in the Pixel 10 was conversational photo editing — although this is technically a feature in the Google Photos app. This lets you simply describe what you want changed in the photo, with voice or typing, and the AI takes care of the rest. So you can remove a tree, re-center the image, or add more clouds to the sky, for example.

As Reynolds explained it, “Conversational editing just takes the whole interface away and it’s essentially a mapping function from natural language to the things that were in the editor. So you can say, ‘Erase the thing on the left,’ and it will just figure out what the thing on the left is and then invoke Magic Eraser. You can say, ‘Hey, when I was in Utah I remember the rocks being more red than that’ and it just increases the warmth a little bit. You can say, ‘Can you focus on the thing in the center’ and it puts a little vignette around it.

“And that mapping is a huge time saver. The promise of AI was not just that it would be informational, but it was that it would take actions for you. And I think this is one of the most perfect cases of the AI not just reminding you of something … but doing it for you. It has been really, really cool to see how effective it is.

“It even gives you suggestions. The AI will look at a picture and say ‘I think you have some bystanders you would want to remove.’ And so it populates these little suggestion chips. The funniest part of the suggestion chips is when you tap them, all it does is type into the text box. It’s not a separate pathway. You just tap the chip and it sticks something in the text box. You could have written that yourself. It’s not doing anything wildly different than you could do… It’s also got the voice button, which is super cool. You could just talk to it if you want to. The AI is getting so good so much faster than I could imagine, and I’m a professional in this space.”

Pro Res Zoom

As a photographer who loves zoom photography, this was the feature I wanted to talk with Reynolds about the most. I take a lot of photos with smartphones, but long distance zooms are where I most often need to pull out my Sony mirrorless camera and 70-200mm lens. I’ve already written about how excited I am to thoroughly test Pro Res Zoom, since it could help produce a lot more usable zoom photos from a phone by using generative AI to fill in the gaps in digital zoom.

Reynolds commented, “The fundamental problem is, how do I turn a digital zoom where you’ve got a sensor pixel on the far right corner, and then another one on the bottom left corner. And you have to fill in all the pixels in between. You can do an interpolation. You can just set them all to be some color, like just average them. We’ve grown all the way through the process here. We’ve gone through multi-frame denoise. We’ve gone through multiple different generations of upscalers to make better interpolations. We went to a multi-frame merge that was block-by-block. And then the major advancement that was Super Res Zoom was going from a block-by-block multi-frame to a probabilistic pixel-by-pixel multi-frame… In parallel, the upscalers were improving. And the latest generation upscaler is the largest model we’ve ever run in Pixel Camera ever… And it’s just a really, really good interpolator.

Also: I replaced my Samsung Galaxy S25 Ultra with the Pixel 10 Pro XL for a week – and can’t go back

“It doesn’t just say that’s black and that’s white, and so the middle is gray. It’s like, well, I know that that black pixel is part of a larger structure. I know that that larger structure appears to be the grout in between some brick on a facade. And so probably it’s going to be black up until that point, and then it’s going to turn red — which is so much smarter than just going, ‘Well, it’s black and it’s red. So, I don’t know. I guess we’ll just mix them as we go across.’ So we still have those real things as real pixels, and then we have to fill in what’s in between. And now the models are just so, so good at that.

Pixel 10 Pro Res Zoom at 100x — The top photo is at 0.5x zoom and the bottom is the same framing at 100x on Pixel 10 Pro.

Google (screenshot by Jason Hiner/ZDNET)

“We’ve had a long line of upscalers, and this is the latest one. All the upscalers have artifacts. Different upscalers have different kinds of problems. We’ve had upscalers in the past that were very, very good at text — because text has very harsh lines — but very bad at water, because water is fundamentally chaotic. This upscaler has its own artifacts, and those artifacts are very difficult for the human eye to recognize, because the new models are so good at making content that is 100% authentic to the scene.

“Like, yes, that’s a leaf on a tree. That’s exactly what a leaf on a tree looks like. It’s flawless. But for a human face, there is so much of the human brain dedicated to recognizing faces, that no level of artifact is effectively acceptable. The level of subtle artifact on a leaf, you may never notice. But the same subtlety on a face, you notice instantly — just because we’re human beings and we’re designed to recognize other human beings. We’re social creatures, so the bar for actually doing a good job with human faces is extraordinarily high.”

As a result, when Pro Res Zoom recognizes a human face, it won’t use the AI to upscale it.

C2PA metadata to label AI

Because Google is now part of the Coalition for Content Provenance and Authenticity (C2PA), it has started to embed metadata into its photos to indicate whether generative AI was used to make the photo by using SynthID, a watermark created by Google DeepMind. Reynolds was deeply involved with the project to make this part of Pixel Camera.

“The [C2PA] metadata identifies whether this was AI or not, and it just generally tells you the history of the picture and we embed it,” said Reynolds. “I was personally the product manager for that. I don’t do things personally like that a lot anymore, but I did take that one because I knew how important, nuanced, and subtle it was. And the deeper I got into that feature, the more I realized how little people actually know about what AI is or isn’t, what it can and can’t do, or how fast or slow it’s progressing.”

An example of Google C2PA metadata for AI.

Google

Educating the public about AI

“The world is honestly behind in terms of not realizing how good AI is already. So there’s some education to do. And we realize that AI can do things that I think users would really, really like if they understood better what was going on. So part of what we do in Pro Res Zoom is we don’t touch faces. I think that’ll make people more comfortable. We also show them the before and after — the version with the new upscaler and the one without it, and you get to decide for yourself, what did AI do? Did I find it acceptable or unacceptable? The overwhelming majority are finding it more than acceptable — highly preferred, in fact. They want the upscale. But they wouldn’t know that if they didn’t get to see the side-by-side.

“And then we also label it with content credentials [C2PA] so that whenever they transmit that photo, somebody else can make their own decision about, ‘How do I imagine this photo? Do I discount this as maybe AI? Or do I go, oh no, the content credentials are right there. They say it’s not AI at all. This is great. I have so much more trust now.’ And as users learn more, as they’re educated more, as they gain more comfort and more real world data points of what is AI and what isn’t, I think they will end up being more comfortable over time, and that’s what we’re seeing with Pro Res Zoom already. The customer satisfaction that we measured pre-launch was so good for that feature.

“And as the technology gets better, we’ll do more. We will put this stuff into more modes, perhaps. We’ll push the zoom a little higher quality. But we really want to make sure that we’re doing that as users expect and understand it. So we’re giving you options and choices and transparency, but we’re also trying to push the boundaries of technology in a way that keeps customer satisfaction high.”

Telephoto Panoramas

“There are always little goodies hidden all over the [camera] app,” Reynolds told me. “We build more stuff than we can realistically talk about.”

One of the new photography features in the Pixel 10 Pro that Google hasn’t talked much about is Telephoto Panoramas, or what they affectionately call “5x tele-panos.”

These allow you to take more cinematic landscape shots using the zoom lens, new viewfinder controls, and the ability to shoot 360 degrees and up to 100MP resolution. “There’s something that’s just so nice about zooming in with your lens and then stitching the panorama,” said Reynolds.

But what Google hasn’t talked about is the fact that it’s using an entirely new method of capturing these panoramic images.

Also: Google Pixel 10 Pro vs. iPhone 16 Pro: I’ve tried both flagships, and there’s an easy winner

“A lot of panoramas in the market, and ours historically as well, were video-based,” Reynolds noted. “And what that means is to make a panorama, you take 100 to 1000 images, and each one of them, you stitch a little tiny vertical slice. So that means two things. Number one, it means that the artifacts you get tend to be curves, stretches, and compressions because you’re just going slice by slice. The other problem is that in that 30 seconds, you have to process [up to] 1000 images.

“So what we did is we said instead of a video we’re going to use photo input. So we’re going to take five pictures, not hundreds, and we’re going to put all of our processing behind it — full HDR Plus, full computational photography, Night Sight — and then we stitch a little bit of overlap. So instead of having a little sliver from each picture, it’s just a little overlap. That’s how [Adobe] Lightroom would do it, for example. We’re using the Lightroom method.

“And so we get Night Sight Panorama. We get panoramas now up to 100 megapixels. We get just super, super detailed and we can turn on parts of the zoom pipeline that we couldn’t necessarily do before. So you can use the 2x zoom, which on a Pixel phone has optical quality. And you can even invoke the 5x telephoto [on the Pixel Pro]. It’s a very computational-photography-forward, photo-based panorama.”

Guided Frame (accessibility feature)

Another feature that has flown under the radar that Reynolds wanted to point out was Guided Frame.

“Guided Frame is an accessibility feature. If you are blind or low-vision, we use Gemini to allow you to frame any photo,” said Reynolds. “In that case, you point the camera, you invoke Guided Frame, and it says, ‘This is a photo of a scene of the woods with some trees off to the right and a person on the left. Person is in frame, smiling, good for a selfie. And then it will take the photo. So if you can’t really see the screen that well, it helps take selfies and photos, because [selfies] are how people communicate. Whether you’re blind or low-vision or not, people communicate using pictures. So it gives them that capability.”

Auto Best Take

I also asked Reynolds about the evolution of Best Take to Auto Best Take this year and was surprised to learn that this feature is actually using more machine learning.

“Auto Best Take is much more traditional processing,” Reynolds commented. “You can imagine this as a decision tree, because that’s essentially what this feature is. You press the shutter once. If that shutter press was perfect and everyone was smiling, everyone’s looking at the camera, then great. Done. One picture.

“Okay, let’s say it wasn’t perfect. Then we’re going to open the shutter a little longer and we’re going to look at every single frame. So that’s up to 150 frames in just a few seconds. If we see one that’s better, we’ll take it, we’ll save that one, we’ll process it in full HDR Plus quality… So when you go to the gallery, you’re going to see the one that we took as the primary, that’s called Top Shot. So that’s one step down the decision tree.

Google Pixel 10 Pro selfie camera.

Sabrina Ortiz/ZDNET

“Let’s say we looked at 150 frames and we couldn’t find one that was perfect, but we found one that was almost perfect, and a second one that was almost perfect but in a different way, such as a different face. Then what we’ll do is we’ll save both of those and then we’ll pass that to Best Take and Best Take will blend them into one that is perfect. And Top Shot will intentionally choose a range of pictures so that there’s at least one photo in which every face is smiling. So if there is a picture of every face smiling at least once somewhere in the set then it will do a Best Take. Once you look at 150 pictures, most of the time you get the shot. So very rarely does it actually go to Best Take. So it’s a little odd that we call it Auto Best Take, because in reality, we don’t do it very often, since it’s at the end of the decision tree.

“The goal is that you press the shutter one time and you get one photo and that photo is perfect. It does not matter how we get there. We never want you to have to take three photos [of the same group picture] again. Because why would you take three random photos when [the AI] can look at 150 photos. So we say just press [the shutter button] once. Give it a couple of seconds. You’ll see it in the UI. It draws boxes around people’s faces. It turns them gold when it thinks it nailed it. So press the shutter, give it a couple seconds, and then watch what you get at the end.”

The difference with Tensor G5

Google made a big move in 2025 with its Tensor G5 chip powering the Pixel 10 phones — shifting from having Samsung build its Tensor chips in the past to a TSMC 3nm process that uses TSMC’s advanced technology to increase AI performance. I asked Reynolds about the impact.

Also: Considering the Pixel 10 Pro? I recommend buying these 5 phones instead – here’s why

“[The boost with Tensor G5] is one of the largest before-and-afters I’ve ever seen in terms of processing latency,” he noted. “The first versions of Pro Res Zoom took like two minutes [to process]. And then by the end, once they got it on Tensor G5 and all the bugs had been worked out, that got down to just several seconds… So the Tensor G5 TPU is 60% more powerful, and we can definitely see that.”

The AI models powering Pixel photography

Since so many of the Pixel 10’s most important new features are powered by AI advances, I wanted to know more about how the Pixel Camera team works with Google’s internal AI capabilities.

“It’s not like there’s this one monolithic Gemini,” Reynolds said. “It is extremely carefully tuned and tested for one particular use case at a time… There are so many more versions of Gemini inside [Google] than you can see outside. And then you have to decide, am I going to prompt this Gemini or am I going to fine-tune this Gemini? It’s all super, super custom to a particular implementation.” For example, he added, “Magic Eraser is generative, but it’s not Gemini.”

Final thought

Google is the only one among the dozen or so companies in the world building frontier AI models that also makes its own smartphone. And with the Pixel 10 Pro, the impact is starting to show.

Source link

Cyptea Daily News

The secrets that make Pixel 10 Pro the world’s smartest phone camera – from inside Google

ZDNET’s key takeaways

Mission of the Pixel Camera team

Camera Coach

Conversational editing

Pro Res Zoom

C2PA metadata to label AI

Educating the public about AI

Telephoto Panoramas

Guided Frame (accessibility feature)

Auto Best Take

The difference with Tensor G5

The AI models powering Pixel photography

Final thought

Related Articles

Leave a Reply Cancel reply

The secrets that make Pixel 10 Pro the world’s smartest phone camera – from inside Google

ZDNET’s key takeaways

Mission of the Pixel Camera team

Camera Coach

Conversational editing

Pro Res Zoom

C2PA metadata to label AI

Educating the public about AI

Telephoto Panoramas

Guided Frame (accessibility feature)

Auto Best Take

The difference with Tensor G5

The AI models powering Pixel photography

Final thought

Related Articles

I compared the iPhone 17 Pro and Pixel 10 Pro zooms, and it’s not even close

Google is looking for 15 people to test its next Pixel phone

Apple MacBook Pro (M5, 14-Inch) Review: More of the Same

Leave a Reply Cancel reply