Recently, Visual 1st held a small panel of scientists and technology gurus called “Visual AI — The One-Click Future and Beyond.” This discussed what role AI-driven tools might play in the visual fields. Hans Hartman, President of Suite 48 Analytics and Christian Rondeau, CTO of Mediaclip moderated the panel. 

About the panel

Visual 1st described the panel:

“Just like in the early days of mobile photography, it’s not the technology shift itself that matters: it’s the breakthrough in solutions for problems — or unmet needs — we didn’t even know we had. 

“Fast-forward from the photo app revolution to today’s surging field of AI-driven tools. What can visual AI really deliver? Solutions for one-click photo curation, one-click auto-layout, one-click photo capture, one-click photo editing — and one-click solutions for what we can’t even fathom yet? And how does this one-click automated future coexist with tools or features that give the user manual control to get exactly what they want?”

The panelists included Appu Shaji, CEO and Chief Scientist at Mobius Labs; Brad Malcolm, CEO and co-founder at EyeQ; and Yair Adato, CTO and co-founder at Bria.

Various approaches to AI

Although never at loggerheads during the panel, the three participants mentioned approaching AI from different angles. For instance, Shaji finds the idea of non-technical people using AI for problem-solving to be particularly interesting.

Malcolm cited making corrections in images as opposed to necessarily enhancements. He also mentioned that just because AI is “hot” doesn’t mean that one should always use it. EyeQ uses AI in its analysis to determine when to use AI and when to use more traditional approaches. Malcom believes this hybrid approach leverages what each approach does best.

Adato discussed utilizing Natural Language Processing, or NLP, to create or invent new concepts. NLP focuses on the ability for computers to understand text and spoken words in much the same way human beings can. The idea here is that NLP could help extract information from documents and aid in categorization and organization of those documents. 

Snowpixel generated this image using AI based based on my word description. I described this as "an abandoned bus with a moon shining behind it."
Snowpixel generated this image using AI based on my word description. I described this as “an abandoned bus with a moon shining behind it.”

Further, as Adato mentioned, AI promises to generate something that was not there before. I have done a bit of dabbling with Snowpixel, for instance, allowing you to describe the image in words and generate an image based on that.

I used the same description as above, "an abandoned bus with a moon shining behind it." This time, Snowpixel generated a different photo altogether.altogether.
I used the same description as above, “an abandoned bus with a moon shining behind it.” This time, Snowpixel generated a different photo altogether.

How to evaluate whether AI works outside of a laboratory?

Rondeau asked how to determine if AI works “out in the wild,” out of scientists’ hands. Adato stated that you need specific strategies. 

Malcolm continued, stating that EyeQ use quantitative mathematical data. However, this needs to be evaluated by a qualitative A/B test from customers. In other words, do people feel that AI works as promised? This feedback is crucial, as the qualitative data must back up mathematical data. 

Shaji continued by stating that AI attempts to mimic human behavior, and that this has particularly taken root in the past 3-4 years. He stated that the scale is enormous because AI can process millions of images. 

However, Shaji pointed out, no systems are ever 100% accurate. AI algorithms require extensive training and tests. He stated that AI resides at an intersection of technical and social science. Shaji felt that AI must evolve alongside humans socially. “Body shaming,” which was more present forty years ago, would not be considered appropriate today.

Snowpixel generated this image based on how my description. I used a more complicated description to see what might happen. I described it as a "Klingon wearing a Chicago Cubs shirt using a camera to take photos of the Milky Way." I don't think Snowpixel's AI knows what a Klingon is.
Snowpixel generated this image based on how my description. I used a more complicated description to see what might happen. I described it as a “Klingon wearing a Chicago Cubs shirt using a camera to take photos of the Milky Way.” I don’t think Snowpixel’s AI knows what a Klingon is.

What do they see going forward for AI?

Malcolm’s company is developing AI for corrections on video. He stated that this was now possible due to increases in high-speed performance, mentioning AI Scene Detection to detect specific scenes.

Shaji again cited the usability of AI. He feels that the flurry of AI applications in the next 5-6 years will mean that AI is being trained by people outside the field, joking that he might put himself out of a job.

Adato continued, stating that he felt that all Fortune 500 companies will be software companies in 10 years, and AI will be in all software companies. He then joked that Shaji’s future is secure. He also mentioned the possibility of converting much of AI to a unified platform.

Part of the discussion centered around TensorFlow, an open-source platform for machine learning that many software companies use for developing machine learning and AI. 

Facebook already leans heavily on AI

Each minute, upload over 136,000 photos to Facebook. Facebook then uses AI for attempting to “understand” photos and videos. They also use AI for its Deepfake Detection Challenge (DFDC) to attempt to find deepfake videos. The company also uses Deep Text to determine how people use language, slangs, abbreviation and more. Sound familiar? It should. These are the very examples of AI that the Visual 1st panel discussed.

Additionally, Facebook uses AI-based translation. They also use AI policing, which I hope improves greatly.

Facebook CEO Mark Zuckerberg has said, Our goal is to enable computers to understand language more like humans would, instead of rote ones and machine-like zeros memorization.”

Facebook enters the metaverse

Facebook recently announced a rebranding of its parent company to Meta, indicating that they are all in with virtual and augmented reality, essentially looking toward a social media brought forth in an immersive 3D environment. Some, including Zuckerberg, refer to this as the Metaverse. Just to be clear, we should not confuse this with Marvel’s Multiverse of similar alternate universes!

Investors and companies are interested in new online, multi-dimensional, immersive spaces for a while now. It doesn’t seem to be going away, even if so far it has never quite taken root. However, to be fair, VR/AR has never been implemented particularly well either. But with increased processing power and AI, perhaps it might be.

After all, Facebook has announced that they would hire 10,000 workers in the E.U. in the next five years to develop the metaverse. Facebook Reality Labs, their VR/AR research and development team, announced the creation of a special product group to focus on their vision for the metaverse. And also, they has owned Oculus since 2014. All roads seem to be leading to the metaverse.

If Facebook and others are fully invested in creating an immersive metaverse, they would need to lean heavily on machine learning and AI to take the load off creating and coding. And that’s in addition to their already enormous use of AI for photos, videos and many other areas.

Snowpixel generated this image based on how my description. I used a more complicated description to see what might happen. I described it as a "Klingon wearing a Chicago Cubs shirt using a camera to take photos of the Milky Way." I don't think Snowpixel's AI knows what a Klingon is.
Snowpixel generated another image based on how my description. I used a more complicated description to see what might happen. I described it as a “Klingon wearing a Chicago Cubs shirt using a camera to take photos of the Milky Way.” I don’t think Snowpixel’s AI knows what a Klingon is.

Photographers in the metaverse

The idea of the metaverse, of course, is that it promises more interaction. Photographers might build new communities similar to Facebook groups or message boards. People would earn money and purchase items as digital assets. Users could exchange digital assets as Non-Fungible Tokens, or NFTs, allowing them to “tokenize” their work and sell it in marketplaces throughout the metaverse.

The idea here too is that you would have a verified digital identity traced to a user’s digital wallet. The hope is that people cannot create troll accounts as easily, providing a safer environment. 

But even if you don’t wish to do this, photographers may possibly find new opportunities. The company’s vision for the metaverse is that it will not only be available on more immersive virtual reality devices, but accessible across different computing platforms. In other words, anything from VR/AR to computers, mobile devices and gaming consoles.

Given how strongly this metaverse relies on an immersive visual environment, there may be opportunities for photography that we have not thought of yet. Could people experience our night panoramas in some sort of virtual reality? Could there be more of an immersive component while experiencing photos? Could one “enter” a photograph while listening to environmental sounds or music?

Thoughts

Is this good? Is this desirable? Will it even really take root? Do we think Facebook, or Meta, would implement this well? 

I don’t have the answers to that. However, my guess is that machine learning and AI will have an enormous part in all this.

For now, my interest in machine learning and AI is perhaps more mundane. I want AI to do all the mundane tasks for me so I can create. If that means getting rid of high-ISO noise and hot pixels such as what Topaz DeNoise AI is addressing, great! Quick and effective masking in images? Fantastic! Eliminating sensor spots from images like Luminar Neo will do? Sure! Finding photos quickly without keywording? I say yes!

As the panelists at Virtual 1st pointed out, the idea is to make products usable. Adato mentioned focusing on trying to do something “small” at first to bring something to production. And that sounds great to me.