Intel Labs, in collaboration with Blockade Labs, has introduced the Latent Diffusion Model for 3D (LDM3D), the industry's first diffusion model to deliver depth mapping to create 3D images with 360-degree views that are vivid and immersive.  LDM3D has the potential to revolutionize content creation, metaverse applications and digital experiences, transforming a wide range of industries from entertainment and gaming to architecture and design.  (Credit: Intel Corporation)

Intel Labs, in collaboration with Blockade Labs, has introduced the Latent Diffusion Model for 3D (LDM3D), the industry’s first diffusion model to deliver depth mapping to create 3D images with 360-degree views that are vivid and immersive. LDM3D has the potential to revolutionize content creation, metaverse applications and digital experiences, transforming a wide range of industries from entertainment and gaming to architecture and design. (Credit: Intel Corporation)





LDM3D is the industry’s first generation AI model to deliver depth mapping. It has the potential to revolutionize content creation, the metaverse, and the digital experience.

SANTA CLARA, Calif.–( BUSINESS WIRE )– Intel (Nasdaq: INTC ):

This press release contains multimedia. View the full release here: https://www.businesswire.com/news/home/20230621842353/en/

Intel Labs, in collaboration with Blockade Labs, has introduced the Latent Diffusion Model for 3D (LDM3D), the industry's first diffusion model to deliver depth mapping to create 3D images with 360-degree views that are vivid and immersive.  LDM3D has the potential to revolutionize content creation, metaverse applications and digital experiences, transforming a wide range of industries from entertainment and gaming to architecture and design.  (Credit: Intel Corporation)

Intel Labs, in collaboration with Blockade Labs, has introduced the Latent Diffusion Model for 3D (LDM3D), the industry’s first diffusion model to deliver depth mapping to create 3D images with 360-degree views that are vivid and immersive. LDM3D has the potential to revolutionize content creation, metaverse applications and digital experiences, transforming a wide range of industries from entertainment and gaming to architecture and design. (Credit: Intel Corporation)

What’s new: Intel Labs, in partnership with Blockade Labs, has introduced the Latent Diffusion Model for 3D (LDM3D), a new diffusion model that uses generative AI to create realistic 3D imagery. LDM3D is the industry’s first depth-mapping model using the diffusion process to create 3D images with 360-degree views that are vivid and immersive. LDM3D has the potential to revolutionize content creation, metaverse applications and digital experiences, transforming a wide range of industries from entertainment and gaming to architecture and design.

“Generative AI technology aims to further enhance and enhance human creativity and save time. However, most of today’s next-generation AI models are limited to generating 2D images, and only very few can generate 3D images with text requests. Unlike existing latent constant diffusion models, LDM3D allows users to generate an image and a depth map from a given textual cue using nearly the same number of parameters. It provides more accurate relative depth for each pixel in an image compared to standard post-processing methods for depth estimation, saving developers significant time developing scenes.”

—Vasudev Lal, AI/ML Research Scientist, Intel Labs

Why it matters: Closed ecosystems limit scale. And Intel’s commitment to the true democratization of AI will enable wider access to the benefits of AI through an open ecosystem. One field that has seen significant progress in recent years is in the field of computer vision, specifically in creative artificial intelligence. However, many of today’s advanced generation AI models are limited to generating only 2D images. Unlike existing diffusion models, which typically generate only 2D RGB images from text requests, LDM3D allows users to generate both an image and a depth map from a given text request. Using almost the same number of parameters as the latent constant distribution, LDM3D provides a more accurate relative depth for each pixel in an image compared to standard post-processing methods for depth estimation.

This research could revolutionize the way we interact with digital content by allowing users to experience their text notifications in previously unimaginable ways. The images and depth maps generated by LDM3D allow users to turn a text description of a serene tropical beach, a modern skyscraper, or a science world into a 360-degree detailed panorama. This ability to capture depth information can instantly increase overall realism and immersion, enabling innovative applications for industries ranging from entertainment and gaming to interior design and real estate listings, as well as virtual museums and immersive virtual reality (VR) experiences.

On June 20, LDM3D won the award for best poster at the 3DMV workshop at CVPR.

How it works: LDM3D was trained on a dataset constructed from a subset of 10,000 samples of the LAION-400M database, which contains over 400 million caption pairs. The team used the DPT (Dense Prediction Transformer) depth estimation model (previously developed at Intel Labs) to annotate the training set. The DPT-scale model provides very accurate relative depth for each pixel in an image. The LAION-400M dataset has been constructed for research purposes to enable the testing of model training on a larger scale for a wide range of researchers and other interested communities.

The LDM3D model is trained on an Intel AI supercomputer powered by Intel® Xeon® processors and Intel® Habana Gaudi® AI accelerators. The resulting model and pipeline combines a generated RGB image with a depth map to create a 360-degree view for an immersive experience.

To demonstrate the potential of LDM3D, Intel and Blockade researchers developed DepthFusion, an application that leverages standard 2D RGB images and depth maps to create an immersive and interactive 360-degree viewing experience. DepthFusion uses TouchDesigner, a node-based visual programming language for real-time interactive multimedia, to turn text prompts into interactive and immersive digital experiences. The LDM3D model is a single model for generating both an RGB image and its depth map, resulting in memory footprint savings and latency improvements.

What’s next: The introduction of LDM3D and DepthFusion paves the way for further advances in multiview AI and computer vision. Intel will continue to explore the use of creative AI to enhance human capabilities and build a strong ecosystem of open source AI research and development that democratizes access to this technology. Continuing Intel’s strong support for the open ecosystem in artificial intelligence, LDM3D is open source through HuggingFace. This will allow AI researchers and practitioners to further improve this system and optimize it for custom applications.

More context: Intel’s research will be presented at the IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 18-22. June. For more information, see “LDM3D: A Latent Diffusion Model for 3D” or view LDM3D demo.

About Intel

Intel (Nasdaq: INTC) is an industry leader creating world-changing technologies that enable global progress and enrich lives. Inspired by Moore’s Law, we continuously work to advance semiconductor design and manufacturing to help address our customers’ greatest challenges. By embedding intelligence in the cloud, the web, the edge and any computing device, we unlock the potential of data to transform business and society for the better. To learn more about Intel’s innovations, visit newsroom.intel.com and intel.com.

Laura Stadler

laura.stadler@intel.com

Source: Intel

#Intel #Labs #Introduces #Model #Generates #Degree #Images #Text #Requests