GeNVS: Generative Novel View Synthesis with 3D-Aware Diffusion Models

  • Eric R. Chan * 1, 2
  • Koki Nagano * 2
  • Matthew A. Chan * 2
  • Alexander W. Bergman * 1
  • Jeong Joon Park * 1
  • Axel Levy 1
  • Miika Aittala 2
  • Shalini De Mello 2
  • Tero Karras 2
  • Gordon Wetzstein 1
  • 1 Stanford University
  • * Equal contribution.

We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our model samples from the distribution of possible renderings consistent with the input and, even in the presence of ambiguity, is capable of rendering diverse and plausible novel views. To achieve this, our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume. This latent feature field captures the distribution over possible scene representations and improves our method's ability to generate view-consistent novel renderings. In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences. We demonstrate state-of-the-art results on synthetic renderings and room-scale scenes; we also show compelling results for challenging, real-world objects.

novel view synthesis

  • Code on GitHub (Coming Soon)

novel view synthesis

We lift and aggregate features from input image(s) into a 3D feature field. Given a query viewpoint, we volume-render a feature image to condition a U-Net image denoiser. The entire model, including feature encoder, volume renderer, and U-Net components, is trained end-to-end as an image-conditional diffusion model. At inference, we generate consistent sequences in an auto-regressive fashion.

The following videos demonstrate novel view synthesis with our method, which produces high-quality, multi-view-consistent renderings on varied datasets.

Images, text and video files on this site are made freely available for non-commercial use under the Creative Commons CC BY-NC 4.0 license . Feel free to use any of the material in your own work, as long as you give us appropriate credit by mentioning the title and author list of our paper.

Acknowledgments

We thank David Luebke, Samuli Laine, Tsung-Yi Lin, and Jaakko Lehtinen for feedback on drafts and early discussions. We thank Jonáš Kulhánek and Xuanchi Ren for thoughtful communications and for providing results and data for comparisons. We thank Trevor Chan for help with figures. Koki Nagano and Eric Chan were partially supported by DARPA’s Semantic Forensics (SemaFor) contract (HR0011-20-3-0005). JJ park was supported by ARL grant W911NF-21-2-0104. This project was in part supported by Samsung, the Stanford Institute for Human-Centered AI (HAI), and a PECASE from the ARO. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government. Distribution Statement ``A'' (Approved for Public Release, Distribution Unlimited). We base this website off of the StyleGAN3 website template.

ACM Digital Library home

  • Advanced Search

Novel View Synthesis from a Single Unposed Image via Unsupervised Learning

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations.

  • Su H Liu Q Liu Y Yuan H Yang H Pan Z Wang Z (2023) Bitstream-Based Perceptual Quality Assessment of Compressed 3D Point Clouds IEEE Transactions on Image Processing 10.1109/TIP.2023.3253252 32 (1815-1828) Online publication date: 1-Jan-2023 https://dl.acm.org/doi/10.1109/TIP.2023.3253252

Index Terms

Information systems

Information systems applications

Multimedia information systems

Recommendations

Unsupervised single-view synthesis network via style guidance and prior distillation.

View synthesis aims to learn a view transformation and synthesize the target views from a single or multiple source views. Although previous view synthesis methods have obtained promising performance, they heavily rely on the supervision of the target ...

A novel multi-view learning developed from single-view patterns

The existing multi-view learning (MVL) learns how to process patterns with multiple information sources. In generalization this MVL is proven to have a significant advantage over the usual single-view learning (SVL). However, in most real-world cases we ...

A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization

The existing Multi-View Learning (MVL) is to discuss how to learn from patterns with multiple information sources and has been proven its superior generalization to the usual Single-View Learning (SVL). However, in most real-world cases there are just ...

Information

Published in.

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Association for Computing Machinery

New York, NY, United States

Publication History

Permissions, check for updates, author tags.

  • Multimedia applications
  • unsupervised single-view synthesis
  • token transformation module
  • view generation module
  • Research-article

Funding Sources

  • National Key R&D Program of China
  • National Natural Science Foundation of China
  • Natural Science Foundation of Tianjin
  • China Postdoctoral Science Foundation

Contributors

Other metrics, bibliometrics, article metrics.

  • 1 Total Citations View Citations
  • 460 Total Downloads
  • Downloads (Last 12 months) 361
  • Downloads (Last 6 weeks) 22

View Options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

View this article in Full Text.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Fast and Explicit Neural View Synthesis

Authors Pengsheng Guo, Miguel Angel Bautista, Alex Colburn, Liang Yang, Daniel Ulbricht, Joshua M. Susskind, Qi Shan

View publication

Copy Bibtex

We study the problem of novel view synthesis from sparse source observations of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. Our approach explicitly encodes observations into a volumetric representation that enables amortized rendering. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.

Related readings and updates.

Pseudo-generalized dynamic view synthesis from a video, efficient-3dim: learning a generalizable single image novel view synthesizer in one day.

Bottom banner

Discover opportunities in Machine Learning.

Our research in machine learning breaks new ground every day.

Work with us

Novel View Synthesis with Diffusion Models

3d generation from a single image.

We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which is trained to take a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can then generate multiple views that are approximately 3D consistent using a novel technique called stochastic conditioning . At inference time, the output views are generated autoregressively. When generating each novel view, one selects a random conditioning view from the set of previously generated views at each denoising step. We demonstrate that stochastic conditioning significantly improves 3D consistency compared to a naive sampler for an image-to-image diffusion model, which involves conditioning on a single fixed view. We compare 3DiM to prior work on the SRN ShapeNet dataset, demonstrating that 3DiM's generated completions from a single view achieve much higher fidelity, while being approximately 3D consistent. We also introduce a new evaluation methodology, 3D consistency scoring , to quantify the 3D consistency of a generated object by training a neural field on the model's output views. 3DiM is geometry free, does not rely on hyper-networks or test-time optimization for novel view synthesis, and allows a single model to easily scale to a large number of scenes.

Authored by Daniel Watson, William Chan, Ricardo Martin-Brualla, Jonathan Ho, Andrea Tagliasacchi and Mohammad Norouzi from Google Research.

Examples: ShapeNet renders to 3D

novel view synthesis

We show samples from a single 3DiM trained on all of ShapeNet. The source views were taken from ShapeNet objects we did not include as part of the training data.

This 3DiM only takes the relative pose as input, rather than source and target absolute poses. This allows us to test the model on arbitrary images where no pose is available, as we only ask for relative changes in pose.

To increase robustness, the training view pairs are rendered with objects at a random orientation and varying scales. We rendered 128 pairs (256 views) for each training object using kubric . We also add random hue augmentation during training. The model was trained for 840K training steps at batch size 512, and has 471M parameters. We use 256 denoising steps at generation time with classifier-free guidance (with a guidance weight of 6).

Examples: in-the-wild images to 3D

novel view synthesis

These samples are from the same 3DiM as the first set of samples. The source images, however, we took the source images directly from the internet, just ensuring they correspond to some ShapeNet classes and that they have white backgrounds and little to no shadow.

Examples: Imagen to 3D

novel view synthesis

These samples are from a single 3DiM trained on all of ShapeNet. The source images were produced with Imagen , an AI system that converts text into images.

To make Imagen generate objects on white backgrounds, we inpaint a 5px white border when generating a 64x64 image from text, and then upsample the images to 128x128 using a text-conditional super-resolution model.

Comparisons to prior work

novel view synthesis

We compare against prior state-of-the-art methods on novel view synthesis from few images on the SRN ShapeNet benchmark. The methods whose outputs we could acquire all guarantee 3D consistency, due to the use of volume rendering (unlike 3DiM). We render the same trajectories given the same conditioning image.

Prior methods directly regress outputs, often leading to severe bluriness. We show that 3DiM overcomes this problem: it is a generative model by design, and diffusion models have a natural inductive bias towards generating much sharper samples. Below we show more samples from the 3DiMs we trained for prior work comparisons; a 471M parameter 3DiM for cars, and a 1.3B parameter 3DiM for chairs.

State-of-the-art FID scores on SRN ShapeNet

novel view synthesis

3DiM Technical Details

Generation with 3DiM -- We propose stochastic conditioning , a new sampling strategy where we generate views autoregressively with an image-to-image diffusion model. At each denoising step, we condition on a random previous view, so the denoising process is guided to be 3D consistent to all previous frames with enough denoising steps.

3DiM research highlights

  • We demonstrate the effectiveness of diffusion models for novel view synthesis.
  • Stochastic conditioning -- novel sampler to achieve approximate 3D consistency.
  • X-UNet -- improved results by modifying the usual image-to-image UNet to use weight-sharing and cross-attention.
  • 3D consistency scoring -- new evaluation method to quantify 3D consistency of geometry-free models.

X-UNet -- Our proposed changes to the image-to-image UNet, which we show are critical to achieve high-quality results.

TOSS: High-quality Text-guided Novel View Synthesis from a Single Image

Yukai shi 1,3 , jianan wang 3 , he cao 2,3 , boshi tang 1,3 , xianbiao qi 3 , tianyu yang 3 , yukun huang 3 , shilong liu 1,3 , lei zhang 3 , heung-yeung shum 1,3 1 tsinghua university    2 hong kong university of science and technology 3 international digital economy academy paper code weights, toss utilizes text as semantic guidance to further constrain the solution space of nvs, and generates more plausible , controllable , multiview-consistent novel view images from a single image..

novel view synthesis

TOSS introduces text to the task of novel view synthesis (NVS) from just a single RGB image. While Zero-1-to-3 has demonstrated impressive zero-shot open-set NVS capability, it treats NVS as a pure image-to-image translation problem. This approach suffers from the challengingly under-constrained nature of single-view NVS: the process lacks means of explicit user control and often results in implausible NVS generations. To address this limitation, TOSS uses text as high-level semantic information to constrain the NVS solution space. TOSS fine-tunes text-to-image Stable Diffusion pre-trained on large-scale text-image pairs and introduces modules specifically tailored to image and camera pose conditioning, as well as dedicated training for pose correctness and preservation of fine details. Comprehensive experiments are conducted with results showing that our proposed TOSS outperforms Zero-1-to-3 with more plausible, controllable and multiview-consistent NVS results. We further support these results with comprehensive ablations that underscore the effectiveness and potential of the introduced semantic guidance and architecture design.

novel view synthesis

The pipeline of TOSS (Left) and our conditioning mechanisms (Right).

novel view synthesis

Comparing previous image conditioning mechanisms (a-b) and TOSS (c).

Noevel View Synthesis Results on the GSO and RTMV Datasets

novel view synthesis

Quantitative comparison of single-view novel view synthesis on GSO and RTMV.

novel view synthesis

3D consistency scores on GSO and RTMV.

novel view synthesis

Qualitative comparison of single-view NVS, on GSO (Left) and RTMV (Right).

novel view synthesis

NVS examples using TOSS on Synthetic NeRF dataset.

novel view synthesis

Random sampled novel views using TOSS.

3D Generation Results on the GSO and RTMV Datasets

novel view synthesis

Quantitative comparison of single-view 3D reconstruction on GSO and RTMV.

novel view synthesis

Qualitative comparison of 3D reconstruction on GSO and RTMV.

3D generation results based on in-the-wild images.

Novel View Synthesis Competition

Description.

Welcome to the Novel View Synthesis Competition!

View synthesis is a task of generating novel views of a scene/object from a given set of input views. It is a challenging and important problem in computer vision and graphics, with significant applications in virtual and augmented reality, 3D reconstruction, video editing, and more. Particularly, this competition is focused on two challenging conditions: (1) only one single reference view is available, and (2) a large view change is requested. We hope this competition can bring researchers together to explore and advance the state-of-the-art algorithms such as neural rendering and large generative models, producing high quality views at high efficiency .

Below are some sample image pairs to give you an idea of the task:

Given View Human Face

We host the competition on CodaLab . Detail rules can be found at competition rules . Teams who provide top solutions have the opportunity to present at IEEE ICIP, Abu Dhabi, UAE.

Important Dates

  • Data Available: 01/26/2024
  • Qualification Deadline: 02/23/2024
  • Final Competition Deadline: 03/01/2024 => 03/30/2024
  • Report and Code Verification: 03/06/2024 => 03/30/2024
  • Winner Announcement: 03/13/2024 => 03/30/2024
  • Winner(s) submit 2-pager papers to ICIP: 03/27/2024 => 04/03/2024

Lien

  • Stable Zero123, 2023
  • DiffPortrait3D: Single-Portrait Novel View Synthesis with 3D-Aware Diffusion, 2024
  • ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models, 2024
  • Novel View Synthesis with View-Dependent Effects from a Single Image, 2024
  • Free3D: Consistent Novel View Synthesis without 3D Representation, 2024
  • MultiDiff: Consistent Novel View Synthesis from a Single Image, 2024
  • DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation, 2024
  • LRM: Large Reconstruction Model for Single Image to 3D, 2024
  • SyncDreamer: Generating Multiview-consistent Images from a Single-view Image, 2024
  • DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model, 2024
  • 1st Place: US$150
  • 2nd Place: US$100

NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods

  • Kulhanek, Jonas
  • Sattler, Torsten

Novel view synthesis is an important problem with many applications, including AR/VR, gaming, and simulations for robotics. With the recent rapid development of Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS) methods, it is becoming difficult to keep track of the current state of the art (SoTA) due to methods using different evaluation protocols, codebases being difficult to install and use, and methods not generalizing well to novel 3D scenes. Our experiments support this claim by showing that tiny differences in evaluation protocols of various methods can lead to inconsistent reported metrics. To address these issues, we propose a framework called NerfBaselines, which simplifies the installation of various methods, provides consistent benchmarking tools, and ensures reproducibility. We validate our implementation experimentally by reproducing numbers reported in the original papers. To further improve the accessibility, we release a web platform where commonly used methods are compared on standard benchmarks. Web: https://jkulhanek.com/nerfbaselines

  • Computer Science - Computer Vision and Pattern Recognition

SG-NeRF: Sparse-Input Generalized Neural Radiance Fields for Novel View Synthesis

  • Regular Paper
  • Special Section of CVM 2024
  • Published: 26 June 2024

Cite this article

novel view synthesis

  • Kuo Xu  ( 徐 阔 ) 1 ,
  • Jie Li  ( 李 颉 ) 1 , 2 ,
  • Zhen-Qiang Li  ( 李振强 ) 3 &
  • Yang-Jie Cao  ( 曹仰杰 ) 1  

Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization, which limits their practical applications. We propose a generalization method to infer scenes from input images and perform high-quality rendering without pre-scene optimization named SG-NeRF (Sparse-Input Generalized Neural Radiance Fields). Firstly, we construct an improved multi-view stereo structure based on the convolutional attention and multi-level fusion mechanism to obtain the geometric features and appearance features of the scene from the sparse input images, and then these features are aggregated by multi-head attention as the input of the neural radiance fields. This strategy of utilizing neural radiance fields to decode scene features instead of mapping positions and orientations enables our method to perform cross-scene training as well as inference, thus enabling neural radiance fields to generalize for novel view synthesis on unseen scenes. We tested the generalization ability on DTU dataset, and our PSNR (peak signal-to-noise ratio) improved by 3.14 compared with the baseline method under the same input conditions. In addition, if the scene has dense input views available, the average PSNR can be improved by 1.04 through further refinement training in a short time, and a higher quality rendering effect can be obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Mildenhall B, Srinivasan P P, Tancik M, Barron J T, Ramamoorthi R, Ng R. NeRF: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM , 2022, 65(1): 99–106. DOI: https://doi.org/10.1145/3503250 .

Article   Google Scholar  

Yu A, Ye V, Tancik M, Kanazawa A. pixelNeRF: Neural radiance fields from one or few images. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2021, pp.4576–4585. DOI: https://doi.org/10.1109/CVPR46437.2021.00455 .

Trevithick A, Yang B. GRF: Learning a general radiance field for 3D representation and rendering. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.15162–15172. DOI: https://doi.org/10.1109/ICCV48922.2021.01490 .

Li J X, Feng Z J, She Q, Ding H H, Wang C H, Lee G H. MINE: Towards continuous depth MPI with NeRF for novel view synthesis. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.12558–12568. DOI: https://doi.org/10.1109/ICCV48922.2021.01235 .

Deng K D, Liu A, Zhu J Y, Ramanan D. Depth-supervised NeRF: Fewer views and faster training for free. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2022, pp.12872–12881. DOI: https://doi.org/10.1109/CVPR52688.2022.01254 .

Jain A, Tancik M, Abbeel P. Putting NeRF on a diet: Semantically consistent few-shot view synthesis. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.5865–5874. DOI: https://doi.org/10.1109/ICCV48922.2021.00583 .

Chen A P, Xu Z X, Zhao F Q, Zhang X S, Xiang F B, Yu J Y, Su H. MVSNeRF: Fast generalizable radiance field reconstruction from multi-view stereo. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.14104–14113. DOI: https://doi.org/10.1109/ICCV48922.2021.01386 .

Johari M M, Lepoittevin Y, Fleuret F. GeoNeRF: Generalizing NeRF with geometry priors. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2022, pp.18344–18347. DOI: https://doi.org/10.1109/CVPR52688.2022.01782 .

Jensen R, Dahl A, Vogiatzis G, Tola E, Aanæs H. Large scale multi-view stereopsis evaluation. In Proc. the 2014 IEEE Conference on Computer Vision and Pattern Recognition , Jun. 2014, pp.406–413. DOI: https://doi.org/10.1109/CVPR.2014.59 .

De Bonet J S, Viola P. Poxels: Probabilistic voxelized volume reconstruction. In Proc. International Conference on Computer Vision , Sept. 1999, p.2.

Furukawa Y, Ponce J. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Analysis and Machine Intelligence , 2010, 32(8): 1362–1376. DOI: https://doi.org/10.1109/TPAMI.2009.161 .

Kolmogorov V, Zabih R. Multi-camera scene reconstruction via graph cuts. In Proc. the 7th European Conference on Computer Vision, Copenhagen , May 2002, pp.82–96. DOI: https://doi.org/10.1007/3-540-47977-5_6 .

Schönberger J L, Zheng E L, Frahm J M, Pollefeys M. Pixelwise view selection for unstructured multi-view stereo. In Proc. the 14th European Conference on Computer Vision , Oct. 2016, pp.501–518. DOI: https://doi.org/10.1007/978-3-319-46487-9_31 .

Yao Y, Luo Z X, Li S W, Fang T, Quan L. MVSNet: Depth inference for unstructured multi-view stereo. In Proc. the 15th European Conference on Computer Vision , Sept. 2018, pp.785–801. DOI: https://doi.org/10.1007/978-3-030-01237-3_47 .

Yao Y, Luo Z X, Li S W, Shen T W, Fang T, Quan L. Recurrent MVSNet for high-resolution multi-view stereo depth inference. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2019, pp.5520–5529. DOI: https://doi.org/10.1109/CVPR.2019.00567 .

Cheng S, Xu Z X, Zhu S L, Li Z W, Li L E, Ramamoorthi R, Su H. Deep stereo using adaptive thin volume representation with uncertainty awareness. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2020, pp.2521–2531. DOI: https://doi.org/10.1109/CVPR42600.2020.00260 .

Gu X D, Fan Z W, Zhu S Y, Dai Z Z, Tan F T, Tan P. Cascade cost volume for high-resolution multi-view stereo and stereo matching. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2020, pp.2492–2501. DOI: https://doi.org/10.1109/CVPR42600.2020.00257 .

Yang J Y, Mao W, Alvarez J M, Liu M M. Cost volume pyramid based depth inference for multi-view stereo. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition , May 2020, pp.4876–4885. DOI: https://doi.org/10.1109/CVPR42600.2020.00493 .

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L , Polosukhin I. Attention is all you need. In Proc. the 31st International Conference on Neural Information Processing Systems , Dec. 2017, pp.6000–6010. DOI: https://doi.org/10.5555/3295222.3295349 .

Guo M H, Xu T X, Liu J J, Liu Z N, Jiang P T, Mu T J, Zhang S H, Martin R R, Cheng M M, Hu S M. Attention mechanisms in computer vision: A survey. Computational Visual Media , 2022, 8(3): 331–368. DOI: https://doi.org/10.1007/s41095-022-0271-y .

Shmatko A, Ghaffari Laleh N, Gerstung M, Kather J N. Artificial intelligence in histopathology: Enhancing cancer research and clinical oncology. Nature Cancer , 2022, 3(9): 1026–1038. DOI: https://doi.org/10.1038/s43018-022-00436-4 .

Li Y H, Mao H Z, Girshick R, He K M. Exploring plain vision transformer backbones for object detection. In Proc. the 17th European Conference on Computer Vision , Oct. 2022, pp.280–296. DOI: https://doi.org/10.1007/978-3-031-20077-9_17 .

Kalantari N K, Wang T C, Ramamoorthi R. Learning-based view synthesis for light field cameras. ACM Trans. Graphics , 2016, 35 (6): Article No. 193. DOI: https://doi.org/10.1145/2980179.2980251 .

Srinivasan P P, Wang T Z, Sreelal A, Ramamoorthi R, Ng R. Learning to synthesize a 4D RGBD light field from a single image. In Proc. the 2017 IEEE International Conference on Computer Vision , Oct. 2017, pp.2262–2270. DOI: https://doi.org/10.1109/ICCV.2017.246 .

Chen A P, Wu M Y, Zhang Y L, Li N Y, Lu J, Gao S H, Yu J Y. Deep surface light fields. Proceedings of the ACM on Computer Graphics and Interactive Techniques , 2018, 1(1): 14. DOI: https://doi.org/10.1145/3203192 .

Chaurasia G, Duchene S, Sorkine-Hornung O, Drettakis G. Depth synthesis and local warps for plausible image-based navigation. ACM Trans. Graphics , 2013, 32(3): 30. DOI: https://doi.org/10.1145/2487228.2487238 .

Chaurasia G, Sorkine O, Drettakis G. Silhouette-aware warping for image-based rendering. In Proc. of the 22nd Eurographics conference on Rendering , Jun. 2011, pp.1223–1232. DOI: https://doi.org/10.1111/j.1467-8659.2011.01981.x .

Sinha S N, Steedly D, Szeliski R. Piecewise planar stereo for image-based rendering. In Proc. the 12th IEEE International Conference on Computer Vision , Sept. 29–Oct. 2, 2009, pp.1881–1888. DOI: https://doi.org/10.1109/ICCV.2009.5459417 .

Zhou T H, Tucker R, Flynn J, Fyffe G, Snavely N. Stereo magnification: Learning view synthesis using multiplane images. ACM Trans. Graphics , 2018, 37(4): 65. DOI: https://doi.org/10.1145/3197517.3201323 .

Choi I, Gallo O, Troccoli A, Kim M H, Kautz J. Extreme view synthesis. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision , Oct. 27-Nov. 2, 2019, pp.7780–7789. DOI: https://doi.org/10.1109/ICCV.2019.00787 .

Mildenhall B, Srinivasan P P, Ortiz-Cayon R, Kalantari N K, Ramamoorthi R, Ng R, Kar A. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graphics , 2019, 38(4): 29. DOI: https://doi.org/10.1145/3306346.3322980 .

Srinivasan P P, Tucker R, Barron J T, Ramamoorthi R, Ng R, Snavely N. Pushing the boundaries of view extrapolation with multiplane images. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2019, pp.175–184. DOI: https://doi.org/10.1109/CVPR.2019.00026 .

Huang J W, Thies J, Dai A G L, Kundu A, Jiang C Y, Guibas L J, Nießner M, Funkhouser T. Adversarial texture optimization from RGB-D scans. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2020, pp.1556–1565. DOI: https://doi.org/10.1109/CVPR42600.2020.00163 .

Aliev K A, Sevastopolsky A, Kolos M, Ulyanov D, Lempitsky V. Neural point-based graphics. In Proc. the 16th European Conference on Computer Vision , Aug. 2020, pp.696–712. DOI: https://doi.org/10.1007/978-3-030-58542-6_42 .

Meshry M, Goldman D B, Khamis S, Hoppe H, Pandey R, Snavely N, Martin-Brualla R. Neural rerendering in the wild. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2019, pp.6871–6880. DOI: https://doi.org/10.1109/CVPR.2019.00704 .

Barron J T, Mildenhall B, Tancik M, Hedman P, Martin-Brualla R, Srinivasan P P. MIP-NeRF: A multiscale representation for anti-aliasing neural radiance fields. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.5835–5844. DOI: https://doi.org/10.1109/ICCV48922.2021.00580 .

DeVries T, Bautista M A, Srivastava N, Taylor G W, Susskind J M. Unconstrained scene generation with locally conditioned radiance fields. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision , Oct. 2021, pp.14284–14293. DOI: https://doi.org/10.1109/ICCV48922.2021.01404 .

Martin-Brualla R, Radwan N, Sajjadi M S M, Barron J T, Dosovitskiy A, Duckworth D. NeRF in the wild: Neural radiance fields for unconstrained photo collections. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2021, pp.7206–7215. DOI: https://doi.org/10.1109/CVPR46437.2021.00713 .

Wang Q Q, Wang Z C, Genova K, Srinivasan P, Zhou H, Barron J T, Martin-Brualla R, Snavely N, Funkhouser T. IBRNet: Learning multi-view image-based rendering. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Jun. 2021, pp.4688–4697. DOI: https://doi.org/10.1109/CVPR46437.2021.00466 .

Varma T M, Wang P H, Chen X X, Chen T L, Venugopalan S, Wang Z Y. Is attention all that NeRF needs? In Proc. the 11th International Conference on Learning Representations , May 2023.

Max N. Optical models for direct volume rendering. IEEE Trans. Visualization and Computer Graphics , 1995, 1(2): 99–108. DOI: https://doi.org/10.1109/2945.468400 .

Schönberger J L, Frahm J M. Structure-from-motion revisited. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition , Jun. 2016, pp.4104–4113. DOI: https://doi.org/10.1109/CVPR.2016.445 .

Wang Z, Bovik A C, Sheikh H R, Simoncelli E P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Processing , 2004, 13(4): 600–612. DOI: https://doi.org/10.1109/TIP.2003.819861 .

Zhang R, Isola P, Efros A A, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , Jun. 2018, pp.586–595. DOI: https://doi.org/10.1109/CVPR.2018.00068 .

Download references

Author information

Authors and affiliations.

School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou, 450002, China

Kuo Xu  ( 徐 阔 ), Jie Li  ( 李 颉 ) & Yang-Jie Cao  ( 曹仰杰 )

Intelligent Big Data System (iBDSys) Lab, Shanghai Jiao Tong University, Shanghai, 200240, China

Jie Li  ( 李 颉 )

School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou, 450001, China

Zhen-Qiang Li  ( 李振强 )

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Yang-Jie Cao  ( 曹仰杰 ) .

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

This work is supported by the Zhengzhou Collaborative Innovation Major Project under Grant No. 20XTZX06013 and the Henan Provincial Key Scientific Research Project under Grant No. 22A520042.

Kuo Xu is pursuing his M.S. degree in computer science and technology at the Hanwei Internet of Things Research Institute, School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou. He received his Bachelor’s degree in software engineering from Henan Polytechnic University, Jiaozuo, in 2021. His main research interests include 3D reconstruction and 3D generation.

Jie Li is an Endowed Chair Professor in computer science and engineering of Shanghai Jiao Tong University (SJTU), Shanghai. He is an IEEE fellow. His current research interests include big data, AI, blockchain, network systems and security, and smart

Zhen-Qiang Li is currently a Ph.D. candidate in software engineering at the School Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou. He received his M.Eng. degree in modern equipment engineering from Huazhong Agricultural University, Wuhan, in 2020. His current research interests include 3D generation and 3D scene understanding.

Yang-Jie Cao is a professor, doctoral supervisor, and director of the Institute of Internet of Things Engineering, School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou. He got his Ph.D. degree from Xi’an Jiaotong University, Xi’an, in 2012. His research interests are machine intelligence and human-computer interaction, intelligent processing of big data, cloud computing, and high-performance computing.

Rights and permissions

Reprints and permissions

About this article

Xu, K., Li, J., Li, ZQ. et al. SG-NeRF: Sparse-Input Generalized Neural Radiance Fields for Novel View Synthesis. J. Comput. Sci. Technol. (2024). https://doi.org/10.1007/s11390-024-4157-6

Download citation

Received : 29 January 2024

Accepted : 29 March 2024

Published : 26 June 2024

DOI : https://doi.org/10.1007/s11390-024-4157-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Neural Radiance Fields (NeRF)
  • multi-view stereo (MVS)
  • new view synthesis (NVS)
  • Find a journal
  • Publish with us
  • Track your research

Subscribe to the PwC Newsletter

Join the community, edit social preview.

novel view synthesis

Add a new code entry for this paper

Remove a code repository from this paper, mark the official implementation from paper authors, add a new evaluation result row.

TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK REMOVE
Novel View Synthesis KITTI Novel View Synthesis Multi-view to Novel View SSIM 0.626 # 1
Novel View Synthesis ShapeNet Car Multi-view to Novel View SSIM 0.923 # 1
Novel View Synthesis ShapeNet Chair Multi-view to Novel View SSIM 0.895 # 1
Novel View Synthesis Synthia Novel View Synthesis Multi-view to Novel View SSIM 0.697 # 1
  • NOVEL VIEW SYNTHESIS

Remove a task

novel view synthesis

Add a method

Remove a method.

Include the markdown at the top of your GitHub README.md file to showcase the performance of the model.

Badges are live and will be dynamically updated with the latest ranking of this paper.

Badge Markdown

Edit Datasets

Multi-view to novel view: synthesizing novel views with self-learned confidence.

Proceedings of the 15th European Conference on Computer Vision, 2018  ·  Shao-Hua Sun , Minyoung Huh , Yuan-Hong Liao , Ning Zhang , Joseph J. Lim · Edit social preview

We address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images. We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision. Specifically, our model consists of a flow prediction module and a pixel generation module to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors. To merge the predictions produced by the two modules given multi-view source images, we introduce a self-learned confidence aggregation mechanism. We evaluate our model on images rendered from 3D object models as well as real and synthesized scenes. We demonstrate that our model is able to achieve state-of-the-art results as well as progressively improve its predictions when more source images are available.

Code Edit Add Remove Mark official

Tasks edit add remove, datasets edit.

novel view synthesis

Results from the Paper Edit Add Remove

novel view synthesis

Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Novel View Synthesis KITTI Novel View Synthesis Multi-view to Novel View SSIM 0.626 # 1
Novel View Synthesis ShapeNet Car Multi-view to Novel View SSIM 0.923 # 1
Novel View Synthesis ShapeNet Chair Multi-view to Novel View SSIM 0.895 # 1
Novel View Synthesis Synthia Novel View Synthesis Multi-view to Novel View SSIM 0.697 # 1

Methods Edit Add Remove

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

remotesensing-logo

Article Menu

novel view synthesis

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Sts-nerf: novel view synthesis of space targets based on improved neural radiance fields, share and cite.

Ma, K.; Liu, P.; Sun, H.; Teng, J. STs-NeRF: Novel View Synthesis of Space Targets Based on Improved Neural Radiance Fields. Remote Sens. 2024 , 16 , 2327. https://doi.org/10.3390/rs16132327

Ma K, Liu P, Sun H, Teng J. STs-NeRF: Novel View Synthesis of Space Targets Based on Improved Neural Radiance Fields. Remote Sensing . 2024; 16(13):2327. https://doi.org/10.3390/rs16132327

Ma, Kaidi, Peixun Liu, Haijiang Sun, and Jiawei Teng. 2024. "STs-NeRF: Novel View Synthesis of Space Targets Based on Improved Neural Radiance Fields" Remote Sensing 16, no. 13: 2327. https://doi.org/10.3390/rs16132327

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

  • DOI: 10.1007/s10008-024-05973-9
  • Corpus ID: 270625653

Synthesis and performance of novel n-type EC/AIE energy storage electrode materials containing triphenylamine and quinazolin groups

  • Pengna Wang , Shengqing Zheng , Bao-Ping Lin
  • Published in Journal of Solid State… 19 June 2024
  • Chemistry, Materials Science, Engineering

32 References

Two-dimensional conjugated polymer thin films for organic electronics: opportunities and challenges., 14-electron redox chemistry enabled by salen-based π-conjugated framework polymer boosting high-performance lithium-ion storage., n‐type organic mixed ionic‐electronic conductors for organic electrochemical transistors, pani-grafted radially porous mno2 for supercapacitor applications, a novel n‐type molecular dopant with a closed‐shell electronic structure applicable to the vacuum‐deposition process, controllable fluorescent switch based on ferrocene derivatives with aie-active tetraphenylethene moiety, electropolymerization of d-a-d type triphenlyamine-based monomers consisting of camphor substituted quinoxaline unit for efficient electrochromism and supercapacitors, synthesis of novel fluorobenzene-thiophene hybrid polymer for high-performance electrochromic devices, polythiophene derivatives for efficient all‐polymer solar cells, n-type doping of naphthalenediimide-based random donor-acceptor copolymers to enhance transistor performance and structural crystallinity., related papers.

Showing 1 through 3 of 0 Related Papers

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: deep learning based novel view synthesis.

Abstract: Predicting novel views of a scene from real-world images has always been a challenging task. In this work, we propose a deep convolutional neural network (CNN) which learns to predict novel views of a scene from given collection of images. In comparison to prior deep learning based approaches, which can handle only a fixed number of input images to predict novel view, proposed approach works with different numbers of input images. The proposed model explicitly performs feature extraction and matching from a given pair of input images and estimates, at each pixel, the probability distribution (pdf) over possible depth levels in the scene. This pdf is then used for estimating the novel view. The model estimates multiple predictions of novel view, one estimate per input image pair, from given image collection. The model also estimates an occlusion mask and combines multiple novel view estimates in to a single optimal prediction. The finite number of depth levels used in the analysis may cause occasional blurriness in the estimated view. We mitigate this issue with simple multi-resolution analysis which improves the quality of the estimates. We substantiate the performance on different datasets and show competitive performance.
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: [cs.CV]
  (or [cs.CV] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

COMMENTS

  1. Novel View Synthesis

    Find papers, benchmarks, datasets and libraries for novel view synthesis, a computer vision task of generating images from arbitrary viewpoints. Explore methods such as NeRF, MPI, DONeRF, X3D and more.

  2. Novel View Synthesis: From Depth-Based Warping to Multi-Plane Images

    Learn about the problem, context, and taxonomy of novel view synthesis, a long-standing problem at the intersection of computer graphics and computer vision. Watch the videos of the talks by the researchers behind the most recent approaches, such as SynSin, Multiplane Images, NeRF, and more.

  3. GeNVS: Generative Novel View Synthesis with

    GeNVS is a diffusion-based method that generates novel views from a single input image, incorporating 3D priors in a feature field. It can synthesize diverse and plausible renderings on complex scenes, such as objects and rooms, and produce 3D-consistent sequences.

  4. [2210.04628] Novel View Synthesis with Diffusion Models

    A paper that introduces 3DiM, a diffusion model for 3D novel view synthesis from a single input view. 3DiM uses stochastic conditioning to generate consistent and sharp completions across many views, and outperforms prior work on the SRN ShapeNet dataset.

  5. Novel View Synthesis from a Single Image via Unsupervised Learning

    and a novel view of an arbitrary pose is synthesized from the intrinsic representation. the synthesis of a novel view. In other words, only the views with pose information can be chosen as input in synthesis. In a practical multi-view scenario [6]-[7], such as broad-casting of a sports event, multiple source views are captured

  6. Novel View Synthesis from a Single Image via Unsupervised learning

    This paper proposes an unsupervised network to generate novel views from a single source viewpoint image. The network learns a pixel transformation from a pre-defined reference pose and synthesizes an arbitrary view from the transformed features.

  7. Novel View Synthesis

    Find the latest research papers, benchmarks, datasets and libraries for novel view synthesis, a computer vision task that synthesizes a target image with an arbitrary target camera pose from given source images and their camera poses. Explore the methods, results and trends of novel view synthesis with NeRF, MPI and other techniques.

  8. Novel View Synthesis from a Single Unposed Image via Unsupervised

    Novel view synthesis aims to generate novel views from one or more given source views. Although existing methods have achieved promising performance, they usually require paired views with different poses to learn a pixel transformation. This article proposes an unsupervised network to learn such a pixel transformation from a single source image.

  9. PDF Fast and Explicit Neural View Synthesis

    recent approaches for novel view synthesis. 3. Methodology The novel view synthesis problem is defined as follows. Given a set S= {(Ii,Pi)}n i=0 of one or more source views, where a view is defined as an imageIi∈R3 ×h wtogether with the camera pose Pi ∈SO(3), we want to learn a model fθ that can reconstruct a ground-truth target image

  10. PDF Continuous Object Representation Networks: Novel View Synthesis without

    Novel view synthesis is the problem of generating new camera perspectives of a scene. Key challenges of novel view synthesis are inferring the scene's 3D structure and inpainting occluded and unseen parts. Existing methods differ in their generality, some aim to

  11. Fast and Explicit Neural View Synthesis

    The task of novel view synthesis aims at generating unseen perspectives of an object or scene from a limited set of input images. Nevertheless, synthesizing novel views from a single image still remains a significant challenge in the ever-evolving realm of computer vision. Previous approaches tackle this problem by adopting mesh prediction ...

  12. View synthesis

    View synthesis. In computer graphics, view synthesis, or novel view synthesis, is a task which consists of generating images of a specific subject or scene from a specific point of view, when the only available information is pictures taken from different points of view. [1]

  13. MultiDiff: Consistent Novel View Synthesis from a Single Image

    We introduce MultiDiff, a novel approach for consistent novel view synthesis of scenes from a single RGB image. The task of synthesizing novel views from a single reference image is highly ill-posed by nature, as there exist multiple, plausible explanations for unobserved areas. To address this issue, we incorporate strong priors in form of monocular depth predictors and video-diffusion models ...

  14. Novel View Synthesis with Diffusion Models

    3DiM is a diffusion model that can generate 3D consistent and sharp completions from a single input view. It uses pose-conditional image-to-image diffusion and stochastic conditioning to achieve high fidelity and 3D consistency on ShapeNet and in-the-wild images.

  15. Novel View Synthesis

    Novel View Synthesis. 353 papers with code • 18 benchmarks • 34 datasets. Synthesize a target image with an arbitrary target camera pose from given source images and their camera poses. See Wiki for more introdcutions. The Synthesis method include: NeRF, MPI and so on.

  16. PDF Novel View Synthesis

    Free view synthesis 1. Render mesh into target view to get its depth map 𝑡 2. For each source image: a. Extract features (using 3 stages of ImageNet pretrained VGG) b. Warp each pixel into each source view using 𝑡 and get interpolated features c. Predict intensity 𝐼መand confidence images using blending decoder (UNet+GRU) for each ...

  17. TOSS: High-quality Text-guided Novel View Synthesis from a Single Image

    TOSS utilizes text as semantic guidance to further constrain the solution space of NVS, and generates more plausible, controllable, multiview-consistent novel view images from a single image.. TOSS introduces text to the task of novel view synthesis (NVS) from just a single RGB image. While Zero-1-to-3 has demonstrated impressive zero-shot open-set NVS capability, it treats NVS as a pure image ...

  18. Novel View Synthesis with Diffusion Models

    Edit social preview. We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view ...

  19. Novel View Synthesis Competition

    A competition to generate novel views of scenes/objects from single reference views. Learn the rules, data, and important dates for this challenging and important problem in computer vision and graphics.

  20. Novel-view Acoustic Synthesis

    To our knowledge, this work represents the very first formulation, dataset, and approach to solve the novel-view acoustic synthesis task, which has exciting potential applications ranging from AR/VR to art and design. Unlocked by this work, we believe that the future of novel-view synthesis is in multi-modal learning from videos. ...

  21. NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

    By harnessing the potent generative capabilities of pre-trained large video diffusion models, we propose NVS-Solver, a new novel view synthesis (NVS) paradigm that operates \\textit{without} the need for training. NVS-Solver adaptively modulates the diffusion sampling process with the given views to enable the creation of remarkable visual experiences from single or multiple views of static ...

  22. NerfBaselines: Consistent and Reproducible Evaluation of Novel View

    Novel view synthesis is an important problem with many applications, including AR/VR, gaming, and simulations for robotics. With the recent rapid development of Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS) methods, it is becoming difficult to keep track of the current state of the art (SoTA) due to methods using different evaluation protocols, codebases being difficult to ...

  23. SG-NeRF: Sparse-Input Generalized Neural Radiance Fields for Novel View

    Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization, which limits their practical applications. We propose a generalization method to infer scenes from input images and perform high-quality rendering without pre-scene optimization named SG-NeRF (Sparse-Input Generalized Neural Radiance Fields). Firstly, we construct an improved ...

  24. Multi-view to Novel view: Synthesizing Novel Views with Self-Learned

    Edit social preview. We address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images. We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision.

  25. SinMPI: Novel View Synthesis from a Single Image with Expanded

    Single-image novel view synthesis is a challenging and ongoing problem that aims to generate an infinite number of consistent views from a single input image. Although significant efforts have been made to advance the quality of generated novel views, less attention has been paid to the expansion of the underlying scene representation, which is ...

  26. Remote Sensing

    Since Neural Radiation Field (NeRF) was first proposed, a large number of studies dedicated to them have emerged. These fields achieved very good results in their respective contexts, but they are not sufficiently practical for our project. If we want to obtain novel images of satellites photographed in space by another satellite, we must face problems like inaccurate camera focal lengths and ...

  27. Synthesis and performance of novel n-type EC/AIE energy storage

    Semantic Scholar extracted view of "Synthesis and performance of novel n-type EC/AIE energy storage electrode materials containing triphenylamine and quinazolin groups" by Pengna Wang et al.

  28. [2107.06812] Deep Learning based Novel View Synthesis

    A novel approach to predict novel views of a scene from real-world images using a deep convolutional neural network. The paper presents the model architecture, datasets, results and limitations of the method.