109.16
9.1 fps
512 x 288
Another terrain, with cheap trees made of ellipsoids and noise. It computes analytic normals for the terrain and clouds. The art composed to camera as usual. Making-of Tutorial: https://www.youtube.com/watch?v=BFld4EBO2RE
Shader Inputs
uniform vec3 iResolution; // viewport resolution (in pixels)
uniform float iTime; // shader playback time (in seconds)
uniform float iTimeDelta; // render time (in seconds)
uniform float iFrameRate; // shader frame rate
uniform int iFrame; // shader playback frame
uniform float iChannelTime[4]; // channel playback time (in seconds)
uniform vec3 iChannelResolution[4]; // channel resolution (in pixels)
uniform vec4 iMouse; // mouse pixel coords. xy: current (if MLB down), zw: click
uniform samplerXX iChannel0..3; // input channel. XX = 2D/Cube
uniform vec4 iDate; // (year, month, day, time in seconds)
uniform float iSampleRate; // sound sample rate (i.e., 44100)
xxxxxxxxxx1
// Copyright Inigo Quilez, 2016 - https://iquilezles.org/2
// I am the sole copyright owner of this Work.3
// You cannot host, display, distribute or share this Work neither4
// as it is or altered, here on Shadertoy or anywhere else, in any5
// form including physical and digital. You cannot use this Work in any6
// commercial or non-commercial product, website or project. You cannot7
// sell this Work and you cannot mint an NFTs of it or train a neural8
// network with it without permission. I share this Work for educational9
// purposes, and you can link to it, through an URL, proper attribution10
// and unmodified screenshot, as part of your educational material. If11
// these conditions are too restrictive please contact me and we'll12
// definitely work it out.13
14
// A rainforest landscape.15
//16
// Tutorial on Youtube : https://www.youtube.com/watch?v=BFld4EBO2RE17
// Tutorial on Bilibili: https://www.bilibili.com/video/BV1Da4y1q78H18
//19
// Buy a metal or paper print: https://www.redbubble.com/shop/ap/3984351120
//21
// Normals are analytical (true derivatives) for the terrain and for the22
// clouds, including the noise, the fbm and the smoothsteps.23
//24
// Lighting and art composed for this shot/camera. The trees are really25
// ellipsoids with noise, but they kind of do the job in distance and low26
// image resolutions Also I used some basic reprojection technique to 27
// smooth out the render.28
//29
// See here for more info: 30
// https://iquilezles.org/articles/fbm31
// https://iquilezles.org/articles/morenoise32
33
34
void mainImage( out vec4 fragColor, in vec2 fragCoord )35
{36
vec2 p = fragCoord/iResolution.xy;37
38
vec3 col = texture( iChannel0, p ).xyz;39
//vec3 col = texelFetch( iChannel0, ivec2(fragCoord-0.5), 0 ).xyz;40
41
col *= 0.5 + 0.5*pow( 16.0*p.x*p.y*(1.0-p.x)*(1.0-p.y), 0.05 );42
43
fragColor = vec4( col, 1.0 );44
}45
Filter
Wrap
iChannel0
Filter
Wrap
iChannel1
Filter
Wrap
iChannel2
Filter
Wrap
iChannel3
wow
In treesMap(), why not use the floored position for calculating the tree height? vec2 n = floor( p.xz/2.0 ); float base = terrainMap(n*2.).x; The trees seem less distorted by steep terrain this way, since each point within a "cell" isn't getting a different height.
one of your video about this work led me to this wonderful world
really cool, but made my graphics card cry
its not always worse. and more consistent than doing hammer projection and its inverse.
@ollj: you mean, computing the image at resolution 6*1024*1024 ? ;-)
if you use cubemapA as buffer for reprojection, the camera could spin and rotate wildly, without having grainier borders, and only on translation-movements you have grainy-parallaxes.
@iq sorry for that last comment idk why it posted early so i can just find a "pretty good" step size multiplier for the terrain?
@iq
To prevent overshooting you need to divide your step size to the maximum derivative of the function (the slope of the terrain) in the neighborhood of the area you are sampling. You can do this properly, or just hack it a bit like I am doing here in line 557: First I consider the terrain without the cliffs, and I set a raymarching step multiplier of 0.8, which allows me to raymarch terrains of a maximum slope of 51 degrees (atan(1/0.8)), which is sufficient for this terrain. Then I need to consider the cliffs, and in this case I do not take their slope, but instead simply apply an extra slow down when we are in the neighborhood of a cliff. I extract this information as a mask (0 to 1), from the terrain building function itself, which I pass to the raymarcher in the env.y, variable.
solving for the overshooting of .y heightmaps has fundamentally the same issues of QUICKLY finding a local minimum of f(xy), and most methods for this involve dividing by a lot of first derivatives.
you under-step by marching only half or quarter distance each step, and then you may need much more steps. Hopefully you add other smart things, like not caring for precision further away from the camera, to compensate for the under-stepping. This compensation may be done by a logarithmic precision of log(distanceToSurface/distanceFromCamera) :: let s=distanceToSurface AND c=DistanceToCamera :: if(s<0.01) //is a simple arbitrary distance precision. may be changed to if log(c*c/s/100000.)>0.0 or if log(c*c/s/s/1000000.)>0.5 or if log(c*c*c/s/s/100000000000.)>0.0 as in logeps() of https://www.shadertoy.com/view/tdXXzl Where the things that divide by larger numbers are better for larger distanceFields (seeing a moon at a horizon from a planet surface), and the shorter ones are better for smaller distanceFields. of courrse the precision must also be scaled to the scaling of the distanceField, which is much trickier in log() scales. logeps() tends to do just fine with 1000* as many raymarch iterations, plenty of space to understep more, because most rays overtep far away from the camera and they need a lot less less iterations that way, and only near-tangential/horizon cases need more steps. You could also estimate the local 3d derivative each step, and hope that diving by that differential will significantly/mostly normalize a skewed lipschits-constant, but so far this has never been worth it. you could also use the same derivative, to to other silly things, that have not been worth it, like https://www.shadertoy.com/view/WdfSD2, which may perform better for near-parallel-horizon cases.
How do you avoid over-marching when the terrain is steep? I can't seem to figure this out in my own version and I can't do very dramatic terrain with the overshooting issues.
yall i have a gtx 980 and this sits at 60fps (in 640x360)
MASTER
Signed up to Shadertoy just so I could like this.
@TinajXD Oh your poor old sad GPU! 😁
It looks great, it's sad that at 9 fps.
not even zoomed in im averaging about 35-37 fps, im using a gtx 1650
@matthatter419 In the series he is showing he takes out a certain number of indexes, that is the filtering step. You can look up Fourier series and Fourier filtering.
I bought a new gaming PC for my son, so I had to test this beautifull shader again: 100fps on 1200 x 675 windowed AMD 5800X3D + Radeon 6800XT
In the accompanying video (with timestamp: https://youtu.be/BFld4EBO2RE?t=1158), Inigo says we can essentially band-pass filter the landscape at synthesis time. I don't see anything in the code that does this filtering, yet the output is clearly filtered (if you remove the tree envelope, it's easy to see). Where is this filtering happening in code?
Awesome!
Inspiration
I am learning a ton from your code and accompanying video, thanks!!
It was a leftover. Removed.
Is line 308 a typo, or what is its purpose?
On my windows machine (core i7 6700K and GTX1080ti) it does 30 fps on 1800x1013 and 60 fps on 1200x675. If I zoom to mobile mode on my 4k desktop (by zooming in with the browser) I can get it to run at 2577x1124 at around 16fps and around 10fps on 3610 x 1405 (I'm guesing it syncs to the refresh rate to a divider of 60).
Casually, from visual tests in a shop(!) I'd guess the M1 in their iMac's is about the equivalent to my old GTX 1060, which isn't really all that bad. I have a 2017 Intel iMac (Radeon Pro 580) that does about 22fps at 1280x720 On my main dev machine, I'm getting 60fps at 1200x675. I can't get it to render any larger on Windows 10 without going full screen, I don't know why, but the text editor wants to take up a lot of the screen on Windows. This is on a 4K monitor with no DPI scaling.
@iq It's apple there is hardly anything to configure ;) Theoretical fp32 performance should be around 5 TFLOPS for FP32. Angle is configured to use OpenGL, I ould set it to Metal (Apple's vulkan like API) but I've had bad experience with that since the chrome support for that is lagging behind. Safari does 20-25fps on 1280x720 resolution, the M1 is an ARM processor with an IGP. There are no fans running as far as i can hear
I have 4.5 fps at 640x360 on my nVidia Quadro K2100M. Sometime perfs are very different on OpenGL vs Windows
(in 800x450 resolution)
Firefox on Linux with AMD Radeon Vega 3 Graphics (RAVEN2) 4.5 fps
I don't know anything about Mac computers, but 31-38 fps is really slow, I'm getting 60 fps in my 7 years old laptop. Maybe you have hardware rendering disabled or some other thing misconfigured.
Notebook Mac M1 Pro 16" 16-core GPU version in chrome 1280 x 720 on laptop screen between 14 and 18 fps 800 x 450 on tv screen between 31 and 38 fps
Another datapoint for 3080 12GB on debian 11 (xfce, no wayland, no overclock): Fluid 60fps in 1920x1200 in chromium, stuttery <60fps in firefox
GTX 3080 benchmark: constant 60 fps in 945x531 on all browsers constant 60 fps in 1805x737 in some browsers (firefox and operaGX underperform, not having constant 60fps, MicrosoftEdge performed best in win11) easily 55 fps in 1796x1054 via MicrosoftEdge (my fullscreen resolution is lower, this is mousewheel-up oversized-canvas) 30 fps in 1710x1686 Higher resolutions still maintain 30 fps, but I can't tell the resolution due to suboptimal formatting on smaller displays.
col *= 0.5 + 0.5*pow( 16.0*p.x*p.y*(1.0-p.x)*(1.0-p.y), 0.05 ); //is like a vignette, darkening the border- col *= 0.5*pow( 16.0*p.x*p.y*(1.0-p.x)*(1.0-p.y), 0.55 ); //is an exaggerated variant
What is the post-processing you're doing on the color using the uv?
I love these math paintings you create, and the explanation videos are excellent! Good Job!
[This comment has been hidden by the shader author]
stunning!
Extraordinary shader
Extremely beautiful.
The loop counters are scalar
@iq I checked the difference of loop unrolling using a certain vendor shader compiler software 1. The shader is 43028 bytes with loop unrolling, but only 20120 bytes when forcing loops. Having bigger size is terrible, because it actually causes a lot of cache thrashing 2. LDS is used with loop unrolling. Not sure why it's used, but using it will introduce some waits when multiple threads access the same LDS. 3. Looks like forcing a loop increases the scalar ALU, so maybe some operator become scalar instead of vector? Very hard to say for sure though. unroll: Scalar ALU: 318 Vector ALU: 7363 force loop: Scalar ALU: 327 Vector ALU: 3223 I think the biggest win is the binary code size, which makes sense.
Sorry for the small windoiw, i'm working on a 4k television. The UI still needs updates for smaller screens. I'm working on controls rendered by shaders so you can build synthesizer front panels with it as well. My current plan is to finish it to a complete synth builder and DAW, then seperate it into a musicians part and a programmers part. Programmers can then push their synth creations for the musicians to use, i allready bought the synthastic.com domain for the musicians part.
cool ! but the code window is really small.
@FabriceNeyret & @iq What do you think of my new hobby project. https://shadersynth.com I'm making a synthesizer that runs on shaders, I got inspired by the sound shader here. You can create instruments and effects with shaders like here. It plays midi files, it currently loads "Great gig in the Sky" by Pink Floyd and you can start it with the "play" button. It doesn't have a backend yet and stores everything in local storage for now.
thanks ! it's no longer chrome.exe --use-angle=gl ?
I forcesd chrome to use OpenGL by setting "Choose ANGLE graphics backend" on OpenGL in chrome://flags. Compile time for both is now around 0.5 seconds and fps is also consistent around 32fps. Must be something in combination with D3D. Modern GPU's do have loop and branch support, didn't realy think about the branch predictor remark, but it does indeed make sense a GPU doesn't have any since it runs blocks of code parallel.
NB: all functions get inlined. There is no stack on GPU. And it seems very unlikely there is branch prediction as well.
@Andre: could you please force your firefox/chrome to OpenGL mode and proceed the same benches ?
Compile time goes from 9.2 secs to 1.6 secs on my GTX 1070 Windows 10. FPS goes from 11 to 32fps (1200 x 675). (changed define of ZERO to 0 for 1st results) Maybe the noise functions get inlined, that together with loop unrolling could lead to large code size which is maybe bigger then the GPU's fastest code cache? Branch prediction has also grown over the years so maybe the penalty for not unrolling isn't as high as it used to be.
@iq: 1.2x faster [by] preventing loop unrolling How strange ! is it true on OpenGL too ? Very strange, preventing unrolling generally slow things, and for good reasons. Code cache: I once add to an Nvidia tech, and it seems almost impossible to fail the code cache (of course, but with dedicated code, plus these guys don't always know how extreme our shaders are :-D ). Now, for the optimizer there seem to be windowed range, so who knows in this case. But first, I'm curious whether the gain is in the same direction in OpenGL vs Windows, or Nvidia vs else. Also, the recent Vulkan-based compilers react very differently (sometime worse perfs, sometime better). Detail: for the first minute or so the perfs oscillate really a lot, before stabilising at 8.6 fps at 640x360.
Now the shader runs 1.2x faster and compiles almost 4 times faster. For the curious, all I did was preventing loop unrolling for all the fbm() functions. That lowered the compile time from 4 seconds to 1 for me. On the other hand I'm not sure why that improves performance, but it does. All I can hypothesize is that perhaps there's a code cache in the GPU to speed up instruction fetching and loops help keep the relevant code in the cache for a longer time? Regardless, it helped with a 40% speedup in my GPU. I used that extra performance to tweak the visuals and make the shader look better, while still leaving a 1.2x overall performance improvement. So, triple win.
@jarble: well, your depth buffer cuts the trees on the silhouette !
I tried adding a depth buffer to speed up the framerate. It seems to be only slightly faster.
again, webGL is not ADA, I doubt the norm now forbid patterns for style issue (and efficience is coder busyness). Also, the point is absolutely not about the scope: global variables are still permitted. It is only about where there initialization is done. On a compiler point of view, I don't see how init at global declaration differs to init at the beginning of main(). It could even be compiled that way.
it is lazy bad style to have global mutable. you want short scopes for efficiency. in webgl2 you can more easily have longer lists and have more types in them, or invert large matrices, and that may just crash a lot more of parsing or runtime, if long lists have global scope, in some deconstructing ANGLE parsing.
Yes. My question was: what was the motivation to no longer allow this, while it was ok in webGL1: what is the constraint on compilers that suggest one should better no longer allows it.
No global mutable outside of functions.
@ollj WebGL 2 obviously still does global variable. It just no longer allows their non-const global initialization at declaration. ( I wonder the motivation behind this change, though ).
my compatibe shaders share a [compatibility core] for WebGL 1. main issue are; no ivecN() types, no integer modulo, no inverse(), many list restrictions. WebGL 2 no longer does global variables.
After managing to get the shader working with WebGL 1, I'm seeing a stable 10 FPS on 576 x 324 on a GTX 260. Works in fullscreen too, but it's, judging by eye, about 1-2 fps.
on a 3090 its smooth at 3840x1600
So goood
similar, even on newer GPU this doesn't quite reach 60fps.
Getting 25 fps on 4K fullscreen using RTX 3080 FE. Otherwise, 60 FPS.
Make sure Chrome/Firefox/Edge is running on your GeForce card instead of the integrated GPU. It is a heavy shader, runs at 60 fps on my laptop but at 2 fps on my phone. I should probably take a pass at optimizing something.
My iGPU is having a HORRIBLE time rendering this.
cool
says 1.1 but its much more like 0.1 fps @ 800 x 450 on my "Intel integrated" lolz
I am only getting mid 40's fps with a 1070 gtx at 640x360
I meant to say... 40 fps at 640x360. Went down below 10 fps at 1080x1920. pity because it's incredible at that resolution.
Wowser, hovering around 40fps on my nvidia g980 win7 chrome. Reminds me of the scenery in Crysis, only better
nice
@Teppich It works extraordinarily well. In fact, I couldn't see a difference except in speed for the rainforest when I did it. I did it in one of the raymarching loops, along with once the point is returned, before it gets rendered. Also, one other speed improvement is checking for the camera storage positions before raymarching.
@ShadowFlare I've been looking into terminating rays early when they're less than half a pixel away from an sdf 'surface'. How did you know when to discard the ray? As in, how does rasterization + comparison work for you?
I was also able to boost the fps to 30 on my gpu by doing a few tricks which didn't make it lose any quality, such as discarding raytracing if the pixel difference wouldn't be visible on the screen via rasterization of the points and comparing them.
Got the river, just need to figure out why it isn't fading out properly.
I think I'm gonna add a small river in the background, in the canyony part. Is there any way to label that part?
it looks so awesome. Keep up the perfect job
965M:15.5-17.9fps
Thanks, that helps explain it. I also saw something else (I think it was by P_Malins) that helped too.
@scratch13764 He's using a small portion of the ch0's output buffer to store the current camera transform (ca[0], ca[1], ca[2]), so that he can read it in the next frame as oldCam. Then he transforms the current frag's world space position to screen space and reverts (undoes) the screen space shifting o in order to get an aligned frame (o is used to slightly shift the rendering, so that when mixing with the new frame, partial super-sampling is effectivelly achieved - this is one of the most clever and effective AA techniques I've seen). After that, he goes from screen space to raster space, samples the old frame (texLod(ch0, spos)) and mixes in the new color. The output is essentially used as a IIR history / accumulation buffer.
Can someone explain how the old frame averaging works in this? I can see that the frame is mapped to the terrain and it's really freaking me out how it works.
Amazing!
for mouse control camera rotation (only slow movements, please :-) : add in bufA line 834 vec2 A = 3.14*(2.*iMouse.xy/iResolution.xy-1.); #define rot(a) mat2(cos(a),-sin(a),sin(a),cos(a)) ta = vec3(0,0,1); ta.xz *= rot(A.x); ta.yz *= rot(-A.y); ta += ro;
looks like the sight from my living room :-)
Next project: GTA V
It's awesome graphics!!!!! TIP: if you want to run in IE change "textureLod" to "texture"
Insane. Level god.
@hammedshh, because it's raymarched, because there are no meshes, because it's running on WebGL, because your GPU is not meant to run this. In normal OpenGL this would be full framerate in HD. But this is not what Shadertoy is about
@hamedshh - That's nearly a 6 year old gfx card!
why is it run slowly with 8.8 fps on my PC whereas i have GT630 as Graphic card and core i5 as CPU
i think a [foggy valley] would look good (as in, easily believable and high geometry detail) and have good performance, even when valleys are seeded noise. Fog in valleys basically replaces the occlusion that a mountain would have there otherwise. most people do distance fog poorly, by making it spherical around the camera, and not making it diminishing fog with increasing height. google image search [fog valley] for inspiration. most photos look down into valleys, because looking out of a fog is just not as characteristic. diminishing fog by height (volume marching a cloud in a valley) makes it easy to look "up" from inside a foggy valley, so you likely see birds and nearby mountain peaks. This mostly diminishes looking nearly-horizontally along or trough a foggy valley. Then you put all your high detail geometry inside a foggy valley (along rivers), where it is more realistically occluded by fog and surrounded by either mountains or more valley-fog. you can fly into valleys and over them, giving the illusion of more detail than there is per frame, from memory of what you saw there.
so sick!! but yes, a bit heavy load.
Looks beautiful. I just wish my GPU did better than 26fps. (1070)
Awesome colors
Colours and shading are amazing.
[This comment has been hidden by the shader author]
I can hear the rainforest inside my laptop when this runs.... "Whooosh"
30fps on a Titan X (Maxwell), Ubuntu, for me. Is the 1080 simply better than the old Titan X?
Very nice lush colors
Iñigo, cada vez estoy más impresionado de tus habilidades. Increíbles tus trabajos.
Wow
TAA = Temporal Anti Aliasing
What's TAA refer to in the source comments?
It looks like Intel is not affected by this weirdness: 5fps in chrome in both desktop and Angle modes
Canvas: 800x450, Windows 10, GTX 980 Mobile RainForest: 23 fps Microsoft Edge RainForest: 15 fps Chrome (Angle) RainForest: 35 fps Chrome with --use-gl=desktop RainForest: 15 fps Firefox (Angle) RainForest: 28 fps Firefox with Disable.Angle (and zero compilation time) Snail: 30 fps Microsoft Edge Snail: 44 fps Chrome (Angle) Snail: 60 fps Chrome with --use-gl=desktop Snail: 34 fps Firefox (Angle) Snail: 44 fps Firefox with Disable.Angle (and zero compilation time) Clearly, Angle (Firefox and Chrome's WebGL/GLSL to DirectX/HLSL translation layer) is fucking things up quite a bit... Rainforest: https://www.shadertoy.com/view/4ttSWf Snail: https://www.shadertoy.com/view/ld3Gz2
Canvas : 800x450, Chromium, Debian Linux, GTX 1070 Snail: 60fps RainForest: 60 fps
Ok, Canvas 800x450, Chrome Win 10 GTX1070 Snail: 60 fps Rainforest: 30 fps Right click, inspect in chrome and you can also read the size of the canvas in the attributes of the canvas
To those people getting 60fps in this shader with 1060/1070 Could you please test theIQ's Snail shader and tell me the fps? Please make sure you are in the 1080 layot and canvas is 800x450 (you can check the canvas size with Fabriceneyret2's iResloution shader)
Thanks for the detailed info Andre. Let me know Casty how your investigations go
6700K and a GTX1070 here on Chrome and Win10. 60fps on smallest browser window (canvas is 500x281) 14 when i make browser window bigger (canvas is 1200x675) Edge browser gives me 20fps on largest in browser (canvas 1200x675) Firefox ~23fps varies between 20 and 28
IQ, I'm getting 20fps which is way more than any Intel can do. And it is shader dependent, Snail and pool shaders do run at their theoretical speed. So it has to be something else. At least other ppl are running at 60fps so I should be able to troubleshoot it. Thanks
Browser seems to make a big difference: 54.1 fps in Chrome, 60.0 fps in Firefox, on a 1080 (Windows 10).
Absolutely incredible. Thanks for sharing!
Awesome!!!. 60 Fps on 1070
Also, a friend said the 1060 was balcklisted on his Chrome, he had to enable it by hand, otherwise the Intel was kicking in for him too.
It should run at about 60 fps on your 1060 Casty. Are you in a laptop? You are probably using the Intel integrated card (close all browser tabs and restart Chrome with right_click + Nvidia GPU). We have tested on a 1060M, it does run full framerate. It is 16 fps on my (now old) 980M.
@834144373 I don't know why I'm getting only 20fps with my 1060, are you using the 1080 layout?
Brilliant!! Love your works so much! I hope I can code such beautiful things like all genius here. Unfortunately, I have no graphic mathematics concepts. How to learn from the beginning?
haha... 50.5fps on GTX1060
Outstanding, as always! ^^ @keelo "fbm" stands for "Fractal Brownian Motion"
Very impressive !
I'm in awe, beautiful work.
Please tell us you're working on a new demo. I REALLY would love to see Elevated 2.
gorgeous! what does fbm stand for?
padauz!! so much detail... i really like the clouds falling over the mountains
Amazing. The best looking single shader terrain scene to date.
Fixed, I hope!
@iq, in renderTerrain function out variables teShadow and teDistance are not initialised/set in all program paths. Because of the uninitialised teDistance on mac mini the sky is corrupted. I can't see the difference due to teShadow, but surely it should be initialised. btw 25 fps in smallest resolution (500x280) on mac mini with intel iris. 13 fps in 640x360.
The sense of magnitude in this scene is great, appears far from finite!
Wow...
Beautiful, beautiful terrain running at 19 fps on my workstation!
Awesome!
A little slow, but looks incredible!
Incroyable ! Are you working for the "next Avatar movie"or just preparing the next elevated 4k - 10th happy birthday ? ;-p
excellent. 0.7 fps in rdp software mode
It is a bit strange, I'm getting 20 fps with the GTX1060, but the old Quadro 4000M (equivalent to GTS 450) is showing 7 fps In other shaders I get the theoretical 700% diference (Snail and Pool to name a couple)
It's insanely good. Almost can't believe my eyes.
Gobsmacked. Pure Magic!
few lines?? have you missed the 'Buf A' tab? ;)
fantastic!
This is Incredibly Awesome! WOW!!!
Awesome.
1fps in fullscreen on a GTX950. Totally worth it :-D
Hell Yeah! Brilliant!
Breathtaking!
It looks stunning! quite literally as I have been staring at it for more than 20 minutes... The strokes in the clouds are intriguing, very painterly.
This resembles a photograph. Both in realism and framerate
Amazing!