I'll throw in my thoughts and findings - which are incomplete, some of what I'm saying is pure guesswork, but I did dive deep into the relevant specifications. Feel free to correct me.
First off - the original thing about MKV: yes, MKV is currently in my opinion the bestest of all existing single-file formats. I can't think of a single thing other than DV, that it doesn't support and couldn't easily add if told how to.
Having dealt with the innards of MP4 after I already knew MKV, I was totally appalled at the stupidity and non-scalability of MP4. That crap only survived, because of the lack of alternatives at first and now still isn't dead because it's everywhere.
MP4 is a very rigid container format with a lot of ugly hacks applied over time to support things, that weren't originally catered for. OK, these things happen.
But somebody explain to me, why an audio track needs to define "width" and "height" and other video-related properties.
It's not like the standard was finished and then somebody came around and said "Hey, new idea, let's support audio, too. We'll just use the same structure we did for video, because we didn't think of this before".
Also, in order to parse an MP4 file and its content, you need to know the exact format of all structures (atoms). And there are hundreds.
And one gets added for every measly new feature - which is probably why hardly ever anything gets added.
And when defining a new structure, you need to think of all possible future things that might change, adding reserved fields or optional fields making a huge mess of data nobody needs.
MKV: if you know the basic types (integer, string, ....), you can parse and display the whole content of the thing.
To add a feature: just add a field. If a field is missing, there's a default value. The player doesn't know a new field? Ignore it. Done. Next.
Adding Dolby Vision would be simple - Dolby would have to say how. They just didn't think of it.
It's not only the extra track, but you need to define the relationship between video and DV and also DV profile and stuff somewhere in the file.
Simple thing, but a requirement and something that needs to be agreed upon.
MKV would have been a better option, considering that it supports both Dolby "products" Vision and TrueHD.
When we get displays at 4000 nits and higher, Dolby Vision starts to become relevant.
This, I believe, could also be the exact opposite. DV seems to me (and HDR10+), is only useful as a workaround for displays
not supporting the full range.
(Unless we're talking about the additional 2 bits we get through the extra DV track, but I'm not convinced, that this would really be noticeable).
A couple of "things" (I like things):
- Thing 1: plain HDR10 already has all tools on board to cover the entire defined (BT2020 and ST2094) colour and brightness range. It's all there.
- Thing 2: current TVs are limited to around 700-800 nits (ZD9 excepted), so most UHD movies are mastered up to - typically - roughly 1000 nits. That makes a lot of sense.
- Thing 3: just throwing this in for people, who are new to this: standard "historical" SDR goes up to ~100 nits, while the HDR specification speaks of up to 10,000 nits.
This sounds like the TV is going to fry you, but brightness is perceived logarithmically. Terms like "twice as bright" probably don't make so much sense in the context of the subjective perception of light (which our eyes also auto-adapt to, so they never give us an absolute measurement).
But I'll say that 10000 nits are probably perceived somewhere in the area between 2 to maybe 8 times as bright as 100 (but it still is 100x the energy).
Just to put these values into perspective.
Still - large areas of 10000 nits (or 4000 even) would be annoyingly bright, you'd be squinting at your screen. These values only make sense for tiny things like sparkles or stars.
- Thing 4: HDR10+ adds a per-scene information about average lighting and such - optionally different values for a number of defined areas.
- Thing 5: Dolby Vision adds, on top of that, two extra bits per pixel (actually per 4 pixels, if that's how they map the 1920x1080 DV stream onto the 3840x2160 video stream, but there may be more magic inside, I don't know - the few specs there actually are, do seem to support that theory).
So given all that, you can conclude:
if you had a plain HDR10 movie, mastered for up to 10000 nits and a TV, that supports the required nits in full, there is nothing HDR10+ or DV could add.
Except maybe those two bits, but: there is no TV in existence, that can deal with 12 bits - not even the DV ones. The panels simply can't display the 12 bits and it's not likely to change.
What remains is: currently, HDR10 movies are released with a maximum of 1000 nits, sometimes up to 4000, because that requires less compromise on TVs with a limit in that general area.
That is because the 750 nits LG can't simply cut off anything above 750nits. That would produce flat white areas with no detail. Instead splines are used to ease the bright pixels into the given limits.
The downside is, that even pixels with less than 750nits will have to be darkened as well instead of getting shown with their intended brightness, even though they could.
HDR10+ and DV both can aid in reducing that effect by telling the TV in advance whether it will be required to use a smoothing curve at all (
note: this can also be done by the TV itself without HDR10+/DV by generating a histogram of each picture - and apparently some do exactly that).
It is possible, but I'm not at all sure, that DV can allow for HDR10 remaining in that "modest" area far below 4000nits while adding the option for more for devices that can handle it.
I can't say, because all articles I read about this so far, just either parrot the Dolby Vision marketing, which is vague and full of meaningless bullshit ("up to 10000 nits" - well, the same goes for HDR10, guys, "12 bit" - okay, but what is that really good for?).
Or they really don't know what they are talking about, I've seen the prettiest contradictions.
I have a feeling, that hardly anyone really understands what this DV shit does.
Which is no surprise, as it's a friggin secret.