-
Notifications
You must be signed in to change notification settings - Fork 309
Initial implementation of ImageSequenceReference #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of ImageSequenceReference #602
Conversation
I like that it avoids having to choose a format string standard. |
My C++ foo is a bit weak, so please excuse "stupid/obvious" questions :) The approach with breaking up the filename and providing zero padding etc. seems like a good idea. Will this support For your questions:
|
@apetrynet Thanks for taking a look! My C++ is pretty rusty too so don't worry :)
This approach de-couples frame time from the number it's assigned in the url completely except for the assumption that you have
I think I agree with you. In general with media I avoid referring to frames explicitly because many of the container formats support behavior that will break models involving constant frame rates, but in this case a constant frame rate is a base assumption.
My explanation was a bit fuzzy on that. If you have an image sequence at 24 fps where you've rendered every frame (frame numbers 1, 2, 3, 4, etc.) this means you have an implicit frame duration of 1/24th of a second. To set this on the image sequence reference you'd do this:
If you have the same sequence, but you've only generated every other frame (frame numbers 1, 3, 5, etc.) you have an implicit frame duration of 2/24ths of a second. To set this on the image sequence reference you'd do this:
While the terminology should keep that pretty clear, I worry that using The reason I took the approach of using |
This does a good job of capturing the important data. I do worry that it is quite verbose, but it's only a vague concern - it's probably fine. One question though: do we really need to model the |
The reason for modeling |
OK, I see. But what I seem to be missing here is an easy way of mapping the renumbered frames on disk to the Regarding
I'm actually leaning towards skipping the |
I would like to raise my use case again, of a sequence of images captured at a variable framerate, such as me pushing the shutter button on my camera. In this case, not only is the frame rate variable, but I expect the duration to be implicit, like a time filling slug. A related issue would be a missing frame in a rendered test sequence. e.g. A simulation might fail intermittently, but we would still like to use the previous successful frame as a placeholder without needing to duplicate it on disk to satisfy a regex. I'm not spotting in either of these proposals how an irregular framerate can be accommodated, or hold-frame behavior specified. I'm fine if this were to be handled in a different way, in a different schema holding a list of frames and times, but I'm curious if the use cases can be met in the context of the current proposals? |
Hi @meshula!
My look at the As I read your initial use case again I got a bit confused. By "I expect the duration to be implicit, like a time filling slug", do you mean that you'd like to record the actual time between exposures, producing a sequence that only contains a few images but a total duration of the media reference that reflects the time you used to shoot them?
As I mentioned in an earlier comment, I think this should be up to each consumer to handle. It would be no different than dealing with a corrupt frame in a .mov file for instance. Since (I believe I read this in another thread, couldn't find it though) it's not in the goals of OTIO to become a media player I don't think we need to solve this in the
I believe frame-holds and adjustment of playback speed should be editorial decisions and be reflected in the timeline or clip parts of the .otio file. I don't think |
I tend to agree with Daniel's thinking here. There's a boundary between editorial decisions (composition) and details about the media (frame level encoding). I don't think OTIO should include details of individual iframes, bframes, gops, etc. in encoded video, and so I'm resistant to including frame-level information for image sequences as well.
With encoded video there's already a place to write down variable duration media samples (in the packet durations of the video stream) and so OTIO can just provide high level summary information in the MediaReference as a convenience. For an image sequence, there isn't a container to store data in, so frame durations either need to go into individual frames in some format dependent way, or OTIO (or some other intermediate container) would need to provide a spot to put this info. In practice, we just look at the frame numbers in each filename and deduce the durations and ordering from that.
So there is a spectrum of information from summary to detail that could be included for a MediaReference. The summary information for an image sequence is largely the same as what we could provide for encoded video.
Summary Information:
- available_range (min to max)
- format (exr, tif, dpx, mov, mp4, etc.)
- encoding (lzw, rle, h264, dnx, etc.)
- resolution, bit depth, channels, streams, etc.
We could combine image formats and video container formats in "format", and image encoding/compression with video codec into "encoding" in an attempt to allow these to be treated uniformly. Is this a good idea to lump them together?
Image sequence specific information covers two areas: 1. which frames exist, 2. how frames differ from each other.
Which Frames:
- min, skip, max (1-100 on 2s)
- list of ranges (1-50,52-99)
- list of frame numbers (1,2,3,7,8,9,...)
- zero padding
- Nuke/RV style syntax (foo.@@@.tif, %3d, etc.)
If some frames are missing (on 2s, or missing frame 51) then a policy can specify what to do with the gaps (hold last known frame, use nearest, show black, show error, etc.) without needing to alter the duration or rate of anything. That is, an image sequence at 24 fps, but on 2s, should still (in my view) be treated as 24fps, not 12fps.
What's left is how frames in the sequence differ from each other.
Per-Frame Variations:
- duration
- data window / bounds
- format, resolution, bit depth, channels, etc.
In practice, most systems I've seen try hard to avoid allowing per-frame variation in format, resolution, bit depth, channels, etc. When variations do come up, this is usually considered invalid. Per-frame data window variation is more common, but seems out of scope for OTIO. So we're left with just per-frame duration... How common is this in our industry?
…---
Joshua Minor (he-him)
Pixar Studio Tools Story/Editorial/VR Previs Tech Lead
[email protected]
On Oct 10, 2019, at 1:56 AM, Daniel Flehner Heen ***@***.***> wrote:
Hi @meshula <https://github.com/meshula>!
Please correct me if I'm not answering any of your questions.
I would like to raise my use case again, of a sequence of images captured at a variable framerate, such as me pushing the shutter button on my camera. In this case, not only is the frame rate variable, but I expect the duration to be implicit, like a time filling slug.
My look at the ImageSequenceReference is the same as a regular ExternalReference. The difference being that ImageSequenceReference points to several source files while ExternalReference points to a single file. The available_range should reflect full range of the source files available either based on time code or frame numbers. How the frames are recorded shouldn't really matter as there's most likely an intended playback speed. Even in cases where you in the middle of a shot increase/decrease the exposure frequency the playback speed should be the same producing a "slow-mo or fast-forward effect".
As I read your initial use case again I got a bit confused. By "I expect the duration to be implicit, like a time filling slug", do you mean that you'd like to record the actual time between exposures, producing a sequence that only contains a few images but a total duration of the media reference that reflects the time you used to shoot them?
If so, This could be solved by producing a small edit with your frames and using the rendered image sequence as a media_reference.
A related issue would be a missing frame in a rendered test sequence. e.g. A simulation might fail intermittently, but we would still like to use the previous successful frame as a placeholder without needing to duplicate it on disk to satisfy a regex.
As I mentioned in an earlier comment, I think this should be up to each consumer to handle. It would be no different than dealing with a corrupt frame in a .mov file for instance. Since (I believe I read this in another thread, couldn't find it though) it's not in the goals of OTIO to become a media player I don't think we need to solve this in the media_reference
I'm not spotting in either of these proposals how an irregular framerate can be accommodated, or hold-frame behavior specified.
I believe frame-holds and adjustment of playback speed should be editorial decisions and be reflected in the timeline or clip parts of the .otio file. I don't think ExternalReference has any support for this neither. Does it?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#602>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABJM7YDEDMQGZP2CAOHEVTQN3U45ANCNFSM4I6VLNBQ>.
|
Also, I forgot to propose this alternate way to deal with varying durations of frames: Use a Track of Clips, each of which points to a single image, thus allowing each Clip to hold the duration of that image. This moves it up into the realm of composition instead of media. For relatively few images, e.g. the storyboards for a film, this model works well. For loads and loads of images, it might not.
…---
Joshua Minor (he-him)
Pixar Studio Tools Story/Editorial/VR Previs Tech Lead
[email protected]
On Oct 10, 2019, at 10:35 AM, Joshua Minor ***@***.***> wrote:
I tend to agree with Daniel's thinking here. There's a boundary between editorial decisions (composition) and details about the media (frame level encoding). I don't think OTIO should include details of individual iframes, bframes, gops, etc. in encoded video, and so I'm resistant to including frame-level information for image sequences as well.
With encoded video there's already a place to write down variable duration media samples (in the packet durations of the video stream) and so OTIO can just provide high level summary information in the MediaReference as a convenience. For an image sequence, there isn't a container to store data in, so frame durations either need to go into individual frames in some format dependent way, or OTIO (or some other intermediate container) would need to provide a spot to put this info. In practice, we just look at the frame numbers in each filename and deduce the durations and ordering from that.
So there is a spectrum of information from summary to detail that could be included for a MediaReference. The summary information for an image sequence is largely the same as what we could provide for encoded video.
Summary Information:
- available_range (min to max)
- format (exr, tif, dpx, mov, mp4, etc.)
- encoding (lzw, rle, h264, dnx, etc.)
- resolution, bit depth, channels, streams, etc.
We could combine image formats and video container formats in "format", and image encoding/compression with video codec into "encoding" in an attempt to allow these to be treated uniformly. Is this a good idea to lump them together?
Image sequence specific information covers two areas: 1. which frames exist, 2. how frames differ from each other.
Which Frames:
- min, skip, max (1-100 on 2s)
- list of ranges (1-50,52-99)
- list of frame numbers (1,2,3,7,8,9,...)
- zero padding
- Nuke/RV style syntax (foo.@@@.tif, %3d, etc.)
If some frames are missing (on 2s, or missing frame 51) then a policy can specify what to do with the gaps (hold last known frame, use nearest, show black, show error, etc.) without needing to alter the duration or rate of anything. That is, an image sequence at 24 fps, but on 2s, should still (in my view) be treated as 24fps, not 12fps.
What's left is how frames in the sequence differ from each other.
Per-Frame Variations:
- duration
- data window / bounds
- format, resolution, bit depth, channels, etc.
In practice, most systems I've seen try hard to avoid allowing per-frame variation in format, resolution, bit depth, channels, etc. When variations do come up, this is usually considered invalid. Per-frame data window variation is more common, but seems out of scope for OTIO. So we're left with just per-frame duration... How common is this in our industry?
---
Joshua Minor (he-him)
Pixar Studio Tools Story/Editorial/VR Previs Tech Lead
***@***.*** ***@***.***>
> On Oct 10, 2019, at 1:56 AM, Daniel Flehner Heen ***@***.*** ***@***.***>> wrote:
>
> Hi @meshula <https://github.com/meshula>!
> Please correct me if I'm not answering any of your questions.
>
> I would like to raise my use case again, of a sequence of images captured at a variable framerate, such as me pushing the shutter button on my camera. In this case, not only is the frame rate variable, but I expect the duration to be implicit, like a time filling slug.
>
> My look at the ImageSequenceReference is the same as a regular ExternalReference. The difference being that ImageSequenceReference points to several source files while ExternalReference points to a single file. The available_range should reflect full range of the source files available either based on time code or frame numbers. How the frames are recorded shouldn't really matter as there's most likely an intended playback speed. Even in cases where you in the middle of a shot increase/decrease the exposure frequency the playback speed should be the same producing a "slow-mo or fast-forward effect".
>
> As I read your initial use case again I got a bit confused. By "I expect the duration to be implicit, like a time filling slug", do you mean that you'd like to record the actual time between exposures, producing a sequence that only contains a few images but a total duration of the media reference that reflects the time you used to shoot them?
> If so, This could be solved by producing a small edit with your frames and using the rendered image sequence as a media_reference.
>
> A related issue would be a missing frame in a rendered test sequence. e.g. A simulation might fail intermittently, but we would still like to use the previous successful frame as a placeholder without needing to duplicate it on disk to satisfy a regex.
>
> As I mentioned in an earlier comment, I think this should be up to each consumer to handle. It would be no different than dealing with a corrupt frame in a .mov file for instance. Since (I believe I read this in another thread, couldn't find it though) it's not in the goals of OTIO to become a media player I don't think we need to solve this in the media_reference
>
> I'm not spotting in either of these proposals how an irregular framerate can be accommodated, or hold-frame behavior specified.
>
> I believe frame-holds and adjustment of playback speed should be editorial decisions and be reflected in the timeline or clip parts of the .otio file. I don't think ExternalReference has any support for this neither. Does it?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub <#602>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABJM7YDEDMQGZP2CAOHEVTQN3U45ANCNFSM4I6VLNBQ>.
>
|
I like your summary and this sounds like a good idea, but I think most software will pull the information they need from the files themselves and handle them thereafter. At least for EXR and DPX files, the metadata will contain a lot of info that will help applications read them correctly in lines of what video formats do.
The more I think about it I really don't see why OTIO should have to deal with missing frames and material on 2's etc. on a "low level". In my experience the software reading image sequences has it's own set of policies to deal with that, be it black frames, hold nearest frame etc. So I really like your idea of a policy enum to hold the most common alternatives. Another thing about the media reference actively reacting to missing files is that you should be able to serialize an .otio file with "offline media" without it getting stored as
This is my impression as well. Some just stick to presenting the images as the first frame decoded was.
Yeah, this is more or less the same workflow I tried to convey in my previous comment. :) |
This suggestion by josh works for me:
The use case that prompted me was a series of astrophotographs where the shutter is triggered at optically optimal moments, but still have a sense of a real time associated with the shutter time; not an editorial choice, but a choice reflecting the nature of the data. A track of clips seems heavy weight for this, in terms of very low information density in the resulting json file, but an explicit framerate of say 10ms with a "hold last known frame" policy would solve perfectly and satisfy my desire for recording a minimum of data. To be sure, astrophotography is not obviously in the intended use domain for OTIO, but folks doing planetarium shows and the like definitely are closely adjacent to us and use our industry's tools. I use vfx tools and libs in my astro work, and TBH there's my true motivation ;) |
Good discussion all around on this! The design that backed this PR came from the analysis that in the use cases we're targeting frame sequences:
Given OTIO is all about timing information, my goal was to encode metadata allowing the media reference to provide a consuming application with the URL for "on-disk" assets it would need to render a given time range. I really like the previous suggestions about having an enum to specify missing frame behavior preference. My one concern with this approach is strictly what constitutes a "missing" frame. I think this implicit behavior about whether or not certain assets exist could result in a lot of odd error cases. As an aside, a use case that comes to mind that is related is "chunked" files. This is where each file contains a sub-range of the source's avaliable range. This is typically done to conform to some arbitrary file size limit, RED R3D files are a common case. If you think of frame files as "chunked" files where the time range they contain is the duration of a single frame then these two concepts are interchangeable. However I think the best idea would be to model that case separately as something like |
I see your concern from a producer of frames point of view that you'd like to make sure your content is played back as you intended and not treated as corrupt. Also your concern about the combination of faulty permissions and a sequence on 2's. Another approach could be something in the lines of what Hiero does (actually deprecated now) which is the concept of fragments. Similar to
I've been thinking of this myself and came to the same conclusion, that this is a third kind of media reference. |
…ly accounts for skipped frame adjustments.
…hat to do when a frame isn't on disk
Codecov Report
@@ Coverage Diff @@
## master #602 +/- ##
==========================================
+ Coverage 81.67% 83.04% +1.36%
==========================================
Files 72 74 +2
Lines 2729 2860 +131
==========================================
+ Hits 2229 2375 +146
+ Misses 500 485 -15
Continue to review full report at Codecov.
|
…id complicating them with unicode string oddities.
Hi! However, I have a few suggestions for renaming some attributes as they feel a bit inconsistent and lengthy when writing code. When writing a few examples I found myself having to look up the source code or "dir()" the media_reference quite a few times since I had a hard time remembering what the different attributes were called. As you can see I've tried to use
When it comes to the method names I'm wondering if we should try to use the term I've also had a discussion with @reinecke on slack about introducing the additional concept of a Thanks! |
Good suggestions on renaming those three attributes, I agree the suggested terminology feels more intuitive. The core of what @apetrynet and I came to was that it's really useful for adapters to not only query the URLs for frames, but also get information about how to reconstruct those frame sequences into their native programs. Specifically, the gap here is that it would be useful to be able to get:
So, for example, imagine an The design problem here is that we want to make sure we:
A possible solution would be to add methods:
This would allow getting the full sequence range by doing:
And getting a trimmed frame range by doing:
I think the first example is reasonable, the second example feels a touch heavy. The benefit of the
And return the frame range for the provided time range as a tuple. Here is where, in the rv adapter, these tools might be used: https://github.com/PixarAnimationStudios/OpenTimelineIO/blob/e20d1e0e03d7dd016b5cfda2b40181ff10940ab9/contrib/opentimelineio_contrib/adapters/extern_rv.py#L282-L303 Thoughts? |
I like the general idea that ImageSequenceReference works natively in frames, but the rest of OTIO stays general. That will feel very comfortable to people working with frames. In your trimmed frame range example, shouldn't the last frame be inclusive? |
Hi @reinecke
Since it's "impossible" to agree on what symbol is best suited for all, I suggest that there is no default symbol and that the method errors out if the user doesn't provide one at call time. Having a method like this in |
I like the abstract target URL idea a lot! I'll add it to my todo list. |
…pected enum value for missing_frame_policy
…stead of long to match RationalTime frame methods
I no longer have any TODOs on the PR. Ready for feedback for final acceptance. |
…rence to raise an exception with more details about what went wrong
Nice! Good job @reinecke ! I'll compile and try it out again as soon as possible. |
src/py-opentimelineio/opentimelineio/schema/image_sequence_reference.py
Outdated
Show resolved
Hide resolved
@reinecke I really like this! In addition to the docstring suggestion above, I have a thought. (might be silly though) What I predict will happen in a lot of use cases is: path, filename = os.path.split('/path/to/image_sequence.1001.ext') # -> ('/path/to', 'image_sequence.1001.ext') No separator "/" after dirname
mr = otio.schema.ImageSequenceReference(
target_url_base=path, # Oh, oh!
name_base='image_sequence.', # imagine these also being split by a regex or something :)
name_suffix='.ext',
start_frame=1001
)
seq = mr.abstract_target_url(symbol='%04d') # -> '/path/toimage_sequence.%04d.ext' It would be nice not having to always do something like |
I don't think |
Unfortunately |
Isn't this a part of the c++ code as well? At least |
Oh, good call. All the URL building is C++ with the exception of |
So to unify URL handling, should we consider either moving |
…d add when building target_url
Finally got around to another pass on this. I believe I've addressed all the open issues. |
Great! I'll give it a spin before Thursday. |
Added ImageSequenceReference MediaReference subclass schema.
* Initial implementation of ImageSequenceReference (#602) Added ImageSequenceReference MediaReference subclass schema. * Implement ImageSequenceReference in extern_rv (#633) * Added support for ImageSequenceReference to example RV plugin (#637) * Added support for ImageSequenceReference to the example OTIO RV reader plugin * Switched to using opentime.to_frames to compute rv's in and out frames Co-authored-by: Daniel Flehner Heen <[email protected]> Co-authored-by: Robyn Rindge <[email protected]>
This is meant to be an alternate approach to the one @apetrynet took in pr #536 as a point of discussion. This resolves #69. I wanted to try an approach that more explicitly broke out image sequence parameters and had awareness of timing associated with frames. This also avoids using format strings in urls which could become ambiguous or need complex escaping logic if a url encoding includes a
%
(e.x.%20
for a space).The image sequence is expressed with:
target_url_base
- everything leading up to the file name in thetarget_url
name_prefix
- everything in the file name leading up to the frame numbername_suffix
- everything after the frame number in the file namestart_frame
- first frame number used in file namesframe_step
- step between frame numbers in file names (every other frame is a step of 2)rate
- double frame rate if every frame in the sequence were played back (ignoring skip frames)frame_zero_padding
- Number of digits to pad zeros out to (e.x. frame 10 with a pad of 4 would be0010
)missing_frame_policy
- enumImageSequenceReference.MissingFramePolicy
{error
,hold
,black
} allows hinting about how a consuming app should behave if an image for which a url is returned should be handled when missing from diskAn example for 24fps media with a sample provided each frame numbered 1-1000 with a path
/show/sequence/shot/sample_image_sequence.%04d.exr
might be:The same duration sequence but with only every 2nd frame available in the sequence would be:
A target url is generated using the equivalent of the following python format string:
f"{target_url_prefix}{(start_frame + (sample_number * frame_step)):0{value_zero_padding}}{target_url_postfix}"
Negative
start_frame
is also handled. The above example with astart_frame
of-1
would yield the first three target urls as:file:///show/sequence/shot/sample_image_sequence.-0001.exr
file:///show/sequence/shot/sample_image_sequence.-0000.exr
file:///show/sequence/shot/sample_image_sequence.0001.exr
Benefits of this approach include:
Downsides:
Questions:
number_of_images_in_sequence
be a property or method in python?{:04d}
and make a number section of-001
rather than-0001
?image_number_for_time
? This would be useful for people building time-based playback engines but it would also take the opinion a frame starts at a given time and holds until the next frame's start time.References:
TODO:
frame_for_time
frame_range_for_time_range
abstract_target_url
int
vs.long
in a number of places)Co-authored by: @apetrynet