Template Parser Limitations?

Hi!

I think I have discovered a bug.

It appears that toolkit can generate templates with multiple keys in a row but cannot validate or solve for a template with multiple keys in a row.

I did a deep dive into the parsing code and it appears that the parsing code assumes without confirming, that the template is made up of “token”|“key”|“token” triplets.

If I have a template like this:

test_broken_path: "sequences/{Sequence}/{Project}{Sequence}_v{version}.nk"

I can generate that path using a tool like tk-multi-workfiles2.
It would give us a path like this (for example)
'/mnt/Projects/SG/sequences/202_054/SG202_054_v001.nk

But when I try to create a tk-nuke-writenode in that scene, the test_broken_path template cannot validate that our path is from it’s template.

If I put a static token between the “{Project}” and {“Sequence”} keys, for example an underscore, the “token”|“key”|“token” triplet is valid again and the template solves.

test_fixed_path: "sequences/{Sequence}/{Project.tank_name}_{Sequence}_v{version}.nk"

Which would generate a path like
'/mnt/Projects/SG/sequences/202_054/SG_202_054_v001.nk

And this new path will validate correctly against the template that created it.


I had a bit of a think on ways to solve a […] sequence where there may be more than 1 key in a row.

A recursive solve might be a good way to approach this, there are already checks to confirm that the {Sequence} key earlier on in the path matches the same value of {Sequence} later in the path.
We could identify <multi_key> sections in our triplets and prepopulate key values based on our previously solved key values.
Assuming at least all of the keys except one solve correctly you can infer the missing key based on the remaining characters of the <multi_key> substring.

Now if you are missing more than one of the keys from the multi_key substring it gets a bit more complicated.
You could brute force all options in the multi_key substring and store them as options to be resolved against later on in the solving of the path.

The logic for solving the multi_key combinations is roughly the following to get the results.

from itertools import chain, combinations

def partition(message):
    n = len(message)
    b, mid, e = [0], list(range(1, n)), [n]
    splits = (d for i in range(n) for d in combinations(mid, i))
    return [[message[sl] for sl in map(slice, chain(b, d), chain(d, e))]
            for d in splits]

def limit_partition(iterable, key_count):
    results = partition(iterable)
    for result in results:
        if len(result) == key_count:
            yield result
["S", "G202_054"]
["SG", "202_054"]
["SG2", "02_054"]
["SG20", "2_054"]
...
["SG202_0", "54"]
["SG202_05", "4"]

There is a quick way to calculate how many combinations a given substring will return based on it’s length and the amount of keys in the substring.
That logic is here:

def _get_message_combos(msg, key_count=2):
    msg_length = len(msg)
    return sum(range(1, msg_length - key_count + 1 + 1))

_get_message_combos("SG202_054", key_count=2)  # -> 36

This ends up being the same function as solving for triangle numbers
T(n-k+1) where n == len(msg) and `k == key_count


Now another way to solve this solution would be allowing composite keys, which is a direction that I did not investigate much, but if we could define a composite key “{ProjectSequence}” that ends up resolving to “{Project}{Sequence}” that might work? However, without diving deep into that, I bet that would be just kicking the can down the road and it would rear it’s head again later.


Would love to get some thoughts on this, perhaps this has already been documented somewhere and I just missed it?

I will attach a basic code sample to demonstrate the issue in the next comment.


A minimal code example is below, assuming that the commented template lines are valid templates in your config.

import sgtk

engine = sgtk.platform.current_engine()

# Templates
# test_broken_path: "sequences/{Sequence}/{Project.tank_name}{Sequence}_v{version}.nk"
# test_fixed_path: "sequences/{Sequence}/{Project.tank_name}_{Sequence}_v{version}.nk"

bad_template = engine.get_template_by_name("test_broken_path")
fix_template = engine.get_template_by_name("test_fixed_path")
bad_path = '/mnt/Projects/SG/sequences/202_054/SG202_054_v01.nk'
fix_path = '/mnt/Projects/SG/sequences/202_054/SG_202_054_v01.nk'

bad_template.validate(bad_path)
fix_template.validate(fix_path)

I forgot to mention where in the codebase this issue starts rearing it’s head.

The reason for this I think is here:

https://github.com/shotgunsoftware/tk-core/blob/master/python/tank/template_path_parser.py#L311

It appears to be assuming alternating (“static token” “key”) pairs.

But in our case we have a key followed by another key.

/{Project}{Sequence}_
<token>{key}{key}<token>

@tannaz is this a known limitation that is out of scope of toolkit? Or is this something that might be fixed sometime?