Thinnings: Sublist Witnesses and de Bruijn Index Shift Clumping

philipzucker.com · matt_d · 3 days ago · view on HN
0 net
Tags
Thinnings: Sublist Witnesses and de Bruijn Index Shift Clumping | Hey There Buddo! There was a thread on mastodon recently where Conor Mcbride was discussing some really cool stuff. I think the intent was to get at something more interesting, but this is the first place I’ve seen thinnings explained in a way without too many complex trappings around and it really clicked with me. I think I had a primed mind to see something useful to my current set of problems and solutions there. It unlocked a torrent of ideas related to lambda egraphs and generalized unions finds that I’m quite excited to talk about in the next couple posts. But first we need to lay some ground work and discuss thinnings. I’ve seen thinnings before as some mystifying Agda inductive with no obvious relevance to anything that could be useful to me. But they are a perfectly sensible mathematical object and do not have to be relegated to only usage in dependent type theory. You can perfectly well talk about them in Python or any profane (profane is intended as both compliment and criticism) language of your choosing. A couple intuitive highlights: Thinnings are witness data to a sublist question Thinnings are visualizable as non overlapping strings going between n->m slots in an increasing way Thinnings are representable as bool lists / bitvectors Thinnings are the bulk clumping / compaction of de bruijn lifting and lowering operations like how permutations are the clumping of many swaps. Another way of saying it is that they are generated by them. Thinnings, like many actions, can be composed t . s , not just applied t(s(x)) . This is often a very small but also a very big shift in perspective. Thinnings as Sublist Witnesses I like thinking about “witness data” https://www.philipzucker.com/proof_objects/ . Given the witness data, some problem becomes easier in some sense to do (often an improvement in decidability or big O complexity, but I also accept a subjective “just becomes easier to think about”). Some examples of such data a satisfying assignment to a SAT problem is a witness to it’s satisfiability. It’s easy to check as assignment, hard to invent one Paths through a graph witness the connectedness of two vertices A permutation witnesses that a sorted list has the same elements as it’s input list The witness an object is in a list is the index at which to find it. Witnesses are somewhat related to “parse don’t validate” . Parse trees are witness data to a string being in a language. Thinnings are witnesses to a sublist question. I can write a function to determine if one list is a sublist of another (or subsequence really. My brain doesn’t distinguish between the words, but many brains do). Traverse through the big one and increment a pointer into the small one when we find a match def is_sublist ( xs , ys ) -> bool : if len ( xs ) == 0 : return True i = 0 for y in ys : if y == xs [ i ]: i += 1 if i == len ( xs ): return True return False assert is_sublist ([], [ 1 , 2 , 3 ]) assert is_sublist ([ 1 ], [ 1 , 2 , 3 ]) assert is_sublist ([ 1 , 3 ], [ 1 , 2 , 3 ]) assert not is_sublist ([ 4 ], [ 1 , 2 , 3 ]) assert not is_sublist ([ 2 , 2 ], [ 1 , 2 , 3 ]) But I can also do so in a witness producing way. This is in some sense a trace of the computation def is_sublist ( xs : list , ys : list ) -> list [ bool ] | None : i = 0 pf = [] for y in ys : if i < len ( xs ) and y == xs [ i ]: i += 1 pf . append ( True ) else : pf . append ( False ) if i < len ( xs ): return None else : return pf assert is_sublist ([], [ 1 , 2 , 3 ]) == [ False , False , False ] assert is_sublist ([ 1 ], [ 1 , 2 , 3 ]) == [ True , False , False ] assert is_sublist ([ 1 , 3 ], [ 1 , 2 , 3 ]) == [ True , False , True ] assert is_sublist ([ 4 ], [ 1 , 2 , 3 ]) == None assert is_sublist ([ 2 , 2 ], [ 1 , 2 , 3 ]) == None It is a little easier to verify a certificate than it is to make it. If we represented the certificate in a more compact way (see the widening representation later in post), it can be O(|ys|) time to find one, but only O(|pf|) = O(|xs|) to check one. def verify ( xs : list , ys : list , pf : list [ bool ]) -> bool : if len ( pf ) != len ( ys ): return False if len ( xs ) != len ([() for p in pf if p ]): # Thanks to Chris Warbo for the correction return False i = 0 for y , p in zip ( ys , pf ): if p : if xs [ i ] != y : return False else : i += 1 return True assert verify ([], [ 1 , 2 , 3 ], is_sublist ([], [ 1 , 2 , 3 ])) assert verify ([ 1 ], [ 1 , 2 , 3 ], is_sublist ([ 1 ], [ 1 , 2 , 3 ])) assert verify ([ 1 , 3 ], [ 1 , 2 , 3 ], is_sublist ([ 1 , 3 ], [ 1 , 2 , 3 ])) Thinnings can also be used as instructions on how to produce the small list from the big list def thin ( xs : list , pf : list [ bool ]) -> list : return [ x for x , b in zip ( xs , pf ) if b ] assert thin ([ 1 , 2 , 3 ], [ False , True , False ]) == [ 2 ] assert thin ([ 1 , 2 , 3 ], is_sublist ([ 1 , 3 ], [ 1 , 2 , 3 ])) == [ 1 , 3 ] There can be multiple possible certificates in the presence of duplicates. Our proof producing method only produces one of them (a lex minimal one kind of). assert verify ([ 2 ], [ 2 , 2 ], [ True , False ]) assert verify ([ 2 ], [ 2 , 2 ], [ False , True ]) Another nice context to think about this is in prolog. It is often the case that we can write a n-ary relation and then write a (n+1)-ary version that contains a tracing parameter. This tracing parameter is a proof object / witness. The prolog sublist/2 relation is not “good” from the perspective that it backtracks all the ways in which X is a sublist of Y. This is very reasonable however from the perspective that the core philosophy of logic programming is computation ~ proof search . There are many ways to prove one list is a sublist of another in the presence of duplicate elements. %% file / tmp / sublist . pl % without proof witness field sublist / 2 sublist ([], _Ys ). sublist ([ X | Xs ], [ X | Ys ]) : - sublist ( Xs , Ys ). sublist ([ X | Xs ], [ _ | Ys ]) : - sublist ([ X | Xs ], Ys ). % the witness producing version sublist / 3 % if you like , we could write it as using some infix notation as `Ps : sublist(Xs, Ys)` sublist ([], [], []). sublist ([ X | Xs ], [ X | Ys ], [ true | Ps ]) : - sublist ( Xs , Ys , Ps ). sublist ( Xs , [ _ | Ys ], [ false | Ps ]) : - sublist ( Xs , Ys , Ps ). I’ve been tinkering with using scryer prolog as a python library via maturin bindings https://github.com/philzook58/scryerpy . We can run some examples showing how the above predicates work import scryer m = scryer . Machine () m . query ( "consult('/tmp/sublist.pl')." ) # computing forwards assert str ( m . query_once ( "sublist([1,2], [1,2,3], Ps)." )[ "Ps" ]) == "[true, true, false]" assert str ( m . query_once ( "sublist([2], [1,2,3], Ps)." )[ "Ps" ]) == "[false, true, false]" # going "backwards" assert str ( m . query_once ( "sublist(Xs, [1,2,3], [true,false,true])." )[ "Xs" ]) == "[1, 3]" assert m . query_once ( "sublist([4], [1,2,3], Ps)." ) == None # fails [ str ( answer [ "Ps" ]) for answer in m . query ( "sublist([2], [2,2,3], Ps)." )] # all proofs ['[true, false, false]', '[false, true, false]'] Representations and Composition of Thinnings The certificate above is a bitvector. You can implement operations on it using fast machine operations if you want. https://msp.cis.strath.ac.uk/types2025/abstracts/TYPES2025_paper37.pdf Thinnings are representing an order preserving mapping between big list size M to little list size N. It only makes sense to compose thinnings of the right sizes. We can implement composition and identity thinnings as python functions. All of this makes thinnings a category. The following thinning could be represented pictorially but also as a bit vector, a prefix sum, an index selector list, or an index selector set repbitvec = [ True , False , True , False , False , True ] repsum = [ 0 , None , 1 , None , None , 2 ] repwide = ([ 0 , 2 , 5 ], 6 ) # Need to store domain explicitly repset = ({ 0 , 2 , 5 }, 6 ) # order doesn't matter anymore since distinct. sublist is a transitive and reflexive relation. The witnesses inherent a notion of composition and identity from this. These functions are not so hard to implement in python in the bitvec representation def presum ( xs : list [ bool ]) -> list [ int | None ]: # converts to prefix sum representation res = [] i = 0 for x in xs : if x : res . append ( i ) i += 1 else : res . append ( None ) return res #smalllist -> medium -> biglist def comp ( xs : list [ bool ], ys : list [ bool ]) -> list [ bool ]: return [ y is not None and xs [ y ] for y in presum ( ys )] def id ( n : int ) -> list [ bool ]: return [ True ] * n x = [ True , False , True ] y = [ True , False ] assert comp ( id ( 3 ), id ( 3 )) == id ( 3 ) assert comp ( id ( 2 ), x ) == x assert comp ( x , id ( 3 )) == x assert comp ( y , x ) == [ True , False , False ] Thinnings also inherit a notion of (monoidal) product from the appending of lists. It is just appending the bitvectors def prod ( xs , ys ): return xs + ys assert verify ([ 2 ] + [ 3 ], [ 2 , 4 ] + [ 3 , 5 ], is_sublist ([ 2 ], [ 2 , 4 ]) + is_sublist ([ 3 ], [ 3 , 5 ])) De Bruijn Shifting https://en.wikipedia.org/wiki/De_Bruijn_index De bruijn indices are cool. Variables refer to their binding site by counting the number of binders they must cross to get to it. For example lam x. x ==> lam v0 lam x. (lam y. x) ==> lam (lam v1) lam x. (x, lam y. x) ==> lam (v0, lam v1) When you are using de bruijn indices, when you move a term t under a binder during some substitution operation, you need to bump any index that refers across the new binder, but not bump any index that refers inside the t . http://adam.chlipala.net/cpdt/html/DeBruijn.html from dataclasses import dataclass @ dataclass class App : f : str args : list [ "Term" ] @ dataclass class Binder : body : "Term" @ dataclass class Var : idx : int type Term = App | Binder | Var def shift0 ( min : int , t : Term ) -> Term : match t : case Var ( idx ): if idx < min : return Var ( idx ) else : return Var ( idx + 1 ) case App ( f , args ): return App ( f , [ shift0 ( min , arg ) for arg in args ]) case Binder ( body ): return Binder ( shift0 ( min + 1 , body )) assert shift0 ( 0 , Var ( 0 )) == Var ( 1 ) assert shift0 ( 3 , Var ( 4 )) == Var ( 5 ) assert shift0 ( 3 , Var ( 2 )) == Var ( 2 ) assert shift0 ( 0 , Binder ( Var ( 0 ))) == Binder ( Var ( 0 )) assert shift0 ( 0 , Binder ( Var ( 1 ))) == Binder ( Var ( 2 )) What shift is doing is putting one gap in our variables. shift0 ( 1 , App ( "f" , [ Var ( 0 ), Var ( 1 ), Var ( 2 )])) App(f='f', args=[Var(idx=0), Var(idx=2), Var(idx=3)]) Sometimes we end up applying many shifts as they move around. shift0 ( 4 , shift0 ( 3 , shift0 ( 1 , App ( "f" , [ Var ( 0 ), Var ( 1 ), Var ( 2 )])))) App(f='f', args=[Var(idx=0), Var(idx=2), Var(idx=5)]) You can compact these shift together into a “mega shift”. Mega shifts are widenings / thinnings. Thinnings are basically generated by the shift operations, which kind of sort of commute. This is similar to how permutations are a compaction of swapping. Permutations are “mega swaps” from itertools import accumulate def widen ( wide : list [ int ], t : Term ) -> Term : match t : case Var ( idx ): return Var ( wide [ idx ]) case App ( f , args ): return App ( f , [ widen ( wide , arg ) for arg in args ]) case Binder ( body ): return Binder ( widen ([ 0 ] + [ 1 + w for w in widen ], body )) assert widen ([ 0 , 1 , 2 ], Var ( 2 )) == Var ( 2 ) assert widen ([ 0 , 2 , 4 ], App ( "f" , [ Var ( 0 ), Var ( 1 ), Var ( 2 )])) == App ( "f" , [ Var ( 0 ), Var ( 2 ), Var ( 4 )]) # big <- med <- small def comp ( s , t ): return [ s [ i ] for i in t ] def id ( n ): return list ( range ( n )) def shift ( n : int , ctx : int ): return [ i if i < n else i + 1 for i in range ( ctx + 1 )] assert comp ([ 0 , 1 , 3 ], [ 0 , 2 ]) == [ 0 , 3 ] assert comp ( id ( 3 ), id ( 3 )) == id ( 3 ) assert comp ( id ( 3 ), [ 0 , 2 ]) == [ 0 , 2 ] assert comp ([ 0 , 2 , 5 ], id ( 3 )) == [ 0 , 2 , 5 ] assert shift ( 1 , 3 ) == [ 0 , 2 , 3 , 4 ] assert comp ( shift ( 4 , 4 ), comp ( shift ( 3 , 3 ), shift ( 1 , 2 ))) == [ 0 , 2 , 5 ] assert widen ( comp ( shift ( 4 , 4 ), comp ( shift ( 3 , 3 ), shift ( 1 , 2 ))), App ( "f" , [ Var ( 0 ), Var ( 1 ), Var ( 2 )])) \ == App ( "f" , [ Var ( 0 ), Var ( 2 ), Var ( 5 )]) \ == shift0 ( 4 , shift0 ( 3 , shift0 ( 1 , App ( "f" , [ Var ( 0 ), Var ( 1 ), Var ( 2 )])))) # they "commute" by decrementing one of the shifts assert comp ( shift ( 3 , 4 ), shift ( 1 , 3 )) == comp ( shift ( 1 , 4 ), shift ( 2 , 3 )) Bits and Bobbles Why is this interesting to me: Lambda (alpha) egraphs. So the next two posts I think will be about thinning union finds and thinning hash conses. You can bolt thinnings into a hash cons like you can throw permutations into a hash cons for something nominal / slotted https://www.philipzucker.com/slotted_hash_cons/ . Retain Terms as you go as the pair of an ephemeral part containing a thinning and and interned part that is maximally thinned. I think this basically works and is Co De Bruijn. I believe this is a form of McBride’s https://arxiv.org/abs/1807.04085 Everybody’s Got to Be Somewhere which I mistakenly thought contained the Map to Var idea because of what was contained in the hashing mod alpha paper https://arxiv.org/abs/2105.02856 . It has something more local and interesting I think. Using thinnings in this maximal way, they kind of serve as a “free variable analysis”. Consider the set form of thinnings described about (set[int], int) . The missing piece in a free variable analysis is kind of the current scope you’re in, the latter half of the pair. Also it is a bit odd, but sensible to run a free variable analysis on de bruijn indices, you need to shift the analysis results as you pass binders. I do not think it is a good idea to consider these thinnings as merely an analysis though, since they are part of what it even is to be a well scoped term. The idea kind of is related to “let the succs float” by type nat = Succ of nat | Zero type term = | App of ( string , term list ) | Var of nat | Bind of term and smushing those two types together to type term = | Succ of term | Zero of term | App of ( string , term list ) | Bind of term The meaning of Succ is lifting n-ary functions to (n+1)-ary functions. This gives de bruijn shifting a semnantic interpretation. This trick appears in Kiselyov’s SKI semantically https://okmij.org/ftp/tagless-final/ski.pdf , Pavel’s blog post https://pavpanchekha.com/blog/egg-bindings.html , and McBride references Thorsten Altenkirch, Martin Hofmann & Thomas Streicher (1995): Categorical Reconstruction of a Reduction Free Normalization Proof, where there is some neat ML code in the appendix You can bolt thinnings into a union find. Thinnings have some flavor of a monus. This also similar in many ways to the slotted union find, which Rudi and I have not found the right way to write about yet. It appears there is a class of union finds that resolves union of structured eids by appealing to a unification procedure. There is something like a unification procedure avaiable for thinnings, which I think is kind of the subject of some of Conor’s mastodon toots. Thinnings are also a bit like a monus, which also seems to have interesting union finds associated with them. I don’t know why monus has shown up https://doisinkidney.com/posts/2026-03-03-monus-heaps.html in my feed at the same time. Is this a coincidence? Reed Mullanix also brought them up on mastodon. Thinnings are kind of taking slotted’s notion of redundancy but tossing out the permutations. https://dl.acm.org/doi/10.1145/3729326 Slotted egraphs. I think I’ve seen Rudi make a diagram representing slots as little lines running though the edges between children. A “thickened” parent child relationship. Thinnings are kind of the same, but they do not allow wire crossing. https://egraphs.zulipchat.com/#narrow/channel/556617-rambling/topic/Philip.20Zucker/near/574167694 Here I allude to maybe this is a case. I wonder where I pulled this from, because I didn’t really know what thinnings were? https://types.pl/@pigworker/116165967722741785 https://types.pl/@pigworker/116166311939433591 https://types.pl/@pigworker/116166311939433591 https://types.pl/@pigworker/116166518304559876 https://types.pl/@pigworker/116166766060828508 https://types.pl/@pigworker/116166793347069264 https://types.pl/@pigworker/116166908231448340 https://types.pl/@pigworker/116166917513186826 There are many other kinds of proof objects https://www.philipzucker.com/proof_objects/ . A similar one that is useful is outputting a permutation from a sorting algorithm. Verifying a permutation takes O(n) but sorting under the appropriate assumptions takes O(n ln(n)) , so there is a complexity gap there. https://cacm.acm.org/research/program-correctness-through-self-certification/ The clumping of thinnings seems very useful to me when comparing with nominal unification where the clumping of permutations can get thunked on unification variables. https://en.wikipedia.org/wiki/Simplex_category Thinnings are also the morphisms in the simplex category. Simplicial sets are a way of organizing face->edge that use this category as an indexing structure. Is it surprising that de Bruijn manipulations and describing shapes could share this structure? Maybe. Simplicial sets add in degenerate triangles (like a triangle with only 2 distinct vertices (kind of a “thick” line) or even only 1 distinct vertex (a “thick” point)). These are added in because it makes things cleaner. This degenerayc and tossing stuff away does kind of jive with thinning. Nameless Projections kind of jive with thinning. I dunno. If a triangle is a list of 3 points, the oriented edges are kind of the 2 vertex sublists of that and the vertices are kind of the 1 vertex sublists. I personally don’t want to emphasize lambda or at least lambda with beta substitution. I’m more interested in other binders like sum, max, integral, forall, exists. de Bruijn shifting doesn’t persay have to do with beta reduction, it can just be part of what happens in terms with binders and variables while you’re doing substitution in them. You need to do it while manipulating sum expressions also, not just lambda. Witnesses are indeed related to proof relevant mathematics as done in dependent type theory for example, but I kind of like the perspective of deemphasizing that. They’re just data and a concept. They don’t have to be like a rearrangement of your entire framework of reasoning into new dots and squiggles and colons and horizontal lines. Although that can be fun too in the right mood. Do thinnings help solve the prolog set or mulitset problem? Sets are sorted and deduped listsm multisets are sorted lists. Sublists are an easier problem than subsets. The converse of a thinning is a widening or a thickening # monoidal product in widening representation def prod ( w1 : tuple [ list [ int ], int ], w2 : tuple [ list [ int ], int ]): w1 , ctx = w1 w2 , ctx2 = w2 return ( w + [ ctx + i for i in w2 ], ctx + ctx2 ) https://jesper.sikanda.be/posts/1001-syntax-representations.html I’m not so sure this summary does Co de Bruijn justice anymore Proof relevant prolog is evocative of the reified predicates from “Indexing dif/2” https://arxiv.org/abs/1607.01590 . I wonder if there is a fun connection here. Proof relevant prolog also I think perhaps can give a nice story about what cut ! is. When well used, it is some notion of proof irrelevance / proof canonization. Cut is very relevant for doing member checks https://www.swi-prolog.org/pldoc/man?predicate=memberchk/2 which has a similar issue. There is more than one index witness where an element may occur. I think you can in a subsumptive way and logical way say that they are all the same to me and that the canonical proof object is the smallest index. @tausbn writes https://social.wub.site/@tausbn/116199267244002079 : “I know this likely makes no difference with respect to what you wrote (which I’m still digesting), but I can’t help but share my favourite way to do a subsequence check in Python: def is_sublist(needle, haystack) -> bool: haystack_sweep = iter(haystack) return all(elt in haystack_sweep for elt in needle) ” That’s cute! iter is crazy sometimes.