Last week I got the opportunity to share how we are using graph transformer networks, in particular PyTorch Geometric to augment layout transformers to perform complex semantic modeling on visually rich documents like receipts. We use graph neural networks augmented with layout transformer embeddings to model semantic topology as a link-prediction task. Check out the talk below.
A big thanks to my contributor on this work, Boris Kogan!