## Conformer Implementation of the convolutional module from the Conformer paper, for improving the local inductive bias in Transformers. ## Install ```bash $ pip install conformer ``` ## Usage The Conformer convolutional module, the main novelty of the paper ```python import torch from conformer import ConformerConvModule layer = ConformerConvModule( dim = 512, causal = False, # auto-regressive or not - 1d conv will be made causal with padding if so expansion_factor = 2, # what multiple of the dimension to expand for the depthwise convolution kernel_size = 31, # kernel size, 17 - 31 was said to be optimal dropout = 0. # dropout at the very end ) x = torch.randn(1, 1024, 512) x = layer(x) + x ``` 1 Conformer Block ```python import torch from conformer import ConformerBlock block = ConformerBlock( dim = 512, dim_head = 64, heads = 8, ff_mult = 4, conv_expansion_factor = 2, conv_kernel_size = 31, attn_dropout = 0., ff_dropout = 0., conv_dropout = 0. ) x = torch.randn(1, 1024, 512) block(x) # (1, 1024, 512) ``` Conformer - just multiple `ConformerBlock` from above ```python import torch from conformer import Conformer conformer = Conformer( dim = 512, depth = 12, # 12 blocks dim_head = 64, heads = 8, ff_mult = 4, conv_expansion_factor = 2, conv_kernel_size = 31, attn_dropout = 0., ff_dropout = 0., conv_dropout = 0. ) x = torch.randn(1, 1024, 512) conformer(x) # (1, 1024, 512) ``` ## Todo - [ ] switch to a better relative positional encoding. shaw's is dated - [ ] flash attention with a better RPE ## Citations ```bibtex @misc{gulati2020conformer, title = {Conformer: Convolution-augmented Transformer for Speech Recognition}, author = {Anmol Gulati and James Qin and Chung-Cheng Chiu and Niki Parmar and Yu Zhang and Jiahui Yu and Wei Han and Shibo Wang and Zhengdong Zhang and Yonghui Wu and Ruoming Pang}, year = {2020}, eprint = {2005.08100}, archivePrefix = {arXiv}, primaryClass = {eess.AS} } ```