[论文简析]End-to-End Video-Language Transformers..Masked Visual-token..[2111.12681]

作者: 秋刀鱼的炼丹工坊

作者简介: 经中此篇如此高深，我确实不懂。

描述: 论文题目: VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling 论文地址:http://arxiv.org/abs/2111.12681 代码:https://github.com/tsujuifu/ pytorch_violet * 视频受up能力限制经常出现中英混杂，散装英语等现象，请见谅。如论文理解报道出了偏差，欢迎各位怒斥。

[论文简析]End-to-End Video-Language Transformers..Masked Visual-token..[2111.12681]

推荐视频

[论文简析]DeiT: Data-efficient Image Transformers[2012.12877]

[论文简析]DeiT: Data-efficient Image Transformers[2012.12877]

上传者: 秋刀鱼的炼丹工坊

[论文速览]Aggregating Nested Transformers[2105.12723]

[论文速览]Aggregating Nested Transformers[2105.12723]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Robust and Generalizable Visual ... via Random Convolutions[2007.13003]

[论文简析]Robust and Generalizable Visual ... via Random Convolutions[2007.13003]

上传者: 秋刀鱼的炼丹工坊

[论文简析]VATT: Video-Audio-Text Transformer[2104.11178]

[论文简析]VATT: Video-Audio-Text Transformer[2104.11178]

上传者: 秋刀鱼的炼丹工坊

[论文简析]DAT: Vision Transformer with Deformable Attention[2201.00520]

[论文简析]DAT: Vision Transformer with Deformable Attention[2201.00520]

上传者: 秋刀鱼的炼丹工坊

[论文简析]SlowFast Networks for Video Recognition[1812.03982]

[论文简析]SlowFast Networks for Video Recognition[1812.03982]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Dynamic Vision Transformers with Adaptive Sequence Length[2105.15075]

[论文简析]Dynamic Vision Transformers with Adaptive Sequence Length[2105.15075]

上传者: 秋刀鱼的炼丹工坊

[论文简析]DINO Emerging Properties in SelfSupervised Vision Transformers[2104.14294]

[论文简析]DINO Emerging Properties in SelfSupervised Vision Transformers[2104.14294]

上传者: 秋刀鱼的炼丹工坊

[论文速览]EViT: Expediting Vision Transformers via Token Reorganizations[2202.07800]

[论文速览]EViT: Expediting Vision Transformers via Token Reorganizations[2202.07800]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Per-Pixel Classification is Not All You Need for Semantic Seg[2107.06278]

[论文简析]Per-Pixel Classification is Not All You Need for Semantic Seg[2107.06278]

上传者: 秋刀鱼的炼丹工坊

[论文速览]OWL-ViT: Simple Open-Vocabulary Object Detection with ViT[2205.06230]

[论文速览]OWL-ViT: Simple Open-Vocabulary Object Detection with ViT[2205.06230]

上传者: 秋刀鱼的炼丹工坊

[论文简析]SimCLR: A simple framework for contrastive learning[2002.05709]

[论文简析]SimCLR: A simple framework for contrastive learning[2002.05709]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Is Space-Time Attention All You Need for Video Understanding?[2102.05095]

[论文简析]Is Space-Time Attention All You Need for Video Understanding?[2102.05095]

上传者: 秋刀鱼的炼丹工坊

[论文简析]A Generalist Agent / Gato[2205.06175]

[论文简析]A Generalist Agent / Gato[2205.06175]

上传者: 秋刀鱼的炼丹工坊

[论文简析]SimSiam: Exploring Simple Siamese Representation Learning[2011.10566]

[论文简析]SimSiam: Exploring Simple Siamese Representation Learning[2011.10566]

上传者: 秋刀鱼的炼丹工坊

[论文简析]FlowNet3D: Learning Scene Flow in 3D Point Clouds[1806.01411]

[论文简析]FlowNet3D: Learning Scene Flow in 3D Point Clouds[1806.01411]

上传者: 秋刀鱼的炼丹工坊

[论文简析]DETR: End-to-End Object Detection with Transfromers[2005.12872]

[论文简析]DETR: End-to-End Object Detection with Transfromers[2005.12872]

上传者: 秋刀鱼的炼丹工坊

[论文速览]AugSelf: Improving Transferability...Augmentation-Aware..[2111.09613]

[论文速览]AugSelf: Improving Transferability...Augmentation-Aware..[2111.09613]

上传者: 秋刀鱼的炼丹工坊

[论文简析]TNT: Transformer in Transformer[2103.00112]

[论文简析]TNT: Transformer in Transformer[2103.00112]

上传者: 秋刀鱼的炼丹工坊

[论文简析]BEVT: BERT Pretraining of Video Transformers[2112.01529]

[论文简析]BEVT: BERT Pretraining of Video Transformers[2112.01529]

上传者: 秋刀鱼的炼丹工坊

[论文简析]MoCo: Momentum Contrast Learning[1911.05722/2003.04297]

[论文简析]MoCo: Momentum Contrast Learning[1911.05722/2003.04297]

上传者: 秋刀鱼的炼丹工坊

[论文简析]ViT: Vision Transformer[2010.11929]

[论文简析]ViT: Vision Transformer[2010.11929]

上传者: 秋刀鱼的炼丹工坊

[论文简析]β-VAE Learning basic visual concepts with a constrained variational...

[论文简析]β-VAE Learning basic visual concepts with a constrained variational...

上传者: 秋刀鱼的炼丹工坊

[论文简析]Equivariant Contrastive Learning[2111.00899]

[论文简析]Equivariant Contrastive Learning[2111.00899]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Swin Transformer: Hierarchical ViT using Shifted Windows[2103.14030]

[论文简析]Swin Transformer: Hierarchical ViT using Shifted Windows[2103.14030]

上传者: 秋刀鱼的炼丹工坊

[论文简析]GroupViT: Semantic Segmentation Emerges from Text Supervision[2202.11094]

[论文简析]GroupViT: Semantic Segmentation Emerges from Text Supervision[2202.11094]

上传者: 秋刀鱼的炼丹工坊

2025科研创新：特征提取魔改登Nature！算法原理+代码分析+论文解读，导师散养的科研小白都能轻松出文！机器学习|深度学习|计算机视觉

2025科研创新：特征提取魔改登Nature！算法原理+代码分析+论文解读，导师散养的科研小白都能轻松出文！机器学习|深度学习|计算机视觉

上传者: 放羊的迪哥

[论文简析]Finding an Unsupervised Image Segmenter in .. Generative Model[2105.08127]

[论文简析]Finding an Unsupervised Image Segmenter in .. Generative Model[2105.08127]

上传者: 秋刀鱼的炼丹工坊

[论文简析]MaskGIT: Masked Generative Image Transformer[2202.04200]

[论文简析]MaskGIT: Masked Generative Image Transformer[2202.04200]

上传者: 秋刀鱼的炼丹工坊

[论文速览]Bottleneck Transformers for Visual Recognition[2101.11605]

[论文速览]Bottleneck Transformers for Visual Recognition[2101.11605]

上传者: 秋刀鱼的炼丹工坊

[论文速览]BLIP-2 ...with Frozen Image Encoders and Large Language Models[2301.12597]

[论文速览]BLIP-2 ...with Frozen Image Encoders and Large Language Models[2301.12597]

上传者: 秋刀鱼的炼丹工坊

[论文速览]Decision Transformer: RL via Sequence Modeling[2106.01345]

[论文速览]Decision Transformer: RL via Sequence Modeling[2106.01345]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Mobile-Former: Bridging MobileNet and Transformer[2108.05895]

[论文简析]Mobile-Former: Bridging MobileNet and Transformer[2108.05895]

上传者: 秋刀鱼的炼丹工坊

[论文速览]LoRA: Low-Rank Adaptation of Large Language Models[2106.09685]

[论文速览]LoRA: Low-Rank Adaptation of Large Language Models[2106.09685]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Learning Invariant Representations for RL without Reconstructi[2006.10742]

[论文简析]Learning Invariant Representations for RL without Reconstructi[2006.10742]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Location-Aware Self-Supervised Transformers for Semantic Seg.[2212.02400]

[论文简析]Location-Aware Self-Supervised Transformers for Semantic Seg.[2212.02400]

上传者: 秋刀鱼的炼丹工坊

[论文简析]SwAV: Swapping Assignments between multiple Views[2006.09882]

[论文简析]SwAV: Swapping Assignments between multiple Views[2006.09882]

上传者: 秋刀鱼的炼丹工坊

[论文简析]SAC: Soft Actor-Critic Part 2[1812.05905]

[论文简析]SAC: Soft Actor-Critic Part 2[1812.05905]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Learning by Aligning Videos in Time[2103.17260]

[论文简析]Learning by Aligning Videos in Time[2103.17260]

上传者: 秋刀鱼的炼丹工坊

[论文简析]Crossway Diffusion: Improving Diffusion-based ... via SSL[2307.01849]

[论文简析]Crossway Diffusion: Improving Diffusion-based ... via SSL[2307.01849]

上传者: 秋刀鱼的炼丹工坊