Quantcast
Channel: dblp: Lukasz Kaiser
Browsing latest articles
Browse All 88 View Live

Image may be NSFW.
Clik here to view.

A Unified Approach to Boundedness Properties in MSO.

Lukasz Kaiser, Martin Lang, Simon Leßenich, Christof Löding: A Unified Approach to Boundedness Properties in MSO. CSL 2015: 441-456

View Article



Image may be NSFW.
Clik here to view.

Can Active Memory Replace Attention?

Lukasz Kaiser, Samy Bengio: Can Active Memory Replace Attention? CoRR abs/1610.08613 (2016)

View Article

Image may be NSFW.
Clik here to view.

Google's Neural Machine Translation System: Bridging the Gap between Human...

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu,...

View Article

Image may be NSFW.
Clik here to view.

Machine Learning with Guarantees using Descriptive Complexity and SMT Solvers.

Charles Jordan, Lukasz Kaiser: Machine Learning with Guarantees using Descriptive Complexity and SMT Solvers. CoRR abs/1609.02664 (2016)

View Article

Image may be NSFW.
Clik here to view.

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems.

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Gregory S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian J. Goodfellow, Andrew Harp,...

View Article


Image may be NSFW.
Clik here to view.

Multi-task Sequence to Sequence Learning.

Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser: Multi-task Sequence to Sequence Learning. ICLR (Poster) 2016

View Article

Image may be NSFW.
Clik here to view.

Neural GPUs Learn Algorithms.

Lukasz Kaiser, Ilya Sutskever: Neural GPUs Learn Algorithms. ICLR (Poster) 2016

View Article

Image may be NSFW.
Clik here to view.

Can Active Memory Replace Attention?

Lukasz Kaiser, Samy Bengio: Can Active Memory Replace Attention? NIPS 2016: 3774-3782

View Article


Image may be NSFW.
Clik here to view.

One Model To Learn Them All.

Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit: One Model To Learn Them All. CoRR abs/1706.05137 (2017)

View Article


Image may be NSFW.
Clik here to view.

Attention Is All You Need.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention Is All You Need. CoRR abs/1706.03762 (2017)

View Article

Image may be NSFW.
Clik here to view.

Depthwise Separable Convolutions for Neural Machine Translation.

Lukasz Kaiser, Aidan N. Gomez, François Chollet: Depthwise Separable Convolutions for Neural Machine Translation. CoRR abs/1706.03059 (2017)

View Article

Image may be NSFW.
Clik here to view.

Learning to Remember Rare Events.

Lukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio: Learning to Remember Rare Events. CoRR abs/1703.03129 (2017)

View Article

Image may be NSFW.
Clik here to view.

Regularizing Neural Networks by Penalizing Confident Output Distributions.

Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, Geoffrey E. Hinton: Regularizing Neural Networks by Penalizing Confident Output Distributions. CoRR abs/1701.06548 (2017)

View Article


Image may be NSFW.
Clik here to view.

Attention is All you Need.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention is All you Need. NIPS 2017: 5998-6008

View Article

Image may be NSFW.
Clik here to view.

Regularizing Neural Networks by Penalizing Confident Output Distributions.

Gabriel Pereyra, George Tucker, Jan Chorowski, Lukasz Kaiser, Geoffrey E. Hinton: Regularizing Neural Networks by Penalizing Confident Output Distributions. ICLR (Workshop) 2017

View Article


Image may be NSFW.
Clik here to view.

Learning to Remember Rare Events.

Lukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio: Learning to Remember Rare Events. ICLR (Poster) 2017

View Article

Image may be NSFW.
Clik here to view.

Area Attention.

Yang Li, Lukasz Kaiser, Samy Bengio, Si Si: Area Attention. CoRR abs/1810.10126 (2018)

View Article


Image may be NSFW.
Clik here to view.

Universal Transformers.

Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Lukasz Kaiser: Universal Transformers. CoRR abs/1807.03819 (2018)

View Article

Image may be NSFW.
Clik here to view.

Tensor2Tensor for Neural Machine Translation.

Ashish Vaswani, Samy Bengio, Eugene Brevdo, François Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Lukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit:...

View Article

Image may be NSFW.
Clik here to view.

Fast Decoding in Sequence Models using Discrete Latent Variables.

Lukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer: Fast Decoding in Sequence Models using Discrete Latent Variables. CoRR abs/1803.03382 (2018)

View Article

Image may be NSFW.
Clik here to view.

Image Transformer.

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku: Image Transformer. CoRR abs/1802.05751 (2018)

View Article


Image may be NSFW.
Clik here to view.

Generating Wikipedia by Summarizing Long Sequences.

Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer: Generating Wikipedia by Summarizing Long Sequences. CoRR abs/1801.10198 (2018)

View Article


Image may be NSFW.
Clik here to view.

Discrete Autoencoders for Sequence Models.

Lukasz Kaiser, Samy Bengio: Discrete Autoencoders for Sequence Models. CoRR abs/1801.09797 (2018)

View Article

Image may be NSFW.
Clik here to view.

Unsupervised Cipher Cracking Using Discrete GANs.

Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser: Unsupervised Cipher Cracking Using Discrete GANs. CoRR abs/1801.04883 (2018)

View Article

Image may be NSFW.
Clik here to view.

Image Transformer.

Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran: Image Transformer. ICML 2018: 4052-4061

View Article


Image may be NSFW.
Clik here to view.

Fast Decoding in Sequence Models Using Discrete Latent Variables.

Lukasz Kaiser, Samy Bengio, Aurko Roy, Ashish Vaswani, Niki Parmar, Jakob Uszkoreit, Noam Shazeer: Fast Decoding in Sequence Models Using Discrete Latent Variables. ICML 2018: 2395-2404

View Article

Image may be NSFW.
Clik here to view.

Generating Wikipedia by Summarizing Long Sequences.

Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer: Generating Wikipedia by Summarizing Long Sequences. ICLR (Poster) 2018

View Article

Image may be NSFW.
Clik here to view.

Depthwise Separable Convolutions for Neural Machine Translation.

Lukasz Kaiser, Aidan N. Gomez, François Chollet: Depthwise Separable Convolutions for Neural Machine Translation. ICLR (Poster) 2018

View Article

Image may be NSFW.
Clik here to view.

Unsupervised Cipher Cracking Using Discrete GANs.

Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser: Unsupervised Cipher Cracking Using Discrete GANs. ICLR (Poster) 2018

View Article



Image may be NSFW.
Clik here to view.

Tensor2Tensor for Neural Machine Translation.

Ashish Vaswani, Samy Bengio, Eugene Brevdo, François Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Lukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit:...

View Article

Image may be NSFW.
Clik here to view.

The Best of Both Worlds: Combining Recent Advances in Neural Machine...

Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George F. Foster, Llion Jones, Mike Schuster, Noam Shazeer, Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser,...

View Article

Image may be NSFW.
Clik here to view.

Parallel Scheduled Sampling.

Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio: Parallel Scheduled Sampling. CoRR abs/1906.04331 (2019)

View Article

Image may be NSFW.
Clik here to view.

Sample Efficient Text Summarization Using a Single Pre-Trained Transformer.

Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser: Sample Efficient Text Summarization Using a Single Pre-Trained Transformer. CoRR abs/1905.08836 (2019)

View Article


Image may be NSFW.
Clik here to view.

Model-Based Reinforcement Learning for Atari.

Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Ryan Sepassi, George Tucker, Henryk...

View Article

Image may be NSFW.
Clik here to view.

Area Attention.

Yang Li, Lukasz Kaiser, Samy Bengio, Si Si: Area Attention. ICML 2019: 3846-3855

View Article

Image may be NSFW.
Clik here to view.

Universal Transformers.

Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Lukasz Kaiser: Universal Transformers. ICLR (Poster) 2019

View Article


Image may be NSFW.
Clik here to view.

Rethinking Attention with Performers.

Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamás Sarlós, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy J. Colwell, Adrian...

View Article


Image may be NSFW.
Clik here to view.

Reformer: The Efficient Transformer.

Nikita Kitaev, Lukasz Kaiser, Anselm Levskaya: Reformer: The Efficient Transformer. CoRR abs/2001.04451 (2020)

View Article

Image may be NSFW.
Clik here to view.

Reformer: The Efficient Transformer.

Nikita Kitaev, Lukasz Kaiser, Anselm Levskaya: Reformer: The Efficient Transformer. ICLR 2020

View Article

Image may be NSFW.
Clik here to view.

Model Based Reinforcement Learning for Atari.

Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George...

View Article

Image may be NSFW.
Clik here to view.

Sparse is Enough in Scaling Transformers.

Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Lukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva: Sparse is Enough in Scaling Transformers. CoRR abs/2111.12763 (2021)

View Article


Image may be NSFW.
Clik here to view.

Training Verifiers to Solve Math Word Problems.

Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, John Schulman: Training...

View Article

Image may be NSFW.
Clik here to view.

Hierarchical Transformers Are More Efficient Language Models.

Piotr Nawrot, Szymon Tworkowski, Michal Tyrolski, Lukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski: Hierarchical Transformers Are More Efficient Language Models. CoRR abs/2110.13711 (2021)

View Article


Image may be NSFW.
Clik here to view.

Evaluating Large Language Models Trained on Code.

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pondé de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger,...

View Article

Image may be NSFW.
Clik here to view.

Q-Value Weighted Regression: Reinforcement Learning with Limited Data.

Piotr Kozakowski, Lukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kanska: Q-Value Weighted Regression: Reinforcement Learning with Limited Data. CoRR abs/2102.06782 (2021)

View Article


Image may be NSFW.
Clik here to view.

Sparse is Enough in Scaling Transformers.

Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Lukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva: Sparse is Enough in Scaling Transformers. NeurIPS 2021: 9895-9907

View Article

Image may be NSFW.
Clik here to view.

Rethinking Attention with Performers.

Krzysztof Marcin Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamás Sarlós, Peter Hawkins, Jared Quincy Davis, Afroz Mohiuddin, Lukasz Kaiser, David Benjamin Belanger,...

View Article

Image may be NSFW.
Clik here to view.

Hierarchical Transformers Are More Efficient Language Models.

Piotr Nawrot, Szymon Tworkowski, Michal Tyrolski, Lukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski: Hierarchical Transformers Are More Efficient Language Models. NAACL-HLT (Findings)...

View Article

Image may be NSFW.
Clik here to view.

Q-Value Weighted Regression: Reinforcement Learning with Limited Data.

Piotr Kozakowski, Lukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kanska: Q-Value Weighted Regression: Reinforcement Learning with Limited Data. IJCNN 2022: 1-8

View Article


tsGT: Stochastic Time Series Modeling With Transformer.

Lukasz Kucinski, Witold Drzewakowski, Mateusz Olko, Piotr Kozakowski, Lukasz Maziarka, Marta Emilia Nowakowska, Lukasz Kaiser, Piotr Milos: tsGT: Stochastic Time Series Modeling With Transformer. CoRR...

View Article

Browsing latest articles
Browse All 88 View Live




Latest Images