Towards Data Science’s Post

View organization page for Towards Data Science, graphic

631,715 followers

3mo Edited

"So, without further ado, let’s dive into how attention-weighting and FFN make transformers so powerful." Srijanie Dey, PhD's latest post is an accessible introduction to the exciting world of transformers and the math behind them.

Deep Dive into Transformers by Hand ✍︎

towardsdatascience.com

To view or add a comment, sign in

More Relevant Posts

Dhruvil Darji

Prompt engineering should be a degree | Autonomous Driving at Nvidia
2mo
Report this post
If you would like to understand Math behind a crucial block of Transformers, I explained it in my First Medium article.

Decoding “Attention is all you need”

medium.com
Like Comment
To view or add a comment, sign in
Towards Data Science

631,715 followers
1w Edited
Report this post
From its underlying math to a hands-on, from-scratch implementation, Cristian Leo's new deep dive is a comprehensive introduction to the inner workings of transformers' multi-head attention.

The Math Behind Multi-Head Attention in Transformers

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Felipe Navarro Romero M.

Data Scientist and Machine Learning Engineer| Machine Learning, Deep Learning and NLP | Cloud Certified (Azure, AWS)| Passionate about solving practical business problems with data and AI.
1mo
Report this post
Lovely introduction to transformes base on recent history of DL. I've actually learned Deep Learning as the "prehistoric age" way: https://lnkd.in/d-gtps_2

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Towards Data Science

631,715 followers
1mo Edited
Report this post
For a powerful blend of math and code, don't miss Cristian Leo's comprehensive introduction to KANs (Kolmogorov-Arnold networks), which explains in detail how they've come to surpass multi-layer perceptrons in accuracy and interpretability.

The Math Behind KAN — Kolmogorov-Arnold Networks

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Ju Rao

Full Stack Software Consultant ✔Web Apps ✔Java ✔WebGIS ✔Flutter (Mobile, Web) ✔Aeronautical Information Systems
8mo
Report this post
https://lnkd.in/gzX5B5bE The flesh is composed of self-contained non-technical mainstream explanations and examples of the field of mathematics which deals with meaning, called Model Theory. #ai #artificialintelligence #machinelearning #deeplearning #philosophyofmathematics #modeltheory #mathematicallogic

Hack, Hack, Who's There? A Gentle Introduction to Model Theory

freecomputerbooks.com
Like Comment
To view or add a comment, sign in
M. Shahriar Hossain

Associate Professor at The University of Texas at El Paso
6mo
Report this post
I have just released a video on Recurrent Neural Networks (RNNs). RNNs are at the heart of numerous breakthroughs in AI, particularly in processing and predicting sequential data. The video covers the concept and the math behind RNN. Here is the video link: https://lnkd.in/gFZVtDUm #rnn #ai #deeplearning

RNN Theory and Math Clearly Explained

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Towards Data Science

631,715 followers
2mo Edited
Report this post
In the latest installment of his "Math behind..." series, Cristian Leo leads us on a detailed exploration of batch normalization, its underlying mathematics, and its from-scratch implementation.

The Math Behind Batch Normalization

towardsdatascience.com
Like Comment
To view or add a comment, sign in
Anaswara Antony

Software architecture | Embedded Systems | Embedded AI
8mo Edited
Report this post
'Attention Is All You Need' by Ashish Vaswani et al. in 2017 made a revolutionary impact on Natural Language Processing (NLP) by introducing the Transformer architecture. Nowadays, this architecture is widely used across various domains, with its core structure remaining untouched. It's surprising that, at its introduction, the authors were unaware of it's impact, given the continuous growth of deep learning approaches since then. This video from Andrej Karpathy gives an excellent introduction to Transformers and also it's historical context. Do watch it if you are interested :) CS25 I Stanford Seminar - Transformers United 2023: Introduction to Transformers w/ Andrej Karpathy

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Jake Moya-Mendez

Principal Product Manager @ Zillow
3mo
Report this post
One of the best videos I've seen explaining how transformers work for audiences with an undergraduate mathematics education or anyone who has had some exposure to linear algebra and matrix multiplication. Highly recommend checking this out if you want to go deeper on transformers without diving into white papers like "Attention is all you need" https://lnkd.in/e5iSwKk9

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Yogesh Jadhav

Data Enthusiast | Data Analyst | Data Science | ML/DL/AI | Analytics | Visualization | ETL | UI/UX | NFT | Power Apps | IT | Content Writer | Jobs/Recruitment | Quoran | Follow for more
5mo
Report this post
🚀 Exciting news in the world of Computer Vision! A new method has been proposed to reduce halo artifacts in Local Histogram Equalization algorithms, resulting in visually natural images. This approach leverages insights from the human visual system to address dark and light variants separately. #ComputerVision #ImageProcessing #AI #ML #TechInnovation

🚀 Exciting news in the world of Computer Vision! A new method has been proposed to reduce halo artifacts in Local Histogram Equalization algorithms, resulting in visually natural images. This approach leverages insights from the human visual system to address dark and light variants separately. #ComputerVision #ImageProcessing #AI #ML #TechInnovation

arxiv.org
Like Comment
To view or add a comment, sign in

631,715 followers

View Profile Follow

Towards Data Science’s Post

Deep Dive into Transformers by Hand ✍︎

towardsdatascience.com

More from this author

How to Approach Complex Data Science Topics as a Beginner

How to Plan for Your Next Career Move in Data Science and Machine Learning

What’s New in Computer Vision and Object Detection?

Explore topics

Towards Data Science’s Post

More Relevant Posts

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

https://www.youtube.com/

RNN Theory and Math Clearly Explained

https://www.youtube.com/

Stanford CS25: V2 I Introduction to Transformers w/ Andrej Karpathy

https://www.youtube.com/

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

https://www.youtube.com/

More from this author

How to Approach Complex Data Science Topics as a Beginner

How to Plan for Your Next Career Move in Data Science and Machine Learning

What’s New in Computer Vision and Object Detection?

Explore topics