🤓 Yashwanth's Notes

        • 1. Understanding Large Language Models
        • 2. Working with Text Data
        • 3. Coding Attention Mechanisms
        • 4. Implementing a GPT Model From Scratch to Generate Text
        • 5. Pretraining on Unlabeled Data
      • DDPM from Scratch
      • SD from Scratch
        • Inner Products
        • Lengths and Angles of Vectors
        • Matrix Representations of inner products
        • Norms
      • Autocorrelation
      • Hessian Matrix
      • Quasi-Newton Methods
      • Radial Basis Functions (RBFs)
      • Structural risk minimization
      • Symmetric Positive Definite Matrices (SPD Matrices)
      • The Conjugate Gradient Method
      • The Polar Decomposition
      • AdaMuon - Adaptive Muon Optimizer
      • AlexNet - ImageNet Classification with Deep Convolutional Neural Networks
      • Hands-on Bayesian Neural Networks – A Tutorial for Deep Learning Users
      • High-Resolution Image Synthesis with Latent Diffusion Models
      • Identity Mappings in Deep Residual Networks
      • Keeping Neural Networks Simple by Minimizing the Description Length of the Weights
      • LeNet - Gradient-Based Learning Applied to Document Recognition
      • ResNet - Deep Residual Learning for Image Recognition
    Home

    ❯

    Papers

    Folder: Papers

    8 items under this folder.

    • Sep 18, 2025

      AdaMuon - Adaptive Muon Optimizer

      • May 20, 2025

        Hands-on Bayesian Neural Networks – A Tutorial for Deep Learning Users

        • Feb 25, 2025

          Keeping Neural Networks Simple by Minimizing the Description Length of the Weights

          • Feb 04, 2025

            High-Resolution Image Synthesis with Latent Diffusion Models

            • Nov 05, 2024

              Identity Mappings in Deep Residual Networks

              • Nov 05, 2024

                ResNet - Deep Residual Learning for Image Recognition

                • Oct 26, 2024

                  AlexNet - ImageNet Classification with Deep Convolutional Neural Networks

                  • Oct 25, 2024

                    LeNet - Gradient-Based Learning Applied to Document Recognition


                          • 1. Understanding Large Language Models
                          • 2. Working with Text Data
                          • 3. Coding Attention Mechanisms
                          • 4. Implementing a GPT Model From Scratch to Generate Text
                          • 5. Pretraining on Unlabeled Data
                        • DDPM from Scratch
                        • SD from Scratch
                          • Inner Products
                          • Lengths and Angles of Vectors
                          • Matrix Representations of inner products
                          • Norms
                        • Autocorrelation
                        • Hessian Matrix
                        • Quasi-Newton Methods
                        • Radial Basis Functions (RBFs)
                        • Structural risk minimization
                        • Symmetric Positive Definite Matrices (SPD Matrices)
                        • The Conjugate Gradient Method
                        • The Polar Decomposition
                        • AdaMuon - Adaptive Muon Optimizer
                        • AlexNet - ImageNet Classification with Deep Convolutional Neural Networks
                        • Hands-on Bayesian Neural Networks – A Tutorial for Deep Learning Users
                        • High-Resolution Image Synthesis with Latent Diffusion Models
                        • Identity Mappings in Deep Residual Networks
                        • Keeping Neural Networks Simple by Minimizing the Description Length of the Weights
                        • LeNet - Gradient-Based Learning Applied to Document Recognition
                        • ResNet - Deep Residual Learning for Image Recognition

                      Backlinks

                      • No backlinks found

                      Yashwanth's Notes

                      • LinkedIn