• Home
  • Blog
  •  +62 (811) 8751 555
  • Sign in
  • Contact
    • Home
    • Blog
  •  +62 (811) 8751 555
  • Sign in
  • Contact
Latest
How to reduce VRAM peak usage, and increase Volatile GPU-Util?
TLDR; split the intermediate matrix to fit into the GPU L2 Cache
 
  • Blogs:
  • All
  • Blog
  • Large Language Model
Jason Rich Darmawan
Multi-head Attention is the same as a Linear transformation with less computation
Multi-head Attention is the same as a Linear transformation with less computation. 1) Multi-head Attention is the same as Linear transformation because it has the same property of "Every output depend...
Apr 10, 2025 Large Language Model
Jason Rich Darmawan
Business value of Artificial Intelligence in classifying Fruits.
The main message is why building AI to classify fruits have business value The target audience is The management, so no jargon used in the presentation. The presentation overused animation to make it ...
May 23, 2024 Blog
  • 1
  • 2
Useful Links
  • Home
  • Contact

Connect with Jason
  • Contact
  • +62 (811) 8751-555
  • +86 (132) 5993-9392
Follow us
Copyright © Jason Rich Darmawan