RL 100
1. Triage
This is a tech blog that focuses on my research. I'll try to use both Chinese(my native language) and English in this blog.
1. Triage
Code Master Gem Instructions
This guide reviews two important algorithms in reinforcement learning, Conservative Q-Learning (CQL) and Calibrated Q-Learning (Cal-QL), explaining the probl...
Using udev to Create Persistent Device Paths
LLaMA3 技术报告的简要分析。
Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billio...
LLaMA (Large Language Model Meta AI) is a series of foundational language models developed by Meta AI. The LLaMA models are designed to be efficient and effe...
1. Overall Architecture
We shall introduce the foundation of value-base RL algos, Q-learning. Then from Q-learning to DQN, A2C, and finally to SAC.
Check the MKL or OpenBLAS version of NumPy.
Build PX4-Autopilot from source code.
T-Motor F60Pro Kv2550 建模所需参数。
Multicopter Rate Control in PX4 1.13 Release.
Basis of Policy Gradient methods, from PG to PPO.
三维空间刚体运动,参考自《视觉SLAM十四讲》